Recent Releases of ojibwemorph
ojibwemorph - OjibweMorph-v0.1.0
This is a pre-release version! It is very close to the planned "initial release" version, with almost all major features now implemented.
What’s Inside
- Zip files of the three repos used to create this FST:
OjibweLexicon, with lexical information in OjibweOjibweMorph, with morphological information in OjibweParserTools, with the FST-building infrastructure
- Inside
OjibweMorph.zip, the FST is already built for you!OjibweMorph/FST/generated/ojibwe.fomabin(or, as an alternate format,ojibwe.att)
Installation
There are two installation paths: - The simplest route to using the FST: download the pre-generated FST and perform a small amount of set-up. - The more involved route: get all the necessary pieces and generate the FST yourself.
Note for Windows users: Due to some complications with compiling the FST on Windows, we recommend you take the easy route and avoid building on your own system.
The Easy Route
- Download the pre-generated FST and related files: OjibweMorph.zip
- Install foma on your system. This program will be used to run the FST file. Some brief instructions for this are provided in our ParserTools repository.
The Build-It-Yourself Route
All the instructions for building the FST yourself are provided in the OjibweMorph documentation. Essentially, you will clone the three repos that together create the FST: OjibweLexicon, OjibweMorph, and ParserTools. You'll be directed to hop over to the ParserTools docs as well to get set up to use the code in that repository.
Regardless of which installation method you used, you can then check out these instructions in OjibweMorph on how to use the FST!
What’s new
Almost everything! The verbs and nouns have been updated and further honed, and all other parts of speech have been added.
Major features
- Verb inflection
- Paradigms: VAI, VII, VTA, VTI, VAIO
- Orders: Independent, conjunct, and imperative, as well as changed conjunct
- Modes: Neutral, preterit, dubitative
- All known inflectional classes for each paradigm
- Handling of reflexive and reciprocal restrictions, plural-only verb restrictions, and irregular verb izhi
- Noun inflection
- Paradigms: NA, NI, NAD, NID
- All known inflectional classes for each paradigm
- Forms for possession, diminutives, pejorative, preterit, and basic suffixes, both individually and in possible stacked combinations
- Handling of irregular forms such as irregular diminutives
- All types of preverbs and prenouns handled, including directional, lexical, quantificational, relative, subordinating, and tense
- Handling of other parts of speech including adverbs, numerals, particles, and pronouns
- Lexical counts (not including functional elements):
- 17,067 verb stems
- 3,323 noun stems
- 713 adverbs
- 523 numerals
- 166 lexical prenouns/preverbs
- 121 pronouns
- 81 particles
- 38 proper nouns
- Basic derivational processes, including -magad augment, reflexive forms, and reciprocal forms, with functionality to add more as needed
- Ability to add multiple lexical sources and custom sets of stems/lemmas, including excluding specific forms as needed
- Tests
- Spreadsheet-based (paradigm) tests
- These are a smaller set of tests which check that the inflected forms in the OjibweMorph spreadsheets are given the correct analysis by the FST.
- OPD-based tests
- These are a larger set of tests (60,000+ forms) which check that the inflected forms in the Ojibwe People's Dictionary (OPD) are given the correct analysis by the FST.
- Corpus-based tests
- These tests check that the FST outputs an analysis (though not necessarily the right one) for inflected forms found in the example sentences provided in the OPD.
- Spreadsheet-based (paradigm) tests
Beta features
There are a number of features that are in early stages of development, or that have not been fully tested and vetted: - Participle forms are incomplete - Some uncommon argument combinations are missing
Not-yet-implemented
These are features that still need to be added, and are on the docket for future releases: - Clitic support (e.g. writing dash as -sh when it is reduced) - Proper name inflection (obviative and preterit forms) - Singular-only restriction on impersonal VIIs - Singular-only restriction on mass nouns - Distributive locative forms for nouns - Preterit-dubitative forms for verbs - Productive reduplication rules - Suffix-internal morpheme boundary markings
- Python
Published by anna-stacey about 1 year ago
ojibwemorph - OjibweMorph-v0.0.1
This release supports:
- Nouns (full set of forms for 17 model lexemes)
- Verbs (full set of forms for 15279 verb lexemes from the OPD dictionary)
- A subset of all preverbs (subordinate, tense & directional)
The analysis of a verb form like gibimi-ookwandanziinaawaaban looks like:
PVDir/bimi+bookwandan+VTI+Ind+Neg+Prt+2PlSubj+0SgObj
Here PVDir/bimi+, +VTI, +Ind, +Neg, +Prt, +2PlSubj, and +0SgObj are multi-character symbols.
The analysis of a noun form like "ninzhiishiibiman" looks like:
zhiishiib+NA+1SgPoss+Poss+ObvSg
Here +NA, +1SgPoss, +Poss and +ObvSg are multi-character symbols.
Full list of preverb tags:
ChCnj+
PVDir/ani+ PVDir/awi+ PVDir/baa+ PVDir/babaa+ PVDir/bi+ PVDir/bibaa+
PVDir/biiji+ PVDir/bimi+ PVDir/ni+ PVDir/o+ PVDir/ombi+ PVDir/wi+
PVDir/zaagiji+ PVSub/a+ PVSub/e+ PVSub/gaa+ PVTense/daa+ PVTense/ga+
PVTense/gii+ PVTense/gii'+ PVTense/wii+ PVTense/wii'+
Full list of verb tags by category:
Paradigm: +VAI +VAIO +VAI_Pl +VII +VII_Pl +VTA +VTI
Order: +Cnj +Imp +Ind
Negation: +Neg +Pos
Mode: +Del +Dub +DubPrt +Neu +Prb +Prt +Sim
Subject: +0PlObvSubj +0PlSubj +0SgObvSubj +%0SgSubj +2PlSubj +2SgSubj +2plSubj +3PlObvSubj +3PlProxSubj +3SgObvSubj +3SgProxSubj +ExclSubj +InclSubj +XSubj
Object: +0PlObj +0SgObj +1SgObj +1SgSubj +2PlObj +2SgObj +2plObj +3PlObvObj +3PlProxObj +3SgObvObj +3SgProxObj +ExclObj +InclObj
Alternative (indicates non-Border Lakes forms): +Alt
Full list of noun tags by category:
Paradigm: +NA +NAD +NI
PersPoss: +1SgPoss +2PlPoss +2SgPoss +3PlObvPoss +3PlProxPoss +3SgObvPoss +3SgProxPoss +ExclPoss +InclPoss
Dim: +Dim
Poss: +Poss
Pej: +Pej
Pret: +Pret
Basic: +Loc +ObvPl +ObvSg +Pl +ProxPl +ProxSg +Sg +Voc
Changes from previous version:
- Added nouns
- Added
SubjandObjidentifiers to subject and object tags - Eliminated compound Sg/Pl tags like
+3SgProx/3PlProx. Instead, we have two analyses for the word form, one+3SgProxSubjanalysis and one+3PlProxSubjanalysis (note also the addition of theSubjidentifier). - Flag diacritics have been eliminated from the compiled FST.
- Python
Published by mpsilfve about 2 years ago