One rule per line, each rule has eight fields:
E.g.
cecita z|za ^NCFS NCFS00A 0 0 0 0 - les * ^V * 0 1 0 1 $$+les:$$+PP
The first line (cecita) states a suffix rule that will be applied to unknown words, to see whether a valid feminine singular noun is obtained when substituting the suffix cecita with z ot za. This is the case of crucecita (diminutive of cruz). If such a base form is found, the original word is analyzed as diminutive suffixed form. No retokenization is performed.
The second rule (mela) applies to all words and tries to check whether a valid verb form is obtained when removing the suffix les. This is the case of words such as viles (which may mean I saw them, but also is the plural of the adjective vil). In this case, the retokenization info states that if eventually the verb tag is selected for this word, it may be retokenized in two words: The base verb form (referred to as $$, vi in the example) plus the word les. The tags for these new words are expressed after the colon: The base form must keep its PoS tag (this is what the second $$ means) and the second word may take any tag starting with PP it may have in the dictionary.
So, for word viles would obtain its adjective analysis from the dictionary, plus its verb + clitic pronoun from the suffix rule:
viles vil AQ0CP0 ver VMIS1S0
The second analysis will carry the retokenization information, so if eventually the PoS tagger selects the VMI analysis (and the TaggerRetokenize option is set), the word will be retokenized into:
vi ver VMIS1S0 les ellos PP3CPD00
2008-01-24