Usage example

Assuming we have the folowing input file mytext.txt:

El gato come pescado. Pero a Don Jaime no le gustan los gatos.
we could issue the command:
analyzer -f myconfig.cfg <mytext.txt >mytext.mrf
Let's assume that myconfig.cfg is the file presented in section 2.2.2. Given the options there, the produced output would correspond to morfo format (i.e. morphological analysis but no PoS tagging). The expected results are:
El el DA0MS0 1
gato gato NCMS000 1
come comer VMIP3S0 0.75 comer VMM02S0 0.25
pescado pescado NCMS000 0.833333 pescar VMP00SM 0.166667
. . Fp 1
Pero pero CC 0.99878 pero NCMS000 0.00121951 Pero NP00000 0.00121951
a a NCFS000 0.0054008 a SPS00 0.994599
Don_Jaime Don_Jaime NP00000 1
no no NCMS000 0.00231911 no RN 0.997681
le él PP3CSD00 1
gustan gustar VMIP3P0 1
los el DA0MP0 0.975719 lo NCMP000 0.00019425 él PP3MPA00 0.024087
gatos gato NCMP000 1
. . Fp 1

If we also wanted PoS tagging, we could have issued the command:

analyzer -f myconfig.cfg --outf tagged <mytext.txt >mytext.tag

to obtain the tagged output:

El el DA0MS0
gato gato NCMS000
come comer VMIP3S0
pescado pescado NCMS000
. . Fp
Pero pero CC
a a SPS00
Don_Jaime Don_Jaime NP00000
no no RN
le él PP3CSD00
gustan gustar VMIP3P0
los el DA0MP0
gatos gato NCMP000
. . Fp

We can also ask for the synsets of the tagged words:

analyzer -f myconfig.cfg --outf sense --sense=all  <mytext.txt >mytext.sen

obtaining the output:

El el DA0MS0
gato gato NCMS000 01630731:07221232:01631653
come comer VMIP3S0 00794578:00793267
pescado pescado NCMS000 05810856:02006311
. . Fp
Pero pero CC
a a SPS00
Don_Jaime Don_Jaime NP00000
no no RN
le él PP3CSD00
gustan gustar VMIP3P0 01244897:01213391:01241953
los el DA0MP0
gatos gato NCMP000 01630731:07221232:01631653
. . Fp

Alternatively, if we don't want to repeat the first steps that we had already performed, we could use the output of the morphological analyzer as input to the tagger:

analyzer -f myconfig.cfg --inpf morfo --outf tagged <mytext.mrf >mytext.tag

See options InputFormat and OutputFormat in section 2.2.1 for details on which are valid input and output formats.

2008-01-24