The Chinese and English parsers are specifically designed to process the two languages, and by default use the Penn Chinese Treebank and Penn Treebank labels. You can specify alternative label sets by modifying
zpar/src/chinese/tags.h for POS tags,
zpar/src/chinese/dep.h for dependency labels, and
zpar/src/chinese/cfg.h for constituent labels. These are hard-coded; the English version are placed in
On the other hand, you can compile a
generic version of ZPar, which takes any tags in the training data, and compile them into tag sets automatically. The speed of the generic tag sets are slower when compared with the hard-coded tag sets. The files are placed in
To compile individual models with these tags, use
generic in the place of
english. For example,
make generic.conparser. The implementations are found from
src/common/GENERIC_CONPARSER_IMPL. The generic ZPar can be compiled by
The generic parsers are used by different languages and treebank formats, for example, the generic depparser can be used to process CoNLL data in 13 languages.
Since ZPar 0.7, the generic ZPar system is the default ZPar.
make zpar to compile.
Usage of the generic system can be found in the Quick Start Manual.