Zde se nacházíte:
Informace o publikaci
Case study of BushBank concept
Autoři | |
---|---|
Rok publikování | 2011 |
Druh | Článek ve sborníku |
Konference | The 25th Pacific Asia Conference on Language, Information and Computation |
Fakulta / Pracoviště MU | |
Citace | |
Obor | Jazykověda |
Klíčová slova | corpus; rapid development; annotation; treebank |
Popis | In this paper, we present a new type of annotated corpus, called BushBank, which improves handling of ambiguity in natural language. Unlike in traditional approaches where data are directly disambiguated, in a BushBank, disambiguation is done later, based on application needs. This has major impact on the structures used in the corpus, since ordinary syntactic trees disallow ambiguity. Our approach was tested on 10.000 sentences and more than a hundred annotators when creating Czech BushBank. The paper contains information about creating such a resource and the methods used to obtain high inter-annotator agreement. |
Související projekty: |