Extrakce korpusových příkladů pro valenční slovník

Publication details

Title in English	Extraction of Corpus Examples for Valency Lexicon
Authors	BAISA Vít
Year of publication	2011
Type	Article in Proceedings
Conference	Korpusová lingvistika, 3: Gramatika a značkování korpusů
MU Faculty or unit	Faculty of Informatics
Citation
Field	Linguistics
Keywords	valency lexicon; VerbaLex; corpus; valency frame; CQL
Description	Valency lexicon VerbaLex is built on the basis of various lexical sources. Unfortunately, real corpus data is missing among them. Currently, VerbaLex contains about 1O,OOO verb lemmata, 20,000 literals (lemmata with their sense-numbers) and roughly the same amount of valency frames. In most cases, examples for individual valency frames were made up artificially. Our goal is to add real examples from a corpus to this rich lexicographic source. The article summarizes a procedure which tranforms valency frames into CQL queries. These queries then serves for searching of real sentences corresponding with the transformed valency frames. The procedure is simple and relatively effective and is followed by necessary manual selection of acceptable examples. We describe in detail all steps of the procedure, results, their quality and obstacles we have faced during the extraction of examples of valency frames.
Related projects:	Centrum komputační lingvistiky

10 reasons why you will fall in love with MU