Computing Idioms Frequency in Text Corpora

Bušta,  Jan

Publication details

Computing Idioms Frequency in Text Corpora

Authors	BUŠTA Jan
Year of publication	2008
Type	Article in Proceedings
Conference	Proceedings of Recent Advances in Slavonic Natural Language Processing 2008
MU Faculty or unit	Faculty of Informatics
Citation
web	https://nlp.fi.muni.cz/raslan/2008/papers/12.pdf
Field	Linguistics
Keywords	frequency of idioms; headwords; text corpora; czech language
Description	The idioms are phrases which meaning is not composed from the meanings of each word in the phrase. This is one of the natural examples of violating the principle of compositionality that means that idioms are in area of natural language processing problem of meaning mining. To count the frequency of phrases such idioms in corpora has one big aim: To get to know which phrases we use often and which less. We do it to be able to start with getting the meaning of the whole phrases not just each word. This improves the understanding natural language. The idioms are phrases which meaning is not composed from the meanings of each word in the phrase. This is one of the natural examples of violating the principle of compositionality that means that idioms are in area of natural language processing problem of meaning mining. To count the frequency of phrases such idioms in corpora has one big aim: To get to know which phrases we use often and which less. We do it to be able to start with getting the meaning of the whole phrases not just each word. This improves the understanding natural language.
Related projects:	Centrum komputační lingvistiky Prostředky tvorby komplexní báze znalostí pro komunikaci se sémantickým webem v přirozeném jazyce

10 reasons why you will fall in love with MU

Ask our ambassador

Read about research at MU

Computing Idioms Frequency in Text Corpora