Project information
A New Machine Translation-based approach to Parallel Corpora Alignment
- Project Identification
- MUNI/IGA/1334/2021
- Project Period
- 1/2022 - 12/2022
- Investor / Pogramme / Project type
-
Masaryk University
- Internal grant agency MU
- MU Faculty or unit
- Faculty of Informatics
The project involves the development of a new automatic method of parallel corpora alignment. This new approach will be based on Neural Machine Translation and previous aligned corpora. The method will be tested on a Czech-English parallel corpus of Faculty news, which alignment will be improved as a result.
Sustainable Development Goals
Masaryk University is committed to the UN Sustainable Development Goals, which aim to improve the conditions and quality of life on our planet by 2030.
Publications
Total number of publications: 4
2022
-
HFT: High Frequency Tokens for Low-Resource NMT
Proceedings of the Fifth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2022), year: 2022
-
MUNI-NLP Systems for Lower Sorbian-German and Lower Sorbian-Upper Sorbian Machine Translation @ WMT22
Proceedings of the Seventh Conference on Machine Translation, year: 2022
-
Piötòst Ché Niènt, Mèi Piötòst - A Manually Revised Lombard-Italian Parallel Corpus
Proceedings of the Sixteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2022, year: 2022
2021
-
Evaluating the State-of-the-Art Sentence Alignment System on Literary Texts
Recent Advances in Slavonic Natural Language Processing (RASLAN 2021), year: 2021