QARAQALPAQ HÁM TÚRK TILLERINIŃ PARALLEL KORPUSÍN JARATÍW MÁSELESI

Authors

  • Ótemisov Aziz Zarlıqbaevich Author
  • Sharbaev Jaras Bayrambay ulı Author

Keywords:

Linguistic corpus, corpus, parallel corpus, text matching, tokenization, stemmatization, Lingtrain Alignment studio, Humaling, Abbyy Aligner, Trados, Winaling, Wordfast tools, Giza++, AntConc.

Abstract

In this article, we will talk about the tasks that must be performed when creating a parallel corpus of the Karakalpak and Turkic languages. The article provides information on the preparation of texts, the formation of a set of ready-made texts, the stages of matching paragraphs, sentences, words and phrases, methods, programs.

References

1. Abduraxmonova N. O`zbek tili elektron korpusining kompyuter modellari (monografiya) /Toshkent: Muharrir, 2021, B – 202.

2. Abjalova M. Korpus lingvistikasi (uslubiy qollanma)/ Toshkent: Bookmany Print, 2022, B –103.

3. Захаров В., Богданова С. Корпусная лингвистика: учебник. 3-е изд., перераб. – СПб.: Изд-во С.-Петерб. ун-та, 2020, С – 61.

4. Зубов А. В., Зубова И. И. Информационные технологии в лингвистике: учеб. пос. М.: Издательский центр «Академия», 2004, С – 208.

5. https://habr.com/ru/articles/575898/

Downloads

Published

2025-08-04

Similar Articles

31-40 of 103

You may also start an advanced similarity search for this article.