The Prediction of Next Word in Balochi Language Using N-gram Model
DOI:
https://doi.org/10.30537/sjcms.v7i2.1273Abstract
Balochi Language is among the oldest languages, spoken by approximately 10 million people worldwide. The Balochi language has been spoken for a very long period. In comparison to other languages like English, Urdu, French etc. it has a research gap in Natural language processing (NLP). The next word prediction system is one of the techniques of NLP for suggesting standardization and corpus collection. This research aims to provide a next-word prediction system and a corpus with no ambiguity for the Balochi language. N-gram model for the next word prediction has been utilized, i.e. Unigram, Bigram, Trigram, Quad-gram, and so on. A trained model has been embedded in an application after being evaluated extrinsically and intrinsically. It plays a crucial role in typing through a keyboard and helps users to type faster. Additionally, it helps native users to have fewer typing errors in less time. The results of the research show that Five-gram model has the highest performance of 93% while Quad-gram model has 80% and Trigram model has 76% respectively.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Sukkur IBA Journal of Computing and Mathematical Sciences
![Creative Commons License](http://i.creativecommons.org/l/by-nc/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The SJCMS holds the rights of all the published papers. Authors are required to transfer copyrights to journal to make sure that the paper is solely published in SJCMS, however, authors and readers can freely read, download, copy, distribute, print, search, or link to the full texts of its articles and to use them for any other lawful purpose.
The SJCMS is licensed under Creative Commons Attribution-NonCommercial 4.0 International License.