NLP MODULE FOR BULGARIAN TEXT PROCESSING

  • Stoyan Cherecharov Plovdiv University Paisii Hilendarski, Faculty of Mathematics and Informatics
  • Hristo Krushkov Plovdiv University Paisii Hilendarski, Faculty of Mathematics and Informatics
  • Mariana Krushkova Plovdiv University Paisii Hilendarski, Faculty of Mathematics and Informatics
Keywords: natural language processing, computational linguistics, software modules, web-based systems

Abstract

The wide use of web-based information systems and a lack of highly skilled developers are the primary motivation to search for methods and approaches to optimize the building of such systems. This paper describes a model for creating web-based information systems by using a core of reusable, independent, and installable base modules. Such a system is easily adapted to a client’s needs and is extendable by adding specific modules that interact with the remainder of the system by following certain rules. The approach allows flexible and rapid development of applications for small to extremely large web-based systems, simply by adding modules with adequate functionality. The growing demand of Bulgarian customers for such systems is the reason for building a base module for automatic processing of Bulgarian text. This paper presents a module that performs automatic morphological analysis and synthesis, verifies syntactic agreement, automatically places stress, and processes complex verb forms, among other functions. The described functionality can be integrated with other modules using a suitable interface

References

Blagoeva, D., Koeva, S., & Murdarov, V. (2011). The Bulgarian Language in the Digital Age. White Paper Series: Springer.

Dicheva, D., C, D., A. G., & G., A. (2015). Gamification in Education: A Systematic Mapping Study. Educational Technology & Society, 18(3), 75 – 88.

Jackov, L. (2015). Feature-rich part-of-speech tagging using deep syntactic and semantic analysis. International Conference Recent Advances in Natural Language Processing (pp. 173-180 ). Hissar: Association for Computational Linguistics (ACL).

Kapukaranov, B., & Nakov, P. (2015). Fine-grained sentiment analysis for movie reviews in Bulgarian. International Conference Recent Advances in Natural Language Processing (pp. 266-274). Hissar: Association for Computational Linguistics (ACL).

Krushkov, H. (1997). Modelling and Building Machine Dictionaries and Morphological Processors (in Bulgarian). Plovdiv: Ph.D. thesis, University of Plovdiv, Faculty of Mathematics and Informatics.

Krushkov, H. (2001, February ). Automatic Morphological Processing of Bulgarian Proper Nouns. Traitement Automatique des Langues, 41(3), 709-726.

Krushkov, H., Atanasova, M., & Krushkova, M. (2015). Teaching Bulgarian through Games (in Bulgarian). Annual Journal of Education and Technologies, 6, 322 – 329.

Malinova, A., & Rahneva, O. (2016). Automatic generation of english language test questions using mathematica. Conference: CBU International Conference on Innovations in Science and Education (CBUIC) (pp. 906-909). Prague: Central Bohemia Univ, Unicorn College.

Prokofyeva, N., & Boltunova, V. (2017). Analysis and Practical Application of PHP Frameworks in Development of Web Information Systems. Procedia Computer Science, 104, 51-56.

Ricci, A. (April 2016). Programming with event loops and control loops – From actors to agents. Computer Languages, Systems & Structures, 45, 80–104.

Santos, A., Alves, P., Figueiredo, E., & Ferrari, F. (1 April 2016). Avoiding code pitfalls in Aspect-Oriented Programming. Science of Computer Programming, Volume 119, 31–50.

Singh, J., Khilar, P. M., & Mohapatra, D. P. (May 2017). Dynamic slicing of distributed Aspect-Oriented Programs: A context-sensitive approach. Computer Standards & Interfaces, 52, 71–84.

Toulson, R., & Wilmshurst, T. (2017). Further Programming Techniques. In R. Toulson, & T. Wilmshurst, Fast and Effective Embedded Systems Design (Second Edition) (pp. 111-134). Elsevier Ltd.

Yang, W., Lee, S.-H., Zhu Jin, Y., & Hwang, H.-T. (2016, October). Development of web-based collaborative framework for the simulation of embedded systems. Journal of Computational Design and Engineering, 3(4), 363–369.

Published
2017-09-24