МОДЕЛІ ГЛИБОКОГО НАВЧАННЯ ДЛЯ ВИРІШЕННЯ ЗАДАЧІ КЛАСИФІКАЦІЇ ТЕКСТОВОЇ ІНФОРМАЦІЇ

Anton Kontsevoi; Oleg Bisikalo

doi:10.31649/1999-9941-2022-55-3-13-20

Authors

Anton Kontsevoi Vinnytsia National technical University
Oleg Bisikalo Vinnytsia National technical University

DOI:

https://doi.org/10.31649/1999-9941-2022-55-3-13-20

Keywords:

text classification, sentiment analysis, question answering, news categorisation, deep learning, natural language inference, topic classification

Abstract

Text analysis as a whole is a new field of study. Fields such as marketing, product management, research, and management already use the process of analysing and extracting information from textual data. In the previous post, we discussed text classification technology, one of the most important parts of text analysis. Text classification or text categorisation is the activity of labelling texts in natural language with appropriate categories from a predetermined set. To put it bluntly, text classification is the process of extracting generic tags from unstructured text. These generic tags come from a set of predefined categories. Categorising content and products helps users easily find and navigate to a website or app. Text classification, also known as text categorisation, is a classic problem in natural language processing (NLP) that aims to assign labels or tags to text units such as sentences, queries, paragraphs, and documents. It has a wide range of applications, including question answering, spam detection, sentiment analysis, news categorisation, user intent classification, content moderation, and more. Text data can come from a variety of sources, including web data, emails, chats, social media, tickets, insurance claims, user feedback, and customer service questions and answers. The text is an extremely rich source of information. But extracting useful data from text is usually difficult and time-consuming due to the unstructured nature of natural language information. Deep learning based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorisation, question answering, and natural language inference. In this paper, we provide a comprehensive review of most widespread deep learning based models for text classification developed in recent years, and discuss their technical contributions, similarities, and strengths.

Author Biographies

Anton Kontsevoi, Vinnytsia National technical University

post-graduate student, Faculty for Intellectual Information Technologies and Automation

Oleg Bisikalo, Vinnytsia National technical University

Dr.Sc. (Eng.), Professor, Faculty for Intellectual Information Technologies and Automation

References

Bisikalo O. System for definition of indicator characteristics of social networks participants Profiles / Oleg Bisikalo, Anton Kontsevoi // Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Systems (COLINS 2020). – CEUR Workshop Proceedings Volume 2604, 2020. – Lviv, Ukraine, April 23-24, 2020. – Pp. 77-88. – ISSN: 16130073.

I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016.

S. Wang and C. D. Manning, “Baselines and bigrams: Simple, good sentiment and topic classification,” in Proceedings of the 50th annual meeting of the association for computational linguistics: Short papers-volume 2. Association for Computational Linguistics, 2012.

R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013.

X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” in Advances in neural information processing systems, 2015.

W. Zhao, H. Peng, S. Eger, E. Cambria, and M. Yang, “Towards scalable and reliable capsule networks for challenging NLP applications,” in ACL, 2019.

W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Advances in neural information processing systems, 2017.

Y. Sun, S. Wang, Y.-K. Li, S. Feng, H. Tian, H. Wu, and H. Wang, “Ernie 2.0: A continual pre-training framework for language understanding.” in AAAI, 2020.

ANALYSIS OF DEEP LEARNING MODELS FOR TEXT IN-FORMATION CLASSIFICATION TASKS

Authors

DOI:

Keywords:

Abstract

Author Biographies

Anton Kontsevoi, Vinnytsia National technical University

Oleg Bisikalo, Vinnytsia National technical University

References

Downloads

Published

How to Cite

Issue

Section

Metrics

Downloads

Most read articles by the same author(s)

Language

Make a Submission

Information

Current Issue

Developed By