Review of natural language processing methods: evolution, architectures, and application domains

Authors

DOI:

https://doi.org/10.34185/1562-9945-3-164-2026-21

Keywords:

artificial intelligence, natural language processing, large language models, transformer architectures, neuroevolution, sentiment analysis, bibliometric analysis, large reasoning models, machine learning, research automation

Abstract

The global Natural Language Processing (NLP) market demonstrates rapid, almost exponential growth, reflecting immense commercial interest and the digital economy's reliance on AI algorithms. However, the rapid scaling of neural network parameters—from millions to hundreds of billions—has introduced new computational and ecological challenges. Furthermore, models inevitably inherit biases present in training data, raising critical ethical and safety concerns. There is an urgent need for novel, highly efficient architectures and transparent optimization methods to overcome these limitations.

The primary goal of this research is to conduct a comprehensive analytical review of NLP methods, trace their architectural evolution from traditional statistical machine learning models to modern Large Language Models (LLMs) and Large Reasoning Models (LRMs), and systematize their practical application domains at the current stage of technological development.

The evolution of NLP is conceptually divided into three fundamental paradigms: rule-based symbolic systems, statistical methods (e.g., TF-IDF, SVM, n-grams), and deep learning. A tectonic shift occurred with the introduction of the Transformer architecture, which replaced recurrent blocks with self-attention mechanisms, enabling parallel processing. To address the quadratic computational complexity of processing long contexts, the study analyzes non-traditional optimization approaches, primarily Neuroevolution and Neural Architecture Search (NAS), which automatically construct optimal transformer blocks. The paper also systematizes the practical application of NLP in healthcare, software engineering, and business analytics, demonstrating varying degrees of accuracy and highlighting the necessity of Retrieval-Augmented Generation (RAG) technologies.

NLP has become the most dynamic segment of artificial intelligence. A fundamental shift from statistical imitation to algorithmic reasoning is observed, evidenced by LRMs like DeepSeek-R1, which are capable of self-verifying logic and mitigating factual hallucinations. As models approach the limits of silicon architecture, the future lies in combining neural networks with evolutionary algorithms. Future research must focus on overcoming ethical challenges, ensuring digital equality for low-resource languages, and advancing neurosymbolic AI and quantum computing adaptations.

References

ElectroIQ. (2025). Natural Language Processing Statistics By Market, Revenue And Trends (2025). Retrieved from https://electroiq.com/stats/natural-language-processing-statistics/

Strengths and Weaknesses of LLM-Based and Rule-Based NLP Technologies and Their Potential Synergies. (2025). Electronics, 14(15), 3064. Retrieved from https://www.mdpi.com/2079-9292/14/15/3064

Golec, J., & Hachaj, T. (2025). Ten Natural Language Processing Tasks with Generative Artificial Intelligence. Applied Sciences, 15(16), 9057. Retrieved from https://www.mdpi.com/2076-3417/15/16/9057

Manzoni, L., Jakobovic, D., Mariot, L., Picek, S., & Castelli, M. (2020). Towards an evolutionary-based approach for natural language processing. arXiv preprint arXiv:2004.13832. Retrieved from https://arxiv.org/abs/2004.13832

Natural language processing for analyzing online customer reviews: a survey, taxonomy, and open research challenges. (2024). PeerJ Comput. Sci. Retrieved from https://pmc.ncbi.nlm.nih.gov/articles/PMC11323031/

Vaswani, A., et al. (2017). Attention Is All You Need. arXiv preprint arXiv:1706.03762. Retrieved from https://arxiv.org/abs/1706.03762

A 2D Semantic-Aware Position Encoding for Vision Transformers. (2025). arXiv preprint arXiv:2505.09466. Retrieved from https://arxiv.org/html/2505.09466v1

Diko, A., Avola, D., Cascio, M., & Cinque, L. (2024). ReViT: Enhancing Vision Transformers Feature Diversity with Attention Residual Connections. arXiv preprint arXiv:2402.11301. Retrieved from https://arxiv.org/abs/2402.11301

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805. Retrieved from https://arxiv.org/abs/1810.04805

Yenduri, G., et al. (2023). Generative Pre-trained Transformer: A Comprehensive Review on Enabling Technologies, Potential Applications, Emerging Challenges, and Future Directions. arXiv preprint arXiv:2305.10435. Retrieved from https://arxiv.org/abs/2305.10435

DeepSeek-AI, et al. (2025). DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv preprint arXiv:2501.12948. Retrieved from https://arxiv.org/abs/2501.12948

Bai, S., et al. (2025). Qwen2.5-VL Technical Report. arXiv preprint arXiv:2502.13923. Retrieved from https://arxiv.org/abs/2502.13923

Sun, W., et al. (2025). Speed Always Wins: A Survey on Efficient Architectures for Large Language Models. arXiv preprint arXiv:2508.09834. Retrieved from https://arxiv.org/abs/2508.09834

Enhancing Neural Network Training Through Neuroevolutionary Models: A Hybrid Approach to Classification Optimization. (2025). Mathematics, 13(7), 1114. Retrieved from https://www.mdpi.com/2227-7390/13/7/1114

PubMed Article. (2024). PMID: 38502633. Retrieved from https://pubmed.ncbi.nlm.nih.gov/38502633/

Searching for Efficient Transformers for Language Modeling. (2021). Advances in Neural Information Processing Systems (NeurIPS). Retrieved from https://papers.neurips.cc/paper_files/paper/2021/file/2f3c6a4cd8af177f6456e7e51a916ff3-Paper.pdf

Evolutionary Neural Architecture Search for Transformer in Knowledge Tracing. (2024). OpenReview. Retrieved from https://openreview.net/pdf?id=G14N38AjpU

Wang, C., et al. (2025). When Large Language Models Meet Evolutionary Algorithms: Potential Enhancements and Challenges. Science and Technology Review. Retrieved from https://pmc.ncbi.nlm.nih.gov/articles/PMC11948732/

Rahman, S., Shanto, H. K., Koana, U. A., & Danish, S. M. (2025). Automated Research Article Classification and Recommendation Using NLP and ML. arXiv preprint arXiv:2510.05495. Retrieved from https://arxiv.org/html/2510.05495v1

Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation. (2025). Retrieved from https://pmc.ncbi.nlm.nih.gov/articles/PMC11986382/

Lieberum, et al. (2024). Large language models for conducting systematic reviews: on the rise, but not yet ready for use – a scoping review. medRxiv. Retrieved from https://www.medrxiv.org/content/10.1101/2024.12.19.24319326v1.full.pdf

A Systematic Literature Review on Using Natural Language Processing in Software Requirements Engineering. (2024). Electronics, 13(11), 2055. Retrieved from https://www.mdpi.com/2079-9292/13/11/2055

Comparative Analysis of NLP-Based Models for Company Classification. (2024). Information, 15(2), 77. Retrieved from https://www.mdpi.com/2078-2489/15/2/77

Hung, C., & Kumar, S. (2025). Natural language processing in impact assessment: a review of applications and concerns. Impact Assessment and Project Appraisal. Retrieved from https://www.tandfonline.com/doi/full/10.1080/14615517.2026.2620186

Scientific Journal of Engineering Research (SJER). Retrieved from https://journal.futuristech.co.id/index.php/sjer/article/view/6

Published

2026-04-30