ОЦІНКА ВПЛИВУ ПОПЕРЕДНЬОЇ ФІЛЬТРАЦІЇ НА ЯКІСТЬ ВИБІРКИ В RAG-СИСТЕМАХ З ВЕКТОРНИМ ПОШУКОМ

І.В. Клименко; Є.А. Лебідь

doi:10.34185/1991-7848.itmm.2026.01.035

Authors

I.V. Klymenko https://orcid.org/0000-0001-5149-3974 (unauthenticated)
Y.A. Lebid https://orcid.org/0009-0007-4277-2083 (unauthenticated)

DOI:

https://doi.org/10.34185/1991-7848.itmm.2026.01.035

Keywords:

computer systems, information technologies, data mining, artificial intelligence, RAG, machine-based benchmarking, generative language models

Abstract

The paper analyzes modern approaches to evaluating Retrieval-Augmented Generation (RAG) systems that integrate vector search with answer generation by large language models (LLMs). It examines classical retrieval quality metrics alongside LLM-oriented generation quality metrics, including their application within frameworks such as RAGAS, ARES, VERA, and MIRAGE. A computational experiment was conducted using a Google Cloud Platform (GCP) Firestore collection with vector search over a dataset of IT professionals' CVs, comparing standard vector search against search enhanced by pre-filtering on metadata. The results demonstrate that pre-filtering increases the proportion of relevant documents in the context, reduces retrieval latency, and enables larger context sizes without proportional degradation in generation quality. The experimental findings confirm the dependence of RAG system answer quality on the purity and relevance of the retrieved context.

References

Lewis P., Perez E., Piktus A., Petroni F., Karpukhin V., Goyal N., Küttler H., Lewis M., Yih W., Rocktäschel T., Riedel S., Kiela D. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems. – 2020. Vol. 33. P. 9459-9474. DOI: 10.48550/arXiv.2005.11401

Es S., James J., Espinosa-Anke L., Steven S. RAGAS: Automated Evaluation of Retrieval Augmented Generation. Computer Science. Computation and Language. – 2023. DOI: 10.48550/arXiv.2309.15217

Saad-Falcon J., Khattab O., Potts C., Zaharia M. ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2024. P. 3464-3483. DOI: 10.48550/arXiv.2311.09476

Yu Z., Gan Z., Zhang Y., Tong X., Liu H., Liu Q. Evaluation of Retrieval-Augmented Generation: A Survey. Computer Science. Computation and Language. – 2024. DOI: 10.48550/arXiv.2405.07437

THE ASSESSMENT OF IMPACT OF PRE-FILTERING ON RETRIEVAL QUALITY IN RAG SYSTEMS WITH VECTOR SEARCH

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

Language

ouci

crossref

scholar

worldcat

issn

languages

Browse

© 2025 Information technologies in metallurgy and machine building. All Rights Reserved.