Mathematical optimisation model for searching duplicate string objects in the memory snapshot

Authors

  • Huk N
  • Mitikov N.

DOI:

https://doi.org/10.34185/1562-9945-6-155-2024-23

Keywords:

mathematical model; optimisation; algorithm; performance; memory snapshot; duplication; string

Abstract

The purpose of this paper is to identify the increased memory usage of software appli-cations. The modern software development cycle focuses on functionality and often ignores aspects of optimal resource usage. Limited physical scaling sets an upper limit on the system's capacity to process requests. The presence of unchanged objects with the same information is a sign of increased memory consumption. Avoiding duplicate objects in memory allows for a more rational use of the existing resource and an increase in the amount of information proc-essed. Existing scientific publications focus on the study of memory leakage problems, and limit their attention to excessive memory usage due to the lack of a unified model for finding excessive memory usage. It is worth noting that existing programming templates contain the ‘object pool’ template, but leave the conclusion about the feasibility of its implementation to engineers without providing a mathematical basis. The paper presents the development of a mathematical model for the process of detecting duplicate objects with the immutability prop-erty of the String type in a memory snapshot. Industrial systems that require hundreds of GB of RAM to operate and contain millions of objects in RAM are analysed. Given this scale of data, there is a need to optimise the duplicate detection process. The research method is to analyse memory snapshots of highly loaded systems using the software code developed on .NET technology and the ClrMD library. The memory snapshot reflects the state of the proc-ess under study at a given time, contains all objects, threads and operations performed. The ClrMD library allows you to programmatically examine objects, their types, get field values, and build graphs of relationships between objects. Based on the results of the study, an opti-misation was proposed that allows to speed up the process of finding duplicates several times. The scientific contribution of the study is the creation of a mathematically sound approach that significantly reduces the use of memory resources and optimises computing processes. The practical usefulness of the model is confirmed by the optimisation results achieved through the recommendations, reduced hosting costs (which provides greater cost-effectiveness in the deployment and use of software systems in industrial environments), and increased data processing.

References

Gregg, Brendan. 2.7.3 Scaling solutions. Systems Performance, Second Edition. Boston : Addison-Wesley, 2021, с. 929.

Bentley, Jon Louis. Writing efficient programs. Englewood Cliffs, N.J. : Prentice-Hall, 1982. ISBN-13/ ‏ 978-0139702440.

3. Microsoft. Windows Virtual Machines Pricing. Cloud Computing Services | Microsoft Azure. [Online] Microsoft . [Cited: 8 February 2024 p.] https://azure.microsoft.com/en-us/pricing/details/virtual-machines/windows/.

Measuring the Cost of Regression Testing in Practice. Labuschagne, Adriaan, Laura Ino-zemtseva, Reid Holmes. New York : Association for Computing Machinery, 2017. 978-1-4503-5105-8.

Kleppmann, Martin. CHAPTER 5: Replication. Designing Data-Intensive Applications. місце видання невідоме : O’Reilly Media, 2017.

Efficiently and precisely locating memory leaks and bloat. Gene Novark, Emery D. Berger, Benjamin G. Zorn. Dublin : Proceedings of the 30th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation, 2009. 978-1-60558-392-1.

Analyzing Data Structure Growth Over Time to Facilitate Memory Leak Detection. Wen-inger, Markus. Mumbai : Association for Computing Machinery, 2019. 978-1-4503-6239-9.

Hewardt, Mario. Advanced .NET Debugging. Addison-Wesley Professional, 2009. 978-0-321-57889-1.

Hamad Naeem, Shi Dong, Olorunjube James Falana, Farhan Ullah. Development of a deep stacked ensemble with process based volatile memory forensics for platform independent malware detection and classification. Expert Systems with Applications. 2023 p., Т. 223.

Roy Osherove, Vladimir Khorikov. The Art of Unit Testing. Manning, 2024. ISBN 9781617297489.

Ioannis T. Christou, Sofoklis Efremidis. To Pool or Not To Pool? Revisiting an Old Pat-tern. Marousi : Athens Information Technology, 2018. arXiv:1801.03763.

Immutable Objects for a Java-Like Language. C. Haack, E. Poll, J. Schäfer, A. Schubert. Berlin : Springer, 2007. 978-3-540-71316-6.

M. Yu. Mitikov, N.A. Huk Overview of methods for identifying and analyzing perform-ance problems in software: approaches, challenges and prospects // Issues of applied mathe-matics and mathematical modeling [Text]: coll. of science pr. / editor: O.M. Kiselyova (corre-sponding editor) [etc.]. – Dnipro, 2023. – Issue 23. - p. 171 – 178. DOI: 10.15421/322318

MRm-DLDet: a memory-resident malware detection framework based on memory foren-sics and deep neural network. Liu, J., Feng, Y., Liu, X. 21, Cybersecurity, 2023 p., Т. 6. 10.1186/s42400-023-00157-w.

Downloads

Published

2025-02-02