GRAPHICAL AND ANALYTICAL METHODS FOR PROCESSING “BIG DATA” BASED ON THE ANALYSIS OF THEIR PROPERTIES

Authors

  • Olena Ihorivna Syrotkina
  • Mykhailo Oleksandrovych Aleksieiev
  • Iryna Mykhailivna Udovyk

DOI:

https://doi.org/10.34185/1562-9945-3-122-2019-10

Keywords:

big data, data structure, ordered set of arbitrary cardinality, m-tuples, Boolean graph, minimization of time and computing resources

Abstract

This article addresses the subject of creating mathematical methods in order to optimize time and computing resources when processing “big data.” One of the ways of solving this problem is the creation of NoSQL systems, an advantage of which is the flexibility of data models as well as the possibility of horizontal scaling, parallel processing and the speed of obtaining results. From the viewpoint of “big data” analysis, there have been other methods developed such as machine learning, artificial intelligence, distributed processing of streams and events, and visual data research technology.
Furthermore, the aim of the research is to develop mathematical methods for processing “big data” based on the system analysis of the data structure properties known as “m-tuples based on ordered sets of arbitrary cardinality (OSAC).”
The data structure “m-tuples based on OSAC” is the Boolean, which is ordered by right-side enumeration of the elements of the basis set with cardinality n from the lower boundary of the possible change of the index value for each element of the tuple to the upper one. We formulated certain properties for the data structure investigated. These properties result from rules of logic when forming this structure. We also described mathematical methods based on these properties. Boolean graphs are illustrated with drawings and the outlined vertices of the graph correspond to the declared properties of the given data structure. We derived analytical dependencies to determine these Boolean elements. These Boolean elements do not require the execution of algorithms that implement the particular operations of intersection, union, and membership because the desired result is already determined by these properties.
The properties of the data structure in question with regards to m-tuples based on OSAC allow us to determine some interdependencies between m-tuples by their location in the structure. Their location is determined by a pair of indices (j, m) without executing computing algorithms. In this case, the time estimate for obtaining results changes from a cubic O(n3) to linear O(n) dependency.

References

Min Chen. Big Data. Related Technologies, Challenges, and Future Prospects. / Min Chen, Shiwen Mao, Yin Zhang, Victor C.M. Leung // Springer. – 2014. – 100 pp.

Petrova S. Ju. The Problem of Navigation in Big Data. / S. Ju. Petrova // Experimental and Theoretical Studies in Modern Science: Proceedings of the IIIrd International Scientific Conference, Novosibirsk: SibAK. – 2017. – P. 5-8. (in Russian).

Soloviov S. Ju. Mathematical Methods and Principles of Building Automated Knowledge Engineering Systems. / S. Ju. Soloviov // Dissertation of Doctor of Technical Sciences. – 1996. – 272 pp. (in Russian).

Totsenko V.G. Expert Diagnostic System by Unstructured Features / V.G. Totsenko, E.A. Petrova, A.A. Chernyavskaya // Registration, Storage and Data Processing. – 2005. – Issue 7(2) . –P. 94-103. (in Russian).

MacGregor J. Monitoring, Fault Diagnosis, Fault-Tolerant Control and Optimization. / J. MacGregor, A. Cinar // Data Driven Methods. Computers & Chemical Engineering. – 2012. – Issue 47. – P. 111-120.

Syrotkina O. The Application of Specialized Data Structures for SCADA Diagnostics / O. Syrotkina // System Technologies. Regional Interuniversity Collection of Scientific Papers, Dnipropetrovsk – 2015. – Issue 4. – P. 72-81.

Syrotkina O. Evaluation to Determine the Efficiency for the Diagnosis Search Formation Method of Failures in Automated Systems / O. Syrotkina, M. Alekseyev, O. Aleksieiev // Eastern-European Journal of Enterprise Technologies. – 2017. – Vol. 4, Issue 9 (88). – P. 59-68.

Syrotkina O. Methods of Minimizing Computing Resources when Processing “Big Data”. / O. Syrotkina , I. Udovyk , M. Alekseyev. // Fifth International Conference "High Performance Computing" HPC-UA 2018 (Ukraine, Kyiv, October 22-23, 2018). P. 151 – 157.

Downloads

Published

2019-10-10