OF THE INFLUENCE OF A NETWORK INTERFACE ON THE EFFICIENCY OF MODULAR MULTIPROCESSOR SYSTEMS

The paper is devoted to the approach development related to methodology definition for evaluation of the modular multiprocessor computing systems efficiency. At the same time, the main attention is focused on the impact peculiarities on this network interface value. The formation analysis of the multiprocessor system network interface architecture and the basic modes of its operation have been analyzed. To evaluate the processes occurring in the system during the information flows transmission, the network system bandwidth and the switch throughput were compared; which allowed determining the preconditions for optimal components selection of the multiprocessor computing system network interface. The performed researches also allowed deducing analytical relations for determining the optimal number of system nodes with different functioning modes. The selected processors coherency coefficient, network interface and value of the computing area are deduced. The derived analytical relationships also showed that the optimal number of blades in a multiprocessor computing system, that provide its highest speed, decreases with increasing computing power of the processors included. It is shown that the network data interchange among the multiprocessor computing system nodes the more likely to impede the overall computation process; the less time will be spent directly on solving a specific problem.

One of the main problems of using a multiprocessor computing system to solve such problems considered in this paper can be stated as follows: we have the dimension mesh difference M, the computation time of a problem solved by a singleprocessor system denoted by the t value. This is a key parameter. It is necessary to significantly reduce the computing time while saving the M value. Consequently, we consider the problem where we aim to reduce the computation time by increasing the nodes number in the multiprocessor computing system. In this case, the computing area is evenly distributed among the multiprocessor computing system nodes.
Problem statement. Significant interest in the practice of parallel computing is the value evaluation of possible increase in productivity, considering qualitative characteristics of the software product itself and technical capabilities of multiprocessor computing system. Ideally, one can predict that solving any problem by the number of processors n should be n times faster than one processor. However, in fact, such acceleration can hardly ever be achieved. The reason for this circumstance is well illustrated by the Amdahl law (Gene Amdahl) [4]. The Amdahl law associates potential computation acceleration when paralleling with the share of operations performed a priori consistently. A preliminary analysis of Amdahl's law suggests that the potential acceleration of computations when paralleling is associated with the proportion of operations that are performed sequentially. The priori estimation of the successive operations proportion f is not easy (the sequence concepts and operations parallelism are difficult to formalize and allow for ambiguous interpretations).
It's virtually impossible to estimate this value by simple analyzing the program text.
We emphasize that this estimate can only be made by real computations using a different number of processors. This circumstance was considered while evaluating the developed multiprocessor computing system effectiveness. For that matter, in the research of efficiency indicators, Amdall's law was applied to solve the inverse problem, which is in determining the value of f (the algorithm operations part that can not be parallelized) based on experimental data obtained from the system's performance. That allowed quantifying the achieved parallelism efficiency. Taking into account the aforementioned, a general approach was developed to evaluate the performance of a modular multiprocessor computing system.
It is also known that the computations parallelization efficiency essentially depends on many factors; one of the most important is the specificity of the data transfer among adjacent nodes of the multiprocessor computing system, since this is the algorithm slowest part can undo the effect of increasing the processors number used.

58
These issues are considered crucial in a simulation of a wide range of problems using modular multiprocessor computing systems, and today those are successfully solved by many researchers [5 -9].
The unresolved parts of the problem. Working methods of efficiency analyzing of multiprocessor computing systems do not allow determining optimal number of its nodes for solving a certain class of problems. At the same time, they did not get proper research development on the network interface impact on the efficiency of modular multiprocessor computing systems. In addition, for computing multiprocessor system evaluation efficiency, the main analytical relations are not provided through parameters of the system being studied.

Research purpose.
Research purpose is to further develop the approach associated with definition of a methodology for evaluating the multiprocessor modular computing system effectiveness. At the same time, the main attention is paid to influence peculiarities on the given indicator of the developed system network interface. It is also necessary to derive analytical relations for determining the optimal nodes number in the application to different modes of its operation. For ease of estimation of the multiprocessor computing system efficiency, it is necessary to derive the main analytical relationships through its parameters.
The research main results. For a class of problems that are solved in this paper, all computations are performed on the basis of the difference grid. Then, when analyzing the multiprocessor computing system efficiency, the most important parameter was the time to compute a single iteration ( it T ) in association with the computing field. In terms of multiprocessor computing system, this indicator was determined on the basis of the following ratio: herein N T T it n = , this value means the computation time of a single iteration by N computing nodes, in seconds, ex T is the time of boundary data exchange among the system nodes, in seconds. The computation time of the single iteration itself when used in the system N of the computing nodes can be specified by the following formula:

59
In expression (2) Ei is the array length of the computations boundary field; at the same time, this value determines the difference grid length along the abscissa; Ey is the length of the difference grid along the ordinate; KR is the one difference cell size of type Real*8 (64 bits); parameter Vc shows the speed of computing when solving such problems using the proposed processor.
The value ex T was determined by the following formula: In expression (3), the m value may be equal to one in the unilateral mode of boundary data exchange, or two when it is two-way; Vp is network bandwidth in the system, Gbit/s; k is the communication channels number of the network operating simultaneously (computing networks number), d -half-duplex (d = 1), or duplex (d = 2) mode of the computer network in a multiprocessor computing system.
Under these conditions one can compute total computation time of a single iteration that will include, in fact, the computation time of a single iteration when using N nodes of a multiprocessor computing system and the time of boundary data exchange depending on the number of nodes N, that is The relation (4) analysis shows that computation area distribution among the nodes allowed reducing the number of computations performed by each of its blades.
Due to the fact that the multiprocessor computing system nodes work in parallel, then the total computing iteration time decreases. At the same time, with the nodes increase in the system, the boundary data volume also increases, and thus, time for the information exchange among the nodes increases.
In order to compute acceleration and efficiency of the system, the commonly accepted concepts in the theory of parallel computations were taken as the basis. An analytical ratio was derived for estimating the efficiency of a multiprocessor computer system through its parameters, i.e.
The performance indicators of multiprocessor computing system were simultaneously determined by the formulas (3 -5) given above and by means of experimental computations. It was observed that the results obtained coincide, which is explained by the computations nature.   width and the switch throughput were compared. This procedure was necessary for optimal selection of components of the network interface of a multiprocessor computing system. In this regard, for the research convenience, the total bandwidth parameter of the multiprocessor network system was introduced according to the specification of the manufacturer ( S V ). It was defined as follows: Herein N is the nodes number in the system, and p V is its network protocol throughput, Gbit/s. With this approach it is already possible to compare the values of the total network interface bandwidth of the system ( S V ) and the switch ( b V ) bandwidth. For further analysis of the network interface of multiprocessor computing system, the throughput coefficient of the network system ( s k ) was introduced. Its value was determined by the following ratio: By taking into account formula (7) .
To further use this approach, the concept of switching capacity coefficient ( b k ) was proposed and a formula was derived for its definition, i.e.
To illustrate broad picture of the processes under research, some definitions were introduced, and then, with their account, a more detailed analysis of the main network characteristics of multiprocessor computing system was performed.

Definition 1. The network interface deficiency mode of a multiprocessor comput-
ing system is such an option of its network functioning, where there is an inequality: Establishing the optimal nodes number in the cluster system. It should be noted that the numerical-analytical schemes of the higher accuracy order, considered in the papers [12,9], serve as the computational methods for solving the heat «Системні технології» 3 (134) 2021 «System technologies» ISSN 1562-9945 (Print) ISSN 2707-7977 (Online) 62 conduction problem. As a basis for determining the boundary data exchange time in a cluster system while working in its network interface deficiency mode, the relation (3) becomes as follows: To compute the boundary data exchange time in the cluster system in its network interface surplus mode, we apply the relation of the following form: Further, considering the multiprocessor computing system in the performed experiment conditions, we establish in it the nodes number that provides the most effective solution to the problem. At the same time, [5,3] show that the computational speed will increase by about the moment when ex calc T T ≈ . (12) Thus, on the basis of the relation (12), it is possible to compute the nodes number in the cluster computing system needed to effectively solve the problem. Note that this research phase aims to reduce the total computation time by parallelizing the program. Obviously, at the same time, the overall grid size does not depend on the computing nodes number in the cluster system. Taking into account the relation (9), we obtain analytical expressions for determining the optimal nodes number of a cluster system, when it operates in a network interface deficiency mode, that is, We also have the following expression for the network interface surplus: Using expressions (13) and (14), we can obtain two equations in relation to N to determine the optimal nodes number in a cluster system, where the total computational required time for solving the problem will be minimal. Thus, the equation (13) reduces to a quadratic form, i.e.
For analysis convenience, we will write the equation (15) as follows: , this value can be interpreted as the capabilities coherency coefficient of the selected processors, the network interface and the computing area value when the system operates in a network interface deficiency mode. In addition, it should be emphasized that the correspondence of the cluster system capabilities to the nature of the problems to be solved requires the coordination of all parts that is included in the value λ . Let's analyze this coefficient. At first glance, the result turned out to be somewhat paradoxical. It shows that the coefficient of consistency λ , as well as the optimal blades number in the cluster system do not depend on the data area size. Such an assertion can be explained by the fact that the computation domain distribution among the cluster system nodes was carried out at its constant size. This means that the ratio of the time spent on processing the data in this area and the transfer time also remained unchanged and independent of its size. The second very important conclusion is that the optimal blades number in the cluster system, which provides its highest speed, decreases with increasing computing power of the processors included in it. Such a statement becomes quite clear when one considers that the network data exchange among the cluster system nodes is more likely to impede the overall computation process, the less time will be spent directly on solving a specific problem.
Thus, the equation (15) solution will be two radicals, one of them is negative, and the other one is positive. Proceeding from the set physical conditions of the problem, a positive radical is adopted, which value is equal to eight, hence N = 8.
Note that this result satisfies the inequality that establishes the conditions for the cluster system to function in the network interface deficiency mode [14]. Equation (11) is reduced to a cubic form, i.e.
For the convenience of analysis, we will write it in this way In equation (15) ) , and this value can be considered as the capabilities coherence coefficient of selected processors, the network interface and the value of the computing area when the system operates in the network interface surplus mode. Let's analyze the value of this coefficient. One can conclude that the optimal number of blades in the cluster system is capable of providing its highest performance, will depend on the size of the computing area, the switching capabilities and computing power of the processors which the cluster system is composed ISSN 1562-9945 (Print) ISSN 2707-7977 (Online) 64 from. These parameters variation allows selecting the appropriate blades number when operating the system in the network interface surplus mode.
As a result of the equation (14) solution three radicals are obtained, in particular, two imaginary, and one valid. The actual radical corresponds to the nodes number: N = 33. However, this result analysis indicates that it does not satisfy the condition of the cluster system functioning in the network interface surplus mode [5].
Having analyzed the simulation results, we can conclude that under the conditions of the problem being researched the optimal blades number of the cluster system would correspond to N = 8.
Conclusions. Improvement and creation of new technological processes require considerable expenses for a large number of field experiments on laboratory, experimental and industrial equipment, as well as in industrial conditions. The reduction of experimental researches and the time to conduct them with obtaining of necessary information for technological developments design and implementation can be achieved by multiprocessor computing systems.
The class of problems being considered in this paper is solved by a multiprocessor computing system. Thus, we have preconditions for quantifying the multiprocessor computing system efficiency. In this problem, the optimal nodes number in the cluster system, with the maximum parallelism efficiency is N = 8. Such a cluster size will provide 4.28 times faster than solving a problem by one computer. According to the computed data, the proposed cluster mode allowed not only to improve system efficiency, but also significantly reduce computing time, from 83.11 to 19.52 seconds. At the same time, note that if such an acceleration computation value is not able to provide control of temperature fields, then, in this case, it is necessary to apply more powerful processors. For these reasons, further promising research will be aimed at highlighting such issues.