IREK – AESM: Institutional Repository of Economic Knowledge

The Impact of Data Pre-Processing on the Assessment of the Similarity of Trend Functions

Show simple item record

dc.contributor.author Coanda, Ilie
dc.date.accessioned 2024-02-29T08:33:26Z
dc.date.available 2024-02-29T08:33:26Z
dc.date.issued 2023-09
dc.identifier.isbn 978-9975-167-39-0 (PDF).
dc.identifier.uri https://irek.ase.md:443/xmlui/handle/123456789/3096
dc.description COANDA, Ilie. The Impact of Data Pre-Processing on the Assessment of the Similarity of Trend Functions. In: Competitiveness and Innovation in the Knowledge Economy [online]: 27th International Scientific Conference: Conference Proceeding, September 22-23, 2023. Chişinău: ASEM, 2023, pp. 426-430. ISBN 978-9975-167-39-0 (PDF). en_US
dc.description.abstract An approach to the way, the technologies of cleaning, completing, smoothing of large volumes of data to be subjected to analysis is proposed. As a rule, depending on the field and the method of data collection / recording on various supports, they could be classified at least in two categories: precise data (recorded by automated techniques, without any influence of the human factor) and data, with a level of approximation (when collecting / recording, to some extent, at a certain stage of the activity, the "man" (human) participates). If, in the case of the same activity, relatively, many people participate, then, and the quality level of the records will be at a different level of precision than the records performed in an automated way. This work aims to highlight the importance / impact of the influence of the quality of the preliminary processing (smoothing, cleaning, etc.) of the primary data used in the analysis process. In case studies, the object of the research is considered to be a set of time series corresponding to data collected regarding the phenomenon of the spread of an epidemic. The data recording of such a phenomenon fits perfectly in the studied case when the data collection is carried out with the intense participation of the "human", who is characterized by frequent deviations from the regulations prescribed by the situation. Consequently, some data could be fixed with a delay or / and people affected by the disease signal the doctor in a different period of time. Such phenomena can create anomalies in the data structure. In order to highlight the impact of the application of different smoothing methods, the completion of the primary data, the approximating functions for each time series were obtained, having previously been "corrected" by: a) averaging the neighboring data; b) "suspicious" data were excluded. As a result, two sets of approximating functions are obtained (approximating functions can be obtained by involving non-linear regressions). By applying the technologies for evaluating the similarity of the functions, the distance (similarity level) between the functions of each set of approximating functions is calculated. Next, the hierarchical clusters of the sets of approximating functions (two sets of approximating functions) can be obtained. By comparing the hierarchical clusters, the level of impact of the "correction" methodology approach a) and b) can be evaluated. DOI: https://doi.org/10.53486/cike2023.44; UDC: 004.6:61; JEL: C63, I21, I23, I25, I29 en_US
dc.language.iso en en_US
dc.publisher ASEM en_US
dc.subject cleaning en_US
dc.subject smoothing en_US
dc.subject impact en_US
dc.subject similarity en_US
dc.subject functions en_US
dc.subject regression en_US
dc.title The Impact of Data Pre-Processing on the Assessment of the Similarity of Trend Functions en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account