The UU-test (Master thesis)

Χασάνη, Παρασκευή

Recognizing unimodal data distributions is of great significance in statistics, machine learning and data science. Well-known distributions, such as Gaussian, Student’s t, Gamma, Chi-square, Exponential and Cauchy are typical examples of unimodal distributions. Also the uniform distribution is considered as an extreme unimodal case. The characteristic property of a unimodal distribution is that data values are gathered around a single value (peak), which is the mode of the distribution. Due to this property, data can be characterized as homogeneous, forming a single and coherent group. Unimodality tests have been proposed to decide on the unimodality of a set of data values, thus providing useful knowledge about the structure of the data. For example, if a dataset is unimodal, the data values are “gathered” thus applying a clustering method is unnecessary. Current unimodality tests decide exclusively about the existence (or not) of a single mode and do not focus on the statistical modeling of the data. We propose a new unimodality test called Unimodal-Uniform test (UU-test) to decide if a set of data values has been a generated by a unimodal distribution or not. The method utilizes the empirical cumulative density function (ecdf) and attempts to obtain a unimodal piecewise linear approximation of the ecdf under the constraint that the data corresponding to each linear segment follow the uniform distribution. An attractive feature of the proposed approach is that not only it decides on unimodality, but it also produces a generative model of the unimodal data in the form of a mixture of uniform distributions. Thus, it can be used for statistical data modeling. Modeling unimodal data is typically performed by fitting a specific single unimodal distribution, usually a Gaussian distribution. This approach lacks flexibility since it cannot efficiently model data samples generated by asymmetric distributions. The uniform mixture model produced by the UU-test is able to model unimodal distributions with arbitrary shape. In the experimental evaluation we conducted, it is shown that the UU-test is effective both in deciding unimodality/multimodality and also in providing accurate statistical models of unimodal data.
Alternative title / Subtitle: deciding on distribution unimodality using tests of uniformity
ανίχνευση μονοτροπικότητας κατανομών χρησιμοποιώντας τεστ ομοιομορφίας
Institution and School/Department of submitter: Πανεπιστήμιο Ιωαννίνων. Πολυτεχνική Σχολή. Τμήμα Μηχανικών Ηλεκτρονικών Υπολογιστών και Πληροφορικής
Subject classification: Tests
Keywords: Unimodality,Uniform,Test,Distribution,Μονοτροπικότητα,Ομοιόμορφο,Τεστ,Κατανομή
