TY - GEN
T1 - Outliers Detection in Multi-label Datasets
AU - Bello, Marilyn
AU - Nápoles, Gonzalo
AU - Morera, Rafael
AU - Vanhoof, Koen
AU - Bello, Rafael
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020
Y1 - 2020
N2 - In many knowledge discovery applications, finding outliers, i.e. objects that behave in an unexpected way or have abnormal properties, is more interesting than finding inliers in a dataset. Outlier detection is important for many applications, including those related to intrusion detection, credit card fraud, and criminal activity in e-commerce. Several methods of outlier detection have been proposed, and even many of them from the perspective of Rough Set Theory, but at the moment none of them is specifically intended for multi-label datasets. In this paper, we propose a method that measures the degree of anomaly of an object in a multi-label dataset. This score or measure quantifies the degree of irregularity of an object with respect to the dataset. In addition, a method for generating anomalies in this type of datasets is proposed. From these synthetic datasets, the efficacy of the proposed method is proved. The results show the superiority of our proposal over other methods in the literature adapted to multi-label problems.
AB - In many knowledge discovery applications, finding outliers, i.e. objects that behave in an unexpected way or have abnormal properties, is more interesting than finding inliers in a dataset. Outlier detection is important for many applications, including those related to intrusion detection, credit card fraud, and criminal activity in e-commerce. Several methods of outlier detection have been proposed, and even many of them from the perspective of Rough Set Theory, but at the moment none of them is specifically intended for multi-label datasets. In this paper, we propose a method that measures the degree of anomaly of an object in a multi-label dataset. This score or measure quantifies the degree of irregularity of an object with respect to the dataset. In addition, a method for generating anomalies in this type of datasets is proposed. From these synthetic datasets, the efficacy of the proposed method is proved. The results show the superiority of our proposal over other methods in the literature adapted to multi-label problems.
KW - Knowledge discovery
KW - Multi-label datasets
KW - Outlier detection
KW - Outlier generation
KW - Rough set theory
UR - http://www.scopus.com/inward/record.url?scp=85092646846&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-60884-2_5
DO - 10.1007/978-3-030-60884-2_5
M3 - Conference contribution
AN - SCOPUS:85092646846
SN - 9783030608835
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 65
EP - 75
BT - Advances in Soft Computing - 19th Mexican International Conference on Artificial Intelligence, MICAI 2020, Proceedings
A2 - Martínez-Villaseñor, Lourdes
A2 - Ponce, Hiram
A2 - Herrera-Alcántara, Oscar
A2 - Castro-Espinoza, Félix A.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 19th Mexican International Conference on Artificial Intelligence, MICAI 2020
Y2 - 12 October 2020 through 17 October 2020
ER -