Skip to content

Latest commit

 

History

History
222 lines (153 loc) · 7.37 KB

6_Outlier.md

File metadata and controls

222 lines (153 loc) · 7.37 KB

6. Outlier Detection

Outlier detection (OD) requires the observation of all samples and aims to detect those that deviate significantly from the majority distribution. Therefore, their approaches are usually transductive, rather than inductive.

6.1 Density-based Method

[BMC-2014] Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range.
Authors: Xiang Wan, Wenqian Wang, Jiming Liu, Tiejun Tong
Institution: Hong Kong Baptist University; Northwestern University

[SIGMOD-2000] Lof: identifying density-based local outliers.
Authors: Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jorg Sander
Institution: University of Munich; University of British Columbia

[PAKDD-2002] Enhancing effectiveness of outlier detections for low density patterns.
Authors: Jian Tang, Zhixiang Chen, Ada Wai-chee Fu, David W. Cheung
Institution: Chinese University of Hong Kong; University of Texas; University of Hong Kong

[ACM-2009] Loop: local outlier probabilities.
Authors: Hans-Peter Kriegel, Peer Kroger, Erich Schubert, Arthur Zimek
Institution: Ludwig-Maximilians University

[DMKD-2012] Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection.
Authors: Erich Schubert, Arthur Zimek, Hans-Peter Kriegel
Institution: Ludwig-Maximilians-University; University of Alberta

[ACM-1981] Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography.
Authors: Martin A. Fischler, Robert C. Bolles
Institution: SRI International

[WIREs-2011] Robust statistics for outlier de- tection.
Authors: Peter J. Rousseeuw, Mia Hubert
Institution: Katholieke University

[NeurIPS-2018] Efficient anomaly detection via matrix sketching.
Authors: Vatsal Sharan, Parikshit Gopalan, Udi Wieder
Institution: Stanford University; VMware Research

6.2 Distance

6.2.1 Cluster-based Method

The most basic OD method model the entire dataset with the Gaussian distribution, and flag the samples over three standard deviations from the mean.

[KDD-1996] A density-based algorithm for discovering clusters in large spatial databases with noise.
Authors: Martin Ester, Hans-Peter Kriegel, Jiirg Sander, Xiaowei Xu
Institution: University of Munich

[ECML-2007] Class noise mitigation through instance weighting.
Authors: Umaa Rebbapragada, Carla E. Brodley
Institution: Tufts University

6.2.2 Graph-based Method

Similar to "three standard deviations" rules under the assumption that the data follows normal distribution, interquartile range can also be used to identify outliers.

[DMKD-2014] Graph based anomaly detection and description: a survey.
Authors: Leman Akoglu; Hanghang Tong; Danai Koutra
Institution: Stony Brook University, City University of New York, Carnegie Mellon University

[SIGKDD-2003] Graph-based anomaly detection.
Authors: Caleb C. Noble, Diane J. Cook
Institution: University of Texas

[ICTAI-2007] Spatial outlier detection: a graph-based approach.
Authors: Yufeng Kou, Chang-Tien Lu, Raimundo F. Dos Santos
Institution: Virginia Polytechnic Institute and State University

[ICCSE-2012] A graph-based clustering algorithm for anomaly intrusion detection.
Authors: Zhou Mingqiang, Huang Hui, Wang Qian
Institution: Chongqing University

[ACM-2020] Webly supervised image classification with metadata: Automatic noisy label correction via visual-semantic graph.
Authors: Jingkang Yang, Weirong Chen, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang
Institution: Sensetime Research; Rice University; The Chinese University of Hong Kong; Shanghai Jiao Tong University

6.3 Classification-based Method

[-2002] One-class classification: Concept learning in the absence of counter-examples.
Authors: Tax D.M.J
Institution: Technische Universiteit Delft

[ICMI-2018] Deep one-class classification.
Authors: Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Muller, Marius Kloft
Institution: Humboldt University; Hasso Plattner Institute; TU Kaiserslautern; TU Berlin; University of Edinburgh; DFKI GmbH; Singapore University of Technology and Design

[ICDM-2008] Isolation forest.
Authors: Fei Tony Liu, Kai Ming Ting, Zhi-Hua Zhou
Institution: Monash University; Nanjing University

[CVPR-2017] Learning from noisy labels with distillation.
Authors: Yuncheng Li, Jianchao Yang, Yale Song, Liangliang Cao, Jiebo Luo, Li-Jia Li
Institution: Snap Inc.; Yahoo Research

[ICLR-2020] Self: Learning to filter noisy labels with self-ensembling.
Authors: Duc Tam Nguyen, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Laura Beggel, Thomas Brox
Institution: University of Freiburg; Bosch Research; Bosch Center for AI; Karlsruhe Institute of Technology

[NIPS-2018] Co-teaching: Robust training of deep neural networks with extremely noisy labels.
Authors: Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor Tsang, Masashi Sugiyama
Institution: University of Technology Sydney; RIKEN; 4Paradigm Inc.; Stanford University; University of Tokyo

[ECCV-2020] Webly supervised image classification with self- contained confidence.
Authors: Jingkang Yang, Litong Feng, Weirong Chen, Xiaopeng Yan, Huabin Zheng, Ping Luo, Wayne Zhang
Institution: SenseTime Research; Rice University; The Chinese University of Hong Kong; The University of Hong Kong