Inaccurate Labels in Weakly-Supervised Deep Learning: Automatic Identification and Correction and Their Impact on Classification Performance.

Abstract:

:In data-driven deep learning-based modeling, data quality may substantially influence classification performance. Correct data labeling for deep learning modeling is critical. In weakly-supervised learning, a challenge lies in dealing with potentially inaccurate or mislabeled training data. In this paper, we proposed an automated methodological framework to identify mislabeled data using two metric functions, namely, Cross-entropy Loss that indicates divergence between a prediction and ground truth, and Influence function that reflects the dependence of a model on data. After correcting the identified mislabels, we measured their impact on the classification performance. We also compared the mislabeling effects in three experiments on two different real-world clinical questions. A total of 10,500 images were studied in the contexts of clinical breast density category classification and breast cancer malignancy diagnosis. We used intentionally flipped labels as mislabels to evaluate the proposed method at a varying proportion of mislabeled data included in model training. We also compared the effects of our method to two published schemes for breast density category classification. Experiment results show that when the dataset contains 10% of mislabeled data, our method can automatically identify up to 98% of these mislabeled data by examining/checking the top 30% of the full dataset. Furthermore, we show that correcting the identified mislabels leads to an improvement in the classification performance. Our method provides a feasible solution for weakly-supervised deep learning modeling in dealing with inaccurate labels.

authors

Hao D,Zhang L,Sumkin J,Mohamed A,Wu S

doi

10.1109/JBHI.2020.2974425

subject

Has Abstract

pub_date

2020-09-01 00:00:00

pages

2701-2710

issue

9

eissn

2168-2194

issn

2168-2208

journal_volume

24

pub_type

杂志文章
  • Inverse estimation of multiple muscle activations from joint moment with muscle synergy extraction.

    abstract::Human movement is produced resulting from synergetic combinations of multiple muscle contractions. The resultant joint movement can be estimated through the related multiple-muscle activities, which is formulated as the forward problem. Neuroprosthetic applications may benefit from cocontraction of agonist and antagon...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章,随机对照试验

    doi:10.1109/JBHI.2014.2342274

    authors: Li Z,Guiraud D,Hayashibe M

    更新日期:2015-01-01 00:00:00

  • MRI-based segmentation of pubic bone for evaluation of pelvic organ prolapse.

    abstract::Pelvic organ prolapse (POP) is a major women's health problem. Its diagnosis through magnetic resonance imaging (MRI) has become popular due to current inaccuracies of clinical examination. The diagnosis of POP on MRI consists of identifying reference points on pelvic bone structures for measurement and evaluation. Ho...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2014.2302437

    authors: Onal S,Lai-Yuen SK,Bao P,Weitzenfeld A,Hart S

    更新日期:2014-07-01 00:00:00

  • Arrhythmia discrimination using a smart phone.

    abstract::We hypothesize that our smartphone-based arrhythmia discrimination algorithm with data acquisition approach reliably differentiates between normal sinus rhythm (NSR), atrial fibrillation (AF), premature ventricular contractions (PVCs) and premature atrial contraction (PACs) in a diverse group of patients having these ...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2015.2418195

    authors: Chong JW,Esa N,McManus DD,Chon KH

    更新日期:2015-05-01 00:00:00

  • A chance-constrained programming approach to preoperative planning of robotic cardiac surgery under task-level uncertainty.

    abstract::In this paper, a novel formulation for robust surgical planning of robotics-assisted minimally invasive cardiac surgery based on patient-specific preoperative images is proposed. In this context, robustness is quantified in terms of the likelihood of intraoperative collisions and of joint limit violations. The propose...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2014.2315798

    authors: Azimian H,Naish MD,Kiaii B,Patel RV

    更新日期:2015-03-01 00:00:00

  • Unsupervised eye blink artifact denoising of EEG data with modified multiscale sample entropy, Kurtosis, and wavelet-ICA.

    abstract::Brain activities commonly recorded using the electroencephalogram (EEG) are contaminated with ocular artifacts. These activities can be suppressed using a robust independent component analysis (ICA) tool, but its efficiency relies on manual intervention to accurately identify the independent artifactual components. In...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2014.2333010

    authors: Mahajan R,Morshed BI

    更新日期:2015-01-01 00:00:00

  • A survey on ambient-assisted living tools for older adults.

    abstract::In recent years, we have witnessed a rapid surge in assisted living technologies due to a rapidly aging society. The aging population, the increasing cost of formal health care, the caregiver burden, and the importance that the individuals place on living independently, all motivate development of innovative-assisted ...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/jbhi.2012.2234129

    authors: Rashidi P,Mihailidis A

    更新日期:2013-05-01 00:00:00

  • Automatic annotation of seismocardiogram with high-frequency precordial accelerations.

    abstract::Seismocardiogram (SCG) is the low-frequency vibrations signal recorded from the chest using accelerometers. Peaks on dorsoventral and sternal SCG correspond to specific cardiac events. Prior research work has shown the potential of extracting such peaks for various types of monitoring and diagnosis applications. Howev...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2014.2360156

    authors: Khosrow-khavar F,Tavakolian K,Blaber AP,Zanetti JM,Fazel-Rezai R,Menon C

    更新日期:2015-07-01 00:00:00

  • A Visually Interpretable Deep Learning Framework for Histopathological Image-based Skin Cancer Diagnosis.

    abstract::Owing to the high incidence rate and the severe impact of skin cancer, the precise diagnosis of malignant skin tumors is a significant goal, especially considering treatment is normally effective if the tumor is detected early. Limited published histopathological image sets and the lack of an intuitive correspondence ...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2021.3052044

    authors: Jiang S,Li H,Jin Z

    更新日期:2021-01-15 00:00:00

  • Transfer Learning for Multicenter Classification of Chronic Obstructive Pulmonary Disease.

    abstract::Chronic obstructive pulmonary disease (COPD) is a lung disease that can be quantified using chest computed tomography scans. Recent studies have shown that COPD can be automatically diagnosed using weakly supervised learning of intensity and texture distributions. However, up till now such classifiers have only been e...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2017.2769800

    authors: Cheplygina V,Pena IP,Pedersen JH,Lynch DA,Sorensen L,de Bruijne M

    更新日期:2018-09-01 00:00:00

  • Automatic Analysis of Food Intake and Meal Microstructure Based on Continuous Weight Measurements.

    abstract::The structure of the cumulative food intake (CFI) curve has been associated with obesity and eating disorders. Scales that record the weight loss of a plate from which a subject eats food are used for capturing this curve; however, their measurements are contaminated by additive noise and are distorted by certain type...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2018.2812243

    authors: Papapanagiotou V,Diou C,Ioakimidis I,Sodersten P,Delopoulos A

    更新日期:2019-03-01 00:00:00

  • A Smartphone Application for Automated Decision Support in Cognitive Task Based Evaluation of Central Nervous System Motor Disorders.

    abstract:BACKGROUND AND OBJECTIVE:New technology enables constant boost to the powers of mobile devices, which in the previous years have transformed from simple mobile phones to smart phones. Computational powers of these electronics enable actions that previously were possible only for computers. By the use of special applica...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2019.2891729

    authors: Lauraitis A,Maskeliunas R,Damasevicius R,Polap D,Wozniak M

    更新日期:2019-09-01 00:00:00

  • Neonatal Heart and Lung Sound Quality Assessment for Robust Heart and Breathing Rate Estimation for telehealth Applications.

    abstract::With advances in digital stethoscopes, internet of things, signal processing and machine learning, chest sounds can be easily collected and transmitted to the cloud for remote monitoring and diagnosis. However, low quality of recordings complicates remote monitoring and diagnosis, particularly for neonatal care. This ...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2020.3047602

    authors: Grooby E,He J,Kiewsky J,Fattahi D,Zhou L,King A,Ramanathan A,Malhotra A,Dumont GA,Marzbanrad F

    更新日期:2020-12-28 00:00:00

  • Low-Dimensional Subject Representation-based Transfer Learning in EEG Decoding.

    abstract::Recently, the advances in passive brain-computer interfaces (BCIs) based on electroencephalogram (EEG) have shed light on real-world neuromonitoring technologies. However, human variability in the EEG activities hinders the development of practical applications of EEG-based BCI. To tackle this problem, many transfer-l...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2020.3025865

    authors: Jeng PY,Wei CS,Jung TP,Wang LC

    更新日期:2020-09-22 00:00:00

  • The Removal of EOG Artifacts From EEG Signals Using Independent Component Analysis and Multivariate Empirical Mode Decomposition.

    abstract::The recorded electroencephalography (EEG) signals are usually contaminated by electrooculography (EOG) artifacts. In this paper, by using independent component analysis (ICA) and multivariate empirical mode decomposition (MEMD), the ICA-based MEMD method was proposed to remove EOG artifacts (EOAs) from multichannel EE...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2015.2450196

    authors: Wang G,Teng C,Li K,Zhang Z,Yan X

    更新日期:2016-09-01 00:00:00

  • Modeling Consistent Dynamics of Cardiogenic Vibrations in Low-Dimensional Subspace.

    abstract::The seismocardiogram (SCG) measures the movement of the chest wall in response to underlying cardiovascular events. Though this signal contains clinically-relevant information, its morphology is both patient-specific and highly transient. In light of recent work suggesting the existence of population-level patterns in...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2020.2980979

    authors: Zia J,Kimball J,Hersek S,Inan OT

    更新日期:2020-07-01 00:00:00

  • RACE-Net: A Recurrent Neural Network for Biomedical Image Segmentation.

    abstract::The level set based deformable models (LDM) are commonly used for medical image segmentation. However, they rely on a handcrafted curve evolution velocity that needs to be adapted for each segmentation task. The Convolutional Neural Networks (CNN) address this issue by learning robust features in a supervised end-to-e...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2018.2852635

    authors: Chakravarty A,Sivaswamy J

    更新日期:2019-05-01 00:00:00

  • Wireless gigabit data telemetry for large-scale neural recording.

    abstract::Implantable wireless neural recording from a large ensemble of simultaneously acting neurons is a critical component to thoroughly investigate neural interactions and brain dynamics from freely moving animals. Recent researches have shown the feasibility of simultaneously recording from hundreds of neurons and suggest...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2015.2416202

    authors: Kuan YC,Lo YK,Kim Y,Chang MC,Liu W

    更新日期:2015-05-01 00:00:00

  • Laryngeal Tumor Detection and Classification in Endoscopic Video.

    abstract::The development of the narrow-band imaging (NBI) has been increasing the interest of medical specialists in the study of laryngeal microvascular network to establish diagnosis without biopsy and pathological examination. A possible solution to this challenging problem is presented in this paper, which proposes an auto...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2014.2374975

    authors: Barbalata C,Mattos LS

    更新日期:2016-01-01 00:00:00

  • Statistical Metamodeling and Sequential Design of Computer Experiments to Model Glyco-Altered Gating of Sodium Channels in Cardiac Myocytes.

    abstract::Glycan structures account for up to 35% of the mass of cardiac sodium ( Nav ) channels. To question whether and how reduced sialylation affects Nav activity and cardiac electrical signaling, we conducted a series of in vitro experiments on ventricular apex myocytes under two different glycosylation conditions, reduced...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2015.2458791

    authors: Du D,Yang H,Ednie AR,Bennett ES

    更新日期:2016-09-01 00:00:00

  • Multiple-Time-Series Clinical Data Processing for Classification With Merging Algorithm and Statistical Measures.

    abstract::A description of patient conditions should consist of the changes in and combination of clinical measures. Traditional data-processing method and classification algorithms might cause clinical information to disappear and reduce prediction performance. To improve the accuracy of clinical-outcome prediction by using mu...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2014.2357719

    authors: Tseng YJ,Ping XO,Liang JD,Yang PM,Huang GT,Lai F

    更新日期:2015-05-01 00:00:00

  • PulseGAN: Learning to generate realistic pulse waveforms in remote photoplethysmography.

    abstract::Remote photoplethysmography (rPPG) is a non-contact technique for measuring cardiac signals from facial videos. High-quality rPPG pulse signals are urgently demanded in many fields, such as health monitoring and emotion recognition. However, most of the existing rPPG methods can only be used to get average heart rate ...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2021.3051176

    authors: Song R,Chen H,Cheng J,Li C,Liu Y,Chen X

    更新日期:2021-01-12 00:00:00

  • Length-of-Stay Prediction for Pediatric Patients With Respiratory Diseases Using Decision Tree Methods.

    abstract::Accurate prediction of a patient's length-of-stay (LOS) in the hospital enables an efficient and effective management of hospital beds. This paper studies LOS prediction for pediatric patients with respiratory diseases using three decision tree methods: Bagging, Adaboost, and Random forest. A data set of 11,206 record...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2020.2973285

    authors: Ma F,Yu L,Ye L,Yao DD,Zhuang W

    更新日期:2020-09-01 00:00:00

  • Designing a robust activity recognition framework for health and exergaming using wearable sensors.

    abstract::Detecting human activity independent of intensity is essential in many applications, primarily in calculating metabolic equivalent rates and extracting human context awareness. Many classifiers that train on an activity at a subset of intensity levels fail to recognize the same activity at other intensity levels. This...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2013.2287504

    authors: Alshurafa N,Xu W,Liu JJ,Huang MC,Mortazavi B,Roberts CK,Sarrafzadeh M

    更新日期:2014-09-01 00:00:00

  • Accurate Joint-Alignment of Indocyanine Green and Fluorescein Angiograph Sequences for Treatment of Subretinal Lesions.

    abstract::In ophthalmology, aligning images in indocyanine green and fluorescein angiograph sequences is important for the treatment of subretinal lesions. This paper introduces an algorithm that is tailored to align jointly in a common reference space all the images in an angiogram sequence containing both modalities. To overc...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2016.2538265

    authors: Chia-Ling Tsai,Hung-Chuan Hsu,Xin-Chang Wu,Shih-Jen Chen,Wei-Yang Lin

    更新日期:2017-05-01 00:00:00

  • Cluster-based analysis for personalized stress evaluation using physiological signals.

    abstract::Technology development in wearable sensors and biosignal processing has made it possible to detect human stress from the physiological features. However, the intersubject difference in stress responses presents a major challenge for reliable and accurate stress estimation. This research proposes a novel cluster-based ...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章,随机对照试验

    doi:10.1109/JBHI.2014.2311044

    authors: Xu Q,Nwe TL,Guan C

    更新日期:2015-01-01 00:00:00

  • A Workflow-Driven Formal Methods Approach to the Generation of Structured Checklists for Intrahospital Patient Transfers.

    abstract::Intrahospital transfers are a common but hazardous aspect of hospital care, with a large number of incidents posing a threat to patient safety. A growing body of work advocates the use of checklists for minimizing intrahospital transfer risk, but the majority of existing checklists are not guaranteed to be error-free ...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2016.2579881

    authors: Manataki A,Fleuriot J,Papapanagiotou P

    更新日期:2017-07-01 00:00:00

  • GP-CNN-DTEL: Global-Part CNN Model With Data-Transformed Ensemble Learning for Skin Lesion Classification.

    abstract::Precise skin lesion classification is still challenging due to two problems, i.e., (1) inter-class similarity and intra-class variation of skin lesion images, and (2) the weak generalization ability of single Deep Convolutional Neural Network trained with limited data. Therefore, we propose a Global-Part Convolutional...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2020.2977013

    authors: Tang P,Liang Q,Yan X,Xiang S,Zhang D

    更新日期:2020-10-01 00:00:00

  • ELEMENT: Multi-Modal Retinal Vessel Segmentation Based on a Coupled Region Growing and Machine Learning Approach.

    abstract::Vascular structures in the retina contain important information for the detection and analysis of ocular diseases, including age-related macular degeneration, diabetic retinopathy and glaucoma. Commonly used modalities in diagnosis of these diseases are fundus photography, scanning laser ophthalmoscope (SLO) and fluor...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2020.2999257

    authors: Rodrigues EO,Conci A,Liatsis P

    更新日期:2020-12-01 00:00:00

  • Enhancing Heart-Beat-Based Security for mHealth Applications.

    abstract::In heart-beat-based security, a security key is derived from the time difference between consecutive heart beats (the inter-pulse interval, IPI), which may, subsequently, be used to enable secure communication. While heart-beat-based security holds promise in mobile health (mHealth) applications, there currently exist...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2015.2496151

    authors: Seepers RM,Strydis C,Sourdis I,De Zeeuw CI

    更新日期:2017-01-01 00:00:00

  • A Realistic Framework for Investigating Decision Making in the Brain With High Spatiotemporal Resolution Using Simultaneous EEG/fMRI and Joint ICA.

    abstract::Human decision making is a multidimensional construct, driven by a complex interplay between external factors, internal biases, and computational capacity constraints. Here, we propose a layered approach to experimental design in which multiple tasks-from simple to complex-with additional layers of complexity introduc...

    journal_title:IEEE journal of biomedical and health informatics

    pub_type: 杂志文章

    doi:10.1109/JBHI.2016.2590434

    authors: Kyathanahally SP,Franco-Watkins A,Zhang X,Calhoun VD,Deshpande G

    更新日期:2017-05-01 00:00:00