5. Graph-based semi-supervised learning

By Konstantin Avrachenkov, INRIA Sophia-Antipolis, France, k.avrachenkov@inria.fr | Maximilien Dreveton, Inria Sophia-Antipolis, France, maximilien.dreveton@gmail.com

Downloaded: 2952 times

Published: 06 Oct 2022

© 2022 Konstantin Avrachenkov | Maximilien Dreveton

Abstract

Semi-supervised learning (SSL) aims at achieving superior learning performance by combining unlabelled and labelled data. Since typically the amount of unlabelled data is large compared to the amount of labelled data, SSL methods are relevant when the performance of unsupervised learning is low, or when the cost of getting a large amount of labelled data for supervised learning is too high. Unfortunately, many standard semi-supervised learning techniques have been shown to not efficiently use the unlabelled data, leading to unsatisfactory or unstable performances (Chapelle et al., 2006, Chapter 4; Ben-David et al., 2008; Cozman et al., 2002). Moreover, the presence of noise in the labelled data may further degrade their performance. In practice, the noise often comes from a tired or non-diligent expert carrying out the labelling task.