Combining statistical & deep learning models for semi-supervised visual recognition

Statistiques de téléchargement

Téléchargements

Téléchargements par mois depuis la dernière année

Nouboukpo, Adama (2024). Combining statistical & deep learning models for semi-supervised visual recognition. Thèse. Gatineau, Université du Québec en Outaouais, Département d’informatique et d’ingéenierie, 116 p.

[thumbnail of Nouboukpo_Adama_2024_these.pdf]

Prévisualisation

PDF
Télécharger (10MB) | Prévisualisation

Résumé

This thesis addresses two of the most important and complex problems in visual recognition, namely semantic image segmentation and visual anomaly detection. We propose several approaches to tackle these two problems, based on statistical modeling (mixture models) and Deep Learning. In this thesis, we present novel methodologies to address these challenges, combining mixture models with the power of Deep Learning. Our aim is to overcome existing theoretical and algorithmic limitations in these areas. The thesis specifically explores how the advantages of mixture models can enhance the performance of Deep Learning in tackling these complex problems.
Firstly, we address the problem of automating segmentation through fully unsupervised segmentation (without labels), incorporating constraints related to the spatial relationships between image data. As an unsupervised model, it automatically classifies image pixels into coherent regions without relying on any prior knowledge. This is achieved through the application of hierarchical Gaussian mixture reduction, resulting in an optimal number of components that best fit and represent the image structure. In the specific case of foreground image segmentation, the process typically yields two super-clusters, each composed of multiple components representing the foreground and background regions, respectively.
Secondly, we pursue our goal of enhancing image segmentation in a semi-supervised manner. To this end, we propose a weakly semi-supervised approach based on a Gaussian mixture model, that utilizes scarce labeled samples along with multi-level constraints to achieve semantic segmentation. These labels, which may be user-provided or automatically generated, guide the segmentation process. The multi-level constraints play a crucial role in forming coherent and well-structured regions (pixel groups), effectively preventing poor segmentation. Moreover, they contribute to capturing redundancy within the image and reduces drastically the complexity of subsequent processing by shifting the focus from pixel-level to region-level classification. Preliminary results are very encouraging.
Thirdly, with advancements in deep learning, we propose to address the challenging task of semantic segmentation using a modified version of our semi-supervised mixture model to regularize deep neural networks for more effective skin lesion segmentation. Specifically, we introduce a new semi-supervised framework designed to achieve more accurate and robust skin lesion segmentation.
The proposed method uses both local and global spatial consistency to effectively
utilize limited labeled image along with a larger set of unlabeled image. This third proposition outperforms current methods, showing strong potential for semi-supervised medical image segmentation where annotated data is scarce.
Lastly, this thesis presents a specialized deep generative model for anomaly detection, focusing particularly on image-based outlier/out-of-distribution detection using a variational autoencoder that incorporates a Gaussian Mixture Model in its latent space. Reflecting the growing significance of anomaly detection in the context of big data, where identifying outliers or novel occurrences is crucial, our proposed approach is crafted to be both effective and efficient. It adeptly addresses the complexities inherent in modern, large-scale data environments, contributing to enhanced data-driven decision-making processes.

Type de document:	Thèse (Thèse)
Directeur de mémoire/thèse:	Allili, Mohand Saïd
Départements et école, unités de recherche et services:	Informatique et ingénierie
Date de dépôt:	04 oct. 2024 19:44
Dernière modification:	04 oct. 2024 19:44
URI:	https://di.uqo.ca/id/eprint/1692

Gestion Actions (Identification requise)

Dernière vérification avant le dépôt