Please use this identifier to cite or link to this item: https://hdl.handle.net/10216/106477
Author(s): Mafalda Falcão Torres Veiga de Ferreira
Title: Statistical Comparison of Different Machine-Learning Approaches for Malaria Parasites Detection in Microscopic Images
Issue Date: 2017-07-11
Abstract: Malaria is a severe public health problem across the world, particularly in developing countries (≈80% of the cases occur in Africa), putting at special risk the most unprotected groups of society: children and pregnant women. Since it can be caused by 4 different species of parasites, each having different stages of evolution, approaching the right diagnosis without access to costly equipment is complex. Ergo, research has focused on speeding up and lowering the costs of its diagnosis, by resorting to automatic machine classification of microscopic images. Still, most approaches rely on simplistic, single-model classifiers, with a constant absence of a systematic statistical comparison in the literature that supports a particular technique or feature.Hence, this dissertation presents: (i) design and execute a statistical comparison of different ML approaches for the detection of such parasites in microscopic images, (ii) identify which features are more relevant for prediction, and (iii) identify which models and techniques achieve the best results, balancing precision and recall. Given the stated problem, and before approaching it, it was performed an initial statistical analysis to the dataset, to discover its proportions and to detect highly correlated features.After knowing the data, it was developed a framework that (i) optimizes the values of the considered classification and feature selection algorithms, (ii) computes a statistical comparison of different machine learning approaches to the same dataset, using different metrics on the cross validation, where there were used different metrics to measure the performance value variation and evaluate which one is consistent with the data, and, finally, (iii) performs a statistical hypothesis test,to guarantee that the data model with the best performance is distinct from all the others considered in this study. As result, one can verify an improvement over the established baseline, by using a Fdr feature selection method followed by a Ada Boosting classifier with 350 estimators.
Description: A malária é um grave problema de saúde pública em todo o mundo, particularmente nos países em desenvolvimento (cerca de 80% dos casos ocorrem em África), pondo em risco a vida de milhões de pessoas. Causada por 4 espécies diferentes de parasitas, cada um com diferentes estágios de evolução, a aproximação do diagnóstico correto sem acesso a equipamentos caros é complexo. Nos últimos anos, a investigação tem-se centrado na aceleração e redução dos custos deste diagnóstico, recorrendo à classificação de imagens microscópicas, com base em processos de Machine Learning. Ainda assim, persiste na maioria das abordagens a ausência constante de comparação estatística sistemática na literatura que suporta uma técnica ou recurso particular. Assim, os objetivos desta dissertação são: (i) projetar e executar uma comparação estatística de diferentes abordagens de ML para a detecção de tais parasitas em imagens microscópicas, (ii) identificar quais características são mais relevantes para a previsão, (iii) identificar quais modelos E as técnicas obtêm os melhores resultados, equilibrando a precisão e o recall, e (iv) investigando a aplicabilidade das técnicas de aprendizagem profunda e de conjunto. A conclusão bem-sucedida desta dissertação capacitará os países em desenvolvimento com ferramentas de diagnóstico mais rápidas, mais baratas e mais precisas, melhorando diretamente a vida e a saúde de populações.
Subject: Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
Scientific areas: Ciências da engenharia e tecnologias::Engenharia electrotécnica, electrónica e informática
Engineering and technology::Electrical engineering, Electronic engineering, Information engineering
TID identifier: 201800578
URI: https://hdl.handle.net/10216/106477
Document Type: Dissertação
Rights: openAccess
License: https://creativecommons.org/licenses/by-sa/4.0/
Appears in Collections:FEUP - Dissertação

Files in This Item:
File Description SizeFormat 
205566.pdfStatistical Comparison of Different Machine-Learning Approaches for Malaria Parasites Detection in Microscopic Images70.91 MBAdobe PDFThumbnail
View/Open


This item is licensed under a Creative Commons License Creative Commons