Denis Musinguzi
Researchers

Denis Musinguzi

Research Engineer

Research Interests

Multimodal Machine Learning, Automatic Speech Recognition, AI for Healthcare


Projects

Paligemma-CXR: A multi-task multimodal model for TB chest X-ray interpretation.

This project aims to develop a unified multi-task multimodal model capable of performing classification, object detection, segmentation, report generation, and visual question answering (VQA) on chest X-ray images of patients suspected to have tuberculosis (TB).To support this, we curated a comprehensive multi-task dataset. The initial dataset contains segmentation masks for TB-related pathologies and diagnostic classification labels. From this base dataset, we extracted additional labels for report generation, object detection, and VQA, enabling us to train the model across all tasks using a consistent source of annotated data.The model is built using the PaliGemma architecture, which we fine-tune jointly across all tasks. Our experiments show that this multi-task approach outperforms task-specific models trained individually.


ASR for Africa

The goal of this project is to develop automatic speech recognition (ASR) models for multiple low-resource African languages. We train models on progressively larger amounts of speech data to explore how data volume influences ASR performance and to determine the minimum data required to build high-quality models. In addition, we examine the impact of integrating a language model into the ASR pipeline and analyze how it affects recognition accuracy. Finally, we conduct a detailed error analysis to identify common failure modes and understand the underlying causes of model errors.


Exploring the Role and Feasibility of Natural Language Processing Techniques to Improve Mental Health Services in in Uganda and Tanzania

This project aims to develop natural language processing (NLP) techniques to enhance the quality of mental health services. As part of the team, I am working on developing language models that assess the quality of conversations between patients and mental health professionals at a call center.

The models are designed to generate both a quality score and a corresponding explanation for the score, helping to provide actionable feedback that can improve patient care. We use transcripts from real call center interactions to train and evaluate the models.


Drug Resistance Prediction

The goal of this project is to develop predictive models that determine whether tuberculosis (TB) patients will convert from positive to negative status after being placed on an initial drug regimen. The models are trained on demographic and clinical data collected at baseline and are used to predict treatment effectiveness at month 2 and month 5.

In addition to using tabular data alone, we are also developing a multimodal approach that combines clinical data and chest X-ray images to improve prediction accuracy.



Previous Education and Professional Experiences

Education

1. Master of Science in Electrical and Computer Engineering-Advanced, Carnegie Mellon University Africa, August 2022- May 2024

2. Bachelor of Science in Electrical Engineering, Makerere University, August, 2017-May, 2022

Experience

1. Research Engineer, Marconi Research and Innovation Lab May, 2024 - Present

2. Graduate Teaching Assistant, Language Technologies Institute, Carnegie Mellon University, August 2023 - May 2024

3. Powertrain Systems Engineer, Kiira Motors Corporation, July 2022 - January 2024

4. Research Assistant, Kiira Motors Corporation, July 2020 - June 2022