Master of Science in Data Science and Analytics

Permanent URI for this collectionhttps://hdl.handle.net/20.500.11951/1204

Browse

Now showing 1 - 15 of 15

Predicting client retention in an urban HIV clinic – a machine learning approach
(Uganda Christian University, 2025-05) Jonathan Melvin Ikapule
Retention in HIV care is critical to viral suppression, improved health outcomes, and reduced transmission; however, retention rates remain suboptimal in urban Uganda, with some studies reporting rates below 60%. This study aimed to identify retention predictors and develop a machine learning model to predict retention among people living with HIV (PLHIV) using routinely collected patient-level data. A retrospective cohort study was conducted using data from electronic medical records (EMR) from three urban HIV clinics in Kampala (January 2021 - December 2023). Clients who died or were transferred out were excluded, yielding 22,213 clients. Data included demographic, clinical, and visit-related variables, as well as engineered features like duration on antiretroviral therapy, distance to clinic, and viral suppression history. Retention was defined as attending a scheduled appointment within 90 days. Six classification algorithms were trained and evaluated using a 70:30 split and SMOTE (a technique to balance data). Accuracy, precision, recall, and F1 score assessed model performance. XGBoost outperformed other models, achieving an accuracy of 88% and an F1 score of 0.85. Key predictors, identified using SHAP values for feature importance, included duration on ART, weight, age, baseline CD4, distance to the clinic, and ART adherence. These findings demonstrate the feasibility of using EMR data and machine learning to support data-driven decision-making in HIV programs. Machine learning models integrated into EMR systems can enable real-time identification of clients at risk of disengaging from care, guiding targeted interventions. This study highlights the potential of data science to improve HIV service delivery, although further validation in diverse contexts is needed. Keywords: Antiretroviral Therapy, Classification, EMR, Retention, SHAP, SMOTE, Supervised Learning, XGBoost, Urban Clinic, Uganda.
A Machine Learning approach for identifying at risk pupils and recommending support strategies: a case study of primary schools in Mukono District, Uganda
(Uganda Christian University, 2026-05-28) Charles Jovans Galiwango
Academic vulnerability and pupil dropout remain persistent challenges in Ugandan primary education, despite high enrollment rates. Current school support systems are often reactive, intervening only after academic failure has occurred. This study developed a predictive early warning system to proactively identify pupils at risk of academic failure in Mukono District, Uganda. A mixed-methods approach was used, analysing structured records of pupils from Primary 4–6 and conducting interviews with teachers and administrators. The study first identified key behavioural and socioeconomic predictors of academic risk through statistical analysis. Four machine learning models were then evaluated and compared to determine the most effective approach for predicting vulnerability. The analysis revealed that behavioural indicators, specifically disciplinary issues, incomplete homework, and poor attendance, were the strongest predictors of academic risk. Among the models tested, Logistic Regression proved most suitable, achieving a recall of 0.833 and ROC-AUC of 0.941 on unseen test data, while providing interpretable predictions crucial for educational settings. Based on these findings, a three-tiered intervention framework was developed, classifying pupils by risk level and linking specific risk factors to tailored support strategies. The study concludes that a simple, interpretable predictive model using routinely collected school data can effectively identify vulnerable pupils early. The proposed framework offers Ugandan primary schools a practical, proactive tool for targeted intervention, shifting support from crisis management to prevention. This research contributes a feasible, evidence-based approach to enhancing educational equity and retention in resource-constrained settings.
Predictive maintenance of centrifugal water pumps using machine learning: a case study of National Water and Sewerage Corporation
(Uganda Christian University, 0026-05-28) Quinton Ssebaggala
This thesis explores Effective predictive maintenance strategies for Centrifugal water pumps, focusing on Uganda’s National Water and Sewerage Corporation (NWSC) and other similar large-scale water providers, aiming to improve water supply reliability for over 21 million people and reduce 185,000 annual customer complaints caused by 70% pump failures and 8-12 hours of operational downtime. However, despite advances in machine learning, tailored predictive maintenance approaches for water pumps in Uganda are understudied. Thus, this study developed a based predictive maintenance model for the centrifugal pumps using real-time operational data from National Water and Sewerage Corporation (NWSC) (N=13 pumps from the 3 pump stations i.e. (Gunhill, Katosi and uyenga)were analyzed. This study presents a comprehensive Machine-learning based predictive maintenance framework for estimating pump failure. The process integrates data preprocessing, extraction of statistical time-domain condition indicators, and evaluation of 5 machine learning algorithms; XGBoost, LightGBM, CatBoost, Random Forest, and a Voting Ensemble [applied to shift maintenance from a monthly health check to real-time monitoring] providing deeper insights into pump availability and health for future years. The primary objective was to accurately classify the pumps’ operational status into five distinct states: CHANGE, CRITICAL, OFF, OPERATIONAL, and WARNING. The results demonstrate that Extreme Gradient Boosting (XGBoost) model achieved superior predictive performance yielding an accuracy of 74% in detecting failure within pumps before more damage was done. Thus, leveraging of Machine Learning for Predictive maintenance enabled National Water and Sewerage Corporation to detect any anomalies in the Centrifugal pumps like; inconsistencies in flow rates, pressure fluctuations, vibration abnormalities etc. which helped reduce on the maintenance costs from (10-40%), reduce on equipment failure (70-75%), reduced on downtime (35% -45%) and lastly, increased on production capacity by(25%) thus improving on the well-being of the people in Uganda and promoting of SDG 6( Clean water and Sanitation). Keywords: Predictive Maintenance, Machine Learning, Centrifugal Pumps, National Water and Sewerage Corporation, Arduino, Classification Models
Predicting final CGPA using pre-admission data: proactive insights for academic excellence at Uganda Christian University
(Uganda Christian University, 2026-06-29) Simon Fred Lubambo
Higher education institutions increasingly rely on data-driven approaches to improve student support, academic planning, and decision-making. However, the adoption of predictive analytics in Sub-Saharan African universities remains limited, despite the availability of admission records that could inform early academic guidance. This study developed and evaluated a machine learning model for predicting students’ final Cumulative Grade Point Average (CGPA) using pre-admission academic and demographic data from Uganda Christian University. The study employed a quantitative research design using historical student records extracted from the university’s Management Information System. The dataset included O-Level and A-Level academic performance indicators, demographic attributes, and programme-related variables. Several machine learning models were trained and evaluated, with the Random Forest Regressor selected as the best-performing model after hyperparameter optimisation. Model performance was assessed using Mean Absolute Error, Root Mean Squared Error, and the coefficient of determination. To support transparency and responsible use, SHAP-based interpretability, sensitivity analysis, and subgroup fairness evaluation were incorporated. The findings showed that final CGPA can be predicted from admission-time data with moderate but useful accuracy. Prior academic performance, particularly average O-Level grade, weighted A-Level performance, and UCE credits, emerged as the strongest predictors of final CGPA. Fairness analysis across gender, campus, and academic level indicated generally consistent model performance, although continued monitoring is necessary for underrepresented groups. The study further demonstrated how predicted CGPA bands and programme-fit simulations can support proactive academic advising, early identification of students requiring support, and evidence-informed programme guidance. The study concludes that interpretable and fair machine learning models can provide practical value in Ugandan higher education when used as decision-support tools rather than deterministic placement mechanisms. By using pre-admission data already available within institutional systems, universities can strengthen academic advising, improve student support, and promote more evidence-based planning.
Data-driven precision public health: leveraging machine learning to track and reduce zero-dose and partially vaccinated children in Nakifuma, Uganda
(Uganda Christian University, 2026) Kenneth Michael Ogwok
Despite global progress, 14.3 million infants remain zero-dose (ZD) and 5.6 million are partially vaccinated (PV) worldwide (World Health Organization, 2024). In Uganda, where full immunization coverage stands at only 54% (Uganda Bureau of Statistics, 2022), precision public health approaches are urgently needed. This study applies data science to develop a community-level risk profiling framework in a resource-limited Ugandan setting. This study aimed to: (1) identify socio-demographic, health system, and behavioral factors distinguishing ZD, PV, and fully immunized (FI) children; (2) develop and validate machine learning (ML) models predicting vaccination status; and (3) propose data-driven interventions to increase FI coverage.A mixed-methods, cross-sectional study sampled 115 children and their caregivers under five in Nakifuma Sub-county. For objective one, 35 variables were analyzed using chi-square and Mann-Whitney U tests to identify significant predictors. For objective two, four supervised ML algorithms were trained on a stratified 70:30 split and evaluated using precision, recall, F1-score, and AUC. For objective three, validated model-derived risk scores informed targeted, parish-level interventions.The presence of ZD children (10.4%) was associated with negative attitudes of health workers (p=0.013), waiting time >60 minutes (p=0.021), importance of vaccines (p=0.018), and non-parent caregivers (p=0.026). The presence of PV children (40.9%) was associated with increasing child age (p<0.001) and vaccine stock-out (p=0.031), while FI children (48.7%) possessed vaccination cards (p=0.005). The best-performing algorithm was Random Forest, with an F1-score of 0.97 for ZD, 0.74 for PV, and 0.94 for FI. The clustering of ZD/PV children beyond 2 km from health facilities was used for designing a three-tier intervention matrix for sensitizing health workers, supply chain interventions, and SMS reminders.ML models were effective in triaging zero-dose, partially vaccinated, and fully immunized children. The precision public health strategy has immense scope for achieving 90% full immunization by 2030 in Uganda.
Detection of Banana Fusarium Wilt & Black Sigatoka: A Deep Learning Approach for Smallholder Farms in Central Uganda
(Uganda Christian University, 2025-10-20) Peter Mulindwa
Bananas are a vital food and income source in Central Uganda, yet their cultivation is severely threatened by destructive diseases such as Fusarium Wilt and Black Sigatoka. Smallholder farmers, who form the backbone of Uganda’s agricultural sector, rely heavily on manual disease identification methods, which are time-consuming, error-prone, and largely ineffective for early intervention. This thesis proposes a hybrid deep learning approach that integrates Convolutional Neural Networks (CNNs), Vision Transformers (ViTs), and Gray Level Co-occurrence Matrix (GLCM) texture features to provide accurate, efficient, and scalable detection of banana leaf diseases using image-based classification. A dataset of over 17,000 annotated banana leaf images was sourced from the Lacuna Banana project. Rigorous preprocessing, including resizing, normalisation, and augmentation, was applied to enhance model robustness. Texture features extracted through GLCM were combined with spatial features learned by CNN and ViT models to improve classification sensitivity. Several models were developed and evaluated, including a custom CNN, InceptionV3 with transfer learning, and a ViT-based architecture. Evaluation metrics such as accuracy, precision, recall, and F1-score were used to assess model performance. The Vision Transformer outperformed other individual models with 99% classification accuracy, while the proposed hybrid model achieved a balanced accuracy of 98%, with substantial precision and recall across all disease categories. The integration of GLCM features significantly improved the detection of texture-specific diseases like Black Sigatoka. This research contributes a robust, interpretable, and field-deployable AI-based diagnostic tool that aligns with Uganda’s national goals for data-driven agricultural development and the food security-related Sustainable Development Goals (and sustainable farming.
Predicting Postpartum Hemorrhage in Pregnant Mothers in Low-income Settings. A Machine Learning Approach
(Uganda Christian University, 2025-10-14) Tom Eganyu
Postpartum hemorrhage (PPH) remains a significant contributor to maternal mortality globally, particularly in low-income countries where awareness and research on its severity, risk factors, and predictive modeling are limited. This project analyzed 2094 deliveries to identify PPH risk factors and developed a machine learning model for prediction. The Extreme Gradient Boosting model demonstrated superior performance with an AUC of 97.0%, accuracy of 96.0%, precision of 96.0%, and recall of 97.0%. Key identified risk factors associated with increased PPH include: Number of ANC Visits (P-Value: 0.00);Weight of Baby (g) (P-Value: 0.00); Duration of Labour (P-Value: 0.00); Cervical Tear (P-Value: 0.00), Episiotomy (P-Value: 0.00), and Perineal Tears (P-Value: 0.00). The study successfully established avalidated machine learning model capable of predicting mothers at risk of PPH.
Improving Employee Retention by Predicting Employee Attrition using Machine Learning Techniques :Case Study: Centenary Bank Ltd Uganda
(Uganda Christian University, 2025-10-13) Andrew Ronnie Engirot
Employee retention is a critical factor in the success and sustainability of organizations, ensuring that valuable human capital remains engaged, satisfied, and motivated over the long term. High turnover rates can significantly disrupt productivity, damage organizational culture, and inflate operational costs, underscoring the importance of retaining top talent. Continuity in operations is maintained when employees feel valued and supported, fostering a positive work environment where collaboration and high performance are encouraged. In contrast, frequent turnover can lead to instability and decreased morale, ultimately hindering productivity and organizational cohesion. Retaining top talent not only maintains operational continuity but also provides a competitive edge in the marketplace. Organizations with strong employee retention rates attract prospective hires more effectively and are better positioned to develop deep expertise within their workforce, contributing to long-term success. In today’s digital age, where social media and online reviews can quickly shape a company’s reputation, prioritizing employee satisfaction and well-being enhances brand image and appeals to both job seekers and consumers. From a financial perspective, employee retention contributes to significant cost savings. The expenses associated with recruiting, hiring, and training new employees are substantial. By retaining existing employees, organizations can allocate resources more efficiently, as long-term employees are typically more productive and require less supervision. This not only reduces direct costs but also improves overall organizational efficiency and effectiveness. HR analytics has emerged as a powerful tool in predicting and enhancing employee retention. By adopting a data-driven approach, HR analytics involves the collection, analysis, and interpretation of data related to employee behaviors and performance to inform strategic decisions. This approach combines HR-specific data, such as employee demographics, performance metrics, and engagement surveys, with financial and operational data to generate comprehensive insights into workforce trends. One of the key roles of HR analytics in employee retention is identifying predictors or drivers of turnover. By analyzing historical turnover data alongside various HR metrics, organizations can detect patterns and trends indicating employees are at risk of leaving. Predictive models employing algorithms and machine learning techniques analyze large datasets to forecast potential attrition, enabling proactive measures to address these risks. Additionally, HR analytics facilitates sentiment analysis and engagement surveys to assess employee satisfaction and pinpoint areas requiring improvement. Techniques such as natural language processing (NLP) and text analytics allow for the examination of unstructured data from employee feedback, performance reviews, and social media, providing deep insights into employee sentiment and morale. Beyond prediction, HR analytics informs the development of targeted retention strategies. By understanding the underlying factors contributing to attrition, organizations can implement personalized development opportunities, improve communication between managers and employees, and adjust compensation and benefits packages to better align with employee expectations. These tailored interventions aim to enhance engagement, satisfaction, and ultimately, retention. The objectives of this project are to enhance employee retention by leveraging machine learning techniques to predict and mitigate employee attrition. Through the analysis of historical employee data and the application of predictive modeling, the project seeks to identify key factors contributing to turnover and develop actionable insights for proactive retention strategies. The project aims to build predictive models that accurately anticipate staff attrition by examining historical data on demographics, job categories, performance measures, and other relevant variables. Furthermore, the study intends to identify critical organizational factors that predict employee attrition. By analyzing the output of predictive models, the research aims to pinpoint specific risk factors associated with higher turnover rates, such as job dissatisfaction, inadequate compensation, or lack of career advancement opportunities. Based on these insights, the project will formulate targeted intervention strategies to address identified risk factors and reduce employee churn. Recommendations will focus on enhancing employee retention, engagement, and satisfaction. The effectiveness of these interventions will be evaluated by monitoring key performance indicators such as employee satisfaction ratings, attrition rates, and retention metrics. The goal is to assess the impact of implemented strategies on workforce stability and organizational performance over time. Additionally, the project aims to establish a framework for the ongoing evaluation and refinement of retention strategies and predictive models. Through continuous data analysis and feedback mechanisms, the project seeks to iteratively enhance the effectiveness of retention initiatives and improve the accuracy of predictive models, ensuring that retention efforts evolve to meet the organization’s changing needs.
A Text-based Poultry Health System: An Interactive Disease Detection and Prescription Recommendations
(Uganda Christian University, 2025-09-23) Ritah Nakimuli
Poultry farming is vital to Uganda’s economy, providing income for many rural households. However, broiler chicken farmers struggle with early disease detection and management, leading to significant flock losses and financial hardship. Although advanced diagnostic tools exist, they are often too expensive and complicated for small-scale farmers in rural areas to access. This research presents a multilingual, symptom-based poultry disease prediction system, a lightweight, mobile-friendly machine learning solution that addresses the limitations of existing diagnostic tools. By allowing farmers to input observable symptoms like bird behavior, droppings, and flock age through a simple text-based interface, it eliminates the need for costly equipment, lab tests, or other traditional methods. Several machine learning algorithms were tested to identify the best method for disease prediction, including SVM, Random Forest, XGBoost, and KNN. KNN and SVM performed best, each achieving 96% accuracy and 97% precision, with Random Forest close behind. XGBoost performed poorly, with only 11% accuracy. Although SVM matched KNN in accuracy, it struggled with real-world probability calibration. KNN, on the other hand, provided reliable and interpretable confidence scores, making it the preferred choice for deployment. The final application is deployed using the Streamlit framework, enabling seamless access across desktop and mobile browsers. It provides real-time disease predictions, along with tailored prescriptions and prevention strategies. Additional features include a QR code for easy sharing, which enhances both the user experience and accessibility. This project bridges the gap between advanced AI and the practical realities of low-resource agricultural settings.
Data-driven Analysis and Prediction of Human Rights Violations Against Human Rights Defenders: A Case Study of Eastern Africa
(Uganda Christian University, 2025-09-29) Esther Asiimire Bagombeka
Despite the growing availability of big data and machine learning, human rights monitoring in the region remains largely dependent on retrospective reports, eyewitness testimonies, and qualitative assessments, which lack the ability to anticipate future violations. The absence of real- time data processing and predictive analytics limits the ability of policymakers and advocacy groups to implement proactive intervention strategies. As a result, human rights organizations often respond reactively, only after violations occur, rather than deploying preemptive measures to protect HRDs. In this research, a quantitative research design was adopted, utilising a cross-sectional approach to analyse patterns in human rights violations. Data was collected from recognized human rights organisations, human rights databases, and global news agencies. The research employed descriptive analytics to identify trends, K-Means clustering to categorize high-risk regions, and predictive modeling to forecast future violations. Seasonal Autoregressive Integrated Moving Average (SARIMA) was used to model long-term seasonal trends, while Recurrent Neural Networks (RNN) captured short-term fluctuations and nonlinear patterns in the data. The Predictive Human Rights Violations Model (PHRVM) emerged as the most effective, balancing structural seasonality and real-time variations, resulting in higher accuracy and improved forecasting reliability compared to individual models. The findings revealed that human rights violations followed distinct temporal and geo- graphic trends, peaking around election periods, protest seasons, and government crackdowns. While the PHRVM outperformed other forecasting methods during training (MAE : 0.081, RMSE : 0.087), testing revealed a slight increase in prediction error, with MAE rising to 0.684 and RMSE increasing to 1.109. A paired t-test confirmed that the model significantly outperformed a naïve baseline forecast (p < 0.05), validating its predictive capability. This research concluded that human rights violations follow recognizable patterns, making it possible to anticipate high-risk periods and optimize protection efforts for HRDs. This helps policymakers, and advocacy groups to anticipate risks and implement preventive measures before violations escalate. The PHRVM’s success shows the potential of AI-driven forecasting in social science research, offering a more systematic approach to tracking civic space restric- tions. However, for predictive models to be more effective in real-world applications, further refinement is needed, including the integration of real-time data sources such as social media monitoring, remote sensing technologies, and expanded human rights reporting networks. Strengthening these capabilities will enhance model accuracy, responsiveness, and impact, ensuring that human rights organisations can move from reactive responses to preventative protection strategies.
A Machine Learning Approach for Accurate Valuation of Imports in Uganda
(Uganda Christian University, 2025-09-30) Paul Sentongo
Accurate customs valuation is central to revenue mobilization, trade compliance, and economic stability in Uganda, where import duties contribute nearly one-third of domestic tax revenue. Yet persistent inefficiencies in conventional valuation methods such as reliance on importer-declared invoice values, outdated price databases, and manual adjudication have resulted in systemic undervaluation, mis invoicing, and annual revenue losses exceeding USD 200 million. This thesis investigates the potential of machine learning (ML) to transform customs valuation by developing and deploying predictive models trained on more than 70,000 import declaration records from Uganda Revenue Authority’s ASYCUDA system (2020–2024). Three supervised ML algorithms; Random Forest, Extreme Gradient Boosting (XGBoost), and Artificial Neural Networks (ANN) were implemented following a rigorous pipeline that included exploratory data analysis, feature engineering, and model optimization. All models demonstrated strong predictive performance (R² >0.93), with Random Forest achieving near-perfect accuracy(R² = 0.997, MAE = UGX 560.35, RMSE = UGX 1,868.23). Compared to Uganda’s current average based approach (MAE = UGX124,797.76), this represents a 99.55% reduction in error, underscoring the transformative capacity of ML for valuation precision. Beyond model benchmarking, the study contributes technically by operationalizing the Random Forest model into a Streamlit based prototype web application, offering real-time decision support for customs officers. Empirically, it provides the first quantified evidence of ML’s potential to address valuation fraud and inefficiencies in Uganda. Practically, it establishes a replicable frame work for low-resource settings, integrating ML with existing trade platforms such as ASYCUDA. The findings have significant policy implications: adopting ML-driven valuation can curtail revenue leakages, enhance compliance with WTO Customs Valuation Agreements, and support Uganda’s Vision 2040 and National Development Plan III goals for domestic revenue mobilization. Limitations such as reliance on secondary data, exclusion of informal trade, and simulation based deployment highlight opportunities for future research. These include incorporating regional datasets, exploring explainable AI techniques (e.g., SHAP, LIME) to improve transparency, and piloting ML integration within operational customs systems. This thesis thus advances the discourse on AI in public sector modernization, demonstrating that machine learning is not merely a technical innovation but a strategic enabler for fiscal sustainability, trade integrity, and digital transformation in Uganda’s customs administration.
A Data-Driven NLP Skills Gap Analysis of Uganda’s TVET Curriculum and its Effects on Graduate Employability
(Uganda Christian University, 2025-09-23) Patrick Atuhe
This thesis evaluates the outcomes of the revisions to Uganda’s Technical and Vocational Education and Training (TVET) curriculum, focusing on graduate employability. The study applies data science methodologies, particularly Natural Language Processing (NLP), to assess how well the current curriculum aligns with industry needs. Data was collected from 350 TVET graduates, feedback from 50 employers who assessed over 1,250 graduates, and 30 stakeholders analyzed the curriculum. An NLP-based recommendation system was developed using TF-IDF and cosine similarity to quantify alignment between skills taught and those required in the workforce. Findings reveal significant gaps in digital skills, technical preparedness, and alignment with evolving industry expectations. Employers reported a 68% deficiency in digital competencies, with a mean curriculum-employer similarity score of 0.42. The NLP system achieved an F1-score of 0.87, outperforming manual reviews in skill-gap identification. The study provides actionable recommendations for curriculum reform, including the integration of digital tools, periodic review mechanisms, and the use of real-time feedback loops from the industry. These insights contribute to national development goals such as Uganda Vision 2040 by enhancing TVET effectiveness and workforce readiness.
Tackling Dropout Rates of Students in Uganda: An Exploration of Machine Learning and Data-driven Approaches
(Uganda Christian University, 2025-10-02) Diana Kirabo Nakimbugwe
Uganda’s education sector faced notable challenges, including high dropout rates and poor student outcomes. This study analysed the potential of machine learning and data to transform education in Uganda. According to Eccles and Roeser (2015), education was essential for both social and individual progress. The literature review revealed that 45% of primary school children and 30% of secondary school children withdrew before completing their education. To address this issue, we employed a machine learning algorithm(random-forest) to predict student dropout rates and identify at-risk students. Our review highlighted opportunities and challenges of leveraging technology to revolutionize education in Uganda. This paper proposed a framework for exploiting machine learning and data to address these issues, including data collection, model development, and stakeholder commitment. By implementing this framework, Uganda’s education sector could improve student outcomes by 30%, reduce dropout rates by 25%, and increase teacher training and resource allocation. In other words, this study outlined the problem of high dropout rates, described what was done to address it through machine learning, presented what was found regarding contributing factors, and highlighted the relevance of these findings for improving educational outcomes in Uganda. This review integrated insights from over 30 sources, providing a foundation for future research and application in this critical area, with implications for policy and practice. Keywords Primary Keywords: Machine learning, Data analytics, Educational outcomes, Dropout rates, Student performance. Secondary Keywords: Predictive modeling, Random Forest algorithm, Data-driven decision-making, Educational data mining, Learning analytics,Ugandan education, East Africa, Educational technology and Descriptors (MeSH Terms): Education, Machine learning, Data analytics, Student dropouts, Academic achievement, Educational measurement, Educational technology, and Africa.
Forecasting Emerging Skill Demands with Machine Learning to Inform Curriculum Development in Uganda’s Higher Education
(Uganda Christian University, 2025-09-24) Denis Wanyama
Rapid technological advancement and evolving industry demands have widened the skill gap in Uganda’s labor market. Higher education institutions often struggle to keep pace with these changes, leading to mismatches between graduate competencies and employer expectations. This study uses machine learning techniques to forecast emerging skill demands and inform the development of data-driven curricula in Ugandan universities. Drawing on more than one million job postings from 2021 to 2023, the research applies natural language processing (NLP), time series forecasting (ARIMA and Holt-Winters), and clustering algorithms to analyze labor market trends. Exploratory Data Analysis (EDA) revealed high-demand skills, while Holt-Winters outperformed ARIMA (MAE: 9.05 vs. 23.87), capturing the seasonal nature of skill fluctuations. Key findings indicate a growing demand for roles such as interaction designers, network administrators, user experience professionals, and social media managers. In-demand technical skills include Python, Google Analytics, CSS, Tableau, AWS, and Sketch. The increasing emphasis on digital literacy and soft skills underscores the need for more flexible and adaptive curricula. This study offers actionable recommendations for curriculum reform, including integrating technical skills, developing continuous learning pathways, and enhancing academic-industry collaboration. By applying machine learning to labor market analysis, the research equips universities, policymakers, and stakeholders with the information needed to align higher education with the demands of Uganda’s evolving digital economy. Keywords: Machine Learning, Labor Market Trends, Skill Forecasting, Higher Education Curriculum, Uganda, ARIMA, Holt-Winters, Data Science, skill mismatch.
Contextualising AI Ethics in Uganda’s Microcredit With Adaptive Sensitive Re-weighting
(Uganda Christian University, 2025-08-12) Emmanuel Isabirye
This research tackles the pressing ethical concerns of using Artificial Intelligence (AI) in Uganda’s microcredit sector, namely to develop an Adaptive Sensitive Reweighting (ASR) model to mitigate algorithmic bias and promote equitable access to credit. Traditional credit scoring models - and AI algorithms trained on Western-biased data - discriminate against marginalized groups because they are based on formal financial records, reinforcing structural disadvantages. By iterative engagement with Ugandan policymakers, lenders, borrowers, and AI experts, we identify the most significant ethical concerns and specify context-specific fairness metrics. The ASR approach adaptively adjusts weights for sensitive features like collateral values and transaction history during model training to enhance fairness. Experimental outcomes on a typical credit scoring dataset demonstrate ASR’s success: the inclusion rate of disadvantaged borrowers is enhanced by 15% with predictive accuracy maintained, and significant improvements on key fairness metrics. The research provides actionable policy recommendations on implementing ASR-based AI systems in Uganda’s microfinance sector to drive financial inclusion and sustainable development. This study contributes to emerging Majority World scholarship on AI ethics by demonstrating the necessity of situating ethical frameworks and valuing stakeholder perspectives to develop equitable, inclusive AI systems. Our findings offer valuable insights for policymakers, microfinance institutions, and AI practitioners who aim to implement responsible AI in Uganda’s Microcredit sector.

Browse

Recent Submissions