Master of Science in Data Science and Analytics

Permanent URI for this collectionhttps://hdl.handle.net/20.500.11951/1204

Browse

Recent Submissions

Now showing 1 - 9 of 9
  • Item
    Predicting postpartum hemorrhage in pregnant mothers in low-income settings. A machine learning approach.
    (Uganda Christian University, 2025-10-14) Tom Eganyu
    Postpartum hemorrhage (PPH) remains a significant contributor to maternal mortality globally, particularly in low-income countries where awareness and research on its severity, risk factors, and predictive modeling are limited. This project analyzed 2094 deliveries to identify PPH risk factors and developed a machine learning model for prediction. The Extreme Gradient Boosting model demonstrated superior performance with an AUC of 97.0%, accuracy of 96.0%, precision of 96.0%, and recall of 97.0%. Key identified risk factors associated with increased PPH include: Number of ANC Visits (P-Value: 0.00);Weight of Baby (g) (P-Value: 0.00); Duration of Labour (P-Value: 0.00); Cervical Tear (P-Value: 0.00), Episiotomy (P-Value: 0.00), and Perineal Tears (P-Value: 0.00). The study successfully established avalidated machine learning model capable of predicting mothers at risk of PPH.
  • Item
    Improving Employee Retention by Predicting Employee Attrition using Machine Learning Techniques :Case Study: Centenary Bank Ltd Uganda
    (Uganda Christian University, 2025-10-13) Engirot Andrew Ronnie
    Employee retention is a critical factor in the success and sustainability of organizations, ensuring that valuable human capital remains engaged, satisfied, and motivated over the long term. High turnover rates can significantly disrupt productivity, damage organizational culture, and inflate operational costs, underscoring the importance of retaining top talent. Continuity in operations is maintained when employees feel valued and supported, fostering a positive work environment where collaboration and high performance are encouraged. In contrast, frequent turnover can lead to instability and decreased morale, ultimately hindering productivity and organizational cohesion. Retaining top talent not only maintains operational continuity but also provides a competitive edge in the marketplace. Organizations with strong employee retention rates attract prospective hires more effectively and are better positioned to develop deep expertise within their workforce, contributing to long-term success. In today’s digital age, where social media and online reviews can quickly shape a company’s reputation, prioritizing employee satisfaction and well-being enhances brand image and appeals to both job seekers and consumers. From a financial perspective, employee retention contributes to significant cost savings. The expenses associated with recruiting, hiring, and training new employees are substantial. By retaining existing employees, organizations can allocate resources more efficiently, as long-term employees are typically more productive and require less supervision. This not only reduces direct costs but also improves overall organizational efficiency and effectiveness. HR analytics has emerged as a powerful tool in predicting and enhancing employee retention. By adopting a data-driven approach, HR analytics involves the collection, analysis, and interpretation of data related to employee behaviors and performance to inform strategic decisions. This approach combines HR-specific data, such as employee demographics, performance metrics, and engagement surveys, with financial and operational data to generate comprehensive insights into workforce trends. One of the key roles of HR analytics in employee retention is identifying predictors or drivers of turnover. By analyzing historical turnover data alongside various HR metrics, organizations can detect patterns and trends indicating employees are at risk of leaving. Predictive models employing algorithms and machine learning techniques analyze large datasets to forecast potential attrition, enabling proactive measures to address these risks. Additionally, HR analytics facilitates sentiment analysis and engagement surveys to assess employee satisfaction and pinpoint areas requiring improvement. Techniques such as natural language processing (NLP) and text analytics allow for the examination of unstructured data from employee feedback, performance reviews, and social media, providing deep insights into employee sentiment and morale. Beyond prediction, HR analytics informs the development of targeted retention strategies. By understanding the underlying factors contributing to attrition, organizations can implement personalized development opportunities, improve communication between managers and employees, and adjust compensation and benefits packages to better align with employee expectations. These tailored interventions aim to enhance engagement, satisfaction, and ultimately, retention. The objectives of this project are to enhance employee retention by leveraging machine learning techniques to predict and mitigate employee attrition. Through the analysis of historical employee data and the application of predictive modeling, the project seeks to identify key factors contributing to turnover and develop actionable insights for proactive retention strategies. The project aims to build predictive models that accurately anticipate staff attrition by examining historical data on demographics, job categories, performance measures, and other relevant variables. Furthermore, the study intends to identify critical organizational factors that predict employee attrition. By analyzing the output of predictive models, the research aims to pinpoint specific risk factors associated with higher turnover rates, such as job dissatisfaction, inadequate compensation, or lack of career advancement opportunities. Based on these insights, the project will formulate targeted intervention strategies to address identified risk factors and reduce employee churn. Recommendations will focus on enhancing employee retention, engagement, and satisfaction. The effectiveness of these interventions will be evaluated by monitoring key performance indicators such as employee satisfaction ratings, attrition rates, and retention metrics. The goal is to assess the impact of implemented strategies on workforce stability and organizational performance over time. Additionally, the project aims to establish a framework for the ongoing evaluation and refinement of retention strategies and predictive models. Through continuous data analysis and feedback mechanisms, the project seeks to iteratively enhance the effectiveness of retention initiatives and improve the accuracy of predictive models, ensuring that retention efforts evolve to meet the organization’s changing needs.
  • Item
    A Text-based Poultry Health System: An Interactive Disease Detection and Prescription Recommendations
    (Uganda Christian University, 2025-09-23) Ritah Nakimuli
    Poultry farming is vital to Uganda’s economy, providing income for many rural households. However, broiler chicken farmers struggle with early disease detection and management, leading to significant flock losses and financial hardship. Although advanced diagnostic tools exist, they are often too expensive and complicated for small-scale farmers in rural areas to access. This research presents a multilingual, symptom-based poultry disease prediction system, a lightweight, mobile-friendly machine learning solution that addresses the limitations of existing diagnostic tools. By allowing farmers to input observable symptoms like bird behavior, droppings, and flock age through a simple text-based interface, it eliminates the need for costly equipment, lab tests, or other traditional methods. Several machine learning algorithms were tested to identify the best method for disease prediction, including SVM, Random Forest, XGBoost, and KNN. KNN and SVM performed best, each achieving 96% accuracy and 97% precision, with Random Forest close behind. XGBoost performed poorly, with only 11% accuracy. Although SVM matched KNN in accuracy, it struggled with real-world probability calibration. KNN, on the other hand, provided reliable and interpretable confidence scores, making it the preferred choice for deployment. The final application is deployed using the Streamlit framework, enabling seamless access across desktop and mobile browsers. It provides real-time disease predictions, along with tailored prescriptions and prevention strategies. Additional features include a QR code for easy sharing, which enhances both the user experience and accessibility. This project bridges the gap between advanced AI and the practical realities of low-resource agricultural settings.
  • Item
    Data-driven Analysis and Prediction of Human Rights Violations Against Human Rights Defenders: A Case Study of Eastern Africa
    (Uganda Christian University, 2025-09-29) Bagombeka Esther Asiimire
    Despite the growing availability of big data and machine learning, human rights monitoring in the region remains largely dependent on retrospective reports, eyewitness testimonies, and qualitative assessments, which lack the ability to anticipate future violations. The absence of real- time data processing and predictive analytics limits the ability of policymakers and advocacy groups to implement proactive intervention strategies. As a result, human rights organizations often respond reactively, only after violations occur, rather than deploying preemptive measures to protect HRDs. In this research, a quantitative research design was adopted, utilising a cross-sectional approach to analyse patterns in human rights violations. Data was collected from recognized human rights organisations, human rights databases, and global news agencies. The research employed descriptive analytics to identify trends, K-Means clustering to categorize high-risk regions, and predictive modeling to forecast future violations. Seasonal Autoregressive Integrated Moving Average (SARIMA) was used to model long-term seasonal trends, while Recurrent Neural Networks (RNN) captured short-term fluctuations and nonlinear patterns in the data. The Predictive Human Rights Violations Model (PHRVM) emerged as the most effective, balancing structural seasonality and real-time variations, resulting in higher accuracy and improved forecasting reliability compared to individual models. The findings revealed that human rights violations followed distinct temporal and geo- graphic trends, peaking around election periods, protest seasons, and government crackdowns. While the PHRVM outperformed other forecasting methods during training (MAE : 0.081, RMSE : 0.087), testing revealed a slight increase in prediction error, with MAE rising to 0.684 and RMSE increasing to 1.109. A paired t-test confirmed that the model significantly outperformed a naïve baseline forecast (p < 0.05), validating its predictive capability. This research concluded that human rights violations follow recognizable patterns, making it possible to anticipate high-risk periods and optimize protection efforts for HRDs. This helps policymakers, and advocacy groups to anticipate risks and implement preventive measures before violations escalate. The PHRVM’s success shows the potential of AI-driven forecasting in social science research, offering a more systematic approach to tracking civic space restric- tions. However, for predictive models to be more effective in real-world applications, further refinement is needed, including the integration of real-time data sources such as social media monitoring, remote sensing technologies, and expanded human rights reporting networks. Strengthening these capabilities will enhance model accuracy, responsiveness, and impact, ensuring that human rights organisations can move from reactive responses to preventative protection strategies.
  • Item
    A Machine Learning Approach for Accurate Valuation of Imports in Uganda
    (Uganda Christian University, 2025-09-30) Sentongo Paul
    Accurate customs valuation is central to revenue mobilization, trade compliance, and economic stability in Uganda, where import duties contribute nearly one-third of domestic tax revenue. Yet persistent inefficiencies in conventional valuation methods such as reliance on importer-declared invoice values, outdated price databases, and manual adjudication have resulted in systemic undervaluation, mis invoicing, and annual revenue losses exceeding USD 200 million. This thesis investigates the potential of machine learning (ML) to transform customs valuation by developing and deploying predictive models trained on more than 70,000 import declaration records from Uganda Revenue Authority’s ASYCUDA system (2020–2024). Three supervised ML algorithms; Random Forest, Extreme Gradient Boosting (XGBoost), and Artificial Neural Networks (ANN) were implemented following a rigorous pipeline that included exploratory data analysis, feature engineering, and model optimization. All models demonstrated strong predictive performance (R² >0.93), with Random Forest achieving near-perfect accuracy(R² = 0.997, MAE = UGX 560.35, RMSE = UGX 1,868.23). Compared to Uganda’s current average based approach (MAE = UGX124,797.76), this represents a 99.55% reduction in error, underscoring the transformative capacity of ML for valuation precision. Beyond model benchmarking, the study contributes technically by operationalizing the Random Forest model into a Streamlit based prototype web application, offering real-time decision support for customs officers. Empirically, it provides the first quantified evidence of ML’s potential to address valuation fraud and inefficiencies in Uganda. Practically, it establishes a replicable frame work for low-resource settings, integrating ML with existing trade platforms such as ASYCUDA. The findings have significant policy implications: adopting ML-driven valuation can curtail revenue leakages, enhance compliance with WTO Customs Valuation Agreements, and support Uganda’s Vision 2040 and National Development Plan III goals for domestic revenue mobilization. Limitations such as reliance on secondary data, exclusion of informal trade, and simulation based deployment highlight opportunities for future research. These include incorporating regional datasets, exploring explainable AI techniques (e.g., SHAP, LIME) to improve transparency, and piloting ML integration within operational customs systems. This thesis thus advances the discourse on AI in public sector modernization, demonstrating that machine learning is not merely a technical innovation but a strategic enabler for fiscal sustainability, trade integrity, and digital transformation in Uganda’s customs administration.
  • Item
    A Data-Driven NLP Skills Gap Analysis of Uganda’s TVET Curriculum and its Effects on Graduate Employability
    (Uganda Christian University, 2025-09-23) Patrick Atuhe
    This thesis evaluates the outcomes of the revisions to Uganda’s Technical and Vocational Education and Training (TVET) curriculum, focusing on graduate employability. The study applies data science methodologies, particularly Natural Language Processing (NLP), to assess how well the current curriculum aligns with industry needs. Data was collected from 350 TVET graduates, feedback from 50 employers who assessed over 1,250 graduates, and 30 stakeholders analyzed the curriculum. An NLP-based recommendation system was developed using TF-IDF and cosine similarity to quantify alignment between skills taught and those required in the workforce. Findings reveal significant gaps in digital skills, technical preparedness, and alignment with evolving industry expectations. Employers reported a 68% deficiency in digital competencies, with a mean curriculum-employer similarity score of 0.42. The NLP system achieved an F1-score of 0.87, outperforming manual reviews in skill-gap identification. The study provides actionable recommendations for curriculum reform, including the integration of digital tools, periodic review mechanisms, and the use of real-time feedback loops from the industry. These insights contribute to national development goals such as Uganda Vision 2040 by enhancing TVET effectiveness and workforce readiness.
  • Item
    TACKLING DROPOUT RATES OF STUDENTS IN UGANDA: AN EXPLORATION OF MACHINE LEARNING AND DATA-DRIVEN APPROACHES
    (Uganda Christian University, 2025-10-02) NAKIMBUGWE DIANA KIRABO
    Uganda’s education sector faced notable challenges, including high dropout rates and poor student outcomes. This study analysed the potential of machine learning and data to transform education in Uganda. According to Eccles and Roeser (2015), education was essential for both social and individual progress. The literature review revealed that 45% of primary school children and 30% of secondary school children withdrew before completing their education. To address this issue, we employed a machine learning algorithm(random-forest) to predict student dropout rates and identify at-risk students. Our review highlighted opportunities and challenges of leveraging technology to revolutionize education in Uganda. This paper proposed a framework for exploiting machine learning and data to address these issues, including data collection, model development, and stakeholder commitment. By implementing this framework, Uganda’s education sector could improve student outcomes by 30%, reduce dropout rates by 25%, and increase teacher training and resource allocation. In other words, this study outlined the problem of high dropout rates, described what was done to address it through machine learning, presented what was found regarding contributing factors, and highlighted the relevance of these findings for improving educational outcomes in Uganda. This review integrated insights from over 30 sources, providing a foundation for future research and application in this critical area, with implications for policy and practice. Keywords Primary Keywords: Machine learning, Data analytics, Educational outcomes, Dropout rates, Student performance. Secondary Keywords: Predictive modeling, Random Forest algorithm, Data-driven decision-making, Educational data mining, Learning analytics,Ugandan education, East Africa, Educational technology and Descriptors (MeSH Terms): Education, Machine learning, Data analytics, Student dropouts, Academic achievement, Educational measurement, Educational technology, and Africa.
  • Item
    Forecasting Emerging Skill Demands with Machine Learning to Inform Curriculum Development in Uganda’s Higher Education
    (Uganda Christian University, 2025-09-24) Wanyama Denis
    Rapid technological advancement and evolving industry demands have widened the skill gap in Uganda’s labor market. Higher education institutions often struggle to keep pace with these changes, leading to mismatches between graduate competencies and employer expectations. This study uses machine learning techniques to forecast emerging skill demands and inform the development of data-driven curricula in Ugandan universities. Drawing on more than one million job postings from 2021 to 2023, the research applies natural language processing (NLP), time series forecasting (ARIMA and Holt-Winters), and clustering algorithms to analyze labor market trends. Exploratory Data Analysis (EDA) revealed high-demand skills, while Holt-Winters outperformed ARIMA (MAE: 9.05 vs. 23.87), capturing the seasonal nature of skill fluctuations. Key findings indicate a growing demand for roles such as interaction designers, network administrators, user experience professionals, and social media managers. In-demand technical skills include Python, Google Analytics, CSS, Tableau, AWS, and Sketch. The increasing emphasis on digital literacy and soft skills underscores the need for more flexible and adaptive curricula. This study offers actionable recommendations for curriculum reform, including integrating technical skills, developing continuous learning pathways, and enhancing academic-industry collaboration. By applying machine learning to labor market analysis, the research equips universities, policymakers, and stakeholders with the information needed to align higher education with the demands of Uganda’s evolving digital economy. Keywords: Machine Learning, Labor Market Trends, Skill Forecasting, Higher Education Curriculum, Uganda, ARIMA, Holt-Winters, Data Science, skill mismatch.
  • Item
    CONTEXTUALIZING AI ETHICS IN UGANDA’S MICROCREDIT WITH ADAPTIVE SENSITIVE REWEIGHTING
    (Uganda Christian University, 2025-08-12) Isabirye Emmanuel
    This research tackles the pressing ethical concerns of using Artificial Intelligence (AI) in Uganda’s microcredit sector, namely to develop an Adaptive Sensitive Reweighting (ASR) model to mitigate algorithmic bias and promote equitable access to credit. Traditional credit scoring models - and AI algorithms trained on Western-biased data - discriminate against marginalized groups because they are based on formal financial records, reinforcing structural disadvantages. By iterative engagement with Ugandan policymakers, lenders, borrowers, and AI experts, we identify the most significant ethical concerns and specify context-specific fairness metrics. The ASR approach adaptively adjusts weights for sensitive features like collateral values and transaction history during model training to enhance fairness. Experimental outcomes on a typical credit scoring dataset demonstrate ASR’s success: the inclusion rate of disadvantaged borrowers is enhanced by 15% with predictive accuracy maintained, and significant improvements on key fairness metrics. The research provides actionable policy recommendations on implementing ASR-based AI systems in Uganda’s microfinance sector to drive financial inclusion and sustainable development. This study contributes to emerging Majority World scholarship on AI ethics by demonstrating the necessity of situating ethical frameworks and valuing stakeholder perspectives to develop equitable, inclusive AI systems. Our findings offer valuable insights for policymakers, microfinance institutions, and AI practitioners who aim to implement responsible AI in Uganda’s Microcredit sector.