Research - Ritta Neg Mfa - Academic Website

Research & Projects¶

My research focuses on advancing fairness, transparency, and accountability in artificial intelligence and machine learning. I develop benchmarks, evaluation frameworks, and practical tools to assess how AI systems make decisions and how they can be improved to reduce bias. By integrating technical rigor with ethical considerations, my work aims to help researchers and practitioners build trustworthy, socially responsible AI systems that promote equitable outcomes in high-impact domains such as healthcare, finance, and hiring.

Research Interests¶

Fairness in Artificial Intelligence and Machine Learning
Evaluation of Large Language Models (LLMs)
Trustworthy and Responsible AI

Projects¶

Fairness in Machine learning: A Literature Review On Challenges, Toolkits, and Frameworks.

:link: https://ml4sts.com/fairml-bestpractices/#checklist-of-best-practices

This work synthesizes insights from over 40 peer-reviewed studies (2018–2025) to provide actionable guidance for building fair and responsible AI systems. It examines technical, organizational, and societal challenges in automated decision-making across domains such as hiring, lending, healthcare, and law enforcement, addressing key questions about the definition, barriers, and implementation of fairness in practice. Building on this analysis, the research introduces a practitioner-oriented fairness checklist and evaluation rubric to help developers systematically integrate fairness throughout the machine learning lifecycle. These tools translate theoretical principles into practical guidance, bridging the gap between fairness research and real-world applications. Ultimately, the work advances transparency, accountability, and ethical AI development.

GitHub

LLM-based Code Evaluation for Fairness.

A comprehensive LLM evaluation framework to assess how effectively AI agents generate fair and reliable machine learning solutions. The framework operates across three levels: code, reasoning, and model outcomes. At the code level, it combines automated static analysis using a custom Flake8 fairness linter with rubric-based LLM evaluations to assess adherence to fairness best practices. At the reasoning level, LLMs evaluate agent logs to determine whether the decisions and justifications reflect sound and ethical fairness considerations. At the outcome level, the framework measures the fairness and performance of the resulting models using established metrics. Through statistical and embedding-based analyses, this work examines consistency, alignment, and the relationship between fairness-aware coding practices, reasoning quality, and real-world model performance.

Improvement on the DaST: Data-Free Substitute Training.

This project implements the DaST framework for data-free substitute training of adversarial attacks, enhanced with a total variation regularizer and a mode-seeking loss to improve the diversity and quality of synthesized data. These improvements strengthen attack effectiveness and model transferability, enabling more robust evaluation of neural network vulnerabilities without requiring access to the original training data. GitHub

Prototype-Based Interpretability for Fairness Auditing.

This project asks whether prototype-based interpretable models encode demographic shortcuts when trained on facial attribute classification, and whether prototype visualizations can expose these biases more transparently than post-hoc explanation methods like GradCAM or LIME. The core problem is not only that bias exists, but that it hides inside black-box models — making it difficult to diagnose, communicate, or fix. A model can appear accurate overall while relying on unfair patterns tied to gender, age, or makeup rather than the true target attribute. By using prototypes to inspect why a model makes a prediction, this work connects interpretability and fairness into a single auditing framework.

Publications¶

LLM-based Code Evaluation for Fairness.

:link: https://openreview.net/forum?id=PR0s3xSecr Ritta Neg Mfa, Surbhi Rathore, Ayman Sandouk, Sarah M Brown (2025). Women in Machine Learning Workshop@ NeurIPS 2025.
Introduces a new multi-level evaluation framework to evaluate AI agents on their ability to generate fair code.