Vinitra Swamy

EPFL · Microsoft AI · UC Berkeley · Scholé AI · vinitra@schole.ai

Hi there! I'm an AI researcher, CEO, and co-founder of the recent AI-for-education spinoff Scholé AI. I earned my PhD in Computer Science at in 2025, co-advised by Prof. Tanja Käser at the ML for Education Lab and Prof. Martin Jaggi at the ML and Optimization Lab, recieving the Patrick Denantes Memorial Prize for EPFL's best computer science thesis.

Before Switzerland, I spent two years at Microsoft AI as a lead engineer for the Open Neural Network eXchange (ONNX) project, collaborating with over 30 leading tech companies (including Nvidia, Intel, Amazon, and Meta) to establish an AI interoperability standard.

My claim to fame (😂) is graduating at 20 as the youngest M.S. in Computer Science recipient in UC Berkeley’s history. Since then, I’ve lectured courses in AI and machine learning for UC Berkeley, University of Washington, and Harvard, and spent summers at , , , and .

I love people, data, and working on exciting problems at the intersection of the two:

ML for education (personalized learning, autograding, knowledge tracing)
Explainable and interpretable AI
Generalized learning (transfer learning, multimodal learning)

Thank you for taking time from your day to find out what I do with mine!

Selected Research

✨🎓 My PhD thesis on "A Human-Centric Approach to Explainable AI for Personalized Education" is now publicly available!

For a full list of publications, please visit my Google Scholar page. During my PhD, I was fortunate to be recognized by Stanford, UCSD and UChicago as a "Rising Star in Data Science", and awarded the GResearch PhD Award, the IC Distinguished Service Award three times, and most recently the Patrick Denantes Memorial Prize for EPFL's best CS thesis.

User-Centric Interpretability Through Mixture-of-Experts (ICLR)

Vinitra Swamy, Syrielle Montariol, Julian Blackwell, Jibril Frej, Martin Jaggi, Tanja Käser

An interpretable-by-design mixture-of-experts model architecture

We present InterpretCC (interpretable conditional computation), a family of intrinsically interpretable neural networks at a unique point in the design space that optimizes for ease of human understanding and explanation faithfulness, while maintaining comparable performance to state-of-the-art models. InterpretCC achieves this through adaptive sparse activation of features before prediction, allowing the model to use a different, minimal set of features for each instance. We extend this idea into an interpretable, global mixture-of-experts (MoE) model that allows users to specify topics of interest, discretely separates the feature space for each data point into topical subnetworks, and adaptively and sparsely activates these topical subnetworks for prediction. We apply InterpretCC for text, time series and tabular data across several real-world datasets, demonstrating comparable performance with non-interpretable baselines and outperforming intrinsically interpretable baselines. Through a user study involving 56 teachers, InterpretCC explanations are found to have higher actionability and usefulness over other intrinsically interpretable approaches.

[Paper] [OpenReview - ✨ Top 5% of scores! ✨] [Pre-Print] [Code]

2025

iLLuMinaTE: From Explanations to Action (AAAI)

Vinitra Swamy*, Davide Romano*, Bhargav Srinivasa Desikan, Oana-Maria Camburu, Tanja Käser

Using LLMs and social science theories of explanation to communicate XAI insights to students

We introduce iLLuMinaTE, a zero-shot, chain-of-prompts LLM-XAI pipeline inspired by Miller's cognitive model of explanation that ensures explanations for state-of-the-art AI models are understandable for non-technical users such as educators and students. iLLuMinaTE is designed to deliver theory-driven, actionable feedback to students in online courses. iLLuMinaTE navigates three main stages - causal connection, explanation selection, and explanation presentation - with variations drawing from eight social science theories (e.g. Abnormal Conditions, Pearl's Model of Explanation, Necessity and Robustness Selection, Contrastive Explanation). We extensively evaluate 21,915 natural language explanations of iLLuMinaTE extracted from three LLMs (GPT-4o, Gemma2-9B, Llama3-70B), with three different underlying XAI methods (LIME, Counterfactuals, MC-LIME), across students from three diverse online courses. Our evaluation involves analyses of explanation alignment to the social science theory, understandability of the explanation, and a real-world user preference study with 114 university students containing a novel actionability simulation. We find that students prefer iLLuMinaTE explanations over traditional explainers 89.52% of the time.

[Paper] [Pre-Print] [Code]

2025

🏆 AI or Human? Evaluating Student Feedback Perceptions in Higher Education (ECTEL, ✨Best Paper✨)

Tanya Nazaretsky, Paola Mejia-Domenzain, Vinitra Swamy, Jibril Frej, Tanja Käser

A study on how knowing whether feedback is provided by AI or a human affects students' perceptions

Feedback plays a crucial role in learning by helping individuals understand and improve their performance. Yet, providing timely, personalized feedback in higher education presents a challenge due to the large and diverse student population, often resulting in delayed and generic feedback. Recent advances in generative Artificial Intelligence (AI) offer a solution for delivering timely and scalable feedback. However, little is known about students' perceptions of AI feedback. In this paper, we investigate how the identity of the feedback provider affects students' perception, focusing on the comparison between AI-generated and human-created feedback. Our approach involves students evaluating feedback in authentic educational settings both before and after disclosing the feedback provider’s identity, aiming to assess the influence of this knowledge on their perception. Our study with 457 students across diverse academic programs and levels reveals that students’ ability to differentiate between AI and human feedback depends on the task at hand. Disclosing the identity of the feedback provider affects students’ preferences, leading to a greater preference for human-created feedback and a decreased evaluation of AI-generated feedback. Moreover, students who failed to identify the feedback provider correctly tended to rate AI feedback higher, whereas those who succeeded preferred human feedback. These tendencies are similar across academic levels, genders, and fields of study. Our results highlight the complexity of integrating AI into educational feedback systems and underline the importance of considering student perceptions in AI-generated feedback adoption in higher education.

[Pre-Print] [Paper] [News Coverage]

2024

Viewpoint: Future of Human-Centric XAI (JAIR)

Vinitra Swamy, Jibril Frej, Tanja Käser

Defining XAI needs for human-centric applications (healthcare, education, etc.)

Current approaches in human-centric XAI (e.g. predictive tasks in healthcare, education, or personalized ads) tend to rely on a single explainer. This is a concerning trend given systematic disagreement in explainability methods applied to the same points and underlying black-box models. We propose to shift from post-hoc explainability to designing interpretable neural network architectures; moving away from approximation techniques in human-centric and high impact applications. We identify five needs of human-centric XAI (real-time, accurate, actionable, human-interpretable, and consistent) and propose two schemes for interpretable-by-design neural network workflows (adaptive routing for interpretable conditional computation and diagnostic benchmarks for iterative model learning). We postulate that the future of human-centric XAI is neither in explaining black-boxes nor in reverting to traditional, interpretable models, but in neural networks that are intrinsically interpretable.

[Pre-Print] [Paper]

2024

🏆 Interpret3C: Interpretable Student Clustering (AIED, ✨Best LBR Paper✨)

Isadora Salles, Paola Mejia-Domenzain*, Vinitra Swamy*, Julian Blackwell, Tanja Käser

Interpretable-by-design clustering, showcased for students in a large online course

Interpret3C (Interpretable Conditional Computation Clustering) is a novel clustering pipeline that incorporates interpretable neural networks (NNs) in an unsupervised learning context. This method leverages adaptive gating in NNs to select features for each student. Then, clustering is performed using the most relevant features per student, enhancing clusters’ relevance and interpretability. We use Interpret3C to analyze the behavioral clusters considering individual feature importances in a MOOC with over 5,000 students. This research contributes to the field by offering a scalable, robust clustering methodology and an educational case study that respects individual student differences and improves interpretability for high-dimensional data.

[Pre-Print] [Code]

2024

Student Answer Forecasting: Answer Choice Prediction for Language Learning (EDM)

Elena Gado, Tommaso Martorella, Luca Zunino, Paola Mejia-Domenzain, Vinitra Swamy, Jibril Frej, Tanja Käser

Using LLMs to personalize student answer forecasting through in-context student embeddings

Intelligent Tutoring Systems (ITS) enhance personalized learning by predicting student responses to provide immediate and personalized instruction. However, recent research has primarily focused on the correctness of the answer rather than the student's performance on specific answer choices, limiting insights into students' thought processes and potential misconceptions. To address this gap, we present MCQStudentBert, an answer forecasting model that leverages the capabilities of Large Language Models (LLMs) to integrate contextual understanding of students' answering history along with the text of the questions and answers. By predicting the specific answer choices students are likely to make, practitioners can easily extend the model to new answer choices or remove answer choices for the same multiple-choice question (MCQ) without retraining the model. In particular, we compare MLP, LSTM, BERT, and Mistral 7B architectures to generate embeddings from students' past interactions, which are then incorporated into a finetuned BERT's response-forecasting mechanism. We apply our pipeline to a dataset of language learning MCQ, gathered from an ITS with over 10,000 students to explore the predictive accuracy of MCQStudentBert, which incorporates student interaction patterns, in comparison to correct answer prediction and traditional mastery-learning feature-based approaches. This work opens the door to more personalized content, modularization, and granular support.

[Pre-Print] [Code]

2024

✨🗞️ MultiModN (NeurIPS)

Vinitra Swamy*, Malika Satayeva*, Jibril Frej, Thierry Bossy, Thijs Vogels, Martin Jaggi, Tanja Käser*, Mary-Anne Hartley*

A modular, multimodal, interpretable neural network architecture that is robust to missing data

We present MultiModN, a multimodal, modular network that fuses latent representations in a sequence of any number, combination, or type of modality while providing granular real-time predictive feedback on any number or combination of predictive tasks. MultiModN's composable pipeline is interpretable-by-design, as well as innately multi-task and robust to the fundamental issue of biased missingness. We perform four experiments on several benchmark MM datasets across 10 real-world tasks (predicting medical diagnoses, academic performance, and weather), and show that MultiModN's sequential MM fusion does not compromise performance compared with a baseline of parallel fusion. By simulating the challenging bias of missing not-at-random (MNAR), this work shows that, contrary to MultiModN, parallel fusion baselines erroneously learn MNAR and suffer catastrophic failure when faced with different patterns of MNAR at inference.

[Paper] [✨ News Coverage ✨] [Pre-Print] [Code] [Poster]

2023

✨🗞️ MEDITRON-70B: Scaling Medical Pretraining for LLMs (Nature Preprint)

EPFL LLM Team, Yale Medicine, ICRC, Clinical Evaluation Group

Developing open medical foundation models (MEDITRON)

We democratize large-scale medical AI systems by developing MEDITRON: a suite of open-source LLMs and LMMs with 7B and 70B parameters adapted to the medical domain. MEDITRON extends pretraining on a comprehensively curated medical corpus that includes biomedical literature and internationally recognized clinical practice guidelines. Evaluations using standard medical reasoning benchmarks show significant improvements over all current open-access models and several state-of-the-art commercial LLMs that are orders of magnitude larger, more expensive to host, and closed-source. Enhanced with visual processing capabilities, our MEDITRON-V model also outperforms all open-access models and much larger closed-source models on multimodal reasoning tasks for various biomedical imaging modalities. Beyond traditional benchmarks, we also create a novel and physician-driven adversarial question dataset grounded in real-world clinical settings, and a comprehensive 17-metric evaluation rubric to assess alignment and contextualization to real-world clinical practice. Applying this framework to MEDITRON-70B's responses, sixteen independent physicians found a high level of alignment across all metrics, including medical accuracy, safety, fairness, communication, and interpretation. The MEDITRON suite is a significant step forward in closing the technological gap between closed- and open-source medical foundation models.

[ArXiv Pre-Print] [✨ News Coverage ✨] [Journal Pre-Print] [Code] [Models + Datasets]

2023

Unraveling Downstream Bias from LLMs (EMNLP Findings)

Thiemo Wambsganss*, Xiaotian Su*, Vinitra Swamy, Parsa Seyed Neshai, Roman Rietsche, Tanja Käser

A study on LLM bias transfer in AI writing support for students

We investigate how bias transfers through an AI writing support pipeline through a large scale user study with 231 students writing business case peer reviews in German. Students are divided into five groups with different levels of writing support (traditional ML suggestions, control group with no assistance, finetuned versions of GPT2, GPT 3, and GPT3.5). Using GenBit, WEAT, and SEAT, we evaluate the gender bias at various stages of the pipeline: in model embeddings, in suggestions generated by the models, and in reviews written by students. Our results demonstrate that there is no significant difference in gender bias between the resulting peer reviews of groups with and without LLM suggestions. Our research is therefore optimistic about the use of AI writing support in the classroom, showcasing a context where bias in LLMs does not transfer to students' responses.

[Paper] [Pre-Print] [Code]

2023

🏆 Trusting the Explainers (LAK, ✨Honorable Mention✨)

Vinitra Swamy, Sijia Du, Mirko Marras, Tanja Käser

Validating eXplainable AI with university professors for AI-assisted course design

We use human experts to validate explainable AI approaches in the context of student success prediction. Our pairwise analyses cover five course pairs (nine datasets from Coursera, EdX, and Courseware) that differ in one educationally relevant aspect and popular instance-based explainers. We quantitatively compare the distances between the explanations across courses and methods, then validate the explanations of LIME, SHAP, and a counterfactual-based confounder with 26 semi-structured interviews of university-level educators regarding which features they believe contribute most to student success, which explanations they trust most, and how they could transform these insights into actionable course design decisions. Our results show that quantitatively, explainers significantly disagree with each other about what is important, and qualitatively, experts themselves do not agree on which explanations are most trustworthy.

[Paper] [Pre-Print] [Code]

2023

RIPPLE: Concept-Based Interpretation for Raw Time Series (AAAI)

Mohammad Asadi, Vinitra Swamy, Jibril Frej, Julien Vignoud, Mirko Marras, Tanja Käser

In-hoc explainability for graph neural networks, using raw multivariate time series clickstreams

We present RIPPLE, utilizing irregular multivariate time series modeling with graph neural networks to achieve comparable or better accuracy with raw time series clickstreams in comparison to hand-crafted features. Furthermore, we extend concept activation vectors for interpretability in raw time series models. Our experimental analysis on 23 MOOCs with millions of combined interactions over six behavioral dimensions show that models designed with our approach can (i) beat state-of-the-art time series baselines with no feature extraction and (ii) provide interpretable insights for personalized interventions.

[Paper] [Pre-Print] [Slides] [Code]

2023

Bias at a Second Glance (COLING)

Thiemo Wambsganss, Vinitra Swamy, Roman Rietsche, Tanja Käser

Measuring bias propogation in LLMs using Swiss-German student peer-reviews

We analyze bias across text and through multiple architectures on a corpus of 9,165 German peer-reviews collected from university students over five years. Notably, our corpus includes labels such as helpfulness, quality, and critical aspect ratings from the peer-review recipient as well as demographic attributes. We conduct a Word Embedding Association Test (WEAT) analysis on (1) our collected corpus in connection with the clustered labels, (2) the most common pre-trained German language models (T5, BERT, and GPT-2) and GloVe embeddings, and (3) the language models after fine-tuning on our collected data-set. In contrast to our initial expectations, we found that our collected corpus does not reveal many biases in the co-occurrence analysis or in the GloVe embeddings. However, the pre-trained German language models find substantial conceptual, racial, and gender bias and have significant changes in bias across conceptual and racial axes during fine-tuning on the peer-review data. With our research, we aim to contribute to the fourth UN sustainability goal (quality education) with a novel dataset, an understanding of biases in natural language education data, and the potential harms of not counteracting biases in language models for educational tasks.

[Paper] [Pre-Print] [Code + Data]

2022

✨🗞️ Evaluating the Explainers (EDM)

Vinitra Swamy, Bahar Radhmehr, Natasa Krco, Mirko Marras, Tanja Käser

Evaluation of systematic disagreement in post-hoc explainers for education

We compare five explainers for black-box neural nets (LIME, PermutationSHAP, KernelSHAP, DiCE, CEM) on the downstream task of student performance prediction for five massive open online courses. Our experiments demonstrate that the families of explainers do not agree with each other on feature importance for the same Bidirectional LSTM models with the same representative set of students. We use Principal Component Analysis, Jensen-Shannon distance, and Spearman's rank-order correlation to quantitatively cross-examine explanations across methods and courses. Our results come to the concerning conclusion that the choice of explainer contains systematic bias and is in fact paramount to the interpretation of the predictive results, even more so than the data the model is trained on.

[Paper] [✨ News Coverage ✨] [Pre-Print] [Slides] [Code]

2022

Meta Transfer Learning (Learning@Scale)

Vinitra Swamy, Mirko Marras, Tanja Käser

Generalizable models for early success prediction in MOOCs

We tackle the problem of transferability across MOOCs from different domains and topics, focusing on models for early success prediction. In this paper, we present and analyze three novel strategies to creating generalizable models: 1) pre-training a model on a large set of diverse courses, 2) leveraging the pre-trained model by including meta features about courses to orient downstream tasks, and 3) fine-tuning the meta transfer learning model on previous course iterations. Our experiments on 26 MOOCs with over 145,000 combined enrollments and millions of interactions show that models combining interaction clickstreams and meta information have comparable or better performance than models which have access to previous iterations of the course. With these models, we enable educators to warm-start their predictions for new and ongoing courses.

[Paper] [Pre-Print] [Slides] [Code]

2022

Interpreting LMs Through KG Extraction (NeurIPS XAI)

Vinitra Swamy, Angelika Romanou, Martin Jaggi

Global explainability for LLMs through temporal extraction of knowledge graphs

While transformer-based language models are undeniably useful, it is a challenge to quantify their performance beyond traditional accuracy metrics. In this paper, we compare BERT-based language models (DistilBERT, BERT, RoBERTa) through snapshots of acquired knowledge at sequential stages of the training process. We contribute a quantitative framework to compare language models through knowledge graph extraction and showcase a part-of-speech analysis to identify the linguistic strengths of each model variant. Using these metrics, machine learning practitioners can compare models, diagnose their models' behavioral strengths and weaknesses, and identify new targeted datasets to improve model performance.

Published at eXplainable AI for Debugging and Diagnosis Workshop at NeurIPS 2021.

[Paper] [Poster] [Code]

2021

ONNX: Open Neural Network eXchange (Microsoft AI)

Open source standard for ML model interoperability

Open Neural Network Exchange (ONNX) is an open source ecosystem for machine learning interoperability, empowering AI developers to choose the right tools as their project evolves. Founded by Microsoft and Facebook, and now supported by over 30 other companies, ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers. In addition to the code and model contributions, we presented several research talks about model operationalization and acceleration with ONNX and ONNX Runtime at Microsoft MLADS (ML, AI, Data Science Conference) and UW eScience Institute.

[ONNX Model Zoo] [ONNX + Azure ML Tutorials] [MLADS Notebooks] [MLADS Slides] [UW Slides]

2020

✨🗞️ ML for Humanitarian Data: Tag Prediction using the HXL Standard (KDD)

Vinitra Swamy (Microsoft AI), Elisa Chen, Anish Vankayalapati, Abhay Aggarwal, Chloe Liu (UC Berkeley), Vani Mandava (MSR), Simon Johnson (UN)

A case study in NLP for humanitarian data interoperability

We present a simple yet effective machine learning model to predict tags for datasets from the United Nations Office for the Coordination of Humanitarian Affairs (UN OCHA) with the labels and attributes of the Humanitarian Exchange Language (HXL) Standard for data interoperability. This paper details the methodology used to predict the corresponding tags and attributes for a given dataset with an accuracy of 94% for HXL header tags and an accuracy of 92% for descriptive attributes. Compared to previous work, our workflow provides a 14% accuracy increase and is a novel case study of using ML to enhance humanitarian data.

[Paper] [✨ News Coverage ✨] [Slides] [Poster] [Code]

2019

Pedagogy, Infrastructure, and Analytics for Data Science Education at Scale (MSc Thesis)

Vinitra Swamy, David Culler

Tools for scaling data science course offerings from 100 to 1500 students a semester

A detailed research report on autograding, analytics, and scaling JupyterHub infrastructure highlighted in use for thousands of students taking Data 8 at UC Berkeley. Thesis presented as a graduate student affiliated with RISELab, after helping develop UC Berkeley data science's software infrastructure stack including JupyterHub, autograding with OkPy, Gradescope, and authentication for 1000s of students. Collaborated with Yuvi Panda, Ryan Lovett, Chris Holdgraf, and Gunjan Baid on a talk detailing the infrastructure stack at JupyterCon 2017.

[Thesis] [Blog] [Code] [JupyterCon Slides] [JupyterCon Speaker Profile]

2018

Deep Knowledge Tracing for Student Code Progression (AIED)

Vinitra Swamy, Samuel Lau, Allen Guo, Madeline Wu, Wilton Wu, Zachary Pardos, David Culler

Modeling student learning through iterative code attempts in Jupyter notebooks

Knowledge Tracing is a body of learning science literature that seeks to model student knowledge acquisition through their interaction with coursework. This paper uses a recurrent neural network (LSTM) and free-form code attempts to model student knowledge in large scale computer science classes. This work has applications in hint generation for code assignments, student modeling, and plagiarism detection in interactive programming environments.

[Paper] [Poster]

2018

Industry

Scholé AI

Co-founder and CEO

Working on personalized learning at scale. Check out our company website!

2025 - Current

Microsoft AI

AI Software Engineer

Working on a framework for deep learning / ML framework interoperability (ONNX) alongside an ecosystem of converters, containers, and inference engines.

Lead of the inter-company ONNX Special Interest Group (SIG) for Model Zoo and Tutorials with Microsoft, Intel, Facebook, IBM, nVidia, RedHat, and other academic and industry collaborators.

Presented and represented Microsoft AI at several conferences: WIDS 2020, Microsoft //Build 2019, KDD 2019, Microsoft Research Faculty Summit 2019, UC Berkeley AI for Social Impact Conference 2018, Women in Cloud Summit 2018, RISECamp 2018

2018 - 2020

Berkeley Insitute for Data Science (BIDS), RISELab

Research Assistant

Worked on projects in AI + Systems with an application area of data science education. Project areas include JupyterHub architecture, custom deployments, OkPy autograding integration, Jupyter noteboook extensions, and D3.js / PlotLy visualizations for data science explorations of funding and enrollment data.

[BIDS] [RISELab]

2015 - 2018

IBM Research

Research Scientist Intern, Machine Learning

Worked on the CSi2 project as a Machine Learning Research Scientist intern on the Hybrid Cloud team. The CSi2 algorithm is an ensemble machine learning algorithm to detect inactivity of VMs as well as suggest a course of action (i.e. termination, snapshot). It is projected to save IBM Research at least $3.2 million dollars with 95.12% recall and 88% F1 score (>> industry standard) and is being implemented into the Watson Services Platform. Collaborated with Neeraj Asthana, Sai Zheng, Ivan D'ell Era, Aman Chanana. Presented an exit talk and filed 2 patents.

2017

Software Engineering Intern

Interned at LinkedIn headquarters with the Growth Division's Search Engine Optimization (SEO) Team the summer before entering UC Berkeley. Worked on fullstack testing infrastructure for the public profile pages, as well as a Hadoop project; outside of assigned work, helped plan LinkedIn’s DevelopHER Hackathon and worked on several Market Research/User Experience Design initiatives.

2015

Google

Intern, Made w/ Code Ambassador

Spent a summer learning computer science fundamentals and shadowing engineers through the CAPE high school internship program at Google Headquarters in Mountain View, CA. Chosen as a Google Ambassador for Computer Science following the experience. Worked with Google, Salesforce, and AT&T to introduce coding to over 15,000 girls across California with the Made w/ Code Initiative.

2011

Education

École Polytechnique Fédérale de Lausanne

PhD in Computer Science

President of EPFL PhDs in Computer Science (EPIC)
Advised by Prof. Tanja Käser at the ML4ED Lab
and Prof. Martin Jaggi at the MLO Lab
EDIC Computer Science Fellowship Recipient
EPFL IC Distinguished Service Award (3-time Recipient)
Elected CS PhD Representative (EDIC Committee under Prof. Ed Bugnion)

2020 - 2025

University of California, Berkeley

Master's in Electrical Engineering and Computer Science

President of Computer Science Honor Society (UPE)
✨🗞️Head Graduate Student Instructor of Data 8 (Foundations of Data Science)
Research Assistant, Graduate Opportunity Fellow at RISELab
Advisor: Dean of Data Sciences, David Culler

2017 - 2018

University of California, Berkeley

Bachelor's in Computer Science

EECS Award of Excellence in Undergraduate Teaching and Leadership
UC Berkeley Alumni Leadership Scholar
Graduated 2 years early

2015 - 2017

Teaching Experience

Agentic AI Intensives

Harvard University

Adjunct Teaching Faculty to 100s of professionals around the world through HDSI's AI-enabled online classroom. Forbes recently recommended the course as the top way to learn agents in 2026!

[Course Announcement] [Agentic AI Intensives]

2025 - Current

Machine Learning, Data Analysis, Databases

EPFL

Fall 2022, 2023: TA for AICC 1 - Advanced Information, Computation, and Communication with Tanja Käser
Spring 2022, 2023: TA for CS 421 - Machine Learning for Behavioral Data with Tanja Käser
Fall 2021: TA for CS 401 - Applied Data Analysis with Bob West
Spring 2021: TA for CS 322 - Introduction to Database Systems with Anastasia Alaimaki and Christoph Koch

2020 - 2024

CSE/STAT 416: Introduction to Machine Learning

University of Washington, Seattle

Lecturer to 100+ upper-division undergraduate and graduate students on a practical introduction to machine learning. Modules include regression, classification, clustering, retrieval, recommender systems, and deep learning, with a focus on an intuitive understanding grounded in real-world applications.

[CSE 416 Website]

Summer 2020

Data 8: Foundations of Data Science

UC Berkeley

✨🗞️ Lecturer to 250+ undergraduate students on fundamentals of statistical inference, computer programming, and inferential thinking.

[Data 8 Website] [Course Offering] [Course Materials / Code]

Summer 2018

Data 8: Foundations of Data Science

UC Berkeley

✨🗞️ TA / Head Graduate Student Instructor (GSI) of Data 8 for 4 semesters, responsible for management of 1000+ undergraduates, 40 GSIs, 30 tutors, and 100+ lab assistants each semester.
✨🗞️ Helped create data science curriculum material for lecture and domain-specific seminar courses.
✨🗞️ In charge of developing JupyterHub infrastructure for 1500+ active users (with Jupyter Servers with Docker/Kubernetes backend on top of various cloud providers including Google Cloud, Azure, and AWS).

2016 - 2018

Organizing Team

HEXED Workshop Organizer @ EDM 2024
WiML Program Chair @ ICML 2022
FATED Workshop Organizer @ EDM 2022

Reviewer / Program Committee

AIED Program Committee 2023, 2024
AIED 2021*, 2022* (Subreviewer for Tanja Käser)
EMNLP BlackBoxNLP 2021, 2022, 2023
NeurIPS 2024
NeurIPS GenBench 2022, 2023, GAIED 2023, and XAI Workshops 2023, 2024
EACL 2022
EDM Program Committee 2023, 2034
Journal of Educational Data Mining (JEDM) 2022, 2023
LAK 2022*, 2023* (Subreviewer for Tanja Käser)
Editor for Springer Series on Big Data Management (Educational Data Science)

Working Groups

Fairness Working Group @ EDM 2022
WiML Workshop Team @ NeurIPS 2021
Lead of the 2020 ONNX SIG for Models and Tutorials

2023

Awards

Patrick Denantes Memorial Award for the EPFL's best CS thesis
Rising Stars in Data Science 2024
2024 G-Research PhD Prize
Winner of the 2024 Learning Engineering Tools Competition
EPFL IC Distinguished Service Award 2021, 2022, 2023
EPFL Computer Science (EDIC) Fellowship
UC Berkeley EECS Award of Excellence for Teaching and Leadership
UC Berkeley Graduate Opportunity Fellowship
Kairos Society Entrepreneurship Fellow, UC Berkeley
President of UPE, UC Berkeley Computer Science Honor Society
UC Berkeley Alumni Leadership Scholar
NASA-Conrad Foundation Spirit of Innovation Cybertechnology Finalist
Girl Scout Gold Award: Bridging the Digital Divide
Google International Trailblazer in Computer Science

Speaking Engagements

Fall 2024: Talk at Rising Stars in Data Science Workshop at UC San Diego on the "Future of human-centric eXplainable AI"
Summer 2024: Invited Talk at Microsoft Research Cambridge in Cambridge, UK on evaluating post-hoc explainers, InterpretCC, MultiModN, and iLLuMinaTE
Summer 2024: Speaker at the Data Makers Fest 2024 in Porto, Portugal on the future of Explainable AI
Summer 2024: Speaker at the LauzHack Deep Learning Bootcamp on Advanced Topics: Explainable AI
Spring 2023: Speaker at the SMART-AI Workshop for the WHO on Interpretable AI
Spring 2023: Speaker at the Applied XAI track of Applied Machine Learning Days 2024
Spring 2023: Speaker at EDIC Open House for the ML4ED Laboratory
Spring 2023: Instructor at the BeLEARN center for the JDPLS program (ETH Zurich, EPFL) on Student Modeling
Fall 2023: Speaker at Red Cross LLM Day at the ICRC Headquarters on Evaluating and Interpreting LLMs
Fall 2023: Speaker at AWS Research Day on Personalized, Trustworthy Human-Centric Computing: AI for Education
Summer 2022: Speaker at Oxford ML "Un-Workshop" Series on Evaluating Explainable AI
Summer 2022: Opening Remarks at the FATED workshop at EDM 2022 (Durham, UK)
Spring 2022: Speaker at Women in Data Science (WIDS 2022) Silicon Valley: Explainable AI
Fall 2021: Spotlight Talk at NeurIPS Inaugural eXplainable AI for Debugging and Diagnosis Workshop
Fall 2021: Presenter at the Tamil Internet Conference (INFITT) on "TamilBERT: Natural Language Modeling for Tamil"
Spring 2021: Presenter at the EDIC Orientation for PhDs, EPFL
Spring 2021: UC Berkeley Data Science Alumni Panel (Data 8)
Fall 2020: Featured Guest on the Tech Gals Podcast (Episode 3)
Fall 2020: Speaker at the ONNX Workshop
Spring 2020: Speaker at Women in Data Science Conference (WIDS 2020) Silicon Valley: Interoperable AI (ONNX)
Spring 2020: Speaker at the Linux Foundation (LF) AI Day
Fall 2019: Presenter at Microsoft Bay Area AI Meetup
Summer 2019: Guest on the Microsoft AI Show (Channel 9)
Spring 2019: Speaker at Microsoft Machine Learning and Data Science Conference (MLADS) (Redmond)
Summer 2018: Presenter at Artificial Intelligence in Education 2018 (London)
Summer 2018: Speaker at UC Berkeley's Data Science Undergraduate Pedagogy and Practice Workshop (Berkeley)
Spring 2018: Brilliance of Berkeley Panelist on the "Transformative Powers of Data" (LA '18)
Fall 2017: Opening Panelist at SalesForce Dreamforce Conference (SF)
Summer 2017: Speaker at JupyterCon (NYC)
Spring 2017: Presenter at Berkeley Institute for Data Science Research Showcase (Berkeley)
Fall 2016: Panelist at SF BusinessWeek Conference (SF)
Summer 2016: Conference organizing team at Algorithms for Modern Massive Data Sets (MMDS) (Berkeley)