Vinitra Swamy

PhD Candidate at EPFL · Deep Learning Research ·

Hello! I am an AI researcher and PhD student working on deep learning model explainability at École Polytechnique Fédérale de Lausanne (EPFL). I'm coadvised by Prof. Tanja Käser at the ML4Ed Lab and Prof. Martin Jaggi at the MLO Lab.

Before moving to Switzerland, I worked at Microsoft AI as a lead engineer for the Open Neural Network eXchange project.

I graduated 2 years early with my B.A. and M.S. in Computer Science at UC Berkeley (‘17, ‘18, go bears!) and served as a machine learning lecturer for the Berkeley Division of Data Sciences and the University of Washington CSE Department.

I love people, data, and working on exciting problems at the intersection of the two:

  • Explainable and interpretable AI
  • Natural language models
  • ML for education (autograding, knowledge tracing, scalable infrastructure)
  • Running a data science course with 1000+ passionate undergrads

Thank you for taking time out of your day to find out what I do with mine!


Reviewer / Program Committee

BlackBoxNLP 2021 @ EMNLP
LAK 2022 (Subreviewer for Tanja Käser)
AIED 2021 (Subreviewer for Tanja Käser)

Working Groups

Fairness Working Group @ EDM 2021
Lead of the 2020 ONNX SIG for Models and Tutorials


Interpreting Language Models Through Knowledge Graph Extraction

École Polytechnique Fédérale de Lausanne (EPFL)

While transformer-based language models are undeniably useful, it is a challenge to quantify their performance beyond traditional accuracy metrics. In this paper, we compare BERT-based language models (DistilBERT, BERT, RoBERTa) through snapshots of acquired knowledge at sequential stages of the training process. We contribute a quantitative framework to compare language models through knowledge graph extraction and showcase a part-of-speech analysis to identify the linguistic strengths of each model variant. Using these metrics, machine learning practitioners can compare models, diagnose their models' behavioral strengths and weaknesses, and identify new targeted datasets to improve model performance.

Collaborated with Angelika Romanou and Professor Martin Jaggi. Talk, poster, and paper published at eXplainable AI for Debugging and Diagnosis at NeurIPS 2021.

[Paper] [Poster] [Code]


ONNX: Open Neural Network eXchange

Open Neural Network Exchange (ONNX) is an open standard for machine learning interoperability. Founded by Microsoft and Facebook, and now supported by over 30 other companies, ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

[ONNX + Azure ML Tutorials]

Microsoft MLADS Conference (Machine Learning, AI, and Data Science)

Gave a talk to data scientists and engineers at MLADS Spring 2019 on model operationalization and acceleration with ONNX alongside Emma Ning, Spandan Tiwari, Nathan Yan, and Lara Haidar-Ahmad.

[Notebooks] [Slides]

University of Washington eScience Institute

Overview of AI model interoperability with ONNX and ONNX Runtime for data scientists and researchers at University of Washington, Seattle.



Machine Learning for Humanitarian Data: Tag Prediction using the HXL Standard

Microsoft AI & Research, United Nations OCHA, UC Berkeley

We present a machine learning model to predict tags for datasets from the United Nations Office for the Coordination of Humanitarian Affairs (UN OCHA) with the labels and attributes of the Humanitarian Exchange Language (HXL) Standard for data interoperability. This paper details the methodology used to predict the corresponding tags and attributes for a given dataset with an accuracy of 94% for HXL header tags and an accuracy of 92% for descriptive attributes. Compared to previous work, our workflow provides a 14% accuracy increase and is a novel case study of using ML to enhance humanitarian data.

Collaborated with Elisa Chen, Anish Vankayalapati, Abhay Aggarwal, Chloe Liu (UC Berkeley); Vani Mandava (Microsoft Research); Simon Johnson (UN OCHA). Talk, poster, and short paper published at Social Impact Track at KDD 2019 in Anchorage, Alaska.

[Paper] [Talk Abstract] [Slides] [Poster] [Code]


Automating Infrastructure for Data Science Education at Scale

Master's Thesis: Pedagogy, Infrastructure, and Analytics for Data Science Education at Scale

A detailed research report on autograding, analytics, and scaling JupyterHub infrastructure highlighted in use for thousands of students taking Data 8 at UC Berkeley. Presented as a graduate student affiliated with RISELab.


Deep Knowledge Tracing for Student Code Progression

Knowledge Tracing is a body of learning science literature that seeks to model student knowledge acquisition through their interaction with coursework. This project uses a recurrent neural network (LSTM) to optimize prediction of student performance in large scale computer science classes.

Collaborated with Samuel Lau, Allen Guo, Madeline Wu, Wilton Wu, Professor Zachary Pardos, Professor David Culler on a short paper published at published at Artificial Intelligence in Education / International Festival of Learning 2018 in London, England.

[Paper] [Poster]

Project Jupyter: Scaling JupyterHub architecture for Data 8

Helped develop UC Berkeley data science's software infrastructure stack including JupyterHub, autograding with OkPy, Gradescope, and authentication for 1000s of students.

[Blog] [Code]

Collaborated with Yuvi Panda, Ryan Lovett, Chris Holdgraf, and Gunjan Baid on a talk detailing the infrastructure stack at JupyterCon 2017.

[Slides] [Speaker Profile]


CSi2: Idle Server Identification

IBM Research, T.J Watson Research Center

Recent studies have shown that "zombie" virtual machines in hybrid/private clouds have been wasting millions of dollars worth of resources. The CSi2 algorithm is an ensemble machine learning algorithm to detect inactivity of VMs as well as suggest a course of action like termination or snapshot. It is projected to save IBM Research at least $3.2 million dollars with 95.12% recall and 88% F1 score (>> industry standard) and is being implemented into the Watson Services Platform. 2 patents have been filed.

Collaborated with Neeraj Asthana, Sai Zheng, Ivan D'ell Era, Aman Chanana.


Neural Style Transfer for Non-Parallel Text

Natural Language Processing

Expanded on an MIT CSAIL paper by Shen et. al. to improve the accuracy of neural style transfer for unaligned text using author disambiguation algorithms.

Collaborated with Vasilis Oikonomou and Professor David Bamman.



Blog Post: Deep Sentiment Analysis

Natural Language Processing

A Distill-style (interactive visualization) introduction and literature review of neural networks for Natural Language Processing, specifically in the field of analyzing sentiment and emotion.

Collaborated with Stefan Palombo and Michael Brenndoerfer.



Deep Causal Reward Design

Fairness in Machine Learning

Exploring reward design for reinforcement learning through the framework of causality and fairness. Class project later expanded into a short paper at CausalML workshop at NeurIPS 2018 by collaborators.



Other Research Projects

Deep DJ: Musical Score Generation for Video

Extracting sentiment from video frames, experimenting with GANs for audio, and ultimately using a neural style transfer for audio technique to generate unique musical tracks for video.

[Blog] [Code]

Goodly Labs: Deciding Force

Advising on data science project to extract key information surrounding police activity from news articles.

[Project Overview] [Website]

BIDS: Ecosystem Mapping Initiative

ETL pipeline and web scraper to determine graph of collaborations between professors and researchers across institutions.



Industry Experience


Wingman Campus Fund

Partner of the Campus Fund, a student venture arm of Wingman Ventures. We evaluate and invest pre-seed in student startups and learn about VC from talented mentors and entrepreneurs. The Campus Fund operates in Switzerland through ETH Zurich, EPFL and HSG.

[Wingman Campus Fund]


Software Engineer, AI Frameworks


Working on a framework for deep learning / ML framework interoperability (ONNX) alongside an ecosystem of converters, containers, and inference engines.

Lead of the inter-company ONNX Special Interest Group (SIG) for Model Zoo and Tutorials with Microsoft, Intel, Facebook, IBM, nVidia, RedHat, and other academic and industry collaborators.

Attended several conferences as a representative of Microsoft AI: WIDS 2020, Microsoft //Build 2019, KDD 2019, Microsoft Research Faculty Summit 2019, UC Berkeley AI for Social Impact Conference 2018, Women in Cloud Summit 2018, RISECamp 2018

2018 - 2020

Research Assistant

Berkeley Insitute for Data Science (BIDS), RISELab

Worked on projects in AI + Systems with an application area of data science education. Project areas include JupyterHub architecture, custom deployments, OkPy autograding integration, Jupyter noteboook extensions, and D3.js / PlotLy visualizations for data science explorations of funding and enrollment data.


2015 - 2018

Research Scientist Intern, Machine Learning

IBM Research

Worked on the CSi2 project as a Machine Learning Research Scientist intern on the Hybrid Cloud team. Presented an exit talk and filed 2 patents.


Software Engineering Intern


Interned at LinkedIn headquarters with the Growth Division's Search Engine Optimization (SEO) Team the summer before entering UC Berkeley. Worked on fullstack testing infrastructure for the public profile pages, as well as a Hadoop project; outside of assigned work, helped plan LinkedIn’s DevelopHER Hackathon and worked on several Market Research/User Experience Design initiatives.


CAPE Intern, Made w/ Code Ambassador


Spent a summer learning computer science fundamentals and shadowing engineers through the CAPE internship program at Google Headquarters in Mountain View, CA. Chosen as a Google Ambassador for Computer Science following the experience. Worked with Google, Salesforce, and AT&T to introduce coding to over 15,000 girls across California with the Made w/ Code Initiative.



École Polytechnique Fédérale de Lausanne

PhD in Computer Science
2020 - Current

University of California, Berkeley

Master's in Electrical Engineering and Computer Science
  • President of Computer Science Honor Society (UPE)
  • Head Graduate Student Instructor of Data 8 (Foundations of Data Science)
  • Research Assistant, Graduate Opportunity Fellow at RISELab
  • Claim to fame: Graduated at 20 as youngest graduate of the M.S. in EECS program in Berkeley history
  • Advisor: Dean of Data Sciences, David Culler
2017 - 2018

University of California, Berkeley

Bachelor's in Computer Science
  • EECS Award of Excellence in Undergraduate Teaching and Leadership
  • UC Berkeley Alumni Leadership Scholar
2015 - 2017

Teaching Experience

Machine Learning, Data Analysis, Databases


2020 - 2024

CSE/STAT 416: Introduction to Machine Learning

University of Washington, Seattle
  • Lecturer to 100+ upper-division undergraduate and graduate students on a practical introduction to machine learning. Modules include regression, classification, clustering, retrieval, recommender systems, and deep learning, with a focus on an intuitive understanding grounded in real-world applications.

[CSE 416 Website]

Summer 2020

Data 8: Foundations of Data Science

UC Berkeley
  • Lecturer to 250+ undergraduate students on fundamentals of statistical inference, computer programming, and inferential thinking.

[Data 8 Website] [Course Offering] [Course Materials / Code]

Summer 2018

Data 8: Foundations of Data Science

UC Berkeley
  • TA / Head Graduate Student Instructor (GSI) of Data 8 for 4 semesters, responsible for management of 1000+ undergraduates, 40 GSIs, 30 tutors, and 100+ lab assistants each semester.
  • Helped create data science curriculum material for lecture and domain-specific seminar courses.
  • In charge of developing JupyterHub infrastructure for 1500+ active users (with Jupyter Servers with Docker/Kubernetes backend on top of various cloud providers including Google Cloud, Azure, and AWS).
2016 - 2018


  • EPFL Computer Science (EDIC) Fellowship
  • UC Berkeley EECS Award of Excellence for Teaching and Leadership
  • Google International Trailblazer in Computer Science
  • UC Berkeley Graduate Opportunity Fellowship
  • Kairos Society Entrepreneurship Fellow, UC Berkeley
  • President of UPE, UC Berkeley Computer Science Honor Society
  • UC Berkeley Alumni Leadership Scholar
  • NASA-Conrad Foundation Spirit of Innovation Cybertechnology Finalist
  • Girl Scout Gold Award: Bridging the Digital Divide

Speaking Engagements

  • Fall 2021: Spotlight Talk at NeurIPS Inaugural eXplainable AI for Debugging and Diagnosis Workshop
  • Fall 2021: Presenter at the Tamil Internet Conference (INFITT) on "TamilBERT: Natural Language Modeling for Tamil"
  • Spring 2021: Presenter at the EDIC Orientation for PhDs, EPFL
  • Spring 2021: UC Berkeley Data Science Alumni Panel
  • Fall 2020: Featured Guest on the Tech Gals Podcast (Episode 3)
  • Fall 2020: Speaker at the ONNX Workshop
  • Spring 2020: Speaker at Women in Data Science Conference (WIDS 2020 Silicon Valley)
  • Spring 2020: Speaker at the Linux Foundation (LF) AI Day
  • Fall 2019: Presenter at Microsoft Bay Area AI Meetup
  • Summer 2019: Guest on the Microsoft AI Show (Channel 9)
  • Spring 2019: Speaker at Microsoft Machine Learning and Data Science Conference (MLADS) (Redmond)
  • Summer 2018: Presenter at Artificial Intelligence in Education 2018 (London)
  • Summer 2018: Speaker at UC Berkeley's Data Science Undergraduate Pedagogy and Practice Workshop (Berkeley)
  • Fall 2017: Opening Panelist at SalesForce Dreamforce Conference (SF)
  • Summer 2017: Speaker at JupyterCon (NYC)
  • Spring 2017: Presenter at Berkeley Institute for Data Science Research Showcase (Berkeley)
  • Fall 2016: Panelist at SF BusinessWeek Conference (SF)
  • Summer 2016: Conference organizing team at Algorithms for Modern Massive Data Sets (MMDS) (Berkeley)