Vinitra Swamy

UC Berkeley · Data 8 Summer Instructor · Deep Learning Research · vinitra@berkeley.edu

Summer Instructor (to 250+ undergrads) for UC Berkeley's 2018 offering of Data 8
Graduate Data Science Research Assistant at RISE Lab
President of CS Honor Society UC Berkeley UPE

Hello! I'm currently a data science teaching instructor at UC Berkeley. I graduated 2 years early from my Bachelor's in Computer Science (also at UC Berkeley) and finished my M.S. in Electrical Engineering and Computer Sciences (EECS) in 2018 as a UC Berkeley Graduate Opportunity Fellow, advised by Dean of Data Sciences David Culler.

My research focus is in Deep Learning for computing education at scale (JupyterHub, Kubernetes, Docker). I'd love to talk about data science pedagogy, scaling technical infrastructure for 40,000 student EdX MOOCs, NLP research, Deep Knowledge Tracing, identifying the best boba order, poker strategy, good books, great screenplays, and pretty much anything under the sun. Feel free to reach me using the buttons below.

Next steps? I will be joining Microsoft AI + Research full time after my summer as a data science lecturer. Thank you for taking time out of your day to find out what I do with mine!

Research

CSi2 - Idle Server Identification

IBM Research, T.J Watson Research Center

Recent studies have shown that "zombie" virtual machines in hybrid/private clouds have been wasting millions of dollars worth of resources. The CSi2 algorithm is an ensemble machine learning algorithm to detect inactivity of VMs as well as suggest a course of action like termination or snapshot. It is projected to save IBM Research at least $3.2 million dollars with 95.12% recall and 88% F1 score (>> industry standard) and is being implemented into the Watson Services Platform. 2 patents have been filed and papers are in the process of being finished.

Collaborated with Neeraj Asthana, Sai Zheng, Ivan D'ell Era, Aman Chanana.

Summer 2017

Deep Knowledge Tracing for Student Code Progression

Master's Thesis

Knowledge Tracing is a body of learning science literature that seeks to model student knowledge acquisition through their interaction with coursework. This project uses a recurrent neural network (LSTM) to optimize prediction of student performance in large scale computer science classes.

Collaborated with Samuel Lau, Allen Guo, Madeline Wu, Wilton Wu, Professor Zachary Pardos, Professor David Culler.

2017 - 2018

Neural Style Transfer for Non-Parallel Text

Natural Language Processing

Expanded on an MIT CSAIL paper by Shen et. al. to improve the accuracy of neural style transfer for unaligned text using author disambiguation algorithms.

Collaborated with Vasilis Oikonomou and Professor David Bamman.


2017

Blog Post: Deep Sentiment Analysis

Natural Language Processing

A Distill-style (interactive visualization) literature review of neural networks for Natural Language Processing, specifically in the field of analyzing sentiment and emotion.

Collaborated with Stefan Palombo and Michael Brenndoerfer.


2017

Deep Causal Reward Design

Fairness in Machine Learning

Exploring reward design for reinforcement learning through the framework of causality and fairness.

2017

Experience

Research Scientist Intern (Machine Learning)

IBM Research

Worked on the CSi2 project as a Machine Learning Research Scientist intern on the Hybrid Cloud team. Presented an exit talk, filed 2 patents, and am currently working on a research paper.

Summer 2017

Teaching: Foundations of Data Science

UC Berkeley Data 8

Head GSI of Data 8 for 3 semesters, responsible for management of 1000+ undergraduates, 40 GSIs, 30 tutors, and 100+ lab assistants. Helped create data science curriculum material for the large lecture and domain-specific seminar courses. Also in charge of maintaining JupyterHub infrastructure for 1500+ active users (with Docker/Kubernetes/Google Compute Engine backend).

Fall 2016 - Present

Software Engineering Intern

LinkedIn

Interned at LinkedIn headquarters with the Growth Division's Search Engine Optimization (SEO) Team the summer before entering UC Berkeley. Worked on fullstack testing infrastructure for the public profile pages, as well as a minor Hadoop project; outside of assigned work, helped plan LinkedIn’s DevelopHER Hackathon and worked on several Market Research/User Experience Design initiatives.

June 2015 - Aug 2015

CAPE Intern, Made w/ Code Ambassador

Google

Spent a summer learning the fundamentals of Computer Science, algorithms, and computational thinking at Google Headquarters in Mountain View, CA. Chosen as a Google Ambassador for Computer Science following the experience. Worked with Google, Salesforce, and AT&T to introduce coding to over 15,000 girls across California with the Made w/ Code Initiative.

July 2010 - December 2011

Education

University of California Berkeley

Master's in Electrical Engineering and Computer Science

President of Computer Science Honor Society (UPE)
Head Graduate Student Instructor of Foundations of Data Science
Advisor: Dean of Data Sciences, Freisen Professor in EECS, David Culler

August 2017 - May 2018

University of California Berkeley

Bachelor's in Computer Science
August 2015 - August 2017

Skills and Technical Proficiencies

Programming Languages & Tools
  • Python
  • C++
  • Java
  • SQL
Data Science Toolkit
  • Machine Learning and Data Manipulation with SciKitLearn, NumPy, SciPy, Pandas, R
  • Deep Learning with PyTorch, TensorFlow, Keras
  • Visualization with D3.js, Plotly, MatPlotLib, Seaborn, Tableau
  • Natural Language Processing with NLTK, Word2Vec

Software Engineering Toolkit
  • Containerized services for quick deployment (Docker, Kubernetes)
  • Clean, commented, modularized code
  • Understanding of testing infrastructure
  • Github / Git / SVN proficiency, Version Control
  • Text editing with Sublime, Atom, Vim, IDEs

Speaking Engagements

  • Lecturer for UC Berkeley's Data 8 Summer 2018, alongside Fahad Kamran (250+ students)
  • Presenter at Artificial Intelligence in Education 2018 in London, England
  • Speaker at UC Berkeley's Data Science Undergraduate Pedagogy and Practice Workshop
  • Speaker at JupyterCon NYC 2017
    || Talk Details || Speaker Profile ||
  • Opening Panelist at SalesForce Dreamforce 2017
  • Panelist at SF BusinessWeek Conference 2016
  • Presenter at Berkeley Institute for Data Science Research Showcase
  • Organizer at MMDS 2016
  • Speaker at Google CAPE Award Ceremony

Awards & Certifications

  • Google Trailblazer in Computer Science
  • UC Berkeley EECS Award of Excellence for Teaching and Leadership
  • UC Berkeley Graduate Opportunity Fellowship Recipient
  • Cal Alumni Leadership Scholar
  • NASA-Conrad Foundation Spirit of Innovation Finalist
  • Girl Scout Gold Award: Bridging the Digital Divide