Hello! I am an AI engineer and researcher working on deep learning framework interoperability at Microsoft. I currently work on the Open Neural Network eXchange project (ONNX), an open format to represent deep learning models and translate between AI frameworks.
Before moving to Seattle, I graduated 2 years early from my B.A. and M.S. in Computer Science at UC Berkeley (‘17, ‘18, go bears!) and was a summer lecturer for the Division of Data Sciences.
I love people, data, and working on exciting problems at the intersection of the two:
Thank you for taking time out of your day to find out what I do with mine!
Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides an open source format for AI models, both deep learning and traditional ML. It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types. Currently we focus on the capabilities needed for inferencing (scoring).
Gave a talk to data scientists and engineers at MLADS Spring 2019 on model operationalization and acceleration with ONNX alongside Emma Ning, Spandan Tiwari, Nathan Yan, and Lara Haidar-Ahmad.
Overview of AI model interoperability with ONNX and ONNX Runtime for data scientists and researchers at University of Washington, Seattle.
Microsoft //Build 2019, KDD 2019, Microsoft Research Faculty Summit 2019, UC Berkeley AI for Social Impact Conference 2018, Women in Cloud Summit 2018, RISECamp 2018
We present a machine learning model to predict tags for datasets from the United Nations Office for the Coordination of Humanitarian Affairs (UN OCHA) with the labels and attributes of the Humanitarian Exchange Language (HXL) Standard for data interoperability. This paper details the methodology used to predict the corresponding tags and attributes for a given dataset with an accuracy of 94% for HXL header tags and an accuracy of 92% for descriptive attributes. Compared to previous work, our workflow provides a 14% accuracy increase and is a novel case study of using ML to enhance humanitarian data.
Collaborated with Elisa Chen, Anish Vankayalapati, Abhay Aggarwal, Chloe Liu (UC Berkeley); Vani Mandava (Microsoft Research); Simon Johnson (UN OCHA). Talk, poster, and short paper published at Social Impact Workshop at KDD 2019 in Anchorage, Alaska.
A detailed research report on autograding, analytics, and scaling JupyterHub infrastructure highlighted in use for thousands of students taking Data 8 at UC Berkeley. Presented as a graduate student affiliated with RISELab.
Knowledge Tracing is a body of learning science literature that seeks to model student knowledge acquisition through their interaction with coursework. This project uses a recurrent neural network (LSTM) to optimize prediction of student performance in large scale computer science classes.
Collaborated with Samuel Lau, Allen Guo, Madeline Wu, Wilton Wu, Professor Zachary Pardos, Professor David Culler on a short paper published at published at Artificial Intelligence in Education / International Festival of Learning 2018 in London, England.
Helped develop UC Berkeley data science's software infrastructure stack including JupyterHub, autograding with OkPy, Gradescope, and authentication for 1000s of students.
Collaborated with Yuvi Panda, Ryan Lovett, Chris Holdgraf, and Gunjan Baid on a talk detailing the infrastructure stack at JupyterCon 2017.
Recent studies have shown that "zombie" virtual machines in hybrid/private clouds have been wasting millions of dollars worth of resources. The CSi2 algorithm is an ensemble machine learning algorithm to detect inactivity of VMs as well as suggest a course of action like termination or snapshot. It is projected to save IBM Research at least $3.2 million dollars with 95.12% recall and 88% F1 score (>> industry standard) and is being implemented into the Watson Services Platform. 2 patents have been filed and papers are in the process of being finished.
Collaborated with Neeraj Asthana, Sai Zheng, Ivan D'ell Era, Aman Chanana.
Expanded on an MIT CSAIL paper by Shen et. al. to improve the accuracy of neural style transfer for unaligned text using author disambiguation algorithms.
Collaborated with Vasilis Oikonomou and Professor David Bamman.
A Distill-style (interactive visualization) introduction and literature review of neural networks for Natural Language Processing, specifically in the field of analyzing sentiment and emotion.
Collaborated with Stefan Palombo and Michael Brenndoerfer.
Exploring reward design for reinforcement learning through the framework of causality and fairness. Class project later expanded into a short paper at CausalML workshop at NeurIPS 2018 by collaborators.
Extracting sentiment from video frames, experimenting with GANs for audio, and ultimately using a neural style transfer for audio technique to generate unique musical tracks for video.
Advising on data science project to extract key information surrounding police activity from news articles.
ETL pipeline and web scraper to determine graph of collaborations between professors and researchers across institutions.
Working on a framework for deep learning / ML framework interoperability (ONNX) alongside an ecosystem of converters, containers, and inference engines.
Worked on projects in AI + Systems with an application area of data science education. Project areas include JupyterHub architecture, custom deployments, OkPy autograding integration, Jupyter noteboook extensions, and D3.js / PlotLy visualizations for data science explorations of funding and enrollment data.
Worked on the CSi2 project as a Machine Learning Research Scientist intern on the Hybrid Cloud team. Presented an exit talk and filed 2 patents.
Interned at LinkedIn headquarters with the Growth Division's Search Engine Optimization (SEO) Team the summer before entering UC Berkeley. Worked on fullstack testing infrastructure for the public profile pages, as well as a minor Hadoop project; outside of assigned work, helped plan LinkedIn’s DevelopHER Hackathon and worked on several Market Research/User Experience Design initiatives.
Spent a summer learning the fundamentals of Computer Science, algorithms, and computational thinking at Google Headquarters in Mountain View, CA. Chosen as a Google Ambassador for Computer Science following the experience. Worked with Google, Salesforce, and AT&T to introduce coding to over 15,000 girls across California with the Made w/ Code Initiative.