Supratik Saha

Supratik Saha

Machine Learning Engineer

Personal Profile

I am a passionate machine learning engineer with industry experience in building data products.

I design and create revenue-generating solutions that require predictive modeling, natural language processing, and deep learning.

I have led cross-functional teams, worked closely with executive leadership, and effectively managed client relationships.

Interested in senior level machine learning and data science positions in which I can continue to grow.

Key Skills

  • Deep Learning
  • NLP
  • Machine Learning
  • Computer Vision
  • CNN
  • RNN
  • LSTM
  • Python
  • SQL
  • SAS
  • Java
  • Keras
  • PyTorch
  • TensorFlow
  • Git
  • Tableau
  • AWS
  • Google Cloud Platform
  • Microservices
  • Kafka
  • Kubernetes

Projects

Distracted Driver Detection - OpenCV, Keras, VGG-16

Jul 2020 – Aug 2020

  • Developed a set of models used to identify distracted drivers and classify mode of distraction from a set of driver images
  • Used OpenCV, Keras, TensorFlow and pre-trained VGG_16 networks
  • Model log loss - 0.22855

Quora Duplicate Questions Pairs - LSTM, Glove

Jun 2020 – Jul 2020

  • Built an NLP model using Deep Learning that can identify duplicate question pairs on Quora
  • Used NLP and non-NLP feature extraction techniques, Keras, LSTM and Glove Embedding vectors
  • Model log loss - 0.13499

Forecasting Walmart Store Sales - LightGBM, Keras

Aug 2020 – Sep 2020

  • Developed a set of models used to forecast uncertainty distributions in retail sales of Walmart stores
  • Used LightGBM and Keras ensemble, embeddings and forecasted uncertainty distributions
  • Model weighted scaled pinball loss - 0.25547

Dog Breed Image Classification - PyTorch

Apr 2020 – May 2020

  • Created an app that identifies the dog breed in an image or identifies that the image is of a human and assigns a dog breed that resembles the person
  • Built the project uses Python, CNN, PyTorch and Transfer Learning
  • Identified dogs or humans successfully in over 90% cases

Courses and Certifications

Natural Language Processing with Attention Models

Coursera   |   Dec 2020

Certificate Link

Machine Learning Engineer Nanodegree

Udacity   |   May 2020

Certificate Link

Extreme Gradient Boosting with XGBoost

DataCamp   |   May 2019

Certificate Link

Work Experience

SAP - Sr. Data Scientist/Machine Learning Engineer

Greater LA Area, CA   |   Jan 2020 - Present

Tools & Languages: Python, SQL, Machine Learning, Linear Optimization, Google Cloud Platform, Java, Kafka, Docker, Kubernetes, Jenkins, Argo CD, Git, Grafana, Test Driven Development

  • Designed & engineered a Machine Learning product to recommend trade promotions for 2 pilot retailers that reduced manufacturer effort to match trade claims by 90%
  • Leading a team of 6 Machine Learning Engineers at Eureka – a cloud solutions startup that is at the forefront of the transition of SAP from on-premise to cloud
  • Responsible for conceptualizing, designing, co-writing, testing & deploying 3 microservices in Python & Java that form the core of the new K-native asynchronous distributed software

Cox Auto (Kelley Blue Book) - Sr. Data Scientist

Greater LA Area, CA   |   Jan 2017 - Jan 2020

Tools & Languages: Python, SQL, Gradient Boosted Trees, Deep learning, Generalized Liner Models, XGBoost, SAS, Tableau, Enterprise Miner, Enterprise Guide, SageMaker, AWS Lambda, AWS Redshift, Git

  • Developed machine learning products for auto OEM clients earning $500,000 in revenue
  • Forecasted used car price at a bi-monthly cadence for 10,000 different combinations using an ensemble of Deep learning and LightGBM that facilitated lease vehicle pricing of OEMs and dealers
  • Promoted to the lead position of the data science team responsible for residual product portfolio at Kelley Blue Book involved in processing data assets in range of 500 million-1 billion transactions
  • Conceptualized interactive dashboards using Tableau as front end of ML products for clients

Deloitte Consulting - Sr. Consultant - Analytics

Greater New York City, NY;   Atlanta, GA   |   Jul 2016 — Jan 2017

Tools & Languages: Python, SQL, Tableau

  • Designed a 3 year strategy roadmap to build the analytics org of a medical devices client
  • Recommended 6 high impact data science projects after evaluating data from different client departments to kick start and prioritize revenue earning and cost saving initiatives
  • Assessed sales impact by performing Price Volume Mix for a CPG client using Python & Tableau

UNC - Tech Lead – Data Science (Grad Practicum)

Research Triangle Park, NC   |   Sep 2015 - May 2016

Tools & Languages: Python, SQL, Gradient Boosted Trees, Text Mining, NLP, Generalized Liner Models, K Means Clustering, SAS, R Shiny, Tableau, Enterprise Miner, Enterprise Guide

  • Mined text data from 7 years of patient surveys to predict safety incidents
  • Devised an app using R Shiny to enable automatic incident flagging
  • Spearheaded a team of 5 data scientists as the technical leader
  • Modeled decision trees to predict likelihood of response to emergency room questionnaires

Education

North Carolina State University

Master of Science in Analytics   |   Jun 2015 - May 2016

West Bengal University of Technology

Bachelor of Technology - Computer Science & Engineering