-
Principal Machine Learning Engineer · Cerby
Sep 2025 to Present Led the design of AI-driven workflow code generation using transformer-based language models and graph machine learning, modeling user interactions, DOM structure, and execution flows as graphs to generate, validate, and evolve CHAOS automations. Architected multimodal ML systems leveraging Scout and Replayer data (DOM, rrweb events, screenshots, videos, and failure traces) to power error classification, failure clustering, and drift detection in production workflows. Built and owned scalable end-to-end ML infrastructure on AWS, including data pipelines, SageMaker training pipelines, model deployment, and monitoring, enabling reliable iteration toward self-healing automations.
-
Principal Machine Learning Engineer · Vesta Corporation
Jan 2025 to Sep 2025 Led design and deployment of graph-based fraud detection systems using GNNs and custom graph algorithms, including JA-GNN which outperformed state-of-the-art methods on proprietary and public datasets, improving risk attribution by over 20% across multi-block transaction networks. Built distributed graph pipelines with PySpark, DGL, Neo4j, and Redis, including a custom in-memory graph library enabling real-time subgraph extraction, similarity scoring, and risk propagation.
-
Senior Machine Learning Engineer · Vesta Corporation
Jan 2022 to Jan 2025 Engineered scalable data pipelines for temporal and graph-based feature generation over billions of transactions, enabling efficient training and inference for large-scale fraud models. Delivered end-to-end ML solutions on Azure, containerized with Kubernetes (AKS), integrating graph outputs into downstream risk models and real-time scoring workflows. Collaborated with investigators and cross-functional teams to convert user stories into graph-native features and deployed models using open-source ML frameworks like PyTorch and DGL.
-
Machine Learning and Optimization Engineer (Intern) · Waters Corporation
Jan 2021 to Jul 2021 Built and optimized end-to-end ML pipelines to reduce liquid chromatograph equilibration time by 30%, employing advanced signal decomposition on time-series sensor data. Integrated AWS Redshift with PySpark-based ETL pipelines and deployed a Streamlit application on EC2, streamlining real-time data ingestion and SME collaboration for model lifecycle management.
-
Teaching Assistant · Northeastern University
Aug 2020 to Jun 2021 Teaching Assistant for CS4100: Artificial Intelligence (Prof. Christopher Amato) and CS5100: Artificial Intelligence (Prof. Stacy Marcella).
-
Data Scientist · Maersk Tankers
Mar 2019 to Dec 2019 Developed HPC-ready genetic optimization models for vessel routing, integrating climatic and onboard sensor data to achieve a 14% improvement in fuel efficiency. Implemented a Flask-based web app transitioning vessels from HSFO to VLSFO, containerized and deployed on Kubernetes for real-time, high-performance fuel management. Partnered with operations and technology teams to develop distributed, data-driven solutions using PySpark pipelines, optimizing maritime logistics and reducing operational costs.
-
Data Analyst · Accenture Solutions Private Limited
Nov 2016 to Mar 2019 Built and deployed predictive pricing models using Spark MLlib on high-volume utility data, enabling real-time rate adjustments for electric and gas clients. Collaborated with data engineers and domain experts to streamline data pipelines and integrate open-source ML frameworks for more efficient model experimentation.