$350M
Procurement program powered by ML forecasting model (decision trees + logistic regression)
Full-Stack Data Scientist · ML · MLOps · Systems Architect
Senior-level data scientist and sole architect of two production systems at Devon Energy — a real-time frac data acquisition platform with AI-integrated agents and an ML-driven forecasting platform (decision trees + logistic regression, full Azure MLOps) supporting a $350M procurement program. Broad ML coverage across classical models, deep learning, and LLMs — with the communication chops to bridge data science and business stakeholders.
Selected Outcomes
$350M
Procurement program powered by ML forecasting model (decision trees + logistic regression)
$2.5M
Projected annual savings from the frac data acquisition platform
$40K/mo
Recurring savings from eliminating third-party vendor software
97%+
Field data reliability after pipeline + device overhaul (from <80%)
Systems I've Built
The systems below are production platforms I designed and shipped end-to-end — ML models, cloud architecture, data pipelines, authentication, APIs, frontend, testing, and deployment. Each is in daily use by engineering and operations teams at Devon Energy.
Sole architect of an end-to-end ingestion, validation, and AI-assisted backfill platform replacing a third-party vendor pipeline. Supports live frac fleet operations across Devon's field footprint — built on integrated agents, automated fallback, an LLM-powered backfill engine, and a React monitoring app used daily by completions engineers.
React/TypeScript + Node.js platform on Azure supporting a $350M casing procurement program. Built a production forecasting model combining decision trees and logistic regression with end-to-end MLOps — continuously re-tuning against drilling and completion schedule changes to produce exact per-SKU counts and delivery windows for the forward year.
Fine-tuned and deployed a BERT-based question-answering pipeline extracting structured fields from 5,000+ scanned TIFF images per day at RMS. Replaced a manual review workflow, reducing turnaround and human error while achieving an F1 score of 0.90.
Built AI-powered validation agents using Pydantic AI for casing pressure analysis with structured input/output models and custom tool functions. Authored an agentic coding toolkit with Snowflake integration plugins, query-validation hooks, and developer productivity automations. Integrated OpenAI SDK for intelligent data enrichment.
ML Coverage
I've shipped across classical ML, deep learning, and LLM agents — production, not just notebooks. My view: LLMs are one tool in the box. The right model depends on the problem, the data, and what you can operate at scale.
Decision trees (random forest, gradient boosting), logistic regression, linear regression, clustering (k-means, hierarchical), time-series forecasting, scikit-learn, XGBoost
BERT fine-tuning, PyTorch, Hugging Face Transformers, CNN architectures, OpenCV, Tesseract OCR
OpenAI SDK, Pydantic AI, LLM-based agents for structured extraction (PDF/CSV → schemas), tool-using agents, prompt engineering, structured I/O
Automated retraining, drift monitoring, model observability, containerized deployment on Azure, production error handling, A/B-style model comparison
Regression analysis, hypothesis testing, distributional analysis, exploratory data analysis, time-series decomposition, feature engineering
Power BI, Sisense, Recharts, Plotly.js, MUI Data Grid — dashboards and operator-facing UIs for non-technical stakeholders
Experience
Skills
Production-grade ML, cloud, and full-stack — enterprise code with error handling, testing, and observability.
Python, SQL, TypeScript, JavaScript, Bash · Snowflake, OSIsoft PI, PySpark, Pandas, Hive/Hadoop-style SQL-on-big-data patterns
Decision trees, logistic regression, random forest, XGBoost, clustering, BERT, PyTorch, Hugging Face, LLM agents, OpenAI SDK, Pydantic AI, scikit-learn, time-series forecasting
Automated retraining, drift monitoring, model observability, containerized deployment, production error handling, regression testing on model outputs
App Service · Blob Storage (SAS URLs) · Container Registry · Azure AD / MSAL · Logic Apps · Microsoft Graph API
PySpark pipelines, high-frequency ingestion (~80 files/fleet/day), gap/spike validation, stream processing into time-series historians
React 18, TypeScript, Vite, Material-UI, Recharts, Plotly.js, React Query, React Hook Form, Zod
Node.js, Express, Flask, REST APIs, node-cron, multi-agent pipeline design
Docker (multi-stage), Nginx, Azure Container Registry, CI/CD, Playwright (E2E), Vitest, Testing Library
Power BI, Sisense, Recharts, Plotly.js, MUI Data Grid — dashboards for technical and non-technical audiences
Regression, hypothesis testing, distributional analysis, exploratory analysis, feature engineering
Education
In Progress · Started October 2025
M.S. Data Science
B.S. Computer Science
$12,000 prize · 2020
Contact
Looking for teams solving real production problems with ML — predictive, prescriptive, or agentic. Happy to walk through system architecture, talk through specific problems you're solving, or share more detail on any of the work above.