Shrey Patel
Portfolio 2026
Open to work
Data & ML Engineer

Building the infrastructure for intelligence

Streaming lakehouses · Real-time fraud pipelines · LLM safety systems

Scroll to explore

Where I've built

Data Engineer

Dec 2025 — Present
Sepal AI → Anthropic (Vendor)

Designing closed-loop LLM evaluation frameworks with synthetic observability data and automated grading pipelines. Built representative test data generation for safety-critical RCA workflows. Reduced broken evaluation runs ~70%.

LLM SafetyGreat ExpectationsCI/CDSynthetic DataClickHouseGrafanaPrometheus

Data & AI Engineer

Aug 2025 — Dec 2025
EncryptMyWork

Multi-stage classification for abuse detection and NSFW content moderation. Client-side LLM moderation with CLIP/MiniLM-L6-v2 embeddings. HIPAA/GLBA compliant; PHI/PII recall ≥98% at ≤3% FPR (F1 0.95–0.97).

PyTorchCLIPFastAPIDjangoKafkaFlink

AI Data Engineer

Nov 2024 — Present
Walmart

Azure Lakehouse processing 5+ TB daily across $500M+ transactions. Reduced data latency 65%, reporting from 48h to <6h. Real-time fraud detection with Spark Structured Streaming. AI recommendation systems — 1M+ interactions, +22% engagement.

AzureDelta LakeDatabricksPySparkMLflowTerraform

Data Engineer

Jan 2021 — Aug 2023
Kenexai (fka Ridgeant Technologies)

Built partner analytics backbone on Snowflake + Iceberg + dbt + Airflow. Deployed dynamic pricing on SageMaker driving +35% annual revenue. Real-time DDP handling billions of events/day at 99.9% availability.

SnowflakeIcebergdbtAirflowSageMakerKafka

Software Engineer Intern

Apr 2020 — Dec 2020
ZF Group

Prototyped streaming ETL in Flink, built FastAPI prediction services with OpenTelemetry tracing and CI/CD with reusable GitHub Actions.

FlinkFastAPIOpenTelemetryGCPBigQuery

Technical Graph

Selected Works

← scroll to explore →
01

Rust HFT Risk Engine

Real-time risk engine streaming synthetic venue data with pre-trade credit and price collar checks. PTP timestamped to ClickHouse with deterministic replay.

RustTokioClickHousePrometheusGrafana
<100µsp50 Latency
<500µsp99 Latency
10M+Zero Loss
02

Latent Diffusion Engine

Latent-diffusion pipeline with point-in-time content embeddings trained on 10 H100 nodes with Artbench + OpenImage data.

PyTorchU-NetVAECLIPONNXTensorRT
−20%FID
1.84sInference
10×H100HPC Scale
03

Notion Data Spine Clone

CRDT collaborative editing on Kafka, Postgres, and gRPC with backpressure handling and idempotent writes.

KafkaPostgresgRPCCRDTsDockerPrometheus
3KOps/sec
<30msMedian Lat
1000+Concurrent
04

HPC Image Captioning

Memory-efficient DDP with CNN-RNN, crash recovery via gradient checkpointing across DenseNet169, InceptionV3, ResNet50.

PyTorch DDPSlurmMPICNN-RNN
+15%BLEU
−40%Memory
−50%Train Time

How I see

Credentials

Certifications

O
OCI Multicloud Architect Professional
Oracle · 2025

Enterprise multi-cloud architecture design across AWS, Azure, and OCI platforms.

Verify ↗
O
OCI Cloud Database Services Professional
Oracle · 2025

Cloud database design, migration patterns, and autonomous database management.

Verify ↗
O
OCI Data Science Professional
Oracle · 2025

Machine learning model lifecycle, feature stores, and model deployment on OCI.

Verify ↗
O
OCI Generative AI Professional
Oracle · 2025

LLM integration, RAG architecture, and prompt engineering on Oracle Cloud.

Verify ↗
N
NVIDIA Certified Professional GenAI/LLMs
NVIDIA · 2026

GPU-accelerated LLM inference, TensorRT optimization, and multi-node training.

Verify ↗
D
Databricks ML Professional
Databricks · 2026

MLflow lifecycle, feature engineering, AutoML, and production MLOps on Unity Catalog.

Verify ↗
M
McKinsey Forward Program
McKinsey.org · 2025

Strategic problem-solving, structured communication, and leadership development.

Verify ↗
A
AWS Academy Cloud Foundations
AWS · 2021

Core AWS services, security, architecture, and cloud economics fundamentals.

Verify ↗
A
AWS Academy Machine Learning
AWS · 2021

SageMaker model training, deployment pipelines, and ML infrastructure on AWS.

Verify ↗

Education

Sep 2023 — May 2025
Northeastern University
M.S. Computer Software Engineering
GPA: 3.4/4.0

Generative AI, HPC with Deep Learning, Big Data & Indexing. Co-founder at CareWallet.

Jun 2018 — May 2022
Ganpat University
B.Tech Computer Science · Big Data
GPA: 3.8/4.0

Probability & Statistics, Cloud Computing. 2× GCP Quest Leader. Google Code-in Mentor.

Contributions

ShreyPatel4
@ShreyPatel4
249 Commits · 42 PRs · 19 Repos
Follow
Python 47% Shell 29% JavaScript 24%

Word of mouth

"Shrey is rare in that he combines strong engineering depth with clear product thinking. He's thoughtful about schemas, data quality, and monitoring..."

HK
Harsh Kakadiya
Lead Data Engineer, Kenexai

"He managed end-to-end data workflows — handling everything from source extraction and complex transformations to delivering well-structured data marts."

CK
Chandan Kamal
Lead Data Engineer, Kenexai

"His proactive stance on identifying and mitigating potential challenges underscores his critical role in our risk management strategies."

GC
Grant Chau
Graduate Research Assistant, Dartmouth

"Shrey offered insights that added strength to our technological foundation, paving the way for operations that scale."

ES
Evan Smith
Co-Founder, Eden

Let's talk

Or let Spectra set up a call (Voice coming soon)

Or reach me directly: patelshrey77@gmail.com