Naveed Mohiuddin
Data Engineer
Hands-on experience in production ETL, distributed processing, and data modeling across AWS and Azure — backed by two AWS certifications and a Master's in Computer Science.
About Me
Engineer, not just a resume.
I'm a Data Engineer with experience building cloud-native data platforms that serve real business needs — from policy and claims pipelines at GEICO to serverless lakehouse architectures on AWS. My work sits at the intersection of software engineering and data infrastructure: I design ETL systems, optimize distributed processing, model dimensional data, and automate everything I can.
I hold a Master's in Computer Science from Illinois Institute of Technology and a Bachelor's from Osmania University. I'm AWS certified in both Solutions Architecture and Data Engineering, and I've worked across AWS and Azure stacks in production settings. I care about building systems that are reliable, cost-efficient, and maintainable — not just technically interesting.
Tech Stack
Tools I work with daily.
Production-tested across AWS, Azure, and open-source data ecosystems.
Certifications
AWS Certified, production validated.
Industry-recognized credentials demonstrating cloud architecture and data engineering depth.
AWS Certified Data Engineer – Associate
Validates expertise in designing and maintaining data pipelines, data stores, data processing, security, and governance on AWS.
AWS Certified Solutions Architect – Associate
Demonstrates ability to design secure, scalable, and cost-optimized cloud architectures using AWS services.
Experience
Where I've built things that matter.
Production systems, real data, measurable outcomes.
Data Engineer
Jul 2025 – PresentBenda Infotech — Remote, US
- Architected serverless ETL pipelines on AWS (S3, Lambda, Glue) processing 1M+ records daily at 99.9% reliability
- Developed PySpark transformation jobs in Glue, cutting batch processing time by 40% through join optimization and partition tuning
- Designed dimensional star-schema models loaded into Redshift, boosting dashboard query performance by 35%
- Implemented Airflow scheduling with SLA monitoring, reducing manual intervention by 60%
- Optimized Athena queries via partition pruning and compression, lowering execution costs 30–40%
- Built CI/CD pipelines that reduced release cycles from 2 days to hours with 99%+ data quality validation
Software Engineer (Data Engineering)
Feb 2022 – Jul 2023Applied Information Sciences · Client: GEICO — Hyderabad, India → Remote US
- Engineered Azure Data Factory pipelines ingesting policy, claims, and CRM data (10M+ records) into ADLS Gen2
- Built distributed PySpark transformations in Databricks, reducing runtime by 30% via auto-scaling and partition optimization
- Implemented Delta Lake curated layers with SCD Type 2 MERGE logic, reducing reconciliation issues by 25%
- Developed Kafka-based streaming workflows that cut data latency from hours to near real-time
- Automated CI/CD deployment of ADF pipelines via Azure DevOps, reducing deployment errors by 40%
- Delivered analytics-ready datasets to Synapse, improving reporting turnaround by 50%
Projects
Engineering projects, not just exercises.
Each project solves a real data engineering problem end-to-end.
AWS Lakehouse & Analytics Platform
End-to-end serverless data lake with dimensional modeling and query-optimized analytics layer on AWS.
- Serverless ETL with Glue and PySpark for schema evolution and data cleansing
- Dimensional models loaded into Redshift with Airflow-orchestrated workflows
- Cost-optimized Athena queries with partitioning and Iceberg table format
- Automated ingestion from public APIs with deduplication logic
Big Data Processing with Spark
Large-scale distributed data processing on Hadoop clusters using Spark DataFrames and GCP Dataproc.
- Distributed processing using Spark DataFrames on HDFS-backed clusters
- Schema validation, data cleansing, and aggregation pipelines
- Optimized shuffle operations and join strategies for performance
- Deployed on GCP Dataproc for managed cluster orchestration
Why Hire Me
What I bring to your team.
Not just skills on paper — here's why it translates to real value.
Production Engineering Experience
Hands-on work building and supporting data systems at enterprise scale — not just tutorials or coursework.
AWS Certified
Two AWS certifications validating cloud architecture and data engineering competency.
Full-Stack Data Skills
End-to-end capability across ingestion, transformation, modeling, orchestration, and analytics.
Measurable Impact
Consistent track record of reducing costs, improving performance, and automating manual processes.
Modern Tooling
Fluent in PySpark, Airflow, Delta Lake, Kafka, and both AWS and Azure ecosystems.
Strong Academic Foundation
Master's in CS from Illinois Institute of Technology, combining depth with practical engineering.
Interested in discussing data engineering opportunities?
I'm actively looking for Data Engineer roles. Let's talk about how I can contribute to your team.
Contact
Let's connect.
I'm actively open to Data Engineer opportunities. If my background aligns with your team's needs, I'd love to hear from you.