AWS Certified Data Engineer AWS Certified Solutions Architect Seattle, WA

Naveed Mohiuddin

Data Engineer

Hands-on experience in production ETL, distributed processing, and data modeling across AWS and Azure — backed by two AWS certifications and a Master's in Computer Science.

$ naveed.skills("AWS", "Azure", "PySpark", "Databricks")

View Experience View Projects Download Resume Contact Me

Years Experience

AWS Certifications

Computer Science

AWS + Azure

Multi-Cloud

About Me

Engineer, not just a resume.

I'm a Data Engineer with experience building cloud-native data platforms that serve real business needs — from policy and claims pipelines at GEICO to serverless lakehouse architectures on AWS. My work sits at the intersection of software engineering and data infrastructure: I design ETL systems, optimize distributed processing, model dimensional data, and automate everything I can.

I hold a Master's in Computer Science from Illinois Institute of Technology and a Bachelor's from Osmania University. I'm AWS certified in both Solutions Architecture and Data Engineering, and I've worked across AWS and Azure stacks in production settings. I care about building systems that are reliable, cost-efficient, and maintainable — not just technically interesting.

MS Computer Science

Illinois Institute of Technology

BE Computer Science

Osmania University

Tech Stack

Tools I work with daily.

Production-tested across AWS, Azure, and open-source data ecosystems.

Data Engineering

ETL/ELTData ModelingDistributed ProcessingBatch & Streaming PipelinesMedallion ArchitectureStar SchemaSCD Type 2Data Quality

Cloud Platforms

AWS S3AWS GlueRedshiftLambdaAthenaEC2EventBridgeAzure Data FactoryDatabricksSynapse AnalyticsADLS Gen2

Big Data

Apache SparkPySparkKafkaHDFSHadoopHiveDelta LakeApache Iceberg

Orchestration & DevOps

Apache AirflowCI/CDAzure DevOpsGitGitHubDocker

Programming

PythonSQLPySparkJavaLinuxBash

Analytics & BI

Power BIAthena QueriesSynapseRedshift Spectrum

Certifications

AWS Certified, production validated.

Industry-recognized credentials demonstrating cloud architecture and data engineering depth.

DEA

AWS Certified Data Engineer – Associate

Validates expertise in designing and maintaining data pipelines, data stores, data processing, security, and governance on AWS.

S3GlueRedshiftAthenaLake FormationKinesis

SAA

AWS Certified Solutions Architect – Associate

Demonstrates ability to design secure, scalable, and cost-optimized cloud architectures using AWS services.

EC2VPCIAMLambdaCloudFormationRDS

Experience

Where I've built things that matter.

Production systems, real data, measurable outcomes.

Data Engineer

Jul 2025 – Present

Benda Infotech — Remote, US

AWS S3GlueLambdaRedshiftAthenaPySparkAirflowCI/CD

Architected serverless ETL pipelines on AWS (S3, Lambda, Glue) processing 1M+ records daily at 99.9% reliability
Developed PySpark transformation jobs in Glue, cutting batch processing time by 40% through join optimization and partition tuning
Designed dimensional star-schema models loaded into Redshift, boosting dashboard query performance by 35%
Implemented Airflow scheduling with SLA monitoring, reducing manual intervention by 60%
Optimized Athena queries via partition pruning and compression, lowering execution costs 30–40%
Built CI/CD pipelines that reduced release cycles from 2 days to hours with 99%+ data quality validation

Software Engineer (Data Engineering)

Feb 2022 – Jul 2023

Applied Information Sciences · Client: GEICO — Hyderabad, India → Remote US

Azure Data FactoryDatabricksPySparkDelta LakeADLS Gen2SynapseKafkaPower BI

Engineered Azure Data Factory pipelines ingesting policy, claims, and CRM data (10M+ records) into ADLS Gen2
Built distributed PySpark transformations in Databricks, reducing runtime by 30% via auto-scaling and partition optimization
Implemented Delta Lake curated layers with SCD Type 2 MERGE logic, reducing reconciliation issues by 25%
Developed Kafka-based streaming workflows that cut data latency from hours to near real-time
Automated CI/CD deployment of ADF pipelines via Azure DevOps, reducing deployment errors by 40%
Delivered analytics-ready datasets to Synapse, improving reporting turnaround by 50%

Projects

Engineering projects, not just exercises.

Each project solves a real data engineering problem end-to-end.

AWS Lakehouse & Analytics Platform

End-to-end serverless data lake with dimensional modeling and query-optimized analytics layer on AWS.

Problem: Needed a cost-efficient, scalable pipeline to ingest, transform, and serve large datasets for analytical workloads without managing servers.

Serverless ETL with Glue and PySpark for schema evolution and data cleansing
Dimensional models loaded into Redshift with Airflow-orchestrated workflows
Cost-optimized Athena queries with partitioning and Iceberg table format
Automated ingestion from public APIs with deduplication logic

AWS S3GlueAthenaRedshiftAirflowPySparkIceberg

View on GitHub

Big Data Processing with Spark

Large-scale distributed data processing on Hadoop clusters using Spark DataFrames and GCP Dataproc.

Problem: Processing and analyzing massive datasets that exceeded single-machine capacity, requiring distributed computing with optimized joins.

Distributed processing using Spark DataFrames on HDFS-backed clusters
Schema validation, data cleansing, and aggregation pipelines
Optimized shuffle operations and join strategies for performance
Deployed on GCP Dataproc for managed cluster orchestration

Apache SparkHDFSGCP DataprocSQLPython

View on GitHub

Why Hire Me

What I bring to your team.

Not just skills on paper — here's why it translates to real value.

Production Engineering Experience

Hands-on work building and supporting data systems at enterprise scale — not just tutorials or coursework.

AWS Certified

Two AWS certifications validating cloud architecture and data engineering competency.

Full-Stack Data Skills

End-to-end capability across ingestion, transformation, modeling, orchestration, and analytics.

Measurable Impact

Consistent track record of reducing costs, improving performance, and automating manual processes.

Modern Tooling

Fluent in PySpark, Airflow, Delta Lake, Kafka, and both AWS and Azure ecosystems.

Strong Academic Foundation

Master's in CS from Illinois Institute of Technology, combining depth with practical engineering.

Interested in discussing data engineering opportunities?

I'm actively looking for Data Engineer roles. Let's talk about how I can contribute to your team.

Download Resume Email Me LinkedIn GitHub

Contact

Let's connect.

I'm actively open to Data Engineer opportunities. If my background aligns with your team's needs, I'd love to hear from you.

naveedmohiuddin0311@gmail.com

naveed-mohiuddin

GitHub

NaveedMohiuddin

Location

Seattle, WA

Naveed Mohiuddin

About Me

Engineer, not just a resume.

Tools I work with daily.

AWS Certified, production validated.

AWS Certified Data Engineer – Associate

AWS Certified Solutions Architect – Associate

Where I've built things that matter.

Data Engineer

Software Engineer (Data Engineering)

Engineering projects, not just exercises.

AWS Lakehouse & Analytics Platform

Big Data Processing with Spark

What I bring to your team.

Production Engineering Experience

AWS Certified

Full-Stack Data Skills

Measurable Impact

Modern Tooling

Strong Academic Foundation

Interested in discussing data engineering opportunities?

Let's connect.

Send a message