AWS Certified Data Engineer AWS Certified Solutions Architect Seattle, WA

Naveed Mohiuddin

Data Engineer

Hands-on experience in production ETL, distributed processing, and data modeling across AWS and Azure — backed by two AWS certifications and a Master's in Computer Science.

$ naveed.skills("AWS", "Azure", "PySpark", "Databricks")
3+
Years Experience
2
AWS Certifications
MS
Computer Science
AWS + Azure
Multi-Cloud

About Me

Engineer, not just a resume.

I'm a Data Engineer with experience building cloud-native data platforms that serve real business needs — from policy and claims pipelines at GEICO to serverless lakehouse architectures on AWS. My work sits at the intersection of software engineering and data infrastructure: I design ETL systems, optimize distributed processing, model dimensional data, and automate everything I can.

I hold a Master's in Computer Science from Illinois Institute of Technology and a Bachelor's from Osmania University. I'm AWS certified in both Solutions Architecture and Data Engineering, and I've worked across AWS and Azure stacks in production settings. I care about building systems that are reliable, cost-efficient, and maintainable — not just technically interesting.

MS Computer Science
Illinois Institute of Technology
BE Computer Science
Osmania University

Tech Stack

Tools I work with daily.

Production-tested across AWS, Azure, and open-source data ecosystems.

Data Engineering
ETL/ELTData ModelingDistributed ProcessingBatch & Streaming PipelinesMedallion ArchitectureStar SchemaSCD Type 2Data Quality
Cloud Platforms
AWS S3AWS GlueRedshiftLambdaAthenaEC2EventBridgeAzure Data FactoryDatabricksSynapse AnalyticsADLS Gen2
Big Data
Apache SparkPySparkKafkaHDFSHadoopHiveDelta LakeApache Iceberg
Orchestration & DevOps
Apache AirflowCI/CDAzure DevOpsGitGitHubDocker
Programming
PythonSQLPySparkJavaLinuxBash
Analytics & BI
Power BIAthena QueriesSynapseRedshift Spectrum

Certifications

AWS Certified, production validated.

Industry-recognized credentials demonstrating cloud architecture and data engineering depth.

DEA

AWS Certified Data Engineer – Associate

Validates expertise in designing and maintaining data pipelines, data stores, data processing, security, and governance on AWS.

S3GlueRedshiftAthenaLake FormationKinesis
SAA

AWS Certified Solutions Architect – Associate

Demonstrates ability to design secure, scalable, and cost-optimized cloud architectures using AWS services.

EC2VPCIAMLambdaCloudFormationRDS

Experience

Where I've built things that matter.

Production systems, real data, measurable outcomes.

Data Engineer

Jul 2025 – Present

Benda Infotech Remote, US

AWS S3GlueLambdaRedshiftAthenaPySparkAirflowCI/CD
  • Architected serverless ETL pipelines on AWS (S3, Lambda, Glue) processing 1M+ records daily at 99.9% reliability
  • Developed PySpark transformation jobs in Glue, cutting batch processing time by 40% through join optimization and partition tuning
  • Designed dimensional star-schema models loaded into Redshift, boosting dashboard query performance by 35%
  • Implemented Airflow scheduling with SLA monitoring, reducing manual intervention by 60%
  • Optimized Athena queries via partition pruning and compression, lowering execution costs 30–40%
  • Built CI/CD pipelines that reduced release cycles from 2 days to hours with 99%+ data quality validation

Software Engineer (Data Engineering)

Feb 2022 – Jul 2023

Applied Information Sciences · Client: GEICO Hyderabad, India → Remote US

Azure Data FactoryDatabricksPySparkDelta LakeADLS Gen2SynapseKafkaPower BI
  • Engineered Azure Data Factory pipelines ingesting policy, claims, and CRM data (10M+ records) into ADLS Gen2
  • Built distributed PySpark transformations in Databricks, reducing runtime by 30% via auto-scaling and partition optimization
  • Implemented Delta Lake curated layers with SCD Type 2 MERGE logic, reducing reconciliation issues by 25%
  • Developed Kafka-based streaming workflows that cut data latency from hours to near real-time
  • Automated CI/CD deployment of ADF pipelines via Azure DevOps, reducing deployment errors by 40%
  • Delivered analytics-ready datasets to Synapse, improving reporting turnaround by 50%

Projects

Engineering projects, not just exercises.

Each project solves a real data engineering problem end-to-end.

AWS Lakehouse & Analytics Platform

End-to-end serverless data lake with dimensional modeling and query-optimized analytics layer on AWS.

Problem: Needed a cost-efficient, scalable pipeline to ingest, transform, and serve large datasets for analytical workloads without managing servers.
  • Serverless ETL with Glue and PySpark for schema evolution and data cleansing
  • Dimensional models loaded into Redshift with Airflow-orchestrated workflows
  • Cost-optimized Athena queries with partitioning and Iceberg table format
  • Automated ingestion from public APIs with deduplication logic
AWS S3GlueAthenaRedshiftAirflowPySparkIceberg
View on GitHub

Big Data Processing with Spark

Large-scale distributed data processing on Hadoop clusters using Spark DataFrames and GCP Dataproc.

Problem: Processing and analyzing massive datasets that exceeded single-machine capacity, requiring distributed computing with optimized joins.
  • Distributed processing using Spark DataFrames on HDFS-backed clusters
  • Schema validation, data cleansing, and aggregation pipelines
  • Optimized shuffle operations and join strategies for performance
  • Deployed on GCP Dataproc for managed cluster orchestration
Apache SparkHDFSGCP DataprocSQLPython
View on GitHub

Why Hire Me

What I bring to your team.

Not just skills on paper — here's why it translates to real value.

Production Engineering Experience

Hands-on work building and supporting data systems at enterprise scale — not just tutorials or coursework.

AWS Certified

Two AWS certifications validating cloud architecture and data engineering competency.

Full-Stack Data Skills

End-to-end capability across ingestion, transformation, modeling, orchestration, and analytics.

Measurable Impact

Consistent track record of reducing costs, improving performance, and automating manual processes.

Modern Tooling

Fluent in PySpark, Airflow, Delta Lake, Kafka, and both AWS and Azure ecosystems.

Strong Academic Foundation

Master's in CS from Illinois Institute of Technology, combining depth with practical engineering.

Interested in discussing data engineering opportunities?

I'm actively looking for Data Engineer roles. Let's talk about how I can contribute to your team.

Contact

Let's connect.

I'm actively open to Data Engineer opportunities. If my background aligns with your team's needs, I'd love to hear from you.

Send a message