Giridhar Reddy

Welcome to my portfolio

Hello! I Am Giridhar
Reddy.

I'm a
Ex-Walmart Global Tech · Data Engineer

Building ETL/ELT pipelines at scale with PySpark, Databricks & Delta Lake on Azure & GCP. Processed 10M+ daily retail transactions in production.

View My Work ↗ GitHub LinkedIn
10M+
Daily transactions
in production
70%
Scan reduction via
incremental ingestion
4h→90m
Pipeline runtime
improvement
99%+
SLA adherence
across batch cycles

About Me

Building Data
Pipelines at Scale

I build data pipelines that actually work in production — not just in demos. For the past year and a half, I've been engineering ETL/ELT pipelines that process millions of daily retail transactions, working across Azure Databricks, Google Dataproc, GCP, and Azure Data Factory. Most recently at Walmart Global Tech, I worked on finance-critical pipelines where data quality wasn't optional — late data or bad records had real business consequences.

What i actually spend my time
on

Designing incremental ingestion logic that cuts processing time, Structuring Medallion/layered architectures so data flows cleanly from raw to consumption, and making sure downstream teams — whether in finance or analytics — get what they need on time and in the right shape.

Currently pursuing an M.Sc in Data Science at Vellore Institute of Technology — passionate about real-time pipelines, cloud-native stacks, and scalable analytics systems.


Get In Touch ↗
10M+
Daily retail transactions processed
70%
Scan reduction achieved
99%+
SLA adherence maintained
5
Industry certifications earned
Core Skills
PySparkDatabricksDelta Lake Azure Data FactoryGoogle DataprocBigQuery Apache KafkaApache Airflow PythonSQLSpark SQL Unity CatalogGCSSnowflake
Work Experience

Where I've Worked.

Walmart Global Tech India Featured
Data Engineer
Apr 2024 – Jun 2025 📍 Chennai, India
  • Engineered production-grade Spark-based ETL pipelines on Google Dataproc and Azure Databricks, processing 10M+ daily store and eCommerce sales transactions through Delta Lake for finance reporting and reconciliation.
  • Designed partition-based incremental ingestion with rolling 7-day timestamp windows — eliminating full dataset scans and reducing pipeline runtime from 4 hours to 90 minutes.
  • Built and maintained layered data architecture (Raw → Curated → Enriched → Consumption) ensuring analytics-ready datasets for downstream BI teams.
  • Facilitated hybrid cloud architecture (GCP + Azure) with Azure Data Factory and Azure Databricks, orchestrated via Automic for SLA-driven batch scheduling.
  • Maintained 99%+ SLA adherence across high-volume daily cycles, proactively tracking pipeline failures and data quality issues via Jira.
PySparkDatabricksDelta Lake GCPAzureBigQuery ADFAutomicJira
Wenger & Watson Private Limited
Data Scientist
Nov 2023 – Jan 2024 📍 Bengaluru, India
  • Facilitated end-to-end ETL workflows in Hive and BigQuery for reporting and analytics teams, streamlining data delivery pipelines.
  • Managed incremental and historical datasets using Apache Hudi, ensuring accuracy and preventing dataset duplication across analytics pipelines.
  • Troubleshot Spark jobs and Kafka stream failures to minimize downtime and maintain reliability for critical analytics workflows.
HiveBigQueryApache Hudi SparkKafka
Technical Stack

Skills & Tools.

Core Platform
DatabricksDelta LakePySparkSpark SQLUnity Catalog
Cloud — Azure
Azure Data FactoryADLS Gen2Azure DatabricksMicrosoft Fabric
Cloud — GCP
Google DataprocBigQueryGCSSnowflake
Streaming & Orchestration
Apache KafkaApache AirflowSpark StreamingAutomic
Languages & Databases
PythonSQLHiveHadoopMySQLPostgreSQLApache Hudi
DevOps & Tools
GitJiraDockerCI/CDOLAPOLTP
Projects

Featured Work.

From GitHub · auto-updated
⟳ Fetching repositories from GitHub...
Credentials

Certified Expertise.

🎓 IBM Data Engineer Professional Certificate
IBM / Coursera
Issued: 2025 · No Expiry
Skills: Python, SQL, PySpark, ETL, NoSQL
Details: End-to-end data engineering with IBM tools and cloud platforms.
🪟 Microsoft Certified: Fabric Data Engineer Associate
Microsoft
Issued: 2026 · Expires: 2027
Skills: Microsoft Fabric, Lakehouse, Data Pipelines
Details: Design and implement data engineering solutions on Microsoft Fabric.
🧱 Academy Accreditation — Databricks Fundamentals
Databricks
Issued: 2025 · No Expiry
Skills: Databricks, Delta Lake, Unity Catalog
Details: Core Lakehouse concepts and Databricks platform fundamentals.
🌬️ Astronomer Certification — Apache Airflow Fundamentals
Astronomer
Issued: 2024 · No Expiry
Skills: Apache Airflow, DAGs, Pipeline Orchestration
Details: Build and manage production-grade data pipelines with Airflow.
🗄️ Databases and SQL for Data Science with Python
IBM / Coursera
Issued: 2025 · No Expiry
Skills: SQL, Python, RDBMS, Data Analysis
Details: Advanced SQL and database design for data science use cases.
📐 Introduction to Relational Databases (RDBMS)
IBM / Coursera
Issued: 2025 · No Expiry
Skills: RDBMS, PostgreSQL, MySQL, Schema Design
Details: Foundational relational database concepts and SQL.
Education

Academic Background.

Dec 2025 – Jan 2028
Master's Degree
Data Science
Vellore Institute of Technology (VIT) · Vellore, Tamil Nadu, India
🟢 In Progress
Aug 2019 – Jun 2023
B.Tech — Computer Science & Engineering
Computer Science & Engineering
Sri Chandrasekharendra Saraswathi Vishwa Mahavidyalaya (SCSVMV) · Kanchipuram, Tamil Nadu, India
✅ Completed

Get In Touch

Let's
Work.

I'm looking for a Data Engineer role — preferably cloud-native stacks, real-scale data, and teams that care about doing things properly. Open to full-time opportunities across India.

Giridhar.scsvmv@gmail.com