Welcome to my portfolio

Hello! I Am Giridhar
Reddy.

I'm a

Ex-Walmart Global Tech · Data Engineer

Building ETL/ELT pipelines at scale with PySpark, Databricks & Delta Lake on Azure & GCP. Processed 10M+ daily retail transactions in production.

View My Work ↗ GitHub LinkedIn

10M+

Daily transactions
in production

70%

Scan reduction via
incremental ingestion

4h→90m

Pipeline runtime
improvement

99%+

SLA adherence
across batch cycles

↓scroll

About Me

Building Data
Pipelines at Scale

I build data pipelines that actually work in production — not just in demos. For the past year and a half, I've been engineering ETL/ELT pipelines that process millions of daily retail transactions, working across Azure Databricks, Google Dataproc, GCP, and Azure Data Factory. Most recently at Walmart Global Tech, I worked on finance-critical pipelines where data quality wasn't optional — late data or bad records had real business consequences.

What i actually spend my time
on

Designing incremental ingestion logic that cuts processing time, Structuring Medallion/layered architectures so data flows cleanly from raw to consumption, and making sure downstream teams — whether in finance or analytics — get what they need on time and in the right shape.

Currently pursuing an M.Sc in Data Science at Vellore Institute of Technology — passionate about real-time pipelines, cloud-native stacks, and scalable analytics systems.

Get In Touch ↗

10M+

Daily retail transactions processed

70%

Scan reduction achieved

99%+

SLA adherence maintained

Industry certifications earned

Core Skills

PySparkDatabricksDelta Lake Azure Data FactoryGoogle DataprocBigQuery Apache KafkaApache Airflow PythonSQLSpark SQL Unity CatalogGCSSnowflake

Work Experience

Where I've Worked.

Walmart Global Tech India Featured

Data Engineer

Apr 2024 – Jun 2025 📍 Chennai, India

Engineered production-grade Spark-based ETL pipelines on Google Dataproc and Azure Databricks, processing 10M+ daily store and eCommerce sales transactions through Delta Lake for finance reporting and reconciliation.
Designed partition-based incremental ingestion with rolling 7-day timestamp windows — eliminating full dataset scans and reducing pipeline runtime from 4 hours to 90 minutes.
Built and maintained layered data architecture (Raw → Curated → Enriched → Consumption) ensuring analytics-ready datasets for downstream BI teams.
Facilitated hybrid cloud architecture (GCP + Azure) with Azure Data Factory and Azure Databricks, orchestrated via Automic for SLA-driven batch scheduling.
Maintained 99%+ SLA adherence across high-volume daily cycles, proactively tracking pipeline failures and data quality issues via Jira.

PySparkDatabricksDelta Lake GCPAzureBigQuery ADFAutomicJira

Wenger & Watson Private Limited

Data Scientist

Nov 2023 – Jan 2024 📍 Bengaluru, India

Facilitated end-to-end ETL workflows in Hive and BigQuery for reporting and analytics teams, streamlining data delivery pipelines.
Managed incremental and historical datasets using Apache Hudi, ensuring accuracy and preventing dataset duplication across analytics pipelines.
Troubleshot Spark jobs and Kafka stream failures to minimize downtime and maintain reliability for critical analytics workflows.

HiveBigQueryApache Hudi SparkKafka

Technical Stack

Skills & Tools.

Core Platform

DatabricksDelta LakePySparkSpark SQLUnity Catalog

Cloud — Azure

Azure Data FactoryADLS Gen2Azure DatabricksMicrosoft Fabric

Cloud — GCP

Google DataprocBigQueryGCSSnowflake

Streaming & Orchestration

Apache KafkaApache AirflowSpark StreamingAutomic

Languages & Databases

PythonSQLHiveHadoopMySQLPostgreSQLApache Hudi

DevOps & Tools

GitJiraDockerCI/CDOLAPOLTP

Credentials

Certified Expertise.

🎓 IBM Data Engineer Professional Certificate

IBM / Coursera

Issued: 2025 · No Expiry

Skills: Python, SQL, PySpark, ETL, NoSQL
Details: End-to-end data engineering with IBM tools and cloud platforms.

🪟 Microsoft Certified: Fabric Data Engineer Associate

Microsoft

Issued: 2026 · Expires: 2027

Skills: Microsoft Fabric, Lakehouse, Data Pipelines
Details: Design and implement data engineering solutions on Microsoft Fabric.

🧱 Academy Accreditation — Databricks Fundamentals

Databricks

Issued: 2025 · No Expiry

Skills: Databricks, Delta Lake, Unity Catalog
Details: Core Lakehouse concepts and Databricks platform fundamentals.

🌬️ Astronomer Certification — Apache Airflow Fundamentals

Astronomer

Issued: 2024 · No Expiry

Skills: Apache Airflow, DAGs, Pipeline Orchestration
Details: Build and manage production-grade data pipelines with Airflow.

🗄️ Databases and SQL for Data Science with Python

IBM / Coursera

Issued: 2025 · No Expiry

Skills: SQL, Python, RDBMS, Data Analysis
Details: Advanced SQL and database design for data science use cases.

📐 Introduction to Relational Databases (RDBMS)

IBM / Coursera

Issued: 2025 · No Expiry

Skills: RDBMS, PostgreSQL, MySQL, Schema Design
Details: Foundational relational database concepts and SQL.

Education

Academic Background.

Dec 2025 – Jan 2028

Master's Degree

Data Science

Vellore Institute of Technology (VIT) · Vellore, Tamil Nadu, India

🟢 In Progress

Aug 2019 – Jun 2023

B.Tech — Computer Science & Engineering

Computer Science & Engineering

Sri Chandrasekharendra Saraswathi Vishwa Mahavidyalaya (SCSVMV) · Kanchipuram, Tamil Nadu, India

✅ Completed

Get In Touch

Let's
Work.

I'm looking for a Data Engineer role — preferably cloud-native stacks, real-scale data, and teams that care about doing things properly. Open to full-time opportunities across India.

Giridhar.scsvmv@gmail.com

LinkedIn GitHub +91 6301691663

Name

Subject

Message

Hello! I Am Giridhar Reddy.

Building DataPipelines at Scale

What i actually spend my timeon

Where I've Worked.

Skills & Tools.

Featured Work.

Certified Expertise.

Academic Background.

Let'sWork.

Hello! I Am Giridhar
Reddy.

Building Data
Pipelines at Scale

What i actually spend my time
on

Let's
Work.