My Blog

The 60-Minute Protocol for Staying Sharp in the Age of AI

Feb 20, 2026

The 60-Minute Protocol for Staying Sharp in the Age of AI

mental-modelsartificial-intelligenceneural-networks
Engineers in 2026 Won’t Be Hired for Syntax. They’ll Be Hired for Leverage

Jan 11, 2026

Engineers in 2026 Won’t Be Hired for Syntax. They’ll Be Hired for Leverage

distributed-systemsai-agentllm
I Built an AI Code Reviewer in a Weekend — Here’s the Exact Prompt

Nov 26, 2025

I Built an AI Code Reviewer in a Weekend — Here’s the Exact Prompt

code-reviewprompt-engineeringbig-data
Integrating LLMs and AI Agents into Data Engineering Workflows

Sep 19, 2025

Integrating LLMs and AI Agents into Data Engineering Workflows

aiai-agentllm
A Practical Guide to Spark Serialization and Deserialization

Sep 14, 2025

A Practical Guide to Spark Serialization and Deserialization

big-dataserializationspark
Zero-ETL & Cloud-Native Architectures: Building Real-Time Data Systems

Aug 22, 2025

Zero-ETL & Cloud-Native Architectures: Building Real-Time Data Systems

streamingcloud-nativereal-time-analytics
Why Every Serious Data Engineer Should Understand Bloom Filters and HyperLogLog

Jul 11, 2025

Why Every Serious Data Engineer Should Understand Bloom Filters and HyperLogLog

data-structuresbig-databloom-filter
Embedding-Based Retrieval Is Making Search Smarter

Jul 6, 2025

Embedding-Based Retrieval Is Making Search Smarter

vectorembeddingartificial-intelligence
MLOps and Data Engineering: Bridging the Gap for Machine Learning Pipelines

Jul 2, 2025

MLOps and Data Engineering: Bridging the Gap for Machine Learning Pipelines

mlopsdata-engineeringfeature-engineering
Understanding Spark’s Catalyst Optimizer: Demystifying Query Optimization

Jun 17, 2025

Understanding Spark’s Catalyst Optimizer: Demystifying Query Optimization

sparkapache-sparkspark-optimization
Build Your First Baby Agent with OpenAI in 20 Minutes

May 27, 2025

Build Your First Baby Agent with OpenAI in 20 Minutes

ai-agentchatgptopenai
Say Goodbye to Dirty Data: Build Trustworthy Pipelines with These Pro Tips

May 21, 2025

Say Goodbye to Dirty Data: Build Trustworthy Pipelines with These Pro Tips

Data EngineeringData QualityData Pipelines
No SQL? No Problem: Ask Your Database Questions in Plain English

Apr 20, 2025

No SQL? No Problem: Ask Your Database Questions in Plain English

Data EngineeringNLPMySQL
Catching Sneaky Data Drift Before It Wreaks Havoc

Apr 7, 2025

Catching Sneaky Data Drift Before It Wreaks Havoc

Data EngineeringMachine LearningData Quality
Your Spark Executors Are Wasting Memory — Here’s How to Fix It

Mar 24, 2025

Your Spark Executors Are Wasting Memory — Here’s How to Fix It

sparkdistributed-systemsmemory-improvement
Building a Data Lakehouse with Iceberg, Spark, and AWS Glue

Mar 8, 2025

Building a Data Lakehouse with Iceberg, Spark, and AWS Glue

Data EngineeringApache IcebergApache Spark
From Data Lake to Lakehouse: A Migration Guide with Delta

Feb 11, 2025

From Data Lake to Lakehouse: A Migration Guide with Delta

Data EngineeringDelta LakeApache Spark
Mastering CDC in Delta Tables: A Use-case in Spark

Feb 5, 2025

Mastering CDC in Delta Tables: A Use-case in Spark

Data EngineeringCDCDelta Lake
Indexing Strategies: B-Trees, Hash Indexes, Bitmaps & Beyond

Jan 30, 2025

Indexing Strategies: B-Trees, Hash Indexes, Bitmaps & Beyond

indexingsqlbig-data
Handling Bottlenecks in Spark Streaming: Lessons Learned

Jan 20, 2025

Handling Bottlenecks in Spark Streaming: Lessons Learned

Data EngineeringSpark StreamingPerformance Optimization
Demystifying Event-Driven Architecture with AWS

Jan 9, 2025

Demystifying Event-Driven Architecture with AWS

Data EngineeringEvent-Driven ArchitectureAWS
Hands-on Cloud: Build a Serverless To-Do List App on AWS

Dec 24, 2024

Hands-on Cloud: Build a Serverless To-Do List App on AWS

Cloud ComputingAWSServerless
Zstd vs Snappy vs Gzip: The Compression King for Parquet Has Arrived

Dec 16, 2024

Zstd vs Snappy vs Gzip: The Compression King for Parquet Has Arrived

parquetdata-engineeringspark
Building Real-Time ETL Pipelines with Flink? Here's How You Can Nail It!

Dec 12, 2024

Building Real-Time ETL Pipelines with Flink? Here's How You Can Nail It!

Data EngineeringApache FlinkKafka
Building Real-Time Recommendations with Spark, ALS, and Kafka

Nov 30, 2024

Building Real-Time Recommendations with Spark, ALS, and Kafka

Data EngineeringApache SparkKafka
Customer 360 in E-commerce: Real-Life Use Case with Delta Lake on Databricks

Nov 24, 2024

Customer 360 in E-commerce: Real-Life Use Case with Delta Lake on Databricks

Data EngineeringDelta LakeDatabricks
Real-Time Use-case: Fraud Detection in Financial Transactions with Kafka and Spark Streaming

Nov 18, 2024

Real-Time Use-case: Fraud Detection in Financial Transactions with Kafka and Spark Streaming

Data EngineeringKafkaSpark Streaming
Preventing Data Mix-ups: Understanding Database Isolation and Concurrency Management

Nov 12, 2024

Preventing Data Mix-ups: Understanding Database Isolation and Concurrency Management

Data EngineeringDatabaseConcurrency
Data Engineering for ML: Building a Customer Churn Prediction Pipeline with Airflow

Nov 9, 2024

Data Engineering for ML: Building a Customer Churn Prediction Pipeline with Airflow

Data EngineeringMachine LearningApache Airflow
Building End-to-End Customer Insights Pipeline by Integrating Multiple Data Sources in Spark with Airflow

Nov 3, 2024

Building End-to-End Customer Insights Pipeline by Integrating Multiple Data Sources in Spark with Airflow

Data EngineeringApache SparkApache Airflow
Showing 30 of 35 articles