← Back to Main Site
YennJ12 Engineering Blog

Engineering insights, architecture deep dives, and technical solutions

Home Engineering Architecture Data All Posts About

streaming

Articles in streaming

Sep 27, 2025 22 min

Building NYC Taxi Data Pipeline with Spark and Kafka

Complete guide to building a production-ready data engineering pipeline for processing NYC taxi trip records using Apache Spark, Kafka streaming, Hadoop ecosystem, and AWS cloud infrastructure.

AI apache-spark kafka

Company

  • About us
  • Our offerings
  • Newsroom
  • Investors
  • Blog
  • Careers
  • YennJ12 Engineering Blog AI
  • Gift cards

Products

  • Ride
  • Drive
  • Deliver
  • Eat
  • YennJ12 Engineering Blog for Business
  • YennJ12 Engineering Blog Freight

Global citizenship

  • Safety
  • Sustainability

Travel

  • Reserve
  • Airports
  • Cities
Get it on Google Play Download on the App Store
English
Taipei

© 2025 YennJ12 Engineering Team. All rights reserved.

Built with Hugo