Data Processing at Scale: Queues, Streams & Async Architecture
Message queues, event streaming with Kafka, async job processing, and the architectural patterns that decouple services and absorb traffic spikes.
Data Processing at Scale: Queues, Streams & Async Architecture
Message queues, event streaming with Kafka, async job processing, and the architectural patterns that decouple services and absorb traffic spikes.
What you'll learn
- Move all non-critical work out of the request path — queue it
- Message queues for point-to-point; event streams for multi-consumer fanout
- Build idempotent consumers — cheaper than exactly-once delivery
- Outbox Pattern guarantees event publication even if the message broker fails
- Dead Letter Queues are mandatory — always have a recovery path for failed jobs
Lesson outline
Why synchronous processing is the enemy of scale
Every time your API does more than read/write data — sends an email, processes an image, calls a slow third-party — it forces the user to wait. Move non-critical work out of the request path: accept the request, queue the work, return immediately.
The rule: do only what is critical in the request path
Validate input, write to database, return response. Everything else (send email, generate report, notify webhooks) → queue it. This keeps your API fast and resilient to third-party slowness.
Message queues vs event streams
Message queues (SQS, RabbitMQ, BullMQ): Point-to-point. One consumer receives each message. Deleted after acknowledgment. Use for: task dispatch, job queues (send email, resize image, generate report).
Event streams (Kafka, Kinesis): Append-only log. Multiple independent consumer groups each read all events. Retained for days/weeks. Use for: event sourcing, audit logs, real-time analytics, feeding multiple downstream systems.
| Feature | Message Queue (SQS) | Event Stream (Kafka) |
|---|---|---|
| Delivery | Point-to-point (one consumer) | Pub/sub (multiple consumer groups) |
| Retention | Deleted after ACK | Configurable (days to forever) |
| Replay | ❌ Cannot replay | ✅ Replay from any offset |
| Best for | Job queues, task dispatch | Event sourcing, multi-consumer, audit log |
1// BullMQ: Job queue with retries and DLQ23import { Queue, Worker } from 'bullmq';4const emailQueue = new Queue('emails', { connection: redis });56// Producer: fire and forget in request handler7export async function POST(req: Request) {8const order = await db.insert(orders).values(await req.json()).returning();910// Queue email — do NOT awaitQueue the email BEFORE returning response — do not await it11await emailQueue.add('order-confirmation', {12orderId: order[0].id,13email: order[0].userEmail,14}, {15attempts: 3,16backoff: { type: 'exponential', delay: 1000 },17removeOnComplete: 100,18});1920return Response.json({ orderId: order[0].id }, { status: 201 });21// Response in ~10ms; email sent asynchronously22}2324// Consumer: separate process25const emailWorker = new Worker('emails', async (job) => {26await sendTransactionalEmail(job.data.email, job.data.orderId);27}, { connection: redis, concurrency: 10 });2829// Kafka: fanout to multiple consumer groups30import { Kafka } from 'kafkajs';31const producer = kafka.producer();3233await producer.send({Partition key guarantees ordering of events for the same order34topic: 'orders.placed',35messages: [{36key: orderId, // Same key → same partition → ordered per-order37value: JSON.stringify({ orderId, userId, total }),38}],39});40// Analytics, Notifications, and Inventory services41// each have their own consumer group — fully independent
Idempotent consumers: handling duplicate messages
Distributed systems fail partially: a message is sent but the ACK is lost, so the producer retries, and the consumer processes it twice. Build idempotent consumers: processing the same message twice produces the same result as once.
Deduplication key: Before processing, check if you have seen this messageId before. If yes, skip and ACK. Store processed IDs in Redis with a TTL matching your expected retry window.
The Outbox Pattern: Atomically write to DB + publish to Kafka by writing to an outbox table in the same DB transaction. A separate relay process reads from outbox and publishes. If Kafka publish fails, retry from outbox — no data loss.
1// Idempotent Consumer: safe to process the same message twice23async function processOrderPlaced(message: OrderPlacedEvent) {4const dedupKey = `processed:order:${message.orderId}`;56// Check if already processed7if (await redis.exists(dedupKey)) {8console.log(`Skipping duplicate: ${message.orderId}`);9return; // ACK without processing10}1112await db.transaction(async (tx) => {13await tx.insert(orderAnalytics)14.values({ orderId: message.orderId, total: message.total })onConflictDoNothing makes DB inserts idempotent at the DB level15.onConflictDoNothing(); // Safe to run twice — second run is a no-op16});17Mark processed AFTER success — never before18// Mark processed AFTER successful processing19await redis.setex(dedupKey, 86400, '1'); // 24-hour dedup window20}2122// Outbox Pattern: atomically publish events with DB writes23async function createOrderWithOutbox(orderData: OrderInput) {24await db.transaction(async (tx) => {25const order = await tx.insert(orders).values(orderData).returning();2627// Write event to outbox in SAME transaction as order creation28await tx.insert(outbox).values({29topic: 'orders.placed',30key: order[0].id,31payload: JSON.stringify({ orderId: order[0].id, ...orderData }),32});33});34// Outbox relay publishes to Kafka and marks as published35// If Kafka fails → retry from outbox → no event loss36}
Dead letter queues and job reliability
Dead letter queues (DLQ): Jobs that fail all retries go to the DLQ instead of being discarded. Critical for debugging and recovery — inspect failed jobs and replay after fixing the bug.
Priority queues: Separate queues for urgent vs background work. "Reset password" email is more urgent than "weekly digest." Separate worker pools per priority.
Rate-limited queues: Some APIs (Stripe, SendGrid) have rate limits. Use a queue with a rate limiter to smooth dispatch: "max 100 jobs/second for this Stripe queue."
Always configure a DLQ
Without a DLQ, failed jobs are silently discarded. A missing order confirmation or webhook is a support ticket. DLQs give you a paper trail and a recovery path.
How this might come up in interviews
Async architecture questions test decoupling, reliability, and consistency trade-offs.
Common questions:
- How would you design an email notification system sending 1M emails/day?
- What is the difference between a message queue and an event stream?
- How do you handle duplicate message processing?
- Explain the Outbox Pattern.
Strong answers include:
- Immediately moves email sending to a queue in any design
- Knows the difference between Kafka and SQS
- Understands idempotency and can design idempotent consumers
- Mentions outbox pattern for guaranteed delivery
Red flags:
- Sends emails synchronously in the request handler
- Does not know what a DLQ is
- Relies on exactly-once delivery without designing for idempotency
Quick check · Data Processing at Scale: Queues, Streams & Async Architecture
1 / 1
Three services (analytics, notifications, inventory) need to react to "order placed." Best pattern?
Key takeaways
- Move all non-critical work out of the request path — queue it
- Message queues for point-to-point; event streams for multi-consumer fanout
- Build idempotent consumers — cheaper than exactly-once delivery
- Outbox Pattern guarantees event publication even if the message broker fails
- Dead Letter Queues are mandatory — always have a recovery path for failed jobs
From the books
Designing Data-Intensive Applications — Martin Kleppmann (2017)
Chapter 11: Stream Processing
Event logs are the foundational data structure of distributed systems — they provide total ordering of events, enable replay, and allow multiple consumers to independently process the same stream.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Discussion
Questions? Discuss in the community or start a thread below.
Join DiscordIn-app Q&A
Sign in to start or join a thread.