Kunal Dubey

Kunal Dubey

Evidence over noise. Enter the terrain.

Projects. Blogs. Research.Scroll to traverse the mountainsHold click + drag to orbit

Mountain 01

Projects That Compound

The strongest peak. Flagship systems, growth arcs, execution quality, and measurable outcomes.

Atlas Checkout Engine

Resilient commerce infra that protects conversion under peak load.

Rebuilt checkout from synchronous bottlenecks into an event-first workflow with deterministic retry semantics and observable failure handling.

Monthly GMV$42M+
Success Rate+11.8%
P95 Latency-47%
v1

Lifted checkout out of a monolith into isolated services with zero-downtime migration.

v2

Introduced event choreography and recovery paths across payment providers.

scale

Automated surge controls and back-pressure for predictable throughput.

TypeScriptKafkaPostgresRedisKubernetes

Pulse Reliability Cloud

Operational confidence through deep reliability automation.

Built a reliability platform that unifies telemetry, runbooks, and anomaly response for product and platform teams.

MTTR-58%
Incidents-31%
Confidence+23 pts
v1

Centralized logs and metrics with clear service ownership mapping.

v2

Shipped SLO burn-rate alerting with guided remediation paths.

scale

Added fleet-wide policy checks and auto-triage for recurring incidents.

GoClickHouseOpenTelemetryGrafanaTerraform

Northstar Growth OS

Product experimentation that turns insights into repeatable growth loops.

Designed a growth analytics and experimentation backbone to move roadmap choices from intuition to evidence.

Activation+18.2%
Velocity3.4x
Revenue Lift+14%
v1

Unified event taxonomy and experiment tracking across web and mobile funnels.

v2

Introduced progressive rollout orchestration with guardrail metrics.

scale

Operationalized experiment governance with automated result summaries.

Next.jsNode.jsBigQueryPythonAirflow

Mountain 02

Writing The Build Logic

The editorial ridge. Public engineering thinking that explains why the systems work.
Architecture8 min

Designing APIs For Change, Not For Day One

How to structure service boundaries so product velocity remains high as domain complexity grows.

Read article
Reliability11 min

SLOs That Actually Change Team Behavior

A practical path to turn reliability targets into engineering decisions that reduce firefighting.

Read article
Delivery9 min

Feature Flags As Product Infrastructure

Patterns for rollout safety, experiment quality, and fast rollback without breaking trust.

Read article
Data7 min

From Dashboards To Decisions

Converting analytics noise into high-leverage product and platform choices.

Read article

Mountain 03

Research Frontier

The summit lab. Active investigation, paper queue, and forward-looking systems work.

Adaptive Runtime Guardrails

Active

Policy-driven systems can self-adjust service behavior during anomalies while preserving user-critical flows.

Memory-Efficient Event Replay

In Exploration

Stream snapshots plus deterministic replay can reduce infra cost without sacrificing recovery guarantees.

Human-AI Incident Collaboration

Drafting

Structured AI copilots can accelerate diagnosis while keeping the final decision loop human-owned.

Paper Queue

Paper 01: Runtime Guardrails Under Burst Load

Status: Drafting experiments

Paper 02: Low-Cost Replay Pipelines For Event Stores

Status: Dataset design

Paper 03: Human-AI Incident Command Interfaces

Status: Literature mapping