Writing
Notes on building software
Honest takes on shipping products, indie hacking, and the realities of the tech industry. No fluff.
Vercel Zero: The Programming Language Built So AI Agents Can Read, Repair, And Ship Native Code
Vercel Labs just dropped a new systems language called Zero whose compiler speaks JSON, whose effects live in your function signatures, and whose binaries weigh less than ten kilobytes. The pitch is simple: a language where an AI agent can read a compiler error, ask for a typed fix, and ship a native program without a human in the loop. Here is what Zero actually is, what it is not, and whether the agent-first compiler is a clever bet or a Vercel side project you can safely ignore.
Background Jobs For Indie Developers in 2026: When You Need A Queue, When You Do Not, And What I Actually Use
Every job queue tutorial is written for companies running ten thousand jobs a second. As a solo developer you do not need Sidekiq Pro and a Kubernetes cluster to send a welcome email. Here is the actual background job setup that earns its place for indie projects in 2026, and the day a Stripe webhook taught me why setTimeout was never going to be enough.
Rate Limiting Your SaaS API in 2026: The AI Scraper Problem, Token Buckets, and the Layered Defense That Actually Works
A single AI agent scraped one of my endpoints twenty-three thousand times in a night and turned a $40 OpenAI budget into a $312 invoice before I woke up. Most rate limiting tutorials are written for traffic that pretends to be polite. Here is what actually defending a SaaS API looks like in 2026, with the AI bot wave already through the door.
Claude's June 15 Pricing Split: What Indie Devs Actually Need to Do Before the Meter Starts
On June 15, 2026 Anthropic splits Claude subscriptions into two pools. Interactive chat stays the same. Anything programmatic (Agent SDK, claude -p, Claude Code GitHub Actions) gets metered in dollars at full API rates. Here is what that actually costs, who wins, who loses, and exactly what to change in your setup before the meter flips on.
Feature Flags For Solo Developers in 2026: When You Need Them, When You Do Not, And What I Actually Use
Every feature flag tool is pitched at companies with a hundred engineers. As a solo developer you do not need a $200 a month LaunchDarkly seat to ship safely. Here is the actual feature flag setup that earns its place for small teams and indie projects in 2026, and the moment you finally outgrow a config file.
Zero-Downtime Postgres Migrations: The Mistakes That Locked My Production Database
A single ALTER TABLE on a 40 million row table can freeze your app for forty minutes. Most migration tutorials skip the part where the database is also serving live traffic. Here is what shipping schema changes to a real production Postgres in 2026 actually looks like, including the operations I now refuse to run during business hours.
Server-Sent Events vs WebSockets in 2026: When Each One Actually Wins
WebSockets get reached for by reflex. Half the time the right answer is the boring one nobody talks about: Server-Sent Events. Here is the actual decision framework for real-time features in 2026, and the cost both choices hide from you.
Stripe Webhooks in Production: Idempotency, Retries, and the Mistakes That Cost Me Real Money
Stripe webhooks look like a five-minute integration in the docs. Then a customer is double-charged, a subscription event arrives out of order, your handler 500s for an hour, and Stripe quietly retries the same event 47 times. Here is what shipping webhooks to real billing flows actually looks like in 2026.
Passkeys in Production: What I Wish I Knew Before Replacing Passwords
Passkeys look simple in the WebAuthn demo. They get strange the moment you handle a user with two laptops, a stolen phone, a Bitwarden subscription, and a corporate device that blocks iCloud Keychain. Here is what shipping passkeys to real users actually looks like in 2026.
TypeScript at Scale: Why Your tsc Takes 90 Seconds and How to Fix It
Your build is slow. Your editor lags when you hover a type. CI spends more time type-checking than running tests. None of this is unavoidable. Most of the cost is a small number of patterns that are easy to write and expensive to compile. Here is how to find them and what to do.
Anthropic and SpaceX: What the Colossus Deal Actually Means for Developers
Claude Code rate limits doubled overnight. The reason is a 220,000 GPU data center in Memphis that SpaceX built and Anthropic just rented, from the same Elon Musk who was calling Anthropic evil three months ago. Here is what this deal actually means for developers building with Claude in 2026.
JavaScript Async Lifetimes: The Leak You Have and Probably Do Not Know About
Promise.all does not cancel sibling tasks when one fails. Your async code is likely leaking database connections, keeping fetches alive after unmount, and holding ports open through process exits. ES2026 finally gives you the primitives to fix this without a library. Here is how.
Embedding Models And Reranking In Production 2026: Picking The Pair That Actually Lifts Retrieval Quality
The embedding model decides what your retriever can find. The reranker decides what makes it to the LLM. By 2026 the production patterns for picking and pairing these two have stabilized, and most teams are still leaving real recall on the table because they treat embeddings as a commodity and skip reranking entirely. Here is what actually works, and what to stop doing.
RAG Chunking Strategies In Production 2026: What Actually Survives Real Documents And Real Queries
Most RAG systems do not fail at the LLM. They fail at the chunker. By 2026 the patterns for splitting documents into retrievable units have matured into a small set of choices that consistently outperform the default 512-token slicer everybody starts with. Here is what those choices are, where each one breaks, and how to pick the right one without rebuilding the index every Friday.
AI Guardrails And Output Validation In Production 2026: What Actually Catches Bad Outputs Before Users Do
Most teams discover their guardrails are missing the moment a screenshot of their AI saying something stupid hits the timeline. By 2026 the patterns for catching bad LLM outputs before they ship to users have settled into something concrete: layered validators, fast cheap checks first, expensive ones only when needed, and a clear policy for what to do when validation fails. Here is what that looks like in real systems.
Small Language Models In Production 2026: Where SLMs Beat Frontier Models, And Where They Quietly Fail
The 8B-parameter model that runs on a single GPU is good enough for more of your pipeline than you think, and worse than you think for the parts you keep wanting to give it. By 2026 the production patterns for using small language models alongside frontier ones have settled into a clear shape: route by task, not by vibe, and stop paying for capabilities you are not using. Here is how that actually plays out.
Designing Tools For AI Agents In 2026: Schemas, Descriptions, And The Pitfalls That Make LLMs Fail Silently
The bug in most agents is not the model. It is the tools you handed it. Vague descriptions, overlapping responsibilities, and schemas that look fine on paper produce agents that confidently call the wrong function with the wrong arguments. Here is how to design tools the model can actually use, drawn from the production patterns that have stabilized by 2026.
Multi-Modal AI Agents In Production: Vision, Audio, And The Glue That Actually Works In 2026
Shipping a multi-modal agent is not the same as adding an image input to your chat. The teams running real vision and audio agents in production by 2026 have discovered the same set of sharp edges: tokenization surprises, latency that explodes on the second modality, evaluation that needs new shapes, and cost curves that look nothing like text. Here is what that actually looks like once it is in front of users.
AI Agent Reliability Engineering in 2026: SLOs, Error Budgets, And Failure Modes That Actually Matter
Treating an AI agent like a normal service is how you get a 95 percent uptime number that hides a 60 percent task success rate. The teams running real agent products in 2026 measure reliability differently, set SLOs on outcomes instead of HTTP codes, and have rehearsed every failure mode the agent introduces. Here is what that looks like.
Pricing AI Features in 2026: How To Charge For LLM-Backed Products Without Bleeding Margins
Flat subscriptions on AI features are how indie products go bankrupt in 2026. The teams shipping profitable AI products price for variance, charge close to the unit of value, and pass usage volatility through to the customer in a way that does not feel hostile. Here is how to actually do that.