1. The Landscape
~60 YC-funded developer tools startups, from to . The overwhelming trend: AI agents everywhere. Nearly every W2026 and F2025 company is building some form of AI agent for developers. They cluster into 10 categories:
| Category | Count | Key Companies |
|---|---|---|
| AI Coding Agents / IDEs | 11 | Emdash, 1code, Compyle, Inspector, Clad Labs, Scott AI, Sourcebot, Lotas, Omnara, Sparkles, JSX Tool |
| AI Agent Infrastructure | 9 | Terminal Use, Salus, Lemma, Hyperspell, Kernel, Dedalus Labs, Compresr, Captain, Specific |
| AI Agent Monitoring / Observability | 7 | Sentrial, Lucent, The Context Company, Ashr, ZeroEval, Sonarly, Deeptrace |
| AI DevOps / SRE Agents | 5 | Mendral, IncidentFox, Corelayer, Rovr, Canary |
| No-Code / Low-Code Platforms | 4 | Jinba, Fastshot, Dari, Fastgen |
| Data Infrastructure | 4 | s2.dev, ParadeDB, Moss, Vellum |
| Testing / QA | 4 | Lark, Canary, Mobot, Parea |
| Cloud / Deployment | 5 | Porter, Beam, Ploomber, Chamber, Cedana |
| API Tools | 4 | Parse, Algolia, Rutter, Axle |
| Niche / Vertical | 7+ | RunAnywhere, VOYGR, Maven, Synthetic Sciences, Golpo, Fluidize, Brainboard |
Key observation: The W2026 batch is overwhelmingly AI agent companies. Out of ~23 W2026 companies listed as "developer tools," nearly all of them are building AI agents that do something developers used to do manually. The space is absurdly crowded, and most of these will die.
2. Full Company List
W2026 Batch (23 companies)
- Ashr
- AI agent testing platform generating user journeys to ensure accuracy and quality.
- IncidentFox
- AI SRE agents that learn customer systems and operate like in-house engineers.
- Canary
- AI QA engineer reading source code to catch broken user flows before production.
- Terminal Use
- Orchestration platform for background agents with CLI-first design.
- Emdash
- Agent-first development environment running multiple coding agents in parallel. Open source.
- Synthetic Sciences
- Infrastructure for AI co-scientists executing full research loops with agent swarms.
- Maven
- API enabling AI agents to collect phone payments with PCI compliance.
- Salus
- Runtime API inspecting and blocking incorrect agent actions with correction feedback.
- Lucent
- Monitors user sessions automatically, alerting teams to bugs and UX issues.
- Sentrial
- Monitoring platform for AI products detecting loops, hallucinations, and user frustration.
- VOYGR
- Place intelligence platform combining mapping data with web context for AI agents.
- Mendral
- AI DevOps engineer handling CI failures, flaky tests, and code reviews autonomously.
- Compresr
- API compressing LLM context without losing critical information for agents and RAG.
- Jinba
- Enterprise teams build AI workflows through plain language without coding.
- Sparkles
- Non-technical team members modify front-end styling via GitHub PRs. Like Lovable for existing projects.
- Chamber
- Agentic platform orchestrating and optimizing AI infrastructure on GPUs.
- Corelayer
- On-call AI agents for data-heavy industries with secure production data access.
- RunAnywhere
- SDK and control plane for on-device model deployment across diverse devices.
- Sonarly
- Triages production alerts: deduping, prioritizing, attaching missing context.
- 1code (21st.dev)
- Control panel for running Claude Code instances in parallel with task automation.
- Captain
- Retrieval engine for unstructured data achieving 95% accuracy with citations.
- Specific (F2025)
- Lets coding agents write code and build infrastructure to deploy it.
F2025 Batch (18 companies)
- JSX Tool
- Browser extension locating React components for LLM-assisted visual development.
- Jarmin
- 24/7 AI ML engineer handling full initiatives and tasks via conversation.
- Inspector
- AI IDE for front-end connecting browser and codebase for visual element editing.
- Scott AI
- Agentic workspace: teams align on spec before any codegen runs.
- Lemma
- AI agents continuously improve from user feedback and production outcomes.
- Rovr
- Engineering operations automation: intake, triage, routing via AI front desk.
- Parse
- Build APIs for any website by reverse-engineering network requests. 10-100x faster.
- Moss
- Real-time semantic search runtime, sub-10ms retrieval for voice agents and copilots.
- Clad Labs
- Free codegen from workflow patterns directly within IDE.
- Hyperspell
- Memory layer for AI agents connecting Slack, Gmail, Notion for recall and reasoning.
- Compyle
- Coding agent that asks before it acts. Engineer stays in the driver's seat.
- Fastshot
- No-code mobile development: idea to App Store in minutes via AI chat.
- The Context Company
- Analyzes AI agent conversations to surface patterns and silent failures.
- Sourcebot
- Open-source code understanding platform for massive codebases.
- Dari
- Automates web workflows with natural language + deterministic playback fallback.
- s2.dev
- Serverless datastore for real-time streaming data. "Kafka + S3 had a baby."
- Deeptrace
- AI agents investigating and resolving production alerts end-to-end.
- Specific
- Coding agents write code and build infrastructure for deployment.
S2025 Batch (8 companies)
- Dedalus Labs
- Vercel for AI Agents: hosting MCP servers with autoscaling and orchestration.
- Lark
- Write end-to-end tests in plain English. No test code to maintain.
- ZeroEval
- Self-improving AI agents using calibrated judges for 10x faster optimization.
- Lotas
- AI coding assistant (Rao) for RStudio, targeting 5M data scientists using R.
- Fluidize
- Accelerates R&D through AI-driven simulation and experiment automation.
- Omnara
- Mobile extension for coding agents: hand off from desktop to phone seamlessly.
- Kernel
- Open-source infra for AI agents to access the internet. Sub-150ms cold starts.
- Golpo
- AI video generator converting documents into whiteboard explainers and podcast audio.
X2025 – W2025 Batch (notable companies)
- Better Auth (X2025)
- Comprehensive authentication framework for TypeScript. Open source.
- StarSling (X2025)
- Agentic developer homepage automating deployments, performance monitoring, incidents, and bugs.
- Onlook (W2025)
- Open-source visual editor for code merging design, development, and AI.
- TensorPool (W2025)
- CLI making ML model training effortless with GPU orchestration.
- Truffle AI (W2025)
- Integrates AI agents into applications by exposing agents as simple APIs.
S2024 – W2024 Batch (notable companies)
- Haystack (S2024)
- Code review platform: breaks PRs into digestible chunks on an infinite canvas.
- Random Labs (Slate) (S2024)
- Coding agent designed to work with you for long hours on hard problems.
- Lytix (W2024)
- Datadog for LLMs: custom evaluations, cost optimization, monitoring.
- OneGrep (W2024)
- DevOps agent automating manual tasks like runbook execution during incidents.
- atopile (W2024)
- Language for designing electronic circuit boards with code. Hardware meets software.
Older Batches (established companies)
- GitLab (W2015)
- Single application for the entire DevOps lifecycle. Public company.
- Algolia (W2014)
- Search & Discovery API. 50B+ queries/month.
- Amplitude (W2012)
- Digital analytics platform: product analytics, experimentation, session replay. Public company.
- Porter (S2020)
- Deploy and scale apps on AWS, Azure, GCP. Heroku alternative.
- Beam (W2022)
- Cloud platform for AI inference, sandboxes, and agent deployment.
- ParadeDB (S2023)
- Open-source search and analytics on Postgres. ACID-compliant.
- Vellum (W2023)
- Prompt management, AI workflow orchestration, and LLM observability.
- Parea (S2023)
- Developer platform for debugging and monitoring LLM applications.
- authzed (W2021)
- Scalable authorization infrastructure. Open source (SpiceDB).
- Rutter (S2019)
- Unified API connecting B2B fintech to accounting and e-commerce systems.
- Mobot (W2019)
- QA-as-a-service using mechanical robots for physical mobile device testing.
3. Category Clusters
AI Coding Agents / IDEs (11 companies — most crowded)
Emdash, 1code, Compyle, Inspector, Clad Labs, Scott AI, Sourcebot, Lotas, Omnara, Sparkles, JSX Tool
Everyone is building "AI that writes code for you." Competing directly with Cursor, Windsurf, GitHub Copilot, and Claude Code itself — products backed by billions in funding. The differentiation attempts are thin: "asks before acting" (Compyle), "for R" (Lotas), "on mobile" (Omnara), "for React" (JSX Tool). This is the most dangerous category to enter as a bootstrapper.
AI Agent Infrastructure (9 companies — the picks-and-shovels play)
Terminal Use, Salus, Lemma, Hyperspell, Kernel, Dedalus Labs, Compresr, Captain, Specific
The "infrastructure for the AI agent wave" — orchestration, memory, runtime safety, hosting (MCP servers), context compression, retrieval. Classic picks-and-shovels play. If agents win, infrastructure wins regardless of which agent. But requires deep technical moats and enterprise sales.
AI Agent Monitoring / Observability (7 companies)
Sentrial, Lucent, The Context Company, Ashr, ZeroEval, Sonarly, Deeptrace
"Datadog for AI agents." Monitoring loops, hallucinations, user frustration, production alerts. Real problem — AI products are unreliable and companies need visibility. But 7 companies is already crowded, and incumbents like Datadog, Sentry, and New Relic will add AI monitoring features.
AI DevOps / SRE Agents (5 companies)
Mendral, IncidentFox, Corelayer, Rovr, Canary
AI that handles CI failures, incident response, code reviews, QA. Strong value proposition — DevOps is painful and expensive. But enterprise-heavy, requires deep system integrations, and trust is hard to build for anything that touches production.
No-Code / Low-Code Platforms (4 companies)
Jinba, Fastshot, Dari, Fastgen
"Build apps/workflows without code" — now with AI. Fastshot targets mobile apps, Jinba targets enterprise AI workflows, Dari automates web workflows, Fastgen is a low-code API builder. Proven category (Bubble, Retool, Zapier) but the AI layer makes the old players vulnerable.
Data Infrastructure (4 companies)
s2.dev, ParadeDB, Moss, Vellum
Databases, streaming, search, prompt management. ParadeDB (Postgres-based search) and s2.dev (serverless streaming) are genuinely interesting. Data infra is a proven market but requires deep technical execution.
Testing / QA (4 companies)
Lark, Canary, Mobot, Parea
From "tests in plain English" (Lark) to physical robot testing (Mobot) to LLM evaluation (Parea). Testing is a proven, boring, always-needed category. Companies pay for it. Every team hates writing tests. This is interesting.
Cloud / Deployment (5 companies)
Porter, Beam, Ploomber, Chamber, Cedana
Heroku alternatives, AI inference platforms, GPU optimization. Porter is the most established (Heroku for AWS/Azure/GCP). Chamber and Cedana focus on GPU workload optimization — timely but infrastructure-heavy.
API Tools (4 companies)
Parse, Algolia, Rutter, Axle
Parse is the interesting newcomer: reverse-engineer any website into an API. Algolia is the giant (50B queries/month). Rutter and Axle are unified APIs for specific verticals. API tools are a proven bootstrappable category — RapidAPI, Postman started small.
Niche / Vertical (7+ companies)
RunAnywhere (edge AI), VOYGR (location data), Maven (payments for agents), Synthetic Sciences (AI research), Golpo (video generation), Fluidize (R&D simulation), Brainboard (IaC), atopile (PCB design)
Highly specialized. Some of these are genuinely unique — atopile (circuit board design with code) and Mobot (robot QA) stand out for having physical-world moats that pure software companies cannot replicate.
4. Applying the DHH / Jason Fried Filter
The rules:
5. Categories to Skip
| Category | Why Skip |
|---|---|
| AI Coding Agents / IDEs | 11 YC companies + Cursor + Windsurf + GitHub Copilot + Claude Code. You are competing with tens of billions in combined funding. Skip. |
| AI Agent Infrastructure | Requires deep technical moats, enterprise sales, and the bet that "agents" as a category will stabilize (not guaranteed). Too early and too complex. Skip. |
| AI Agent Monitoring | 7 startups + Datadog/Sentry/New Relic will add features. Enterprise-heavy sales cycles. Skip. |
| AI DevOps / SRE | Touching production requires extreme trust. Long enterprise sales cycles. Companies won't let an unknown bootstrapped tool auto-fix their CI. Skip. |
| Cloud / Deployment / GPUs | Infrastructure-heavy, razor-thin margins, 24/7 on-call. You are competing with AWS, GCP, and well-funded startups like Railway. Skip. |
| Data Infrastructure | Requires deep database engineering and long adoption cycles. Not a solo-founder play. Skip. |
6. What Survives the Filter
Option 1: Testing / QA Tools — the #1 pick
Why: Everyone hates writing tests. Every company needs them. Lark (plain English tests) and Canary (AI QA) prove YC sees demand. Incumbents like Cypress and Playwright are powerful but complex. QA Wolf charges $4k+/month for QA-as-a-service.
The DHH play:
- A simple, hosted testing tool: write tests in plain English, run against your staging URL.
- $100–300/month per team. Self-serve signup.
- No AI agent hype. Just "your tests, written and maintained for you."
- Alternatively: QA-as-a-service (productized). You run the tests, they get a report. $500–2k/month.
Option 2: API Tools (scraping / reverse-engineering)
Why: Parse (reverse-engineer websites into APIs) is a fascinating concept. ScrapingBee, ScraperAPI, and Bright Data prove people pay real money for web data extraction. The market is proven and fragmented.
The DHH play:
- A dead simple scraping/data extraction API. Pay per request.
- $50–200/month tiers. Self-serve, no sales team.
- Focus on one niche: e-commerce data, job listings, real estate, etc.
- Use AI to make extraction smarter than regex-based competitors.
Option 3: No-Code Mobile Apps (productized)
Why: Fastshot is trying to do "idea to App Store in minutes." Glide and Adalo proved the no-code mobile market. Now AI makes it actually feasible to generate real apps, not just toy prototypes.
The DHH play:
- Not a platform. A productized service: "I'll build your mobile app in 1 week for $2–5k."
- Use AI (Claude Code, Cursor) to deliver in a fraction of the time clients expect.
- Target small businesses who need an app but can't afford agencies ($50k+).
- Sell on LinkedIn. Deliver fast. Repeat.
Option 4: Plain English Workflow Automation
Why: Dari (natural language web workflows) and Jinba (AI workflows for enterprise) are tapping into the Zapier/Make.com market. Zapier does $200M+ ARR. The AI layer could make automations 10x easier to set up.
The DHH play:
- "Zapier but you just describe what you want in English."
- $30–100/month. Self-serve.
- Start with a few high-value integrations (Slack + Google Sheets + email).
- Don't try to be a platform. Be the simple option.
Option 5: Auth-as-a-Service (open source)
Why: Better Auth (X2025) and authzed (W2021) prove auth is a perennial need. Clerk is growing fast. Auth0 was acquired for $6.5B. Every app needs auth. Most developers hate implementing it.
The DHH play:
- A managed auth service simpler and cheaper than Clerk/Auth0.
- Free tier + $25–100/month for teams.
- Open-source core for community adoption, hosted version for revenue.
- Target indie hackers and small teams priced out of Clerk.
7. The Street-Smart Verdict
| Option | SNOLOC | TTFP | Recurring? | Bootstrap Score |
|---|---|---|---|---|
| Testing / QA tool | Medium | Weeks | Monthly | |
| Mobile app service | ~0 (service) | Days | Per gig | |
| Auth-as-a-Service | Medium | Weeks | Monthly | |
| API / scraping tools | Medium | Weeks | Monthly | |
| Workflow automation | High | Months | Monthly |