2. 1. The Framework: What Makes a COSS Idea Worth Building
Not every open source project becomes a company. Most die as GitHub repos with 400 stars and no revenue. The ones that work share a specific structure. Here's what to screen for:
| Criterion | What to Look For | Why It Matters |
|---|---|---|
| Incumbent overreach | The closed-source leader is overpriced, over-engineered, or VC-bloated | Price arbitrage is the fastest wedge. "Self-host for free" lands when the alternative is $2K/month. |
| Data sensitivity | Users have legal, contractual, or paranoia reasons to not send data to a third party | Privacy is not a feature, it's a forcing function. Healthcare, legal, finance, government all have this. |
| Services entry point | The thing is complex enough that someone will pay for implementation, integration, or training | Services buys you 6-18 months of runway while cloud matures. It's not glamorous but it works. |
| Data flywheel | Usage generates proprietary data that makes the product smarter over time | This is how COSS becomes an AI company. The OSS version captures data; the cloud version uses it to train models that the OSS version can't replicate alone. |
| Single-developer buildable | V1 can ship in 60-90 days by one person | Bootstrapped means no runway for a 6-person team. The MVP must be achievable alone. |
| Community pull | The category already has angry Reddit threads, GitHub issues, or Hacker News complaints | You don't want to create demand. You want to absorb existing frustration. |
Every idea below passes all six criteria. Let's go through them.
3. 2. Idea 1: Open Source LLM Evaluation Platform
The Gap
Every company building on top of LLMs faces the same problem: they don't know if their prompts are getting worse. A deployment goes out, latency changes, model version shifts, and nobody notices until a customer complains. The tooling to systematically evaluate LLM output quality, track regressions, and run automated test suites is immature, closed-source, or expensive.
Existing players: Braintrust (VC-backed, $2K+/month at scale), Humanloop (VC-backed, enterprise pricing), LangSmith (tied to LangChain, vendor lock-in concern), Promptfoo (open source, CLI-only, no UI, no team features, no cloud). The gap is a proper OSS platform with a UI, team features, and a cloud option.
| Attribute | Details |
|---|---|
| Name idea | Evalforge, Promptwatch, Evals.run, Critera |
| License | Apache 2.0 core; cloud features proprietary |
| V1 scope | Dataset management, prompt versioning, automated test runs against LLM outputs, pass/fail scoring with human and LLM-as-judge, diff view between versions, Slack alerts on regression |
| Stack | Go or Python backend, React UI, SQLite for self-host, Postgres for cloud, Docker one-liner install |
| Time to V1 | 60-75 days for a senior developer working solo |
| Target user | AI engineer or ML engineer at a 10-200 person startup building on GPT-4 / Claude / Gemini |
Why It Doesn't Exist Yet
Promptfoo is close but intentionally CLI-first. LangSmith is tied to LangChain. Every VC-backed alternative has pricing that prices out the indie developer and small team. Nobody has shipped the "Plausible Analytics of LLM evaluation" -- clean, self-hosted, $19/month cloud, MIT/Apache core.
Services Layer
Offer evaluation consulting: come in, audit a company's prompt pipeline, build their eval dataset, set up the tool, configure CI integration. $5K-$15K per engagement. This pays immediately. Enterprises will pay $25K for a week of "LLM quality audit + setup." This isn't hypothetical -- the demand is visible on Upwork, LinkedIn, and Slack communities already.
AI Company Evolution
Here's where it gets interesting. Every eval run generates labeled data: prompt, output, score, human feedback. After 6 months of cloud users running evals, you have a dataset of hundreds of thousands of prompt/output/quality pairs across dozens of LLMs and use cases. You can train a proprietary quality scoring model that's faster and cheaper than GPT-4-as-judge. Then you sell that model as an API. That's a product nobody else has -- because they don't have the data flywheel.
| Timeline | Milestone | Revenue |
|---|---|---|
| Month 1-3 | Build V1, Show HN, first GitHub stars | $0 |
| Month 4-6 | First consulting engagement, 5-10 cloud customers | $2K-$8K MRR |
| Month 7-12 | Cloud at $29/$99/$299/month, 50-100 paying cloud teams | $5K-$15K MRR |
| Year 2 | Proprietary scoring model, API product, enterprise tier | $30K-$80K MRR |
| Year 3 | AI company story is complete: platform + model + data | $100K+ MRR |
Bootstrappability score: 9/10. AI ceiling: 9/10. Defensibility: 8/10.
4. 3. Idea 2: Open Source Contract Intelligence
The Gap
Lawyers and operations teams spend enormous time on contract review: reading NDAs, MSAs, vendor agreements, employment contracts. The closed-source tools -- Ironclad, Juro, Lexion, Luminance -- are expensive and aimed at General Counsel at large enterprises. The SMB and startup market is completely unserved. Nobody is self-hosting contract intelligence because nobody has built it.
What this tool does: upload a contract PDF, extract key clauses (payment terms, termination rights, liability caps, IP ownership, non-compete scope), flag risky language, compare against templates, track expiry dates. Self-hosted so legal data never leaves the network. This is not AI as a gimmick -- AI is the core function.
| Attribute | Details |
|---|---|
| Name idea | Clausekit, Contractforge, Lexibase, Paperwork |
| License | AGPL core (forces cloud users back to managed); proprietary add-ons |
| V1 scope | PDF/DOCX upload, clause extraction via LLM, key date tracking, red flag detection, simple dashboard, contract repository with search |
| Stack | Python backend (FastAPI), React UI, PostgreSQL, pgvector for semantic search, Docker, optional Ollama integration for fully air-gapped self-hosting |
| Time to V1 | 75-90 days |
| Target user | Operations manager or in-house paralegal at a 20-200 person company without a full legal team |
Why It Doesn't Exist Yet
The OSS world has document management (Paperless-ngx, Documenso) but nothing specifically for contract intelligence. The legal tech market is dominated by VC-backed players targeting large law firms and Fortune 500 legal teams. The startup/SMB market is served by exactly nobody with a credible self-hosted option. AGPL licensing is a smart choice here: it means anyone running it as a service must open source their modifications, which keeps the ecosystem honest and channels cloud demand back to you.
Services Layer
Two service plays. First: contract migration -- help a company move from scattered Drive folders into the platform ($3K-$10K per engagement). Second: custom clause detection -- train custom extraction models for a specific industry or contract type. A private equity firm might pay $20K for a tuned model that catches specific covenants in their deal flow. This is genuinely not available anywhere else.
AI Company Evolution
The data flywheel here is slow but rich. Every extracted clause, every flagged risk, every human correction on a false positive becomes training data for a proprietary legal language model fine-tuned on real contract language across industries. After 12 months of cloud usage, you have something no academic dataset provides: real-world contract intelligence signals with human validation. Fine-tune a model on this, offer it via API, and you have a legal AI infrastructure product that law firms will pay serious money for. Think "the Stripe of legal AI" -- infrastructure, not application.
| Timeline | Milestone | Revenue |
|---|---|---|
| Month 1-3 | V1 ship, Hacker News, ProductHunt, legal tech subreddits | $0 |
| Month 4-6 | First consulting engagements, 5-15 cloud customers | $3K-$12K MRR |
| Month 7-12 | Cloud tiers at $49/$149/$499, industry-specific clause packs | $10K-$25K MRR |
| Year 2 | Enterprise self-hosted license ($5K/year), law firm partnerships | $30K-$70K MRR |
| Year 3 | Proprietary legal LLM fine-tune, API product, VC-ready if desired | $80K-$200K MRR |
Bootstrappability score: 7/10. AI ceiling: 10/10. Defensibility: 9/10.
5. 4. Idea 3: Open Source Competitive Intelligence Engine
The Gap
Competitive intelligence is a multi-billion dollar market served by Crayon, Klue, Kompyte, and Semrush. All closed-source. All expensive ($500-$2,000+/month for anything useful). All aimed at enterprise marketing teams. No serious open source alternative exists. The self-hosted category is completely empty.
What this tool does: monitor competitor websites for changes (pricing pages, features, job postings, blog posts), track their social media activity, monitor review sites (G2, Capterra, Trustpilot), aggregate news mentions, and synthesize everything into weekly intelligence digests. The AI layer turns raw change detection into meaning: "Competitor X just added a feature you don't have" or "Their pricing changed -- here's what it means for your positioning."
| Attribute | Details |
|---|---|
| Name idea | Watchpost, Rivalkit, Kompas, Trackeye |
| License | Apache 2.0 core; AI analysis features proprietary cloud-only |
| V1 scope | URL change monitoring, diff rendering, Slack/email alerts, basic scraping pipeline, competitor profile pages, manual notes, search across history |
| Stack | Go backend (good for concurrent scraping), React UI, PostgreSQL, Redis for job queue, Playwright for JS-heavy pages, Docker |
| Time to V1 | 45-60 days (change detection is technically simpler than LLM eval or contract parsing) |
| Target user | Product manager or head of marketing at a B2B SaaS company with 3+ direct competitors |
Why It Doesn't Exist Yet
The technical components exist separately: Changedetection.io handles URL monitoring (self-hosted, open source), but it has no competitive intelligence framing, no AI synthesis, no team features, no battlecard generation. Nobody has assembled these pieces into a product positioned specifically for competitive teams. The closest thing is a DIY stack of Changedetection.io + Zapier + ChatGPT, which is what sophisticated users actually do today. That's a strong signal that the category is real.
Services Layer
Competitive intelligence setup as a service: come in, set up the platform, define the competitor list, configure monitoring rules, build the first round of battlecards. $5K-$20K per engagement. There's also a retainer play: ongoing monthly competitive briefings as a managed service, $2K-$5K/month. This is the "agency mode" version of the product -- sell the output (insights), not just the tool.
AI Company Evolution
The flywheel: every competitor profile, every change event, every battlecard that a human validates becomes training data for competitive intelligence reasoning. After enough data, you can fine-tune a model that reads a competitor website and generates positioning recommendations without human prompting. That model, wrapped in an API, becomes a product for sales teams: "paste a competitor URL, get a battlecard." Gong sells conversation intelligence. This is web intelligence. Same category, different data source.
| Timeline | Milestone | Revenue |
|---|---|---|
| Month 1-2 | V1 ship (fastest of all six ideas to build) | $0 |
| Month 3-5 | First consulting clients, ProductHunt launch | $2K-$8K MRR |
| Month 6-12 | Cloud at $39/$99/$299/month, 30-80 paying teams | $5K-$20K MRR |
| Year 2 | AI battlecard generation, enterprise tier, agency partnerships | $25K-$60K MRR |
| Year 3 | Web intelligence API, sales enablement integrations | $60K-$150K MRR |
Bootstrappability score: 10/10. AI ceiling: 8/10. Defensibility: 7/10.
6. 5. Idea 4: Open Source Developer Documentation AI
The Gap
Every developer tool company struggles with documentation. Writing docs is slow. Keeping docs in sync with code is slower. Search on most documentation sites is terrible. AI-powered documentation platforms like Mintlify (VC-backed), GitBook (VC-backed), and ReadMe (VC-backed) all charge $150-$400+/month at team scale. Nobody has built a serious self-hosted, open source alternative with AI-native features.
What this tool does: git-connected documentation platform, Markdown/MDX support, AI-powered search over docs (semantic, not keyword), AI-generated first drafts from code comments and function signatures, dead link detection, version-aware docs, custom domains, and a feedback widget that surfaces which pages have the most confused readers. The AI keeps docs fresh by detecting when code changes and prompting the maintainer to update the relevant doc page.
| Attribute | Details |
|---|---|
| Name idea | Docforge, Foliobase, Leafdocs, Vellum |
| License | MIT core; AI features and team management proprietary cloud |
| V1 scope | Git sync, Markdown rendering, custom domains, semantic search, basic analytics (page views, search queries), feedback widget |
| Stack | Go or Node.js backend, React UI, PostgreSQL, pgvector, S3-compatible storage, Docker |
| Time to V1 | 60-80 days |
| Target user | Developer relations engineer or technical writer at an API-first company or open source project maintainer who can't afford Mintlify |
Why It Doesn't Exist Yet
The documentation tools that exist are old (GitBook pre-2020 was great, now it's enterprise-focused), too simple (plain Docusaurus/MkDocs with no AI), or VC-funded and expensive (Mintlify). The open source world has Docusaurus and MkDocs, both solid but completely static -- no AI, no team features, no analytics. Nobody has layered AI on top of an open source docs platform. The Docusaurus team won't do it because it's out of scope for a Meta-backed project. Mintlify won't open source because that's their moat. The gap is real.
Services Layer
Documentation audit and setup as a service: come in, migrate existing docs from Confluence/Notion/GitBook, set up the platform, structure the information architecture, write the first round of missing docs. This is a $10K-$40K consulting engagement for a mid-size developer tool company. Developer relations as a service is a growing category -- bolt this onto it. Write the docs, own the tooling.
AI Company Evolution
The flywheel: every search query that fails to return a good result, every piece of user feedback on a doc page, every "this page helped" or "this page confused me" signal becomes training data. You build a model that understands developer documentation quality. Then you sell a Documentation Quality API: paste a doc page, get a quality score, readability score, completeness score, and suggested improvements. Enterprise DevRel teams will pay for this as part of their developer experience stack. Stripe, Twilio, Vercel -- every major API company cares deeply about documentation quality and has no tool to measure it.
| Timeline | Milestone | Revenue |
|---|---|---|
| Month 1-3 | V1 with git sync + AI search, launch on HN and DevRel communities | $0 |
| Month 4-6 | Open source project migrations as proof points, first cloud customers | $1K-$5K MRR |
| Month 7-12 | Cloud at $29/$79/$249/month, code-to-doc AI draft feature | $5K-$15K MRR |
| Year 2 | Enterprise tier, documentation quality scoring product | $20K-$50K MRR |
| Year 3 | Documentation Quality API, DevRel platform category ownership | $50K-$120K MRR |
Bootstrappability score: 8/10. AI ceiling: 7/10. Defensibility: 7/10.
7. 6. Idea 5: Open Source Support Intelligence
The Gap
Support is the most data-rich function in any company and the least analyzed. Every support ticket is a signal about product quality, documentation gaps, onboarding failures, and feature demand. Nobody synthesizes this signal systematically. The tools that exist -- Intercom, Zendesk, Freshdesk -- are expensive, closed-source, and data-extractive. The AI layer on top (Intercom Fin, Zendesk AI) is opt-in, expensive, and completely opaque.
What this tool does: self-hosted help desk with email, chat, and in-app widget support, plus an intelligence layer that categorizes tickets automatically, detects trends (what's the most common complaint this week vs. last week), links tickets to product features, and generates weekly digests for product teams. Not just a help desk -- a support intelligence platform.
| Attribute | Details |
|---|---|
| Name idea | Helpforge, Signaldesk, Ticketml, Helix |
| License | AGPL core; AI intelligence features and integrations proprietary |
| V1 scope | Email-based ticketing, basic chat widget, auto-categorization via LLM, weekly trend digest, Slack integration, knowledge base editor |
| Stack | Go backend (excellent for email/SMTP handling), React UI, PostgreSQL, Redis, Docker, IMAP/SMTP integration |
| Time to V1 | 75-100 days (email handling adds complexity) |
| Target user | Founder or head of support at a 5-50 person SaaS company currently managing support via Gmail labels or a $500+/month Intercom plan |
Why It Doesn't Exist Yet
Help desk OSS exists (Chatwoot, FreeScout, Zammad) but the intelligence layer is absent from all of them. They are ticketing systems, not insight systems. The shift from "support tool" to "support intelligence" is the positioning that nobody has claimed. Chatwoot has 22K GitHub stars and almost no revenue -- because it's positioned as a cheaper Intercom, not as something new. The opportunity is to build what Chatwoot should have been.
Services Layer
Support audit consulting: come in, analyze a company's support backlog (import from Intercom/Zendesk), identify the top 10 recurring issues, write knowledge base articles for each, set up the platform. $5K-$15K per engagement. Ongoing managed support as a service for companies too small to hire a support team -- handle their support tickets, using the tool. This is a retainer business: $2K-$4K/month per client.
AI Company Evolution
The data flywheel: every categorized ticket, every resolved conversation, every knowledge base article that deflected a ticket becomes training data for a support-specific model. After 12 months of cloud users, you have a multi-tenant support intelligence dataset across dozens of product categories. Fine-tune a model on this. The output is a Support Intelligence API: send a ticket, get a category, suggested resolution, relevant knowledge base articles, and predicted resolution time. Sell this to Zendesk add-on marketplace, HubSpot ecosystem, Shopify app store. Infrastructure play, not application play.
| Timeline | Milestone | Revenue |
|---|---|---|
| Month 1-3 | V1, ProductHunt, Hacker News, subreddits (r/CustomerSuccess, r/startups) | $0 |
| Month 4-6 | First migrations from Intercom refugees, consulting engagements | $3K-$10K MRR |
| Month 7-12 | Cloud at $39/$99/$299/month, knowledge base deflection metrics feature | $8K-$20K MRR |
| Year 2 | AI auto-resolution product, enterprise tier, marketplace integrations | $25K-$60K MRR |
| Year 3 | Support Intelligence API, Zendesk/HubSpot plugin ecosystem | $60K-$150K MRR |
Bootstrappability score: 8/10. AI ceiling: 9/10. Defensibility: 8/10.
8. 7. Idea 6: Open Source Financial Planning and Analysis
The Gap
FP&A (Financial Planning and Analysis) is the category where finance teams model revenue, expenses, headcount, and scenarios. The tools: Anaplan ($50K+/year enterprise), Pigment (VC-backed, $2K+/month), Mosaic (VC-backed, $2K+/month), Causal (acquired by Lucanet). For startups and SMBs, the alternative is a complex Google Sheet or Excel model maintained by the CFO. Nobody has a self-hosted, open source FP&A tool.
What this tool does: visual financial model builder (no-code, spreadsheet-like but structured), scenario planning (best/base/worst case), actuals vs. plan comparison once connected to accounting (QuickBooks, Xero, Stripe), headcount planning, and SaaS metrics dashboard (MRR, churn, LTV, CAC). The AI layer generates narrative explanations of what the numbers mean and flags anomalies before the CFO notices them.
| Attribute | Details |
|---|---|
| Name idea | Foreplan, Scenariq, Planbase, Mosswort |
| License | Apache 2.0 core; AI narrative, integrations, and team collaboration proprietary |
| V1 scope | Spreadsheet-like model editor with formula support, scenario branching, basic SaaS metric templates (ARR bridge, headcount plan, burn rate), CSV import/export, basic charting |
| Stack | TypeScript backend, React UI with spreadsheet-like grid (AG Grid or similar), PostgreSQL, Docker |
| Time to V1 | 90-120 days (most complex UI of all six ideas) |
| Target user | Seed-to-Series A startup CFO or finance lead who lives in Google Sheets and can't justify $2K/month for Mosaic |
Why It Doesn't Exist Yet
FP&A is genuinely hard to build. The spreadsheet-like grid UI is a known engineering challenge. This is why nobody has done it open source -- it requires a competent frontend developer who also understands financial modeling. That's a rare combination. But that rarity is the moat. Once built, the defensibility is high: financial models are sticky (nobody migrates their 3-year forecast mid-year), and the enterprise compliance requirements (SOC 2, data residency) push companies toward self-hosted options.
Services Layer
Financial model building as a service. Startups desperately need a 3-year model for their Series A deck. Hire a fractional CFO or finance consultant who builds the model in the tool. $5K-$20K per engagement. Ongoing fractional CFO service using the platform: $3K-$8K/month retainer. This is the cleanest services play of all six ideas because the market for fractional CFO services already exists and is large -- the tool just makes it more scalable.
AI Company Evolution
The flywheel: every financial model structure, every scenario assumption, every actual-vs-plan variance, every SaaS metric pattern becomes training data. After 12 months of cloud usage across 50+ startups, you have a dataset of how startups actually model their business -- what assumptions they use for churn, growth, hiring. A model trained on this can generate first-draft financial models from a simple description ("Series A SaaS, $500K ARR, B2B, 10 employees"). That's a product CFOs will pay for. A financial model generator that produces a realistic, industry-benchmarked model in 5 minutes instead of 5 days. Nobody has built this. The data to build it doesn't exist as a public dataset.
| Timeline | Milestone | Revenue |
|---|---|---|
| Month 1-4 | V1 with core model editor and SaaS templates | $0 |
| Month 5-8 | First consulting engagements (model building), HN launch | $5K-$15K MRR |
| Month 9-14 | Cloud at $49/$149/$499/month, accounting integrations | $10K-$25K MRR |
| Year 2 | AI narrative generation, anomaly detection, enterprise tier | $30K-$70K MRR |
| Year 3 | AI model generator product, VC ecosystem distribution | $80K-$200K MRR |
Bootstrappability score: 6/10. AI ceiling: 10/10. Defensibility: 10/10.
9. 8. Side-by-Side Comparison
| Idea | Build Time | Services Revenue Speed | Cloud MRR at Year 1 | AI Ceiling | Defensibility | Bootstrap Score |
|---|---|---|---|---|---|---|
| LLM Eval Platform | 60-75 days | Fast (4-6 months) | $5K-$15K | 9/10 | 8/10 | 9/10 |
| Contract Intelligence | 75-90 days | Fast (3-5 months) | $10K-$25K | 10/10 | 9/10 | 7/10 |
| Competitive Intel Engine | 45-60 days | Medium (5-7 months) | $5K-$20K | 8/10 | 7/10 | 10/10 |
| Dev Docs AI | 60-80 days | Medium (6-8 months) | $5K-$15K | 7/10 | 7/10 | 8/10 |
| Support Intelligence | 75-100 days | Fast (4-6 months) | $8K-$20K | 9/10 | 8/10 | 8/10 |
| OSS FP&A | 90-120 days | Very Fast (2-4 months) | $10K-$25K | 10/10 | 10/10 | 6/10 |
10. 9. Universal Bootstrap Path (Works for All Six)
The playbook is the same regardless of which idea you pick. It's a five-phase sequence:
- Phase 1: Build in public (Day 1-30). Tweet/post weekly updates. "Building an open source X." This builds an audience before you ship. It also validates demand: if nobody engages, reconsider. The OSS community rewards transparency.
- Phase 2: Ship the OSS core (Day 31-90). Get a working V1 on GitHub. MIT or Apache 2.0. No sign-up required, no waiting list, no fake landing page. Just working software with a clear README and a Docker one-liner.
- Phase 3: Show HN (Day 75-90). Time the Show HN for when V1 is genuinely functional. A bad Show HN (buggy software, missing docs) is worse than no Show HN. A good Show HN can get 500-2,000 GitHub stars in 48 hours and 10-50 real users.
- Phase 4: Services before cloud (Month 3-6). Before building the managed cloud product, do 2-3 paid consulting engagements. This generates $10K-$30K cash, validates real use cases, and shows you what the cloud product must prioritize.
- Phase 5: Cloud launch (Month 6-9). Launch the managed cloud version at simple, transparent pricing. Your cloud customers at this point are either consulting clients who want managed (conversion rate: 30-50%) or Show HN users who want convenience.
11. 10. The Services Layer: Revenue Before Cloud Is Ready
Every COSS founder underestimates services. It feels like a distraction from "building the product." It isn't. Services does three things that pure cloud launch can't:
| Benefit | Why It Matters | Example |
|---|---|---|
| Immediate cash | $10K in consulting buys 3-4 months of runway. Cloud might take 12 months to reach the same MRR. | Two contract intelligence implementations at $8K each = $16K in month 4. Cloud at month 4 might be $1K MRR. |
| Real use case validation | You sit inside a real organization's workflow. You see every integration point, every edge case, every missing feature. You can't simulate this. | Your first LLM eval client tells you they need GitLab CI integration, not just GitHub Actions. Now you build the right thing. |
| Reference customers | A paying consulting client is a reference. "Company X uses this" is worth more than 1,000 GitHub stars for enterprise conversion. | Your first FP&A consulting client is a Series A SaaS. Naming them in your cloud launch page converts other Series A founders. |
Price services at $150-$300/hour or $5K-$20K per engagement. Don't undersell. Cheap services signal cheap software. If you can charge $10K for a contract intelligence setup, the product is worth $500/month to that client. The price anchors the cloud pricing in their mind.
12. 11. The AI Flywheel: How Each Idea Compounds
The difference between "a good open source tool with a cloud product" and "an AI company" is the data flywheel. Here's the general mechanism and how it applies:
| Stage | What's Happening | Timeline |
|---|---|---|
| OSS users generate data | Self-hosted users run the software. Their usage patterns, error rates, and feature adoption signal what matters. Some send telemetry (opt-in). | Month 1-6 |
| Cloud users generate richer data | Cloud users generate labeled, structured data with human feedback loops. Every evaluation score, every ticket categorization, every contract clause correction is signal. | Month 6-18 |
| Dataset becomes proprietary | After 12-18 months, the multi-tenant dataset is an asset no competitor can replicate quickly. It took real-world usage to produce it. No public dataset captures it. | Month 12-24 |
| Fine-tune a domain-specific model | Use the dataset to fine-tune a foundation model (Llama 3, Mistral, Qwen) on your specific task. The resulting model outperforms GPT-4 on your narrow domain at a fraction of the inference cost. | Month 18-30 |
| The model becomes a product | Offer the model via API. Enterprises pay for a model that's accurate in their domain, runs on their infrastructure, and doesn't send data to OpenAI. This is the AI company story. | Month 24-36 |
The key insight: the OSS version is not a loss leader. It's the data acquisition mechanism. Every self-hosted install is a potential cloud customer, and every cloud customer is a data contributor to the model. The more accurate the model gets, the more valuable the cloud product is compared to the OSS version. That's the moat.
13. 12. Verdict: Which One to Build
If you're a developer-founder who wants to bootstrap solo, the ranking is:
- LLM Evaluation Platform. The market is exploding right now. Every company building on LLMs needs this. The build is achievable in 60 days. The data flywheel (prompt/output/quality pairs) is uniquely valuable. Timing is the best of all six ideas -- in 18 months, this category will be crowded. Today it isn't.
- Contract Intelligence. Highest AI ceiling and best defensibility. Legal data is the most sensitive data that exists -- privacy wedge is strong. Services revenue is fast. The only reason it's second is slightly longer build time and slower initial community formation (legal tech is less active on GitHub than developer tools).
- Competitive Intelligence Engine. Fastest to build (45-60 days), most immediate market fit, easiest to talk about. The risk: lower technical barrier means more copycats once you show it works. Build this if you want to move fast and establish first-mover.
- Support Intelligence. Existing OSS (Chatwoot, FreeScout) means you can fork and add the intelligence layer instead of building from scratch. That's a legitimate shortcut. Position as "Chatwoot with a brain."
- Developer Documentation AI. Real market, real gap, but slower revenue. The buyer (DevRel engineer) often doesn't control budget. Great for an open source maintainer who wants to scratch their own itch.
- OSS FP&A. Highest long-term ceiling but hardest to build and hardest to sell without a finance background. Only build this if you have financial modeling expertise or a co-founder who does.
One more thing. None of these require VC. The services layer funds the build. The cloud converts the OSS user base. The AI flywheel builds the moat. You could take any of these from $0 to $50K MRR without a single investor meeting. At $50K MRR with a proprietary dataset and a fine-tuned model, you have a fundable AI company if you want one. Or you keep it bootstrapped. Both are valid exits.
The window for COSS-to-AI-company plays like these is probably 24-36 months before the category gets crowded or a VC-backed player ships OSS to block you. Build now.