Vale is an open-source, command-line prose linter written in Go. It applies code-like linting to written content — enforcing style guides, catching grammar issues, and ensuring terminological consistency. It runs entirely offline, processes Markdown, HTML, AsciiDoc, reStructuredText, and more, and requires no runtime dependencies. With 5,300+ GitHub stars and 3M+ downloads, it has become the industry standard for documentation teams at Datadog, GitLab, Grafana, and Elastic.

Core thesis: Vale and LLMs are complementary, not competing. Vale provides fast, deterministic, auditable rule enforcement (no em dashes, no passive voice, consistent terminology). LLMs provide contextual understanding, nuanced rewriting, and tone assessment. Combined in an actor/critic pipeline, they produce writing that is both stylistically consistent and genuinely good. This report details how to set up that pipeline for documentation and blog articles.

2. 1. What Vale Is and Why It Matters

Vale is a standalone Go binary that lints prose the way ESLint lints JavaScript. You define rules in YAML files, point Vale at your content, and it returns structured feedback — warnings, errors, and suggestions — scoped to headings, paragraphs, sentences, or any other markup element.

Key properties

Markup-aware: Vale understands Markdown, HTML, AsciiDoc, reStructuredText, Org mode, DITA, and MDX. It skips code blocks, avoids false positives from markup syntax, and can scope rules to specific elements (e.g., only check headings for title case).
Deterministic: Every rule produces the same output for the same input, every time. No API calls, no model drift, no hallucinations. This makes Vale auditable and suitable for CI/CD enforcement.
Offline and private: Content never leaves the local machine. Critical for proprietary documentation, internal playbooks, and anything you wouldn't paste into ChatGPT.
Fast: Benchmarks show Vale outperforms textlint, RedPen, write-good, and proselint. It can lint hundreds of files in seconds.
No runtime dependencies: Single binary. No Node.js, Python, or Java. Download, configure, run.

Who uses Vale

Organization	Use case
Datadog	Oxford comma enforcement, jargon flagging, temporal word detection, abbreviation substitution across all documentation
GitLab	Extensive documentation testing integrated into CI pipeline
Grafana	Centralized style (Writers' Toolkit) applied across multiple repositories
Elastic	Product/feature name capitalization and spelling enforcement
Linode/Akamai	Documentation style consistency

3. 2. The Rule System: 11 Check Types

Every Vale rule is a YAML file that extends one of 11 built-in check types. Each rule requires an extends field and a message field. Optional fields include level (suggestion, warning, error), scope (heading, paragraph, sentence, etc.), and link (URL to documentation).

Vale's 11 check types
Check type	Purpose	Example use
`existence`	Flag tokens that appear in content	Ban em dashes, flag weasel words like "arguably" or "basically"
`substitution`	Map observed terms to preferred replacements	Replace "utilize" with "use", "leverage" with "use"
`occurrence`	Enforce min/max counts of a token within a scope	No more than 3 commas per sentence
`repetition`	Detect repeated tokens	Catch "the the" even across markup boundaries
`consistency`	Ensure only one of two competing terms appears per document	"Color" vs. "colour", "advisor" vs. "adviser"
`conditional`	If pattern A exists, pattern B must also exist	Every acronym (e.g., "API") must be defined on first use
`capitalization`	Enforce case styles	Headings must use title case (AP or Chicago style)
`metric`	Evaluate mathematical formulas on document-level variables	Flesch-Kincaid readability score must be below 8.0
`sequence`	Grammar rules using POS tagging	Flag split infinitives, dangling modifiers
`spelling`	Hunspell-compatible dictionary spell checking	American English with custom technical dictionary
`script`	Arbitrary logic via Tengo scripting language	Count paragraphs per section, enforce custom metrics

Vale's regex engine supports lookahead and lookbehind ((?=re), (?!re), (?<=re), (?<!re)), making complex pattern matching possible without scripting.

4. 3. Writing Custom Rules (With Examples)

Rules are YAML files organized into style directories under your StylesPath. Here are practical examples for common writing constraints.

Ban em dashes

NoEmDashes.yml

extends: existence
message: "Don't use em dashes. Use a comma, semicolon, colon, or period instead."
level: error
tokens:
  - '—'
  - '&mdash;'
  - '\u2014'

Ban passive voice

PassiveVoice.yml

extends: existence
message: "'%s' looks like passive voice. Rewrite in active voice."
level: warning
ignorecase: true
tokens:
  - 'is being \w+ed'
  - 'was \w+ed'
  - 'were \w+ed'
  - 'has been \w+ed'
  - 'have been \w+ed'
  - 'had been \w+ed'
  - 'will be \w+ed'
  - 'being \w+ed'

Enforce "we" over "I" in documentation

NoFirstPersonSingular.yml

extends: existence
message: "Use 'we' instead of '%s' in documentation."
level: warning
scope: paragraph
tokens:
  - '\bI\b'
  - '\bmy\b'
  - '\bmine\b'
  - '\bmyself\b'

Enforce short sentences

SentenceLength.yml

extends: occurrence
message: "Sentence too long (%s words). Keep sentences under 25 words."
level: warning
scope: sentence
max: 25
token: '\b\w+\b'

No weasel words

Weasel.yml

extends: existence
message: "Remove the weasel word '%s'. Be specific."
level: warning
ignorecase: true
tokens:
  - arguably
  - basically
  - clearly
  - essentially
  - extremely
  - generally
  - in order to
  - it should be noted
  - literally
  - obviously
  - quite
  - simply
  - somewhat
  - very
  - virtually

Consistent terminology

Terminology.yml

extends: substitution
message: "Use '%s' instead of '%s'."
level: error
ignorecase: true
swap:
  e-mail: email
  e mail: email
  repo: repository
  config: configuration
  'open source': open-source
  web site: website
  data base: database
  end point: endpoint

Readability gate

FleschKincaid.yml

extends: metric
message: "Flesch-Kincaid grade level (%s) is too high. Aim for 8.0 or below."
level: warning
formula: |
  (0.39 * (words / sentences)) + (11.8 * (syllables / words)) - 15.59
condition: '> 8.0'

No exclamation marks in headings

HeadingPunctuation.yml

extends: existence
message: "Don't use '%s' in headings."
level: error
scope: heading
tokens:
  - '!'
  - '\?'

5. 4. Configuration and Project Setup

A .vale.ini file at the project root controls everything.

Minimal setup

StylesPath = .vale/styles
MinAlertLevel = suggestion
Packages = Google, write-good
[*.md]
BasedOnStyles = Vale, Google, write-good
[*.html]
BasedOnStyles = Vale, Google

Directory structure

project/
  .vale.ini
  .vale/
    styles/
      config/
        vocabularies/
          MyProject/
            accept.txt    # Approved terms (one per line)
            reject.txt    # Banned terms (one per line)
      MyStyle/
        NoEmDashes.yml
        Weasel.yml
        SentenceLength.yml
      Google/              # Downloaded via `vale sync`
      write-good/          # Downloaded via `vale sync`

Key configuration options

Setting	Purpose
`StylesPath`	Path to all styles, configs, and scripts
`Packages`	Styles to download via `vale sync`
`Vocab`	Vocabulary directories to load
`MinAlertLevel`	Minimum severity to display: `suggestion`, `warning`, `error`
`BasedOnStyles`	Which styles to apply (per file glob)
`TokenIgnores`	Regex patterns for inline content to skip (e.g., LaTeX formulas)
`BlockIgnores`	Regex patterns for block-level content to skip

Vocabulary system

Vocabularies are project-specific term lists. accept.txt contains approved terms (auto-added to exception lists across all active styles and fed into Vale.Terms for casing enforcement). reject.txt contains banned terms (auto-populates Vale.Avoid as errors). Case-sensitive by default; prefix with (?i) for case-insensitive entries.

Running Vale

# Install brew install vale # macOS go install github.com/errata-ai/vale/v3/cmd/vale@latest # Go

Download packages

vale sync

Lint files

vale docs/ vale --output=JSON docs/ # Machine-readable output vale --glob='*.md' . # Filter by pattern

6. 5. Example Style Guide: Rules for This Site

Here is a concrete set of rules tailored for a personal website with blog articles and documentation. These encode specific editorial preferences as deterministic checks.

Proposed rules for alexisbouchez.com
Rule	Type	Level	Rationale
No em dashes	existence	error	Use commas, semicolons, or periods instead
No exclamation marks	existence	warning	Maintain a calm, measured tone
No weasel words	existence	warning	Be specific rather than hedging
Sentences under 25 words	occurrence	warning	Short sentences are easier to read
Readability below grade 8	metric	warning	Accessible to a broad audience
American English spelling	consistency	error	Pick one and stick with it
No "click here" links	existence	error	Link text should describe the destination
Consistent terminology	substitution	error	"email" not "e-mail", "website" not "web site"
Title case headings	capitalization	warning	Consistent heading style
No repeated words	repetition	error	Catch "the the" typos
Acronyms defined on first use	conditional	warning	Don't assume the reader knows every abbreviation
No passive voice	existence	warning	Active voice is more direct and engaging

Combined with reject.txt:

synergy
synergize
leverage
learnings
deep dive
circle back
move the needle
low-hanging fruit
paradigm shift
disrupt

7. 6. Vale + LLMs: Architecture Patterns

Vale and LLMs have complementary strengths. Vale is deterministic, fast, offline, and auditable. LLMs understand context, nuance, tone, and can rewrite prose. Neither alone is sufficient for great writing. Together, they form a powerful feedback loop.

Pattern 1: Actor/Critic loop

The LLM writes (actor). Vale lints the output (critic). The LLM rewrites based on Vale's structured feedback. Repeat until clean.

1. LLM generates draft
2. Vale lints draft (vale --output=JSON draft.md)
3. If errors exist:
   a. Feed Vale's JSON output back to LLM
   b. LLM rewrites flagged passages
   c. Go to step 2
4. Output clean draft

This is the most practical pattern. Vale's JSON output is machine-readable, so the LLM can parse exactly which lines have issues, what the rule says, and what severity it is. The LLM can then make targeted fixes rather than rewriting the entire document.

Pattern 2: LLM as Vale interpreter

Vale finds issues. The LLM explains them in context and suggests specific rewrites. This is useful for writers who want to learn from the feedback rather than just accepting automated fixes.

1. Writer drafts content
2. Vale lints and produces structured output
3. LLM receives: original text + Vale output
4. LLM produces: explanation of each issue + suggested rewrite + reasoning
5. Writer reviews and decides

Pattern 3: LLM as pre-filter, Vale as enforcer

The LLM does a first pass for tone, structure, and coherence (things Vale cannot check). Vale then enforces the hard rules that the LLM might miss or ignore.

1. Writer drafts content
2. LLM reviews for: logical flow, argument structure, tone, audience fit
3. Writer incorporates LLM feedback
4. Vale enforces: terminology, sentence length, readability, banned patterns
5. Writer fixes Vale errors
6. Ship

Pattern 4: Dual enforcement in CI

Vale runs as a fast, cheap first pass in CI. Only documents with zero Vale errors get sent to an LLM for deeper analysis. This minimizes API costs.

1. PR opened with documentation changes
2. GitHub Action runs Vale (seconds, free)
3. If Vale passes:
   a. Send changed files to LLM API
   b. LLM checks: coherence, accuracy, tone
   c. LLM posts review comments on PR
4. If Vale fails:
   a. Block merge
   b. Author fixes deterministic issues first

8. 7. MCP Integration: Vale as an LLM Tool

The Vale-MCP server exposes Vale to AI coding assistants via the Model Context Protocol. This means an LLM like Claude can call Vale as a tool, lint a document, receive structured results, and provide contextual writing advice informed by Vale's rule-based findings.

How it works

Vale-MCP is a TypeScript/Node.js server that provides three MCP tools:

Tool	Purpose
`vale_status`	Check that Vale is installed and configured
`vale_sync`	Download and install style packages
`check_file`	Lint a file and return structured results

Compatible clients

Claude Desktop
Claude Code (via MCP server config)
Cursor
VS Code with GitHub Copilot
Any MCP-compatible client

This is the most natural integration point for using Vale with LLMs today. The LLM doesn't need to understand Vale's rule syntax; it just calls the tool and interprets the results. The LLM can then explain issues to the user, suggest fixes, or automatically rewrite flagged passages.

Setup

# Requirements: Node.js 22+, Vale 3.0+
Add to your MCP client configuration:
{
"vale": {
"command": "npx",
"args": ["-y", "@christianchiama/vale-mcp"]
}
}

9. 8. Valegen: Generating Rules from Natural Language

Valegen is a web application that generates Vale YAML rules from plain English descriptions using RAG (Retrieval-Augmented Generation).

How it works

User describes a desired rule in natural language: "Flag sentences that use passive voice"
Valegen searches a vector database of Vale documentation and real-world rules from Vale core, Google, and Microsoft styles
Retrieved context + user prompt are sent to an LLM (supports Gemini, GPT, and Claude)
The LLM generates three candidate YAML rules with confidence ratings
User picks the best one and saves it to their style directory

Why this matters

Writing Vale rules requires understanding YAML syntax, regex patterns, and Vale's check type system. Valegen lowers that barrier to zero. A technical writer with no programming experience can describe what they want in English and get a working rule. This makes it practical to encode an entire editorial style guide as Vale rules, even if the guide has dozens of preferences.

Example

Natural language input	Generated rule (simplified)
"Don't allow sentences longer than 20 words"	`extends: occurrence`, `scope: sentence`, `max: 20`, `token: '\b\w+\b'`
"Replace 'utilize' with 'use'"	`extends: substitution`, `swap: { utilize: use }`
"No em dashes anywhere"	`extends: existence`, `tokens: ['---', '—']`
"Headings should use sentence case"	`extends: capitalization`, `scope: heading`, `match: $sentence`

10. 9. CI/CD Pipeline Integration

Vale has a first-party GitHub Action (errata-ai/vale-action) used by ~3,700 projects. It runs Vale on pull requests and surfaces results as inline annotations, PR reviews, or GitHub Checks.

Basic GitHub Actions setup

name: Lint Prose on: pull_request

jobs: vale: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: errata-ai/vale-action@v2 with: files: docs/ reporter: github-pr-review fail_on_error: true

Vale + LLM in CI (advanced)

name: Writing Quality
on: pull_request
jobs:
vale:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: errata-ai/vale-action@v2
with:
files: docs/
reporter: github-pr-check
fail_on_error: true
llm-review:
needs: vale   # Only runs if Vale passes
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Get changed docs
id: changed
run: |
FILES=$(gh pr diff ${{ github.event.pull_request.number }} --name-only | grep -E '.(md|html)$')
echo "files=$FILES" >> $GITHUB_OUTPUT
- name: LLM review
run: |
# Send changed files to LLM API for tone/coherence review
# Post results as PR comment

Pre-commit hook

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/errata-ai/vale
    rev: v3.12.0
    hooks:
      - id: vale
        args: [--glob, '*.md']

Editor integrations

Editor	Integration
VS Code	"Vale CLI" extension (real-time linting, quick fixes)
Neovim	ALE plugin
JetBrains IDEs	Official Vale CLI plugin
Sublime Text	LSP package
Emacs	flymake-vale
Obsidian	Community plugin
Zed	LSP integration

11. 10. Practical Workflow: End to End

Here is a concrete workflow for writing a blog article or documentation page using Vale and an LLM together.

Step 1: Set up Vale for the project

Install Vale: brew install vale
Create .vale.ini at project root
Create custom style directory with your rules
Run vale sync to download community packages
Test with vale docs/

Step 2: Write the first draft

Write freely. Don't self-censor. Get the ideas down. The tooling will catch style issues later.

Step 3: Run Vale

vale --output=JSON article.md > vale-report.json

Fix all errors (hard rules like terminology, banned patterns). Consider warnings (soft rules like sentence length, readability).

Step 4: LLM review

Send the article to an LLM with a prompt like:

Review this article for: - Logical flow and argument structure - Missing context that a reader would need - Tone consistency - Places where examples would help

Do NOT change terminology, formatting, or style conventions. Those are handled by our linter.

[article content]

The key instruction is telling the LLM not to touch what Vale already handles. This prevents the LLM from re-introducing banned patterns.

Step 5: Final Vale pass

After incorporating LLM feedback, run Vale again. The LLM may have introduced style violations in its suggestions. Vale catches them deterministically.

Step 6: Publish

If using CI, the PR will be automatically linted. Zero Vale errors required to merge.

12. 11. What Vale Catches vs. What LLMs Catch

Complementary strengths
Concern	Vale	LLM
Banned words/phrases (em dashes, jargon)	Yes (deterministic)	Unreliable (may ignore instructions)
Consistent terminology	Yes (substitution rules)	Unreliable (may use synonyms)
Spelling	Yes (Hunspell dictionaries)	Unreliable
Sentence length	Yes (occurrence check)	Can suggest but not enforce
Readability metrics	Yes (Flesch-Kincaid, etc.)	No (cannot compute reliably)
Heading capitalization	Yes (AP/Chicago style)	Unreliable
Acronym definitions	Yes (conditional check)	Sometimes
Repeated words	Yes (across markup boundaries)	Sometimes
Logical coherence	No	Yes
Argument structure	No	Yes
Tone and voice assessment	No	Yes
Audience appropriateness	No	Yes
Missing context	No	Yes
Factual accuracy	No	Partially (with caveats)
Creative rewriting	No	Yes
Cultural sensitivity	Partial (alex style)	Yes

The takeaway: Vale handles everything that can be expressed as a pattern. LLMs handle everything that requires understanding meaning. Using both means your writing is both mechanically correct and genuinely good.

As Datadog's engineering team put it: Vale's "crisp, computer-understandable rules" are foundational infrastructure that should exist before integrating LLMs. LLMs lack awareness of organization-specific style choices, but Vale encodes those choices precisely.

13. 12. Resources and Further Reading

Vale on GitHub — source code, benchmarks, releases
vale.sh — official documentation
Vale-MCP — MCP server for AI assistant integration
Valegen — generate Vale rules from natural language using LLMs
vale-action — GitHub Actions integration
Write Better with Vale — book by Brian P. Hogan (Pragmatic Programmers, 2025)
Community packages — Google, Microsoft, write-good, proselint, alex styles

Vale + LLMs: Deterministic Prose Linting Meets AI for Better Writing

2. 1. What Vale Is and Why It Matters

Key properties

Who uses Vale

3. 2. The Rule System: 11 Check Types

4. 3. Writing Custom Rules (With Examples)

Ban em dashes

Ban passive voice

Enforce "we" over "I" in documentation

Enforce short sentences

No weasel words

Consistent terminology

Readability gate

No exclamation marks in headings

5. 4. Configuration and Project Setup

Minimal setup

Directory structure

Key configuration options

Vocabulary system

Running Vale

Download packages

Lint files

6. 5. Example Style Guide: Rules for This Site

7. 6. Vale + LLMs: Architecture Patterns

Pattern 1: Actor/Critic loop

Pattern 2: LLM as Vale interpreter

Pattern 3: LLM as pre-filter, Vale as enforcer

Pattern 4: Dual enforcement in CI

8. 7. MCP Integration: Vale as an LLM Tool

How it works

Compatible clients

Setup

Add to your MCP client configuration:

9. 8. Valegen: Generating Rules from Natural Language

How it works

Why this matters

Example

10. 9. CI/CD Pipeline Integration

Basic GitHub Actions setup

Vale + LLM in CI (advanced)

Pre-commit hook

Editor integrations

11. 10. Practical Workflow: End to End

Step 1: Set up Vale for the project

Step 2: Write the first draft

Step 3: Run Vale

Step 4: LLM review

Step 5: Final Vale pass

Step 6: Publish

12. 11. What Vale Catches vs. What LLMs Catch

13. 12. Resources and Further Reading