Tutorial: CI readiness gate¶

Wire DataLex's red / yellow / green readiness review into your CI so every PR gets a sticky comment with the score, SARIF results in the GitHub Security tab, and a build that fails when the project regresses.

You'll end with:

A .github/workflows/datalex-readiness.yml workflow that posts a PR comment + uploads SARIF
A datalex readiness-gate command you can run locally to reproduce the same scoring
Confidence that the rules in CI match the rules editors run in the Validation drawer (single source of truth, shared packages/readiness_engine Python package)

Time: 5 minutes. Prerequisites:

Python 3.9+ (3.11 recommended in CI for dbt compatibility)
A GitHub repository with Actions enabled
A dbt project with at least a few .yml files in it

Shipped in DataLex 1.4.0. The CLI and the Action share scoring with the api-server's /api/dbt/review endpoint, so the badges in Explorer match what CI sees.

What gets scored¶

Each YAML file gets a per-file score (0-100) and a status (red <60 or any error · yellow <85 or any warning · green ≥85). The project score is the average. Findings come from these categories:

Category	Examples
`metadata`	Missing model description, owner, domain, column descriptions
`dbt_quality`	Missing identity tests, FK tests, contract review nudges
`governance`	Sensitive (PII/PHI/PCI) columns without classification
`import_health`	YAML parse errors, no columns, unknown column types
`modeling`	Fact model without grain, staging materialization mismatch
`enterprise_modeling`	Unclassified YAML, no semantic_models opportunity

Custom rule packs in <project>/.datalex/policies/*.yaml add their own findings on top — see Tutorial: Custom policy packs.

Step 1 — Install the CLI locally and reproduce the score¶

pip install -U 'datalex-cli'
cd ~/path/to/your-dbt-project
datalex readiness-gate --project . --output-json | jq '.summary'

You should see something like:

{
  "total_files": 29,
  "red": 4,
  "yellow": 11,
  "green": 14,
  "findings": 158,
  "errors": 4,
  "warnings": 41,
  "infos": 113,
  "score": 78
}

Inspect the worst-offending files:

datalex readiness-gate --project . --output-json \
  | jq '.files | sort_by(.score) | .[:5] | .[] | {status, score, path}'

Pick a min-score value that's at or just below today's score so the first PR doesn't regress. You can ratchet it up over time as the project improves.

Step 2 — Drop in the GitHub Action¶

Create .github/workflows/datalex-readiness.yml:

name: DataLex readiness
on:
  pull_request:
  push:
    branches: [main]

permissions:
  contents: read
  issues: write           # sticky PR comment
  pull-requests: write
  security-events: write  # SARIF upload

jobs:
  readiness:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0          # required for --changed-only

      - uses: duckcode-ai/DataLex/actions/datalex-gate@main
        with:
          project-path: .
          min-score: 78           # today's baseline; ratchet up later
          changed-only: true
          base-ref: origin/${{ github.base_ref || 'main' }}

Commit, push, open a PR. Within ~30 seconds the workflow:

Checks out the repo with full history.
Installs datalex-cli from PyPI.
Runs datalex readiness-gate --project . --sarif … --pr-comment ….
Posts a sticky PR comment with the red/yellow/green table.
Uploads SARIF so the GitHub Security tab and PR file annotations light up the offending lines.
Fails the job if any threshold is breached.

Step 3 — Inspect the sticky PR comment¶

The Action posts (or updates) a comment that looks like:

## DataLex readiness 🟢

**Score:** 92 · red 0 · yellow 1 · green 4
**Findings:** 0 error(s), 1 warning(s), 3 info

| Status | File                           | Score | Errors | Warnings |
|--------|--------------------------------|-------|--------|----------|
| 🟡     | `models/marts/fct_revenue.yml` | 85    | 0      | 1        |
| 🟢     | `models/staging/stg_orders.yml`| 100   | 0      | 0        |
| …

The comment is sticky on the marker  so subsequent pushes to the PR update the same comment instead of piling up. Reviewers can ignore the workflow re-runs and just read the latest comment.

Step 4 — Ratchet the threshold¶

Once the team has a few PRs at score X with no regressions, raise min-score toward 85+. The Action's min-score is a floor, not a target — anything above it is fine, anything below fails. A typical adoption curve:

Week	min-score	What's expected
1	70	Today's score. Just stop the bleeding.
2-3	78	Fix red files. Yellow is fine.
4-6	85	Most files green. Yellow needs a story.
7+	90	Green-by-default. Yellow is an exception.

Combine with --max-red 0 to fail fast on any red file, regardless of the average score.

Step 5 — Add the CLI to your local pre-commit (optional)¶

If you want the same score before pushing:

# Run on changed files only, fail fast if score drops below 78
datalex readiness-gate --project . \
  --changed-only --base-ref origin/main \
  --min-score 78

Add a Makefile target so reviewers can reproduce CI locally:

readiness-gate:
    datalex readiness-gate --project . --min-score 78 \
      --sarif datalex-readiness.sarif \
      --pr-comment datalex-readiness.md

Run with make readiness-gate.

Step 6 — Pull SARIF into the GitHub Security tab¶

The Action's upload-sarif: true (default) pushes the datalex-readiness.sarif file to GitHub's code-scanning feature. Your Security → Code scanning tab now shows DataLex findings alongside CodeQL / Snyk / etc. — useful for reviewers who already live in that view.

If your repo doesn't allow security-events: write, set upload-sarif: false and rely on the PR comment alone.

Step 7 — Combine with custom rules¶

The readiness gate respects <project>/.datalex/policies/*.yaml. Drop a pack with org-specific rules in there and CI picks them up automatically — no Action input needed:

# .datalex/policies/my-org.yaml
pack:
  name: my-org-standards
  version: 0.1.0
  extends: datalex/standards/base.yaml

policies:
  - id: stg_naming
    type: regex_per_layer
    severity: warn
    params:
      patterns:
        stg: "^stg_[a-z][a-z0-9_]*$"

  - id: marts_meta
    type: required_meta_keys
    severity: error
    params:
      keys: [owner, grain]
      selectors:
        layer: fct

→ Full reference: Custom policy packs.

Common patterns¶

Different floor for main push vs PR:

- uses: duckcode-ai/DataLex/actions/datalex-gate@main
  with:
    project-path: .
    min-score: ${{ github.event_name == 'pull_request' && 78 || 85 }}
    changed-only: ${{ github.event_name == 'pull_request' }}

Skip the gate for docs-only changes:

on:
  pull_request:
    paths-ignore:
      - 'docs/**'
      - '*.md'

Pin the datalex-cli version explicitly:

- uses: duckcode-ai/DataLex/actions/datalex-gate@main
  with:
    project-path: .
    datalex-version: '1.4.0'

Action inputs reference¶

Name	Default	Description
`project-path`	`.`	Path to the dbt project root.
`python-version`	`3.11`	Python interpreter for the gate.
`min-score`	(unset)	Fail when project score falls below this value.
`max-yellow`	(unset)	Fail when yellow file count exceeds this value.
`max-red`	`0`	Fail when red file count exceeds this value.
`allow-errors`	`false`	Don't fail on error-severity findings.
`changed-only`	`false`	Only score files changed vs `base-ref`.
`base-ref`	`origin/main`	Base ref for `changed-only`.
`upload-sarif`	`true`	Upload SARIF to the Security tab.
`pr-comment`	`true`	Post a sticky PR comment summary.
`datalex-version`	(latest)	Pin a specific `datalex-cli` version.

Output	Description
`score`	Project readiness score 0-100.
`status`	Aggregate status: `green` / `yellow` / `red`.
`sarif-path`	Path to the generated SARIF file.

Troubleshooting¶

Symptom	Fix
Action fails: "fatal: bad revision 'origin/main...HEAD'"	The checkout didn't fetch enough history. Set `fetch-depth: 0` in the `actions/checkout@v4` step.
"datalex_readiness is not installed"	The `datalex-cli` install step failed; check the Python version and pip output above the gate step.
Sticky PR comment isn't updating	The job needs `permissions: issues: write` and `pull-requests: write`.
SARIF upload fails	The job needs `permissions: security-events: write`. Set `upload-sarif: false` if your repo can't grant that.
Local CLI score doesn't match CI	CI runs against `origin/main` history. Run with the same `--changed-only --base-ref origin/main` flags locally to compare.
Findings disagree with the UI Validation drawer	The api-server and the CLI both shell out to `python -m datalex_readiness review`. If they diverge, file a bug — it's a parity regression.