Skip to content

datalex CLI cheat sheet

Every subcommand on one page. Global flags: --help on any command prints its full option list. The CLI installs as both datalex and dm — they're the same script. Examples below use datalex.

DataLex 1.4 added five new commands: readiness-gate, dbt docs reindex, emit catalog, plus richer policy-check (now accepts multiple --policy flags). They're called out below.

CI / readiness — new in 1.4

Command What it does
datalex readiness-gate --project <dir> Run the dbt-readiness engine on a project and fail on threshold breaches. Same scoring as /api/dbt/review and the actions/datalex-gate GitHub Action.

Common flags: - --min-score <n> — fail when project score < n - --max-yellow <n> / --max-red <n> — fail when file counts exceed - --allow-errors — don't fail on error-severity findings - --changed-only --base-ref <ref> — only score files changed vs base ref (uses git diff) - --sarif <path> — write SARIF 2.1.0 for GitHub code-scanning upload - --pr-comment <path> — write a sticky PR-comment markdown summary - --output-json — print the full review JSON

→ Full walkthrough: Tutorial: CI readiness gate.

dbt integration

Command What it does
datalex dbt sync <project> --out-root <dir> Reads target/manifest.json + profiles.yml, introspects live warehouse columns, merges into DataLex YAML. Idempotent: user edits survive re-sync. Doc-block references are preserved.
datalex dbt emit <root> --out-dir <dir> Emits dbt sources.yml + models/_schema.yml with contract.enforced: true and data_type: per column. Re-emits {{ doc("name") }} whenever a column carries description_ref.
datalex dbt import <manifest.json> --out-root <dir> One-shot import without warehouse introspection (manifest data_type only). Captures snapshots / seeds / exposures / unit tests / semantic models in 1.4.
datalex dbt docs reindex --project-dir <dir> (1.4) Rebuild the {% docs %} block index used by AI retrieval, the inspector, and the round-trip emitter. Prints resolved names + body previews.

Common flags for sync: - --profile <name> — pick a non-default target from the profile - --profiles-dir <dir> — override profiles.yml search - --manifest <path> — explicit manifest.json location - --skip-warehouse — use manifest data_type only (offline / no creds) - --output-json — machine-readable report

→ Full walkthrough: Tutorial: dbt sync in 5 minutes.

Validation, policies, and project summary

Command What it does
datalex validate <root> Load the project, run schema + policy validation, exit non-zero on error.
datalex lint <root> Semantic / dimensional-modeling rules (warnings) — companion to validate.
datalex policy-check <model> --policy <pack> [--policy <pack>...] Evaluate one or more model files against one or more policy packs. (1.4) Pass --policy multiple times to merge packs. Use --inherit to resolve pack.extends.
datalex info <root> Entity-by-layer counts, physical-by-dialect breakdown, term/domain/policy counts.
datalex expand <root> Preview a project with snippets inlined (doesn't mutate files).

Flags: --output-json on all of them. validate also supports --non-strict (don't exit non-zero; print errors and continue).

→ Custom rule packs: Tutorial: Policy packs (1.4).

Built-in standards pack

datalex policy-check models/marts/fct_orders.yml \
  --policy datalex/standards/base.yaml \
  --policy .datalex/policies/my-org.yaml \
  --inherit

Policy packs live under <project>/.datalex/policies/ and inherit from the bundled datalex/standards/base.yaml. New 1.4 rule types: regex_per_layer, required_meta_keys, layer_constraint, require_contract, require_data_type_when_contracted. All support a selectors: { layer, tag, path_glob } block to scope the rule.

Catalog export — new in 1.4

Command What it does
datalex emit catalog --target <atlan\|datahub\|openmetadata> --model <model.yaml> --out <dir> Emit the glossary + column-binding payload for an external catalog. Output is one JSON file per model named <target>-<model_name>.json.

Bindings come from columns that declare binding: { glossary_term, status } (or the legacy terms: [...] / meta.glossary_term shapes). Operators import the JSON via the catalog's bulk-loader API.

DDL emission

Command What it does
datalex emit ddl <root> --dialect <name> Emit per-dialect DDL for every physical entity. --out <file> writes to disk; stdout otherwise.

Currently registered dialects: postgres, snowflake. The registry is pluggable — see packages/core_engine/src/datalex_core/dialects/.

Semantic diff

Command What it does
datalex diff <old-root> <new-root> Detect added / removed / renamed / changed entities and columns.
datalex gate old.yml new.yml PR breaking-change gate: validate both files and fail on breaking diffs. Different from readiness-gate (which scores quality of one project).

Flags: - --output-json — structured diff (good for CI pipes) - --exit-on-breaking — exit non-zero if any change is breaking

Renames are tracked via previous_name:; the diff prefers explicit renames to heuristic match.

Cross-repo packages

Command What it does
datalex packages resolve <root> Resolve imports: in datalex.yaml, fetch packages, write .datalex/lock.yaml.
datalex packages list <root> Show resolved packages and their cached paths.

resolve flags: --update forces re-fetch, --cache-root <dir> overrides ~/.datalex/packages/, --output-json for JSON output.

Layout migration

Command What it does
datalex migrate to-datalex-layout <legacy.model.yaml> Explode a v1/v2 *.model.yaml into the per-kind: DataLex tree.

Flags: --output-root <dir>, --dialect <name> (default postgres), --dry-run to preview writes, --output-json for machine report.

Common patterns

CI readiness gate — fail on red/score regressions (1.4):

datalex readiness-gate --project . \
  --min-score 80 --max-red 0 \
  --changed-only --base-ref origin/main \
  --sarif datalex-readiness.sarif \
  --pr-comment datalex-readiness.md

CI breaking-change gate — fail on schema regressions:

git checkout main -- datalex/
datalex diff datalex/ ../new/datalex/ --exit-on-breaking

Round-trip sanity check — dbt sync, emit, parse:

datalex dbt sync my-dbt-project --out-root datalex/
datalex dbt emit datalex/ --out-dir dbt-emitted/
cd my-dbt-project && cp -r ../dbt-emitted/models ../dbt-emitted/sources . && dbt parse

Preview sync against a new warehouse without touching it:

datalex dbt sync my-dbt-project --out-root datalex/ --skip-warehouse

Inspect a project without opening every file:

datalex info datalex/ --output-json | jq .

Reindex doc-blocks after editing .md files (1.4):

datalex dbt docs reindex --project-dir .

Run an org policy pack on top of the built-in baseline (1.4):

datalex policy-check models/marts/fct_orders.yml \
  --policy datalex/standards/base.yaml \
  --policy .datalex/policies/my-org.yaml \
  --inherit

Export the glossary to Atlan, DataHub, and OpenMetadata (1.4):

datalex emit catalog --target atlan        --model DataLex/glossary.model.yaml --out out/
datalex emit catalog --target datahub      --model DataLex/glossary.model.yaml --out out/
datalex emit catalog --target openmetadata --model DataLex/glossary.model.yaml --out out/

See also