DataLex¶
YAML-first data modeling that makes contracts machine-checkable for the AI era.
In 2026, AI agents answer the same business question different ways and return different numbers. CFOs and data leaders are starting to ask "can we trust the AI numbers?" — and right now the answer is "no."
DataLex is the layer that turns that "no" into "yes" — by giving your dbt project a governed conceptual model and machine-enforceable contracts that AI tools can't bypass.
Why DataLex¶
- Sits above dbt, never replaces it. Reads
target/manifest.json, never writes back without a reviewable diff. Your dbt project stays the source of truth for transformations. - Conceptual, logical, and physical layers stay connected. Business meaning, data structure, and dbt implementation share one YAML graph instead of drifting across SQL, tickets, and tribal knowledge.
- Reviewable AI authoring.
datalex draftproposes contracts from a dbt project; you accept, edit, and commit. No silent rewrites of project files. - Compile-time contract checks. When a DQL block references a contract by id, the DQL compiler resolves it against your DataLex manifest and refuses to ship if the binding breaks.
- Open source forever. Apache 2.0. No closed-source language features.
Install¶
pip install datalex-cli
pip install datalex-cli[draft]
export ANTHROPIC_API_KEY=sk-ant-...
datalex draft --dbt /path/to/dbt-project --domain commerce
pip install datalex-cli[draft-openai]
export OPENAI_API_KEY=sk-...
datalex draft --dbt /path/to/dbt-project --domain commerce --provider openai
pip install datalex-cli[draft-gemini]
export GOOGLE_API_KEY=...
datalex draft --dbt /path/to/dbt-project --domain commerce --provider gemini
pip install datalex-cli[draft-ollama] # no SDK needed; uses HTTP
ollama serve
datalex draft --dbt /path/to/dbt-project --domain commerce \
--provider ollama --model llama3.1:8b
The CLI auto-detects the provider from env vars when --provider is
omitted: ANTHROPIC_API_KEY > OPENAI_API_KEY > GOOGLE_API_KEY >
Ollama fallback. Pin a specific provider explicitly with the flag.
pip install datalex-cli[serve]
datalex serve
Five-minute path¶
- End-to-end DataLex + DQL tutorial — feel the wedge in 5 minutes using both example repos. The fastest way to understand the product.
- Get started — install, scaffold a project, compile your first model.
- Walk through Jaffle Shop — full dbt + DuckDB + DataLex example.
- Layered modeling — when to use conceptual vs. logical vs. physical.
- The DataLex + DQL stack — how the two languages combine for certified AI analytics.
Architecture in one diagram¶
flowchart LR
Source[(Source data)] --> dbt
dbt --> Manifest[dbt manifest.json]
Manifest --> DataLex[DataLex compiler]
DataLex --> ContractsManifest[DataLex manifest]
ContractsManifest --> DQL[DQL compiler]
DQL --> Blocks[Certified blocks]
Blocks --> MCP[DQL MCP for AI agents]
Blocks --> Apps[Apps + dashboards]
ContractsManifest --> Catalog[Atlan / Marquez / Monte Carlo]
DataLex is the green-field substrate above dbt. DQL consumes the manifest below dbt. Both speak the public manifest spec.
Open source¶
DataLex is Apache 2.0 and built in the open at duckcode-ai/DataLex. File issues, send PRs, drop into Discord.
For the broader plan — manifest-spec versioning, the DQL companion, the launch checklist — see ROADMAP.md.