Natural-language analytics feels harmless right up until an agent decides a dashboard question deserves a five-table join against production. The failure mode is not usually dramatic. It is a quiet full scan, a missed tenant filter, or a query that looks fine in review but melts a replica at 10:07 a.m.
The fix is not, “just make the prompt stricter.” If you want AI-to-SQL to survive contact with a real warehouse or OLTP replica, you need rails under the model: query-shape constraints, plan inspection, resource budgets, and a hard rule that generated SQL never goes straight to your primary.
This is the workflow I would use for an analytics agent that answers product and ops questions safely.
Why this matters
AI-to-SQL systems fail in three predictable ways: they generate operationally rude SQL, they miss business constraints like tenant filters, and they look brilliant in demos before collapsing under broad real-world questions.
Architecture and workflow overview
flowchart LR
User question -> Intent and dataset router
Intent and dataset router -> Template and policy planner
Template and policy planner -> Parameterized SQL draft
Parameterized SQL draft -> Static checks
Static checks -> EXPLAIN gate
EXPLAIN gate -> Read replica execution
Read replica execution -> Result reducer and citations
Result reducer and citations -> Answer plus audit logThe model does not get a direct connection plus free text. It gets a constrained planning job and has to survive policy plus plan checks before execution.
Implementation details
1. Treat SQL generation as template selection, not open-ended synthesis
query_families:
signup_funnel_by_week:
dataset: analytics_replica
sql: |
SELECT date_trunc('week', created_at) AS week,
plan_tier,
count(*) AS signups
FROM accounts
WHERE created_at >= :start_date
AND created_at < :end_date
AND tenant_id = :tenant_id
GROUP BY 1, 2
ORDER BY 1 DESC;
required_params: [start_date, end_date, tenant_id]
max_rows: 5002. Gate every query with static policy and EXPLAIN budgets
FORBIDDEN_PATTERNS = [r"\bUPDATE\b", r"\bDELETE\b", r"\bINSERT\b"]
MAX_PLAN_ROWS = 2_000_000
MAX_COST = 150_000
async def approve_query(conn, sql, params, context):
for pattern in FORBIDDEN_PATTERNS:
if re.search(pattern, sql, flags=re.IGNORECASE):
return {"ok": False, "reason": f"forbidden pattern: {pattern}"}
if context["tenant_id"] and "tenant_id = :tenant_id" not in sql:
return {"ok": False, "reason": "missing tenant scope"}
plan = await conn.fetchval("EXPLAIN (FORMAT JSON) " + sql, *bind_params(sql, params))
summary = extract_plan_summary(plan)
if summary["total_cost"] > MAX_COST or summary["plan_rows"] > MAX_PLAN_ROWS:
return {"ok": False, "reason": "query budget exceeded", "plan": summary}
return {"ok": True, "plan": summary}3. Use a read-only execution lane with timeouts and row caps
async def run_guarded_query(pool, sql, params, audit):
async with pool.acquire() as conn:
await conn.execute("SET statement_timeout = '2500ms'")
await conn.execute("SET default_transaction_read_only = on")
approved = await approve_query(conn, sql, params, audit)
if not approved["ok"]:
return {"status": "rejected", **approved, "audit": audit}
rows = await conn.fetch(sql, *bind_params(sql, params))
return {
"status": "ok",
"row_count": min(len(rows), 500),
"rows": [dict(row) for row in rows[:500]],
"plan": approved["plan"],
"audit": audit,
}$ agent ask "Show top accounts by events this quarter"
planner: selected query_family=events_by_account
policy: tenant scope present
explain: cost=412881 rows=9876543
reject: plan cost too high
rewrite hint: require narrower date range or add event_type filterWhat went wrong and the tradeoffs
| Approach | Upside | Downside | Where it fits |
|---|---|---|---|
| Free-form SQL generation | Maximum flexibility | Easy to create expensive or unsafe queries | Almost nowhere without heavy isolation |
| Template families with params | Predictable and reviewable | Needs ongoing template maintenance | Best default for internal analytics agents |
| Semantic layer plus generated filters | Better abstraction for business users | More metadata work up front | Mature teams with stable metrics definitions |
Practical checklist
- Route agent queries to a read replica or warehouse lane only.
- Require parameterized query families for common analytics tasks.
- Enforce tenant and environment filters in static policy.
- Run EXPLAIN before execution and reject high-cost plans.
- Apply statement timeout, row cap, and retry budget.
- Log question, selected template, plan summary, and final result shape.
- Redact sensitive columns before results re-enter model context.
- Measure estimate-versus-actual drift for repeated query families.
Conclusion
Put the model inside a constrained planning lane, make every query survive policy plus EXPLAIN, and run the result on infrastructure that can afford a mistake.