0.0
The project is in a healthy, maintained state
Ask for data in plain English; get validated, parameterized SQL with guardrails.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

>= 1.1
>= 0
~> 3.0
~> 1.4

Runtime

 Project Readme

CodeToQuery

A gem that converts natural language questions into SQL queries for Rails apps. It's built for teams who need to give non-developers access to data without compromising security or performance.

What it does

Instead of writing SQL, your team can ask questions like "Show me top customers by revenue this month" and get back safe, parameterized queries that respect your database policies and security rules. See business question examples for prompt patterns and review checks against a typical B2B SaaS schema.

Key features

  • Multiple AI providers: Works with OpenAI or local models
  • Built-in safety: SQL linting, table allowlists, EXPLAIN plan checks
  • Schema awareness: Understands your models, associations, and scopes
  • Policy enforcement: Automatically injects tenant filters and access rules
  • Performance monitoring: Optional query analysis and optimization

Getting started

Add to your Gemfile:

gem 'code_to_query'

Run bundle install and create a config file:

# config/initializers/code_to_query.rb
CodeToQuery.configure do |config|
  config.openai_api_key = ENV['OPENAI_API_KEY']
  config.openai_model = 'gpt-4.1-mini'
  
  # Security settings
  config.enable_explain_gate = true
  config.allow_seq_scans = false
  config.max_query_cost = 10000
  config.require_limit_by_default = true
  config.explain_fail_open = false # default; set true only for availability-first deployments
end

Generate your schema context:

rails code_to_query:bootstrap

Basic usage

# Ask a question
query = CodeToQuery.ask(
  prompt: "Top 10 invoices by amount in July",
  allow_tables: %w[invoices vendors],
  current_user: current_user
)

# Check if it's safe to run
if query.safe?
  results = query.run
  puts "Found #{results.rows.length} results"

  # Shape already-materialized results for export without executing another query
  hashes = CodeToQuery::ResultExport.to_a(results)
  json = CodeToQuery::ResultExport.to_json(results)
  csv = CodeToQuery::ResultExport.to_csv(results)
end

# Or get the SQL for review
puts query.sql
puts query.params

Configuration options

Database settings

config.adapter = :postgres           # :postgres, :mysql, :sqlite
config.readonly_role = :reporting    # Database role for queries
config.default_limit = 100           # Default row limit
config.max_limit = 10000             # Max allowed limit

Security settings

config.enable_explain_gate = true    # Block expensive queries
config.allow_seq_scans = false       # Prevent table scans
config.max_query_cost = 10000        # Cost threshold
config.max_joins = 3                 # Join limit
config.explain_fail_open = false     # Keep EXPLAIN failures fail-closed
config.sensitive_column_patterns |= [ # Omitted from context packs
  /internal[_-]?credential/i
]

When enable_explain_gate is true, EXPLAIN connection, adapter, and parsing errors fail closed by default so a degraded reporting database does not turn into an unbounded query bypass. Leave explain_fail_open unset or false for strict internal analytics and production dashboards. Set it to true only for a deliberate availability-first deployment where users may run otherwise safe queries when EXPLAIN is temporarily unavailable; pair that choice with readonly database credentials, conservative table allowlists, and query timeouts.

Example profiles:

# Strict internal analytics / production dashboard
config.readonly_role = :reporting
config.enable_explain_gate = true
config.explain_fail_open = false
config.allow_seq_scans = false
config.max_query_cost = 10_000
config.force_readonly_session = true

# Developer or test environment with fixture-sized data
config.enable_explain_gate = false
config.stub_llm = true

# Availability-first readonly reporting database
config.readonly_role = :reporting
config.enable_explain_gate = true
config.explain_fail_open = true
config.force_readonly_session = true
config.query_timeout = 10

OpenAI settings

config.openai_api_key = ENV['OPENAI_API_KEY']
config.openai_model = 'gpt-4'
config.stub_llm = false              # Set true for testing

Rake tasks

rails code_to_query:bootstrap # Generate full context pack
rails code_to_query:schema    # Extract schema info
rails code_to_query:scan_app  # Scan models and associations
rails code_to_query:verify    # Check context pack integrity

Run the offline provider evaluation task from the gem checkout or host app bundle:

bundle exec rake code_to_query:provider:evaluate

Context generation omits columns whose names match sensitive_column_patterns by default, including password, token, secret, credential, digest, salt, OTP, and API key columns. Because the defaults intentionally favor omission over exposure, remove or replace patterns in your initializer if unrelated business columns contain those substrings. Add to the default pattern list before running bootstrap or schema tasks if your application uses additional sensitive naming conventions. Set config.sensitive_column_patterns = [] if you need to generate an unfiltered local context pack for an internal-only environment. The same filter also removes index, foreign-key, and check-constraint metadata that references sensitive names or definitions.

The provider evaluation task runs deterministic offline fixtures against the planner/provider boundary and reports a pass rate for checks such as table allowlists, bounded limits, and params represented as hashes rather than raw strings. Treat the score as a regression signal for known prompts, not a guarantee of production natural-language quality. Keep real provider smoke tests opt-in and run them only with non-sensitive schema and prompts.

Security features

  • SQL injection prevention: All queries are parameterized
  • Access control: Table allowlists and row-level policies
  • Performance limits: EXPLAIN plan analysis and cost thresholds
  • Readonly execution: Uses dedicated readonly database connections

For deployment guidance, threat boundaries, and vulnerability reporting, see SECURITY.md.

Advanced usage

Custom policies

config.policy_adapter = ->(user) do
  raise "current user is required" unless user

  predicates = { company_id: user.company_id }
  predicates[:user_id] = user.id unless user.admin?

  { enforced_predicates: predicates }
end

Policy adapter failures fail closed by default: adapter exceptions, nil returns, and malformed predicate payloads raise CodeToQuery::PolicyAdapterError rather than running an unscoped query. If you intentionally prefer availability over row-level enforcement for a deployment, opt in explicitly:

config.policy_adapter_fail_open = true

Leave policy_adapter_fail_open unset or false for the safe default.

For Rails applications that use Pundit-style policies and scopes, see the Pundit policy adapter examples for conservative table allowlists, tenant/account/user predicates, and logging boundaries.

Upgrading EXPLAIN gate behavior

EXPLAIN gate errors fail closed by default. Existing applications that already set enable_explain_gate = true and intentionally relied on the older availability-first behavior should opt back in explicitly:

config.explain_fail_open = true

Use that setting only when table allowlists, readonly credentials, row limits, and database timeouts are strong enough for queries to proceed if EXPLAIN is temporarily unavailable.

You can subscribe to sanitized EXPLAIN gate audit events for operational review. The payload intentionally omits raw SQL, prompts, bind values, row data, and raw EXPLAIN plans; it includes only decision metadata and configured thresholds.

ActiveSupport::Notifications.subscribe('code_to_query.explain_gate') do |_name, _started, _finished, _id, payload|
  metadata = {
    adapter: payload[:adapter],
    allowed: payload[:allowed],
    reason: payload[:reason],
    fail_open: payload[:fail_open],
    max_query_cost: payload[:max_query_cost],
    max_query_rows: payload[:max_query_rows],
    allow_seq_scans: payload[:allow_seq_scans]
  }

  CodeToQuery.config.logger.info("[code_to_query] explain gate decision #{metadata.inspect}")
end

Custom schema

schema = {
  tables: [
    {
      name: "users",
      columns: [
        { name: "id", sql_type: "integer" },
        { name: "email", sql_type: "varchar" }
      ]
    }
  ]
}

query = CodeToQuery.ask(
  prompt: "Recent users",
  schema: schema,
  allow_tables: ["users"]
)

Error handling

begin
  query = CodeToQuery.ask(prompt: "Complex query")
  results = query.run if query.safe?
rescue CodeToQuery::ExecutionError => e
  Rails.logger.error "Query failed: #{e.message}"
rescue CodeToQuery::ConnectionError => e
  Rails.logger.error "Database issue: #{e.message}"
end

Contributing

  1. Fork the repo
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

MIT License - see LICENSE file for details.