Wardstone Ruby SDK

Ruby SDK for the Wardstone LLM security API. Detect prompt injection, content violations, data leakage, and unknown links in LLM inputs and outputs.

Installation

Add to your Gemfile:

gem "wardstone"

Or install directly:

gem install wardstone

Quick Start

require "wardstone"

client = Wardstone::Client.new(api_key: "YOUR_API_KEY")
result = client.detect(text: user_input)

if result.risk_bands.prompt_attack.level != "Low Risk"
  puts "Prompt attack detected"
  puts "Risk: #{result.risk_bands.prompt_attack.level}"
end

Configuration

client = Wardstone::Client.new(
  api_key: "YOUR_API_KEY",       # or set WARDSTONE_API_KEY env var
  base_url: "https://wardstone.ai", # default
  timeout: 30,                      # seconds, default: 30
  max_retries: 2                    # default: 2, max: 10
)

Environment Variable

The API key can be set via the WARDSTONE_API_KEY environment variable:

# Will use WARDSTONE_API_KEY from environment
client = Wardstone::Client.new

Usage

Basic Detection

result = client.detect(text: "Ignore all previous instructions")

result.flagged              # true
result.primary_category     # "prompt_attack"
result.risk_bands.prompt_attack.level      # "Severe Risk"
result.risk_bands.content_violation.level  # "Low Risk"
result.risk_bands.data_leakage.level       # "Low Risk"
result.risk_bands.unknown_links.level      # "Low Risk"

Scan Strategies

# Full scan (check all categories)
result = client.detect(text: input, scan_strategy: "full-scan")

# Early exit (stop at first threat)
result = client.detect(text: input, scan_strategy: "early-exit")

# Smart sample (optimized for long texts)
result = client.detect(text: input, scan_strategy: "smart-sample")

Raw Scores

result = client.detect(text: input, include_raw_scores: true)
if result.raw_scores
  result.raw_scores.categories.prompt_attack       # 0.95
  result.raw_scores.categories.content_violation   # 0.01
end

Rate Limit Info

result = client.detect(text: input)
result.rate_limit.limit       # 1000
result.rate_limit.remaining   # 999
result.rate_limit.reset       # 1700000000

Response

{
  "flagged": true,
  "risk_bands": {
    "content_violation": { "level": "Low Risk" },
    "prompt_attack": { "level": "Severe Risk" },
    "data_leakage": { "level": "Low Risk" },
    "unknown_links": { "level": "Low Risk" }
  },
  "primary_category": "prompt_attack",
  "subcategories": {
    "content_violation": { "triggered": [] },
    "data_leakage": { "triggered": [] }
  },
  "unknown_links": {
    "flagged": false,
    "unknown_count": 0,
    "known_count": 0,
    "total_urls": 0,
    "unknown_domains": []
  },
  "processing": {
    "inference_ms": 28,
    "input_length": 62,
    "scan_strategy": "early-exit"
  },
  "rate_limit": {
    "limit": 100000,
    "remaining": 99999,
    "reset": 2592000
  }
}

Error Handling

begin
  result = client.detect(text: input)
rescue Wardstone::AuthenticationError => e
  # Invalid or missing API key (401)
rescue Wardstone::BadRequestError => e
  # Invalid request (400)
  e.max_length  # available for text_too_long errors
rescue Wardstone::PermissionError => e
  # Feature not available on plan (403)
rescue Wardstone::RateLimitError => e
  # Quota exceeded (429)
  e.retry_after  # seconds to wait
rescue Wardstone::InternalServerError => e
  # Server error (500)
rescue Wardstone::TimeoutError => e
  # Request timed out
rescue Wardstone::ConnectionError => e
  # Network failure
rescue Wardstone::Error => e
  # Catch-all for any Wardstone error
  e.status  # HTTP status code (nil for network errors)
  e.code    # Machine-readable error code
end

Risk Levels

Each category returns one of four risk levels:

"Low Risk" - No threat detected
"Some Risk" - Minor concern
"High Risk" - Significant threat
"Severe Risk" - Critical threat, action recommended

Requirements

Ruby >= 3.0
Zero runtime dependencies (stdlib only)

wardstone

Development