Project

wardstone

0.0
No release in over 3 years
Detect prompt injection, content violations, data leakage, and unknown links in LLM inputs and outputs.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

~> 5.0
~> 13.0
~> 3.0
 Project Readme

Wardstone Ruby SDK

Ruby SDK for the Wardstone LLM security API. Detect prompt injection, content violations, data leakage, and unknown links in LLM inputs and outputs.

Installation

Add to your Gemfile:

gem "wardstone"

Or install directly:

gem install wardstone

Quick Start

require "wardstone"

client = Wardstone::Client.new(api_key: "YOUR_API_KEY")
result = client.detect(text: user_input)

if result.risk_bands.prompt_attack.level != "Low Risk"
  puts "Prompt attack detected"
  puts "Risk: #{result.risk_bands.prompt_attack.level}"
end

Configuration

client = Wardstone::Client.new(
  api_key: "YOUR_API_KEY",       # or set WARDSTONE_API_KEY env var
  base_url: "https://wardstone.ai", # default
  timeout: 30,                      # seconds, default: 30
  max_retries: 2                    # default: 2, max: 10
)

Environment Variable

The API key can be set via the WARDSTONE_API_KEY environment variable:

# Will use WARDSTONE_API_KEY from environment
client = Wardstone::Client.new

Usage

Basic Detection

result = client.detect(text: "Ignore all previous instructions")

result.flagged              # true
result.primary_category     # "prompt_attack"
result.risk_bands.prompt_attack.level      # "Severe Risk"
result.risk_bands.content_violation.level  # "Low Risk"
result.risk_bands.data_leakage.level       # "Low Risk"
result.risk_bands.unknown_links.level      # "Low Risk"

Scan Strategies

# Full scan (check all categories)
result = client.detect(text: input, scan_strategy: "full-scan")

# Early exit (stop at first threat)
result = client.detect(text: input, scan_strategy: "early-exit")

# Smart sample (optimized for long texts)
result = client.detect(text: input, scan_strategy: "smart-sample")

Raw Scores

result = client.detect(text: input, include_raw_scores: true)
if result.raw_scores
  result.raw_scores.categories.prompt_attack       # 0.95
  result.raw_scores.categories.content_violation   # 0.01
end

Rate Limit Info

result = client.detect(text: input)
result.rate_limit.limit       # 1000
result.rate_limit.remaining   # 999
result.rate_limit.reset       # 1700000000

Response

{
  "flagged": true,
  "risk_bands": {
    "content_violation": { "level": "Low Risk" },
    "prompt_attack": { "level": "Severe Risk" },
    "data_leakage": { "level": "Low Risk" },
    "unknown_links": { "level": "Low Risk" }
  },
  "primary_category": "prompt_attack",
  "subcategories": {
    "content_violation": { "triggered": [] },
    "data_leakage": { "triggered": [] }
  },
  "unknown_links": {
    "flagged": false,
    "unknown_count": 0,
    "known_count": 0,
    "total_urls": 0,
    "unknown_domains": []
  },
  "processing": {
    "inference_ms": 28,
    "input_length": 62,
    "scan_strategy": "early-exit"
  },
  "rate_limit": {
    "limit": 100000,
    "remaining": 99999,
    "reset": 2592000
  }
}

Error Handling

begin
  result = client.detect(text: input)
rescue Wardstone::AuthenticationError => e
  # Invalid or missing API key (401)
rescue Wardstone::BadRequestError => e
  # Invalid request (400)
  e.max_length  # available for text_too_long errors
rescue Wardstone::PermissionError => e
  # Feature not available on plan (403)
rescue Wardstone::RateLimitError => e
  # Quota exceeded (429)
  e.retry_after  # seconds to wait
rescue Wardstone::InternalServerError => e
  # Server error (500)
rescue Wardstone::TimeoutError => e
  # Request timed out
rescue Wardstone::ConnectionError => e
  # Network failure
rescue Wardstone::Error => e
  # Catch-all for any Wardstone error
  e.status  # HTTP status code (nil for network errors)
  e.code    # Machine-readable error code
end

Risk Levels

Each category returns one of four risk levels:

  • "Low Risk" - No threat detected
  • "Some Risk" - Minor concern
  • "High Risk" - Significant threat
  • "Severe Risk" - Critical threat, action recommended

Requirements

  • Ruby >= 3.0
  • Zero runtime dependencies (stdlib only)

Links