Project

mudis-ql

0.0
The project is in a healthy, maintained state
Mudis-QL extends Mudis by providing a fluent SQL-like query interface for data stored in the mudis cache
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

~> 3.12
~> 0.22

Runtime

~> 0.9.0
 Project Readme

Mudis-QL

RubyMine

Gem Version License: MIT Documentation

A simple query DSL for mudis cache. Mudis-QL extends mudis by providing a SQL-like query interface for data stored in the cache, enabling you to filter, sort, and paginate cached data without needing a full database.

Why Mudis-QL?

Mudis has been a great in-memory cache for most of my needs, but it was only ever designed to support simple key-value retrieval. From the documentation:

"No SQL or equivalent query interface for cached data. Data is per Key retrieval only."

Mudis-QL solves this limitation by providing a chainable query DSL that allows you to:

  • Filter cached data with where conditions
  • Sort results with order
  • Paginate with limit and offset
  • Count and check existence
  • Pluck specific fields

The goal is to retain Mudis's speed, thread-safety, and in-memory efficiency, but provide an opt-in gem fro advanced retrieval options if needed.

Best Practice: Use Namespaces

Important: Mudis-QL is designed to query collections of related data in mudis. For optimal functionality, always use namespaces when storing data you intend to query:

# Recommended - enables full query functionality
Mudis.write('user1', { name: 'Alice' }, namespace: 'users')
Mudis.write('user2', { name: 'Bob' }, namespace: 'users')
MudisQL.from('users').where(name: /^A/).all

# Limited - namespace-less keys can't be listed/queried as a collection
Mudis.write('user:1', { name: 'Alice' })  # individual key access only

Namespaces provide logical separation and enable Mudis-QL to retrieve all keys in a collection for filtering, sorting, and pagination. Without namespaces, Mudis-QL can only perform individual key operations.

Design

flowchart TD
    Start([User Code]) --> Entry{Entry Point}
    
    Entry -->|MudisQL.from| CreateScope[Create Scope Instance]
    Entry -->|MudisQL.metrics| CreateMetrics[Create MetricsScope]
    
    CreateScope --> InitStore[Initialize Store with namespace]
    InitStore --> Scope[Scope Object]
    
    CreateMetrics --> MetricsObj[MetricsScope Object]
    
    Scope --> Chain{Chain Operations?}
    Chain -->|where| WhereOp[Apply Conditions<br/>Hash/Proc/Regex/Range]
    Chain -->|order| OrderOp[Apply Sorting<br/>Handle nil/mixed types]
    Chain -->|limit| LimitOp[Apply Row Limit]
    Chain -->|offset| OffsetOp[Apply Row Offset]
    
    WhereOp --> Chain
    OrderOp --> Chain
    LimitOp --> Chain
    OffsetOp --> Chain
    
    Chain -->|Terminal Method| Execute{Execution Type}
    
    Execute -->|all| FetchAll[Store.all<br/>Get all namespace keys]
    Execute -->|first| FetchFirst[Apply filters & get first]
    Execute -->|last| FetchLast[Apply filters & get last]
    Execute -->|count| FetchCount[Apply filters & count]
    Execute -->|exists?| FetchExists[Apply filters & check any?]
    Execute -->|pluck| FetchPluck[Apply filters & extract fields]
    
    FetchAll --> MudisRead[Mudis.keys + Mudis.read]
    FetchFirst --> MudisRead
    FetchLast --> MudisRead
    FetchCount --> MudisRead
    FetchExists --> MudisRead
    FetchPluck --> MudisRead
    
    MudisRead --> Transform[Transform to Hash<br/>Add _key field]
    Transform --> Filter[Apply where conditions]
    Filter --> Sort[Apply order]
    Sort --> Paginate[Apply limit/offset]
    Paginate --> Result([Return Results])
    
    MetricsObj --> MetricsChain{Metrics Operations}
    MetricsChain -->|summary| GetSummary[Mudis.metrics<br/>Return summary hash]
    MetricsChain -->|hit_rate| CalcHitRate[Calculate hits/total %]
    MetricsChain -->|efficiency| CalcEfficiency[Calculate efficiency score]
    MetricsChain -->|least_touched| ReturnScope1[Return Scope for<br/>least accessed keys]
    MetricsChain -->|buckets| ReturnScope2[Return Scope for<br/>bucket metrics]
    
    GetSummary --> MetricsResult([Return Metrics])
    CalcHitRate --> MetricsResult
    CalcEfficiency --> MetricsResult
    ReturnScope1 --> Scope
    ReturnScope2 --> Scope
    
    style Start fill:#e1f5ff
    style Result fill:#c8e6c9
    style MetricsResult fill:#c8e6c9
    style MudisRead fill:#fff3e0
    style Scope fill:#f3e5f5
    style MetricsObj fill:#f3e5f5
Loading

Installation

Add this line to your application's Gemfile:

gem 'mudis-ql'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install mudis-ql

Requirements

  • Ruby >= 3.0.0
  • mudis gem

Usage

Basic Setup

require 'mudis'
require 'mudis-ql'

# Configure mudis first
Mudis.configure do |c|
  c.serializer = JSON
  c.compress = true
end

# Store some data in mudis
Mudis.write('user1', { name: 'Alice', age: 30, status: 'active' }, namespace: 'users')
Mudis.write('user2', { name: 'Bob', age: 25, status: 'active' }, namespace: 'users')
Mudis.write('user3', { name: 'Charlie', age: 35, status: 'inactive' }, namespace: 'users')

Query Examples

Simple Queries

# Get all users from a namespace
users = MudisQL.from('users').all
# => [{"name"=>"Alice", "age"=>30, "status"=>"active", "_key"=>"user1"}, ...]

# Filter by exact match
active_users = MudisQL.from('users')
  .where(status: 'active')
  .all

# Chain multiple conditions
result = MudisQL.from('users')
  .where(status: 'active')
  .where(age: ->(v) { v >= 25 })
  .all

Advanced Filtering

# Use proc for custom conditions
adults = MudisQL.from('users')
  .where(age: ->(age) { age >= 18 })
  .all

# Use regex for pattern matching
a_names = MudisQL.from('users')
  .where(name: /^A/i)
  .all

# Use ranges
young_adults = MudisQL.from('users')
  .where(age: 18..25)
  .all

Ordering and Pagination

# Order by field (ascending by default)
sorted_users = MudisQL.from('users')
  .order(:age)
  .all

# Order descending
sorted_desc = MudisQL.from('users')
  .order(:age, :desc)
  .all

# Limit results
top_5 = MudisQL.from('users')
  .order(:age, :desc)
  .limit(5)
  .all

# Pagination with offset
page_2 = MudisQL.from('users')
  .order(:name)
  .limit(10)
  .offset(10)
  .all

Utility Methods

# Get first matching record
first_active = MudisQL.from('users')
  .where(status: 'active')
  .first

# Get last matching record
last_user = MudisQL.from('users')
  .order(:age)
  .last

# Count matching records
count = MudisQL.from('users')
  .where(status: 'active')
  .count

# Check if any records match
has_inactive = MudisQL.from('users')
  .where(status: 'inactive')
  .exists?

# Pluck specific fields
names = MudisQL.from('users').pluck(:name)
# => ["Alice", "Bob", "Charlie"]

name_age_pairs = MudisQL.from('users').pluck(:name, :age)
# => [["Alice", 30], ["Bob", 25], ["Charlie", 35]]

Complete Example

# Complex query combining multiple operations
result = MudisQL.from('users')
  .where(status: 'active')
  .where(age: ->(age) { age >= 25 })
  .order(:age, :desc)
  .limit(10)
  .offset(0)
  .all

# Or using method chaining for pagination
def get_active_users(page: 1, per_page: 10)
  MudisQL.from('users')
    .where(status: 'active')
    .order(:name)
    .limit(per_page)
    .offset((page - 1) * per_page)
    .all
end

Aggregation

# Sum numeric values
total_salary = MudisQL.from('users')
  .where(status: 'active')
  .sum(:salary)

# Calculate average
avg_age = MudisQL.from('users')
  .average(:age)

# Group by field value
by_department = MudisQL.from('users')
  .where(status: 'active')
  .group_by(:department)
# => { "engineering" => [...], "sales" => [...], "marketing" => [...] }

# Aggregation with complex filtering
avg_high_earners = MudisQL.from('users')
  .where(status: 'active')
  .where(salary: ->(s) { s > 100000 })
  .average(:salary)

How It Works

mudis-ql works by:

  1. Retrieving all keys from a mudis namespace using Mudis.keys(namespace:)
  2. Loading values for each key using Mudis.read(key, namespace:)
  3. Applying filters in memory using Ruby's enumerable methods
  4. Sorting and paginating the results

This approach is efficient for moderate-sized datasets (thousands of records) that are already cached in memory. For very large datasets, consider using a proper database.

Integration with Rails

# app/services/user_cache_service.rb
class UserCacheService
  NAMESPACE = 'users'

  def self.cache_user(user)
    Mudis.write(
      user.id.to_s,
      user.attributes.slice('name', 'email', 'status', 'created_at'),
      expires_in: 3600,
      namespace: NAMESPACE
    )
  end

  def self.active_users(limit: 50)
    MudisQL.from(NAMESPACE)
      .where(status: 'active')
      .order(:created_at, :desc)
      .limit(limit)
      .all
  end

  def self.search_by_name(pattern)
    MudisQL.from(NAMESPACE)
      .where(name: /#{Regexp.escape(pattern)}/i)
      .all
  end
end

Integration with Hanami

# lib/my_app/repos/user_cache_repo.rb
module MyApp
  module Repos
    class UserCacheRepo
      NAMESPACE = 'users'

      def find_active(limit: 50)
        MudisQL.from(NAMESPACE)
          .where(status: 'active')
          .limit(limit)
          .all
      end

      def find_by_age_range(min:, max:)
        MudisQL.from(NAMESPACE)
          .where(age: min..max)
          .order(:age)
          .all
      end
    end
  end
end

Querying Mudis Metrics

mudis-ql provides a powerful interface for querying mudis cache metrics:

Basic Metrics

# Get a metrics scope
metrics = MudisQL.metrics

# Top-level metrics summary
summary = metrics.summary
# => { hits: 150, misses: 20, evictions: 5, rejected: 0, total_memory: 45678 }

# Cache hit rate
hit_rate = metrics.hit_rate
# => 88.24 (percentage)

# Overall efficiency
efficiency = metrics.efficiency
# => { hit_rate: 88.24, miss_rate: 11.76, eviction_rate: 2.94, rejection_rate: 0.0 }

# Total keys and memory
metrics.total_keys     # => 1000
metrics.total_memory   # => 2048576 (bytes)

Querying Least Touched Keys

# Get least accessed keys (returns a Scope)
least_touched = metrics.least_touched

# Find never-accessed keys
never_used = least_touched
  .where(access_count: 0)
  .pluck(:key)

# Find keys accessed less than 5 times
rarely_used = least_touched
  .where(access_count: ->(count) { count < 5 })
  .order(:access_count)
  .all

# Identify hotspots (most accessed)
hotspots = least_touched
  .order(:access_count, :desc)
  .limit(10)
  .pluck(:key, :access_count)
# => [["user:123", 450], ["product:456", 380], ...]

# Quick helper for never-accessed keys
cold_keys = metrics.never_accessed_keys
# => ["temp:old_session", "cache:expired_data", ...]

Querying Bucket Metrics

# Get bucket metrics (returns a Scope)
buckets = metrics.buckets

# Find buckets with high memory usage
high_memory = buckets
  .where(memory_bytes: ->(m) { m > 1_000_000 })
  .order(:memory_bytes, :desc)
  .all

# Find imbalanced buckets (many keys)
busy_buckets = buckets
  .where(keys: ->(k) { k > 50 })
  .pluck(:index, :keys, :memory_bytes)

# Analyze specific bucket
bucket_5 = buckets.where(index: 5).first

# Distribution statistics
dist = metrics.bucket_distribution
# => {
#   total_buckets: 32,
#   avg_keys_per_bucket: 31.25,
#   max_keys_per_bucket: 45,
#   min_keys_per_bucket: 18,
#   avg_memory_per_bucket: 65536.5,
#   max_memory_per_bucket: 98304,
#   min_memory_per_bucket: 32768
# }

Advanced Metrics Queries

# Find buckets needing rebalancing
avg_keys = metrics.bucket_distribution[:avg_keys_per_bucket]
unbalanced = metrics.buckets
  .where(keys: ->(k) { k > avg_keys * 1.5 })
  .order(:keys, :desc)
  .pluck(:index, :keys)

# Monitor memory hotspots
memory_threshold = 5_000_000
hot_buckets = metrics.high_memory_buckets(memory_threshold)

# Find buckets with many keys
key_threshold = 100
busy_buckets = metrics.high_key_buckets(key_threshold)

# Cache health monitoring
health_report = {
  hit_rate: metrics.hit_rate,
  total_keys: metrics.total_keys,
  memory_usage: metrics.total_memory,
  cold_keys_count: metrics.never_accessed_keys.size,
  efficiency: metrics.efficiency,
  distribution: metrics.bucket_distribution
}

Real-time Monitoring

# Refresh metrics to get latest data
current_metrics = metrics.refresh

# Monitor cache performance over time
def cache_health_check
  m = MudisQL.metrics
  
  {
    timestamp: Time.now,
    hit_rate: m.hit_rate,
    total_keys: m.total_keys,
    memory_mb: (m.total_memory / 1024.0 / 1024.0).round(2),
    cold_keys: m.never_accessed_keys.size,
    hottest_keys: m.least_touched.order(:access_count, :desc).limit(5).pluck(:key),
    memory_hotspots: m.high_memory_buckets(1_000_000).size
  }
end

# Create a dashboard endpoint
class MetricsController < ApplicationController
  def show
    render json: {
      summary: MudisQL.metrics.summary,
      efficiency: MudisQL.metrics.efficiency,
      distribution: MudisQL.metrics.bucket_distribution,
      top_keys: MudisQL.metrics.least_touched.order(:access_count, :desc).limit(10).all
    }
  end
end

API Reference

MudisQL.from(namespace)

Creates a new scope for the specified mudis namespace.

Returns: MudisQL::Scope

MudisQL.metrics

Access mudis metrics with a queryable interface.

Returns: MudisQL::MetricsScope

Scope Methods

Method Description Returns
where(conditions) Filter by hash of conditions Scope (chainable)
order(field, direction) Sort by field (:asc or :desc) Scope (chainable)
limit(n) Limit results to n records Scope (chainable)
offset(n) Skip first n records Scope (chainable)
all Execute query, return all results Array<Hash>
first Return first matching record Hash or nil
last Return last matching record Hash or nil
count Count matching records Integer
exists? Check if any records match Boolean
pluck(*fields) Extract specific fields Array
sum(field) Sum numeric values of a field Integer or Float
average(field) Average numeric values of a field Float
group_by(field) Group records by field value Hash
exists? Check if any records match Boolean
pluck(*fields) Extract specific fields Array

MetricsScope Methods

Method Description Returns
summary Top-level metrics (hits, misses, etc.) Hash
least_touched Query least accessed keys Scope
buckets Query bucket metrics Scope
total_keys Sum of keys across all buckets Integer
total_memory Total memory usage in bytes Integer
hit_rate Cache hit rate percentage Float
efficiency Hit/miss/eviction/rejection rates Hash
high_memory_buckets(threshold) Buckets exceeding memory threshold Array<Hash>
high_key_buckets(threshold) Buckets with many keys Array<Hash>
bucket_distribution Distribution statistics Hash
never_accessed_keys Keys with 0 access count Array<String>
refresh Reload metrics data MetricsScope

Condition Matchers

Mudis-QL supports multiple types of matchers in where conditions:

# Exact match
.where(status: 'active')

# Proc/Lambda for custom logic
.where(age: ->(v) { v >= 18 })

# Regex for pattern matching
.where(name: /^A/i)

# Range for inclusive matching
.where(age: 18..65)

Performance Considerations

  • Best for: Small to medium datasets (hundreds to thousands of records)
  • Memory: All matching keys are loaded into memory for filtering
  • Speed: Fast for cached data, but involves full table scan
  • Use Case: Perfect for frequently accessed, relatively static data that benefits from caching

For very large datasets or complex queries, consider using a proper database alongside mudis for caching.

Known Limitations

  1. Full scan required: Unlike databases with indexes, mudis-ql must load all records from a namespace to filter them
  2. In-memory processing: All filtering happens in Ruby memory, not at the storage layer
  3. No joins: Cannot join data across namespaces (each query targets one namespace)
  4. Namespaces required for queries: mudis-ql requires mudis namespaces to list and query collections. Keys stored without namespaces cannot be queried as a collection (individual key access still works via mudis directly).

These limitations are by design to maintain simplicity and compatibility with mudis's key-value architecture.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/kiebor81/mudisql.

See Also