Unicode Namecode
A powerful Ruby gem for Unicode character lookups with support for official names, aliases, emojis, and fuzzy matching.
Features
- Fast Unicode Lookups: Find codepoints by official Unicode names
- Alias Support: Look up characters using common aliases (e.g., "NULL", "TAB")
- Emoji Integration: Bidirectional emoji ↔ codepoint lookups
- Fuzzy Matching: Find characters even with typos or partial names
- Prefix Search: Discover characters starting with specific prefixes
- Reverse Lookups: Get names and emojis from codepoints
- Optimized Performance: Trie-based lookups with caching
- CLI Tool: Command-line interface for quick lookups
Installation
As a Gem
gem install unicode-namecodeFrom Source
git clone https://github.com/Aeroswift/unicode-namecode.git
cd unicode-namecode
bundle installUsage
Basic Lookups
require 'unicode_namecode'
# Official Unicode names
UnicodeNamecode.codepoint("SNOWMAN") # => 9731 (0x2603)
UnicodeNamecode.lookup("SNOWMAN") # => "U+2603"
# Aliases work too!
UnicodeNamecode.codepoint("NULL") # => 0 (0x0000)
UnicodeNamecode.codepoint("TAB") # => 9 (0x0009)
# Character-based lookups
UnicodeNamecode.of("☃") # => "SNOWMAN"
UnicodeNamecode.codepoint_of("☃") # => 9731
UnicodeNamecode.unicode_of("☃") # => "U+2603"Emoji Support
# Emoji → Codepoint
UnicodeNamecode.codepoint_for_emoji("😊") # => 128522 (0x1F60A)
UnicodeNamecode.name_for_emoji("😊") # => "SMILING FACE WITH SMILING EYES"
# Codepoint → Emoji
UnicodeNamecode.emoji_for_codepoint(0x1F60A) # => "😊"Search Features
# Prefix search
UnicodeNamecode.prefix_search("SNOW", 5)
# => [{name: "SNOWMAN", codepoint: 9731}, {name: "SNOWFLAKE", codepoint: 10052}, ...]
# Fuzzy search (for typos)
UnicodeNamecode.fuzzy_search("SNOWMN", 3)
# => [{name: "SNOWMAN", similarity: 0.85}, ...]
# Reverse lookup
UnicodeNamecode.name_for_codepoint(0x2603) # => "SNOWMAN"Alias Management
# Check if a name is an alias
UnicodeNamecode.is_alias?("NULL") # => true
UnicodeNamecode.is_alias?("SNOWMAN") # => false
# Get all aliases for a codepoint
UnicodeNamecode.aliases_for_codepoint(0x0000)
# => [{name: "NULL", type: "control"}, {name: "NUL", type: "abbreviation"}]Command Line Interface
The gem includes a powerful CLI tool for quick lookups:
# Basic search (tries exact, then alias, then fuzzy)
unicode-namecode search SNOWMAN
# ✓ Found: SNOWMAN → U+2603 (9731) [official name]
unicode-namecode search NULL
# ✓ Found: NULL → U+0 (0) [alias]
# Prefix search
unicode-namecode prefix SNOW
# Names starting with 'SNOW':
# SNOWMAN → U+2603
# SNOWFLAKE → U+2744
# SNOWBOARDER → U+1F3C2
# Fuzzy search for typos
unicode-namecode fuzzy SNOWMN
# Fuzzy matches for 'SNOWMN':
# SNOWMAN → U+2603 (85.7% match)
# Emoji lookups
unicode-namecode emoji 😊
# 😊 → U+1F60A (SMILING FACE WITH SMILING EYES)
unicode-namecode emoji 0x1F60A
# U+1F60A → 😊 (SMILING FACE WITH SMILING EYES)
# Reverse lookup
unicode-namecode reverse 0x2603
# U+2603:
# Name: SNOWMAN
# Emoji: None
# Aliases: []
# Alias information
unicode-namecode alias 0x0000
# U+0 ():
# Aliases:
# NULL (control)
# NUL (abbreviation)Architecture
The gem is built with a modular architecture for maintainability and performance:
-
UnicodeNamecode: Main module with public API -
Trie: Efficient prefix tree for fast lookups -
DataLoader: Handles data parsing and caching -
Emoji: Manages emoji-specific functionality -
Aliases: Handles Unicode name aliases -
Fuzzy: Provides typo-tolerant search
Development
Running Tests
ruby test/test_unicode_namecode.rbBuilding the Gem
gem build unicode-namecode.gemspecInstalling Locally
gem install unicode-namecode-0.1.0.gemPerformance
- First Load: ~2-3 seconds (parses Unicode data)
- Subsequent Loads: ~100ms (uses cached Trie)
- Lookup Speed: O(k) where k is the length of the name
- Memory Usage: ~50MB for full Unicode dataset
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE.txt file for details.
Acknowledgments
- Unicode Consortium for the Unicode data files
- The Ruby community for inspiration and tools
- Contributors and users of this gem
Developed by Mikkal Mullen