WhatsApp Chat Parser
A Ruby library that parses exported WhatsApp chat .txt files and converts them into structured, machine-readable data. Designed for downstream processing such as analytics, ETL pipelines, storage, and transformation - not for rendering UI or interacting with the WhatsApp API.
Features
- Platform support: Handles both Android and iOS WhatsApp chat exports
- Structured output: Normalized message records suitable for JSON, databases, or further transformation
- Robust parsing: Detects platform-specific formats, normalizes timestamps, and groups multi-line messages
- Deterministic: No dependencies, explicit platform handling, predictable output structure
- Fail-safe: Skips or handles malformed lines when possible instead of aborting
Installation
Add to your Gemfile:
gem 'whatsapp-chat-parser'Then run:
bundle installOr install directly:
gem install whatsapp-chat-parserUsage
Parse a single message string (returns a Message or nil if malformed):
require 'whatsapp-chat-parser'
line = '12/15/25, 10:30:00 AM - John Doe: Hello World'
msg = WhatsappChatParser.parse_line(line)
puts "#{msg.timestamp} | #{msg.author}: #{msg.body}" if msgParse a file by path or io (returns an enumerable of Message; malformed lines are skipped):
messages = WhatsappChatParser.parse_file('path/to/chat.txt')
messages.each { |msg| puts "#{msg.timestamp} | #{msg.author}: #{msg.body}" }File.open('path/to/chat.txt') do |f|
WhatsappChatParser.parse_file(f).each { |msg| puts "#{msg.timestamp} | #{msg.author}: #{msg.body}" }
endEach message has timestamp, author, body, platform and type. The result is suitable for JSON, databases, or pipelines.
For a more comprehensive example, see samples/example.rb.
Output format
Each parsed record includes:
| Field | Description |
|---|---|
timestamp |
Normalized date/time (consistent across platforms) |
author |
Sender name or identifier (when present) |
body |
Full message content (multi-line messages grouped) |
platform |
Platform where chat was exported from (Anroid/iOS) |
type |
e.g. user message, system message |
Design principles
- Deterministic parsing - Same input yields same output
- No dependencies - Self-contained Ruby
- Explicit platform handling - Android vs iOS format differences are handled explicitly
- Predictable structure - Stable, documented output schema
Use cases
- Chat analytics and reporting
- Data migration or archival
- ETL pipelines into databases or spreadsheets
- Automated processing of exported WhatsApp conversations
Non-goals
This library does not:
- Interact with WhatsApp APIs
- Require network access
- Perform message interpretation, sentiment analysis, or NLP
- Handle encrypted or proprietary WhatsApp data formats
Input must be unmodified exports from WhatsApp’s “Export Chat” feature.
How to export WhatsApp chats
To use this library you need a plain-text export of a WhatsApp conversation. Use WhatsApp’s built-in Export Chat and choose Without media so you get a single .txt file.
Use the exported .txt file as-is; do not edit the format. This library supports both Android and iOS export formats.
Development
Setup
Clone the repository and install dependencies:
git clone https://github.com/emmaakachukwu/whatsapp-chat-parser-rb
cd whatsapp-chat-parser-rb
bundle installRunning Tests
We use RSpec for testing. Ensure all tests pass before submitting changes:
bundle exec rspecLinting
We use RuboCop to maintain code quality:
bundle exec rubocopContributing
Contributions are welcome. Please open an issue or pull request on the project repository.
- Fork the repository and create a feature branch.
- Ensure your code follows the Development steps above (tests and linting pass).
- Submit a Pull Request with a detailed description of your work.
License
This project is licensed under the MIT License. See the LICENSE file for details.