Kore — Killer Optimized Record Exchange

A high-performance, columnar file format for analytics with cloud storage connectors.

Kore is a Rust-based columnar file format designed for efficient storage and analysis of structured data. It provides zero external dependencies in the base library with optional cloud connectors for AWS S3, Azure Blob Storage, and Google Cloud Storage.

🚀 Quick Start (5 Minutes)

Step 1: Install

pip install kore-fileformat==1.1.4

Step 2: Import

from kore_fileformat import compress_csv

# Compress a CSV file (that's it!)
original, compressed, ratio = compress_csv("data.csv", "data.kore")
print(f"✅ Compressed {ratio:.1%}")

Step 3: Use It

from kore_fileformat import get_kore_info

# Get file information
info = get_kore_info("data.kore")
print(f"Records: {info['total_records']:,}")

📚 Documentation

Getting Started

Document	For Whom	Time
docs/INSTALLATION.md	Everyone	5 min
docs/USER_GUIDE.md	Python users	15 min
docs/EXAMPLES.md	Developers	20 min

Reference

Document	Coverage
docs/API_REFERENCE.md	Complete API documentation with examples
docs/TROUBLESHOOTING.md	FAQ, common issues, solutions

Advanced Documentation

Guide	Purpose
PYTHON_USER_GUIDE.md	Cloud integration, advanced features
DOCKER_EMULATORS_GUIDE.md	Docker setup, LocalStack, Azurite testing
DOCUMENTATION_INDEX.md	Complete documentation index
CI_CD_SECRETS_SETUP.md	GitHub Actions, publishing setup

👉 New users: Start with docs/INSTALLATION.md → docs/USER_GUIDE.md

✨ Features

Core Library

✅ Zero External Dependencies: Lightweight base crate
✅ Columnar Format: Optimized for analytical queries
✅ Compression: Built-in data compression
✅ Multi-Platform: Windows, macOS, Linux, web

Cloud Connectors

✅ AWS S3: Full implementation in v1.0.0
⏳ Azure Blob Storage: Full implementation coming in v1.1.0
⏳ Google Cloud Storage: Full implementation coming in v1.1.0

Language Bindings

✅ Python: PyO3 wheel for Python 3.9-3.12
✅ Java: JNI bindings and Maven package
✅ JavaScript: NAPI module for Node.js
⏳ Go: Coming in v1.2.0

DevOps & CI/CD

✅ GitHub Actions: 10 automated jobs for testing and publishing
✅ Docker Support: Integration tests with emulators
✅ Multi-Registry Publishing: crates.io, PyPI, Maven Central, npm

📋 What's Included in v1.0.0

Component	Status	Details
Base Library	✅ Production	Columnar format, compression, serialization
S3 Connector	✅ Production	Read/write to AWS S3 with LocalStack testing
Azure Connector	⏳ Prepared	Stub implementations, full SDK in v1.1.0
GCS Connector	⏳ Prepared	Stub implementations, full SDK in v1.1.0
Python Bindings	✅ Production	Wheel installation, PyPI distribution
Java Bindings	✅ Production	JNI library, Maven Central distribution
JavaScript Bindings	✅ Production	NAPI addon, npm distribution
Integration Tests	✅ Complete	4 comprehensive tests with emulators
Documentation	✅ Complete	8 guides, 2000+ lines, 50+ examples

🎯 Use Cases

Data Analytics

Process large datasets efficiently with columnar storage:

import kore_fileformat
# Store analytics data in columnar format for fast queries

Cloud Data Lakes

Store data directly in S3, Azure, or GCS:

from kore_fileformat import S3Reader
reader = S3Reader(region='us-east-1')
data = reader.read_file('my-bucket', 'path/to/data.kore')

Multi-Language Projects

Use Kore from Python, Java, or JavaScript in the same project:

# Python: import kore_fileformat
# Java: import com.kore.cloud.S3Reader;
# JS: const kore = require('kore-fileformat');

🏗️ Architecture

Modular Design

kore_fileformat/
├── core/           # Base library (zero dependencies)
├── cloud/          # Cloud connectors (optional)
│   ├── s3/        # AWS S3 (working)
│   ├── azure/     # Azure Blob (v1.1+)
│   └── gcs/       # Google Cloud (v1.1+)
└── bindings/       # Language bindings
    ├── python/    # PyO3 wheel
    ├── java/      # JNI library
    └── napi/      # Node.js addon

Feature Gates

# Base: zero external dependencies
kore_fileformat = "1.0.0"

# With S3
kore_fileformat = { version = "1.0.0", features = ["s3"] }

# With all cloud (v1.1.0+)
kore_fileformat = { version = "1.0.0", features = ["s3", "azure", "gcs"] }

# With Python bindings
# Use: pip install kore-fileformat

📊 Performance

Kore is designed for analytics workloads:

Compression: 5-10x reduction on typical datasets
Query Speed: Columnar format enables fast aggregations
Storage: 10-50 MB files with millions of rows
Cloud: Direct S3/Azure/GCS integration (no intermediate files)

🛠️ Installation

Requirements

Rust: 1.70+ (for building from source)
Python: 3.9-3.12 (for Python wheel)
Java: 17+ (for Java bindings)
Node.js: 14+ (for JavaScript bindings)
Docker: 20.10+ (for testing with emulators)

From PyPI (Recommended for Python)

pip install kore-fileformat

From crates.io (Rust)

cargo add kore_fileformat --features s3

Build from Source

git clone https://github.com/arunkatherashala/Kore.git
cd Kore
cargo build --release --features s3

🧪 Testing

Run Unit Tests

cargo test

Run Integration Tests (requires Docker)

# Start emulators (LocalStack, Azurite, GCS)
docker-compose up -d

# Run tests
cargo test --features s3,azure,gcs --test integration_tests -- --nocapture

# Stop emulators
docker-compose down

See DOCKER_EMULATORS_GUIDE.md for detailed setup.

🚀 Cloud Integration

AWS S3 (v1.0.0 - Working)

from kore_fileformat import S3Reader

reader = S3Reader(region='us-east-1')
data = reader.read_file('bucket', 'object.kore')
reader.write_file('bucket', 'object.kore', data)

Azure Blob Storage (v1.1.0 - Coming Soon)

from kore_fileformat import AzureBlobReader

reader = AzureBlobReader('account', 'key')
data = reader.read_file('container', 'blob.kore')

Google Cloud Storage (v1.1.0 - Coming Soon)

from kore_fileformat import GcsReader

reader = GcsReader('project-id')
data = reader.read_file('bucket', 'object.kore')

🤝 Contributing

We welcome contributions! Here's how:

Report Issues: GitHub Issues
Discuss Ideas: GitHub Discussions
Submit PRs: Fork, branch, code, and create a pull request

See V1_1_ROADMAP.md for planned features and how to help.

📅 Roadmap

v1.0.0 (Current) ✅

S3 connector with full API
Python, Java, JavaScript bindings
Integration tests with emulators
Complete documentation

v1.1.0 (Q2 2026)

Azure Blob Storage full implementation
Google Cloud Storage full implementation
Performance optimizations
Streaming support

v2.0.0 (Q4 2026)

Go language bindings
Multi-region support
Caching layer
Advanced compression

See V1_1_ROADMAP.md for detailed phases and milestones.

📦 Distribution Channels

Latest Versions

Platform	Package	Version	Link
PyPI	kore-fileformat	1.0.0	PyPI
Crates.io	kore_fileformat	1.0.0	Crates.io
Maven	com.arun.kore:kore-cloud-java	1.0.0	Coming v1.1
npm	kore-fileformat	1.0.0	Coming v1.1

🔒 Security

Features

Zero external dependencies in base library
Optional SDKs are version-pinned and updated regularly
Integration tests verify cloud connectivity
GitHub Actions security scanning

Reporting Security Issues

Please email: arunkatherashala@gmail.com

📞 Support & Community

Getting Help

Documentation: DOCUMENTATION_INDEX.md
Issues: GitHub Issues
Discussions: GitHub Discussions
Email: arunkatherashala@gmail.com

Stay Updated

GitHub: Star the repository
Releases: Watch for v1.1.0 announcement
Email: Subscribe to release notifications

📄 License

Kore is licensed under the Apache License 2.0.

Copyright 2024-2026 Sai Arun Kumar Ktherashala

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

👤 Author

Sai Arun Kumar Ktherashala

🎯 What's Next?

For Users

Read: PYTHON_USER_GUIDE.md or DOCKER_EMULATORS_GUIDE.md
Install: pip install kore-fileformat
Explore: Check DOCUMENTATION_INDEX.md for your role

For Contributors

Review: V1_1_ROADMAP.md for v1.1.0 features
Clone: git clone https://github.com/arunkatherashala/Kore.git
Setup: Follow DOCKER_EMULATORS_GUIDE.md
Code: Create feature branch and submit PR

For DevOps

Setup: CI_CD_SECRETS_SETUP.md for automated publishing
Monitor: GitHub Actions workflows on each push
Release: Tag v1.0.1 or v1.1.0 to trigger publishing

✅ Project Status

Phase	Status	Delivered
Phase 1: Core Library	✅ Complete	Base Kore format, compression, serialization
Phase 2: Cloud SDKs	✅ Partial	S3 working, Azure/GCS coming v1.1
Phase 3: Language Bindings	✅ Complete	Python, Java, JavaScript production-ready
Phase 4: Integration Tests	✅ Complete	4 comprehensive tests with emulators
Phase 5: CI/CD & Publishing	✅ Complete	10 automated jobs, multi-registry support
Documentation	✅ Complete	8 guides, 2000+ lines, 50+ examples

🎉 Thank You!

Thank you for choosing Kore! We're excited to see what you build.

Questions? Open an issue or discussion on GitHub.
Want to help? Check V1_1_ROADMAP.md for features to implement.
Found a bug? Report it on GitHub Issues.

Latest Release: v1.0.0
Last Updated: May 14, 2026
Status: Production Ready ✅

🚀 Let's build amazing data infrastructure together!

kore-fileformat

Development

Runtime