Ruby bindings for lzws library
| Github actions | Codecov | Gem |
|---|---|---|
See lzws library.
Other bindings: brotli, zstd, bzip2.
Installation
Operating systems: GNU/Linux, FreeBSD, OSX.
Dependencies: lzws 1.4.0+ version.
gem install ruby-lzwsYou can build it from source.
rake gem
gem install pkg/ruby-lzws-*.gemYou can also use overlay for gentoo.
Usage
There are simple APIs: String and File. Also you can use generic streaming API: Stream::Writer and Stream::Reader.
require "lzws"
data = LZWS::String.compress "TOBEORNOTTOBEORTOBEORNOT"
puts LZWS::String.decompress(data)
LZWS::File.compress "file.txt", "file.txt.Z"
LZWS::File.decompress "file.txt.Z", "file.txt"
LZWS::Stream::Writer.open("file.txt.Z") { |writer| writer << "TOBEORNOTTOBEORTOBEORNOT" }
puts LZWS::Stream::Reader.open("file.txt.Z") { |reader| reader.read }
writer = LZWS::Stream::Writer.new output_socket
begin
bytes_written = writer.write_nonblock "TOBEORNOTTOBEORTOBEORNOT"
# handle "bytes_written"
rescue IO::WaitWritable
# handle wait
ensure
writer.close
end
reader = LZWS::Stream::Reader.new input_socket
begin
puts reader.read_nonblock(512)
rescue IO::WaitReadable
# handle wait
rescue ::EOFError
# handle eof
ensure
reader.close
endYou can create and read tar.Z archives with minitar.
LZWS is compatible with UNIX compress (with default options).
require "lzws"
require "minitar"
LZWS::Stream::Writer.open "file.tar.Z" do |writer|
Minitar::Writer.open writer do |tar|
tar.add_file_simple "file", :data => "TOBEORNOTTOBEORTOBEORNOT"
end
end
LZWS::Stream::Reader.open "file.tar.Z" do |reader|
Minitar::Reader.open reader do |tar|
tar.each_entry do |entry|
puts entry.name
puts entry.read
end
end
endYou can also use Content-Encoding: compress with sinatra:
require "lzws"
require "sinatra"
get "/" do
headers["Content-Encoding"] = "compress"
LZWS::String.compress "TOBEORNOTTOBEORTOBEORNOT"
endAll functionality (including streaming) can be used inside multiple threads with parallel. This code will provide heavy load for your CPU.
require "lzws"
require "parallel"
Parallel.each large_datas do |large_data|
LZWS::String.compress large_data
endDocs
Please review rdoc generated docs.
Options
| Option | Values | Default | Description |
|---|---|---|---|
source_buffer_length |
0, 2 - inf | 0 (auto) | internal buffer length for source data |
destination_buffer_length |
0, 2 - inf | 0 (auto) | internal buffer length for description data |
gvl |
true/false | false | enables global VM lock where possible |
max_code_bit_length |
9 - 16 | 16 | max code bit length |
block_mode |
true/false | true | enables block mode |
without_magic_header |
true/false | false | disables magic header |
msb |
true/false | false | enables most significant bit mode |
unaligned_bit_groups |
true/false | false | enables unaligned bit groups |
quiet |
true/false | false | disables lzws library logging |
There are internal buffers for compressed and decompressed data.
For example you want to use 1 KB as source_buffer_length for compressor - please use 256 B as destination_buffer_length.
You want to use 256 B as source_buffer_length for decompressor - please use 1 KB as destination_buffer_length.
gvl is disabled by default, this mode allows running multiple compressors/decompressors in different threads simultaneously.
Please consider enabling gvl if you don't want to launch processors in separate threads.
If gvl is enabled ruby won't waste time on acquiring/releasing VM lock.
You can also read lzws docs for more info about options.
| Option | Related constants |
|---|---|
max_code_bit_length |
LZWS::Option::LOWEST_MAX_CODE_BIT_LENGTH = 9, LZWS::Option::BIGGEST_MAX_CODE_BIT_LENGTH = 16 |
Possible compressor options:
:source_buffer_length
:destination_buffer_length
:gvl
:max_code_bit_length
:block_mode
:without_magic_header
:msb
:unaligned_bit_groups
:quiet
Possible decompressor options:
:source_buffer_length
:destination_buffer_length
:gvl
:without_magic_header
:msb
:unaligned_bit_groups
:quiet
Example:
require "lzws"
data = LZWS::String.compress "TOBEORNOTTOBEORTOBEORNOT", :msb => true
puts LZWS::String.decompress(data, :msb => true)Please read more about compatibility in lzws docs.
String
String maintains destination buffer only, so it accepts destination_buffer_length option only.
::compress(source, options = {})
::decompress(source, options = {})
source is a source string.
File
File maintains both source and destination buffers, it accepts both source_buffer_length and destination_buffer_length options.
::compress(source, destination, options = {})
::decompress(source, destination, options = {})
source and destination are file pathes.
Stream::Writer
Its behaviour is similar to builtin Zlib::GzipWriter.
Writer maintains destination buffer only, so it accepts destination_buffer_length option only.
::open(file_path, options = {}, :external_encoding => nil, :transcode_options => {}, &block)
Open file path and create stream writer associated with opened file.
Data will be transcoded to :external_encoding using :transcode_options before compressing.
::new(destination_io, options = {}, :external_encoding => nil, :transcode_options => {})
Create stream writer associated with destination io.
Data will be transcoded to :external_encoding using :transcode_options before compressing.
#set_encoding(external_encoding, nil, transcode_options)
Set another encodings, nil is just for compatibility with IO.
#io
#to_io
#stat
#external_encoding
#transcode_options
#pos
#tell
See IO docs.
#write(*objects)
#flush
#rewind
#close
#closed?
See Zlib::GzipWriter docs.
#write_nonblock(object, *options)
#flush_nonblock(*options)
#rewind_nonblock(*options)
#close_nonblock(*options)
Special asynchronous methods missing in Zlib::GzipWriter.
rewind wants to close, close wants to write something and flush, flush want to write something.
So it is possible to have asynchronous variants for these synchronous methods.
Behaviour is the same as IO#write_nonblock method.
#<<(object)
#print(*objects)
#printf(*args)
#putc(object, :encoding => 'ASCII-8BIT')
#puts(*objects)
Typical helpers, see Zlib::GzipWriter docs.
Stream::Reader
Its behaviour is similar to builtin Zlib::GzipReader.
Reader maintains both source and destination buffers, it accepts both source_buffer_length and destination_buffer_length options.
::open(file_path, options = {}, :external_encoding => nil, :internal_encoding => nil, :transcode_options => {}, &block)
Open file path and create stream reader associated with opened file.
Data will be force encoded to :external_encoding and transcoded to :internal_encoding using :transcode_options after decompressing.
::new(source_io, options = {}, :external_encoding => nil, :internal_encoding => nil, :transcode_options => {})
Create stream reader associated with source io.
Data will be force encoded to :external_encoding and transcoded to :internal_encoding using :transcode_options after decompressing.
#set_encoding(external_encoding, internal_encoding, transcode_options)
Set another encodings.
#io
#to_io
#stat
#external_encoding
#internal_encoding
#transcode_options
#pos
#tell
See IO docs.
#read(bytes_to_read = nil, out_buffer = nil)
#eof?
#rewind
#close
#closed?
See Zlib::GzipReader docs.
#readpartial(bytes_to_read = nil, out_buffer = nil)
#read_nonblock(bytes_to_read, out_buffer = nil, *options)
See IO docs.
#getbyte
#each_byte(&block)
#readbyte
#ungetbyte(byte)
#getc
#readchar
#each_char(&block)
#ungetc(char)
#lineno
#lineno=
#gets(separator = $OUTPUT_RECORD_SEPARATOR, limit = nil)
#readline
#readlines
#each(&block)
#each_line(&block)
#ungetline(line)
Typical helpers, see Zlib::GzipReader docs.
Thread safety
:gvl option is disabled by default, you can use bindings effectively in multiple threads.
Please be careful: bindings are not thread safe.
You should lock all shared data between threads.
For example: you should not use same compressor/decompressor inside multiple threads. Please verify that you are using each processor inside single thread at the same time.
CI
Please visit scripts/test-images. See universal test script scripts/ci_test.sh for CI.