No commit activity in last 3 years
No release in over 3 years
CouchFoo provides an ActiveRecord API interface to CouchDB
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

 Project Readme

CouchFoo¶ ↑

Introduction¶ ↑

CouchDB (couchdb.apache.org/) works slightly differently to relational databases. First, and foremost, it is a document-oriented database. That is, data is stored in documents each of which have a unique id that is used to access and modify it. The contents of the documents are free from structure (or schema free) and bear no relation to one another (unless you encode that within the documents themselves). So in many ways documents are like records within a relational database except there are no tables to keep documents of the same type in.

CouchDB interfaces with the external world via a RESTful interface. This allows document creation, updating, deletion etc. The contents of a document are specified in JSON so its possible to serialise objects within the database record efficiently as well as store all the normal types natively.

As a consequence of its free form structure there is no SQL to query the database. Instead you define (table-oriented) views that emit certain bits of data from the record and apply conditions, sorting etc to those views. For example if you were to emit the colour attribute you could find all documents with a certain colour. This is similar to indexed lookups on a relational table (both in terms of concept and performance).

CouchDB has been designed from the ground up to operate in a distributed way. It provides robust, incremental replication with bi-directional conflict detection and resolution. It’s an excellent choice for unstructed data, large datasets that need sharding efficiently and situations where you wish to run local copies of the database.

Overview¶ ↑

CouchFoo provides an ActiveRecord styled interface to CouchDB. The external API is nearly identical to ActiveRecord so it should be possible to migrate your applications quite easily. That said, there are a few minor differences to the way CouchDB works. In particular:

  • CouchDB is schema free so property definitions for the document are defined in the model (like DataMapper)

  • :select, :joins, :having, :group, :from and :lock are not available on find or associations as they don’t apply (locking is handled as conflict resolution at insertion time)

  • :conditions can only accept a hash and not an array or SQL. For example :conditions => {:user_name => “Georgio_1999”}

  • :offset is less efficient in CouchDB - there’s more on this in the rdoc

  • :order is applied after results are retrieved from the database. Therefore :order cannot be used with :limit without a new option :use_key. This is explained fully in the quick start guide and CouchFoo#find documentation

  • :include isn’t implemented yet but the finders and associations still accept the option so you won’t need to make any code changes

  • By default results are ordered by document key. The key uses a UUID scheme so these don’t auto-increment and are likely to come out in a different order to insertion. default_sort can be used on a model to sort by create date by default and overcome this

  • validates_uniqueness_of has had the :case_sensitive option removed

  • Because there’s no SQL there’s no SQL finder methods

  • Timezones, aggregations and fixtures are not yet implemented

  • The price of index updating is paid when next accessing the index rather than the point of insertion. This can be more efficient or less depending on your application. It may make sense to use an external process to do the updating for you - see CouchFoo#find for more on this

  • On that note, compacting of CouchDB is required to recover space from old versions of documents and keep performance fast. This can be kicked off in several ways (see quick start guide)

It is recommend that you read the quick start and performance sections in the rdoc for a full overview of differences and points to be aware of when developing. The Changelog file shows the differences between gem versions and should be checked when upgrading gem versions.

Getting started¶ ↑

To install CouchFoo simply type the following:

sudo gem install couch_foo --source http://gemcutter.org

You will then need to setup the base class connection and logger:

CouchFoo::Base.set_database(:host => "http://localhost:5984", :database => "mydatabase")
CouchFoo::Base.logger = Rails.logger

If using with Rails you will need to create an initializer to do this (until proper integration is added). Also note depending on your version of CouchDB will depend on the version of the CouchREST gem you will require - CouchDB 0.9 requires CouchREST greater than 0.2 and CouchDB 0.8 requires CouchREST between 0.16 and 0.2. You will be warned on CouchFoo initialization if this is wrong.

Examples of usage¶ ↑

Basic operations are the same as ActiveRecord:

class Address < CouchFoo::Base
  property :number, Integer
  property :street, String
  property :postcode # Any generic type is fine as long as .to_json and class.from_json(json) can be called on it
end 

address1 = Address.create(:number => 3, :street => "My Street", :postcode => "secret") # Create address
address2 = Address.create(:number => 27, :street => "Another Street", :postcode => "secret")
Address.all # = [address1, address2] or maybe [address2, address1] depending on key generation
Address.first    # = address1 or address2 depending on keys so probably isn't as expected
Address.find_by_street("My Street") # = address1

As key generation is through a UUID scheme, the order can’t be predicted. However you can order the results by default:

class Address < CouchFoo::Base
  property :number, Integer
  property :street, String
  property :postcode # Any generic type is fine as long as .to_json can be called on it
  property :created_at, DateTime

  default_sort :created_at
end

Address.all # = [address1, address2]
Address.first    # = address1 or address2, sorting is applied after results
Address.first(:use_key => :created_at) # = address1 but at the price of creating a new index

Conditions work slightly differently:

Address.find(:all, :conditions {:street => "My Street"}) # = address1, creates index on :street
Address.find(:all, :conditions {:created_at => "sometime"}) # Uses same index as :use_key => :created_at
Address.find(:all, :use_key => :street, :startkey => 'p') # All streets from p in alphabet, reuses the index created 2 lines up

As well as providing support for people using relational databases, CouchFoo attempts to provide a library for those wanting to use CouchDB as a document-oriented database:

class Document < CouchFoo::Base
  property :number, Integer
  property :street, String

  view :number_ordered, "function(doc) {emit([doc.number , doc.street], doc); }", nil, :descending => true
end

Document.number_ordered(:limit => 75) # Will get the last 75 documents in the database ordered by number, street attributes

Associations work as expected but you must to remember to add the properties required for an association (we’ll make this automatic soon):

class House < CouchFoo::Base
	has_many :windows
end

class Window < CouchFoo::Base
  property :house_id, String
	belongs_to :house
end

Credits¶ ↑

This gem was inspired some excellent work on CouchPotato, CouchREST, ActiveCouch and RelaxDB gems. Each offered its own benefits and own challenges. After hacking with each I couldn’t get a library was happy with. So I started with ActiveRecord and modified it to work with CouchDB. Some areas required more work than others but a lot of features were achieved for free once the base level of functionality had been achieved. Credit to DHH, the rails core guys and the CouchDB gems that inspired this work.

What’s left to do?¶ ↑

Please feel free to fork and hit me with a request to merge back in. At the moment, the following areas need addressing:

  • Proper integration with rails using a couchdb.yml config file

  • Example ruby script for updating indexes as an external process, and the same for database compacting

  • :include for database queries

  • Remove dependency from CouchRest - keep support for both CouchDB 0.8 and 0.9 for now. Add support for attachments and log calls to server so know performance times. General tidy up of database.rb. Add support for etags?

  • inline and HABTM inline associations. Also x_id and x_type properties should be automatically defined

  • compatability with willpaginate and friends

  • cleanup of associations, reflection, validations classes (were modified in a rush)

  • has_many :through

  • Rest of calculations, timezones, aggregations and fixtures

  • Diff with AR to check got everything in

  • DataMapper interface

  • Tool to migrate DB from MySQL to CouchDB

  • some kind of generic admin interface where can just browse around data structure?

  • grep of code and doc for records and columns, should be documents and attributes