No release in over 3 years
Low commit activity in last 3 years
A Sitemap generator specifically designed for large sites (although it works equally well with small sites)
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 2.1.2
>= 0.9.9
 Project Readme

BigSitemap¶ ↑

BigSitemap is a Sitemap generator suitable for applications with greater than 50,000 URLs. It splits large Sitemaps into multiple files, gzips the files to minimize bandwidth usage, supports increment updates, can be set up with just a few lines of code and is compatible with just about any framework.

BigSitemap is best run periodically through a Rake/Thor task.

require 'big_sitemap'

include Rails.application.routes.url_helpers # Allows access to Rails routes

BigSitemap.generate(:url_options => {:host => 'example.com'}, :document_root => "#{APP_ROOT}/public") do
  # Add a static page
  add '/about'

  # Add some URLs from your Rails application
  Post.find(:all).each do |post|
    add post_path(post)
  end

  # Add some URLs with additional options
  Product.find(:all).each do |product|
    add product_path(product), :change_frequency => 'daily', :priority => 0.5
  end
end

The code above will create a minimum of two files:

  1. public/sitemap_index.xml.gz

  2. public/sitemap.xml.gz

Before version 1.0.0, the files were created by default in public/sitemaps, if you want to store them in the sitemaps directory, you have to specify the document_path option:

BigSitemap.generate(  :url_options =>   {:host => 'example.com'}, 
                      :document_root => "#{APP_ROOT}/public/", 
                      :document_path => "sitemaps"
                    ) do
  ...
end

If your sitemaps grow beyond 50,000 URLs (this limit can be overridden with the :max_per_sitemap option), the sitemap files will be partitioned into multiple files (sitemap_1.xml.gz, sitemap_2.xml.gz, …).

Framework-specific Classes¶ ↑

Use the framework-specific classes to take advantage of built-in shortcuts.

Rails¶ ↑

BigSitemapRails deals with setting the :document_root and :url_options initialization options.

Merb¶ ↑

BigSitemapMerb deals with setting the :document_root initialization option.

Install¶ ↑

Via gem:

sudo gem install big_sitemap

Advanced¶ ↑

Initialization Options¶ ↑

  • :url_options – hash with :host, optionally :port and :scheme

  • :base_url – string alternative to :url_options, e.g. 'https://example.com:8080/'

  • :url_path – string path_name to sitemaps folder, defaults to :document_path

  • :document_root – string

  • :document_path – string document path for sitemaps, relative to :document_root, defaults to empty string (putting sitemap files in the document root directory)

  • :document_full – string absolute document path to generation folder - defaults to :document_root/:document_path

  • :max_per_sitemap50000, which is the limit dictated by Google but can be less

  • :gziptrue

  • :ping_googletrue

  • :ping_yahoofalse, needs :yahoo_app_id

  • :ping_bingfalse

  • :ping_askfalse

  • :ping_yandexfalse

  • :partial_updatefalse

Change Frequency, Priority and Last Modified¶ ↑

You can control “changefreq”, “priority” and “lastmod” values for each record individually by passing them as optional arguments when adding URLs:

add(product_path(product), {
  :change_frequency => 'daily',
  :priority         => 0.5,
  :last_modified    => product.updated_at
})

Partial Update¶ ↑

If you enable :partial_update, the filename will include the id of the first entry. This is perfect to update just the last file with new entries without the need to re-generate files being already there. You must pass the entry’s id in when adding the URL. For example:

BigSitemap.generate(:base_url => ‘example.com’, :partial_update => true) do

Widget.find_in_batches(:conditions => "id > #{get_last_id}").each do |widget|
  add widget_path(widget), :id => widget.id
end

end

TODO¶ ↑

Tests for framework-specific components.

Credits¶ ↑

Thanks to Alastair Brunton and Harry Love, who’s work provided a starting point for this library.

Thanks also to those who have contributed patches:

  • Mislav Marohnić

  • Jeff Schoolcraft

  • Dalibor Nasevic

  • Tobias Bielohlawek (www.rngtng.com)

Copyright © 2010 Stateless Systems (statelesssystems.com). See LICENSE for details.