Project

uri-meta

0.0
No commit activity in last 3 years
No release in over 3 years
Retrieves meta information for a URI from the meturi.com service.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

>= 0.1.5
>= 2.10.2

Runtime

>= 0.5.4
>= 0.6.0
 Project Readme

uri-meta: Get meta information about your URI

uri-meta is a ruby interface to the metauri.com service.

metauri.com provides two things:

  • follows your URI to the end point where there is actual content instead of redirects
  • obtains meta information (title etc) about that end URI

Examples

require 'uri'
require 'uri/meta'
uri = URI.parse('http://www.google.com/')
puts uri.meta.title
# Google
puts uri.meta.status
# 200
puts uri.meta(:headers => 1).headers
# HTTP/1.1 .... etc

uri = URI.parse('http://bit.ly/PBzu')
puts uri.meta.content_type
# image/gif

meta = URI.parse('http://bit.ly/PBzu').meta(:max_redirects = 2)
puts(meta.last_effective_uri) unless meta.errors?
# http://clipart.tiu.edu/offcampus/animated/bd13644_.gif


URI::Meta.multi(['http://www.google.com/', 'http://bit.ly/PBzu'], :max_redirects => 10) do |meta|
  # Don't rely on these being processed in the same order they were listed!
  if meta.redirect?
    puts "## #{meta.uri} -> #{meta.last_effective_uri}"
  else
    puts "## #{meta.uri} did not redirect and it's title was #{meta.title}"
  end
end

Caching

uri-meta uses in-memory caching via wycats-moneta, so it should be relatively straight forward for you to use whatever other caching mechanism you want, provided it's supported by moneta.

require 'uri'
require 'uri/meta'

# Memcached
require 'moneta/memcache'
URI::Meta::Cache.cache      = Moneta::Memcache.new(:server => 'localhost', :namespace => 'uri_meta')
URI::Meta::Cache.expires_in = (60 * 60 * 24 * 7) # 1 week

# No caching (for testing I guess)
URI::Meta::Cache.cache = nil

Known Issues

  • Redirects that aren't handled by the webserver (302), such as javascript or <meta> tag redirects are not supported yet.
  • Framed redirects, such as stumbleupon are not resolved yet, as these are techincally full pages it could be difficult to know that it's not really then end URI.
  • No RDOC as yet.

Copyright

Copyright (c) 2009 Stateless Systems. See LICENSE for details.