0.07
No commit activity in last 3 years
No release in over 3 years
Mechanize wrapper to work via Tor/Privoxy with endpoint switching ability
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
 Dependencies

Runtime

 Project Readme

Tor/privoxy wrapped Mechanize

tor-privoxy is a Ruby Mechanize wrapper for accessing the web via Tor/Privoxy. It allows multiple Privoxy instances, switching endpoints, and switching the proxy when you get an HTTP 4xx error code. It is useful for web robots, scanners, and scrapers when accessing sites which may ban/block you unexpectedly

Using

The first step is to install the gem:

gem install tor-privoxy

To use in your application:

require 'tor-privoxy'

To get a Mechanize instance wrapped to use Tor and able to use another endpoint when it encounters an HTTP 4xx code:

agent = TorPrivoxy::Agent.new '127.0.0.1', '', {8118 => 9051} do |agent|
  sleep 1
  puts "New IP is #{agent.ip}"
end

Passing block is optional:

agent = TorPrivoxy::Agent.new '127.0.0.1', '', 8118 => 9051

First parameter is a proxy host, second is Tor control port password:

agent = TorPrivoxy::Agent.new '127.0.0.1', '', 8118 => 9051

The hash is in format proxyport => torcontrolport. Yes, you may provide as many as you want, but I don't have an idea why I initially did it like so.

And use the agent as a usual Mechanize agent instance:

agent.get "http://example.com"

Configuration options

Configuration options are passed when creating an agent and consist of:

  • IP/Host of machine where Tor/Privoxy resides
  • password for Tor Control
  • a hash of Privoxy port => Tor port
  • a block which is called when agent switches to a new endpoint

Author

Created by Phil Pirozhkov

Origin

Future

  • No Mechanize dependency, ability to work with any HTTP library
  • Extend configuration options, allowing for fine proxy setting control
  • Better "ban" detection, e.g. Captcha, etc.