Project

kitcrawler

0.0

No commit activity in last 3 years

No release in over 3 years

kitcrawler parttimenerd/kitcrawler Homepage Documentation Source Code Bug Tracker Wiki

Crawl lecture websites and fetch the PDFs automatically. It currently supports the studium.kit.edu and other HTTP password protected sites.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

2025

Popularity

3,549

0

0

3

Releases

Current version

0.1.0

1

2014-07-12

2014-07-12

Issues

1

0

1

Issue Closure Rate

0%

Development

Primary Language

Ruby

Licenses

GPL v3

Average date of last 50 commits

2014-07-12

Reverse Dependencies

0

Dependencies

Runtime

damerau-levenshtein

>= 1.0.0

>= 1.6.0

>= 1.6.1

thor

>= 0.19.0

Project Readme

KITCrawler

Fetch lecture PDFs with ease.

It currently supports crawling PDFs for lectures from the studium.kit.edu page, but can be easily extended to fetch PDFs from other services.

Requirements

ruby (>= 1.9, but 1.8 might also be okay)
bundler (or install the required gems (see Gemfile) manually)
linux (with curl, might also work on other Unixes)

Install

Simply run

	gem install kitcrawler

to install the gem (it's often a bit behind the repo).

Or run it from source.

	git clone https://github.com/parttimenerd/KITCrawler
	cd KITCrawler
	bundle install

Usage

Run

	kitcrawler add NAME

to add a new fetch job named NAME. This will prompt you to pass an entry URL to the site, etc.

To finally run your jobs use

	kitcrawler fetch

It also supports some command line parameters, run kitcrawler to see an explanation.

License

The code is GNU GPL v3 licensed.