0.0
No commit activity in last 3 years
No release in over 3 years
Arachnidish is a web crawler that relies on Bloom Filters to efficiently store visited urls and Typhoeus to avoid the overhead of Mechanize when crawling every page on a domain.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
Simple web crawler to crawl a domain and generate sitemap
2019
2020
2021
2022
2023
2024
0.0
Repository is gone
No release in over 3 years
Email crawler: crawls the top ten Google search results looking for email addresses and exports them to CSV.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
株価情報を取得してあれこれするライブラリ
2019
2020
2021
2022
2023
2024
0.0
Repository is gone
No release in over 3 years
A crawler and parser to the Currículo Lattes.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
web crawler that generates a sitemap to a neo4j database. It will also store broken_links and total number of pages on site
2019
2020
2021
2022
2023
2024
0.0
Repository is gone
No release in over 3 years
A crawler for a single domain web application
2019
2020
2021
2022
2023
2024
0.08
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Asynchronous web crawler, scraper and file harvester
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
check how many links are available inside the website
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
Dead simple yet powerful Ruby crawler for easy parallel crawling with support for an anonymity.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
Website crawler harvesting e-mails. Uses Sidekiq and Typhoeus.
2019
2020
2021
2022
2023
2024
0.03
No release in over 3 years
Low commit activity in last 3 years
Post URLs to Wayback Machine (Internet Archive), using a crawler, from Sitemap(s) or a list of URLs.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
rails_angular_seo allows you to make your single-page apps (Backbone, Angular, etc) built on Rails SEO-friendly. It works by injecting a small rack middleware that will render pages as plain html, when the requester is one of the most common crawlers/bots out there (Google, Yahoo Baidu and Bing)
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
A simple news crawler. You can specify the structure of your xml or rss feeds.
2019
2020
2021
2022
2023
2024