0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
This crawler will use my personnal scraper named 'RecipeScraper' to dowload recipes data from Marmiton, 750g or cuisineaz
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.17
~> 10.0
~> 3.0
>= 0

Runtime

 Project Readme

RecipeCrawler

A web crawler to save recipes from marmiton.org, 750g.com or cuisineaz.com into an SQlite3 database.

For the moment, it works only with cuisineaz.com

This Rubygems use my other powerfull recipe_scraper gem to scrape data on these websites.

To experiment with that code, run bin/console for an interactive prompt.

Installation

Add this line to your application's Gemfile:

gem 'recipe_crawler'

And then execute:

$ bundle

Or install it yourself as:

$ gem install recipe_crawler

Usage

Command line

Install this gem and run

$ recipe_crawler -h
Usage: recipe_crawler [options]
    -n, --number x                   Number of recipes to fetch
    -u, --url url                    URL of a recipe to fetch

So you just need to specify the first recipe url and the number of recipes wanted. Simple like this :

$ recipe_crawler -u http://www.cuisineaz.com/recettes/pate-a-pizza-legere-55004.aspx -n 8
[x] Pâte à pizza légère fetched!
[x] Pâtes au jambon cuit fetched!
[x] Oeufs au plat sucrés fetched!
[x] Pizza légère fetched!
[x] Vin cuit fetched!
[x] Tiramisu aux framboises simplissime et rapide fetched!
[x] Risotto aux champignons et au parmesan fetched!
[x] Gnocchi fetched!

API

Install & import the library:

require 'recipe_scraper'

Then you just need to instanciate a RecipeCrawler::Crawler with url of a CuisineAZ's recipe.

url = 'http://www.cuisineaz.com/recettes/pate-a-pizza-legere-55004.aspx'
r = RecipeCrawler::Crawler.new url

Then you just need to run the crawl with a limit number of recipe to fetch. All recipes will be saved in a export.sqlite3 file. You can pass a block to play with RecipeScraper::Recipe objects.

r.crawl!(limit: 10) do |recipe|
    puts recipe.to_hash
    # will return
    # --------------
    # { :cooktime => 7,
    #        :image => "http://images.marmitoncdn.org/recipephotos/multiphoto/7b/7b4e95f5-37e0-4294-bebe-cde86c30817f_normal.jpg",
    #        :ingredients => ["2 beaux avocat", "2 steaks hachés de boeuf", "2 tranches de cheddar", "quelques feuilles de salade", "1/2 oignon rouge", "1 tomate", "graines de sésame", "1 filet d'huile d'olive", "1 pincée de sel", "1 pincée de poivre"],
    #        :preptime => 20,
    #        :steps => ["Laver et couper la tomate en rondelles", "Cuire les steaks à la poêle avec un filet d'huile d'olive", "Saler et poivrer", "Toaster les graines de sésames", "Ouvrir les avocats en 2, retirer le noyau et les éplucher", "Monter les burger en plaçant un demi-avocat face noyau vers le haut, déposer un steak, une tranche de cheddar sur le steak bien chaud pour qu'elle fonde, une rondelle de tomate, une rondelle d'oignon, quelques feuilles de salade et terminer par la seconde moitié d'avocat", "Parsemer quelques graines de sésames."],
    #        :title => "Burger d'avocat",
    #  }
end

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment. You can also run yard to generate the documentation.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/recipe_crawler. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Author

Rousseau Alexandre