Project

nekohtml

0.0
No commit activity in last 3 years
No release in over 3 years
Almost the briefest possible wrapper around the NekoHTML parser to provide xpath functionality.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

 Project Readme

nekohtml¶ ↑

A thin wrapper around NekoHTML as provided by Celerity.

At the moment this gem depends on Celerity to provide the nekohtml jar. Once I can figure out how to make this optional, I’ll provide it here if the celerity gem isn’t here at install time.

Usage¶ ↑

jruby-1.4.0 > require 'nekohtml'
 => true 
jruby-1.4.0 > html= "<html><head><title>Title of Majesty</title></head></html>" 
 => "<html><head><title>Title of Majesty</title></head></html>" 
jruby-1.4.0 > doc= Nekohtml.parse(html)
 => #<Nekohtml::HtmlDocument:0x3f70119f ... >
jruby-1.4.0 > doc.search("//TITLE")
 => #<Nekohtml::HtmlNodeList:0x1a7b5617 ... >
jruby-1.4.0 > _.first.text
 => "Title of Majesty"

Note that the xpath must use all-caps for tag names. This is a limitation of NekoHTML; I may plunder Celerity’s source to see how they/HtmlUnit handle it but for now, that’s what you’ve got.

Note on Patches/Pull Requests¶ ↑

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright © 2010 Alex Young. See LICENSE for details.