0.0
No commit activity in last 3 years
No release in over 3 years
Convenient parser that provides a clean way to interact with wordpress's XML dump file in your ruby scripts, to make it easy to migrate away from WordPress anywhere else.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

 Project Readme

wpxml_parser

WordPress (version 3.x and I think there's a plugin for 2.x) can export xml with all your blog posts, comments, etc. This tool can parse the xml file and give you a convenient access to all your posts, comments, categories, etc in a ruby script that you have to write.

The cool thing is — you don't even need to have access to your own WordPress database, all you need is to download the xml file from your admin panel.

Install

gem install wpxml_parser

Usage

require 'wpxml_parser'
include WpxmlParser

# Blog
blog = Blog.new('dump_of_your_blog.xml')
blog.posts.size # => 51
post = blog.posts.first # => <Post>

# Post
post.title # => "Title of your post"
post.categories # => ["ruby", "rails"]
post.date # => <Time>
post.slug # => "title-of-your-post"
post.body # => "...shitload of text..."
post.post_id # => 27

# Comments
post.comments # => [<Comment>, <Comment>, ...]
comment = post.comments.first

comment.author # => 'dude'
comment.date # => <Time>

# Comments have parent and children
comment.parent_id # => 32
comment.parent # => <Comment>
comment.children # => [<Comment>, <Comment>]

# Conveniences
# You can quickly find a post or a comment by post_id or comment_id respectively
blog.find_post(27) # => <Post>
post.find_comment(32) # => <Comment>

# For more info and more accessible properties check out the source code, it's pretty straightforward.

A note on posts

Currently only published posts are included. Drafts, private, etc are skipped. This was done on purpose, since I didn't really use those wordpress features.

A note on comments

Currently only approved comments are included. This was done on purpose, since I didn't need spam.