Given a piece of XML:
<Export> <Product> <SKU>403276</SKU> <ItemName>Trivet</ItemName> <CollectionNo>0</CollectionNo> <Pages>0</Pages> </Product> </Export>
One might assume that REXML is the way to parse it, but we all know how slow it is.
Enter _why’s HTML parser, Hpricot. It’s written in C and since XHTML is a subset of XML, there’s no reason it shouldn’t be able to parse my file.
Turns out it does, it’s really fast, and the code is dead simple.
FIELDS = %w[SKU ItemName CollectionNo Pages] doc = Hpricot.parse(File.read("my.xml")) (doc/:product).each do |xml_product| product = Product.new for field in FIELDS product[field] = (xml_product/field.intern).first.innerHTML end product.save end
Update: Slight refactoring of the code above. Chris figured out last night that you can use innerHTML which eliminated the only ugly part of the code.