Friday, January 18, 2008

Parsing HTML-code

I recently went through som old bookmarks and came across a .NET component (Html Agility Pack) used for parsing malformd HTML just like it was XML. This HTML parser builds a read / write DOM document and supports XPATH and XSLT. Use this assembly if you want to:

  1. Fix or generate pages

  2. Build a web scanner. You can easy get a list of eg all img tags in the document.

  3. Build a web scraper.


The assembly is very easy to use and works great! You find it here...

0 comments: