I recently went through som old bookmarks and came across a .NET component (Html Agility Pack) used for parsing malformd HTML just like it was XML. This HTML parser builds a read / write DOM document and supports XPATH and XSLT. Use this assembly if you want to:
- Fix or generate pages
- Build a web scanner. You can easy get a list of eg all img tags in the document.
- Build a web scraper.
The assembly is very easy to use and works great! You find it
here...
0 comments:
Post a Comment