Monday, December 19, 2011

Parsing HTML

Parsing HTML is not easy.

What did I learn today?

Do not try to parse HTML with regex aka PCRE.


I decided to use PHP, so I started here: PHP Manual DOMDocument

Then found out about: DOMXPath

It should be mentioned now that I have little experience with PHP, mostly C++. With that said, it was very difficult for me to come up with a solution. I was trying to parse a specific web page for specific topics within the source code.

PHP Tutorial

No comments:

Post a Comment