Example:
include("phpHTMLParser.php");
$content = file_get_contents("http://www.onderstekop.nl/");
$parser = new phpHTMLParser("$content");
$HTMLObject = $parser->parse_tags(array("a", "title"));
$aTags = $HTMLObject->getTagsByName("a");
foreach ($aTags as $a) {
if ($a->href != "") {
echo $a->href . "<br/>";
echo $a->innerHTML . "<br/><br/>";
}
}
?>In this example the parser only keeps track of the 'a' and 'title' tag from which only the 'a' tag object is being requested afterwards. Running this code will parse the HTML page obtained from http://www.onderstekop.nl/, return an object containing all the information you need and output a list of links with their description. This makes the job of dealing with web pages pretty simple, because you can work with a page in an object oriented way instead of having to go through it character by character or with sophisticated and error-prone regular expressions.
Some other features
Each tag object in the object obtained by a getTagsByName call, currently supports href and innerHTML (as shown), but also id, src and innerTag (to get all the attributes as a string).
Another feature, most useful for dumping results and debugging is the output() function available on the object returned by parse() or parse_tags() ($HTMLObject in our example). Furthermore, for even more debugging, you could set $debug=True in the php file itself.
Download phpHTMLParser
16 Comments
1
Written by: Dave
2007-12-12 10:38:30
2
Written by: Rajasekaran site
2009-10-05 07:11:28
3
Written by: Dell Charger site
2009-10-31 13:55:05
Did you write this class ?
4
Written by: Victor
2009-11-06 13:12:11
5
Written by: Paco
2009-11-20 14:02:36
The intention may be good, but the result is disasterous.
6
Written by: prashant nalawade site
2009-12-15 18:30:20
There for other people asking is it yours???????
7
Written by: Tayfun Demirbilek
2010-01-25 10:41:42
It was very useful for me to parse some html form in a daily task routine
8
Written by: Mariuss
2010-02-07 12:51:56
I used it to parse TD tags inside a html page.
Some notice warnig to patch, but it works
9
Written by: Israr
2010-02-24 10:39:31
10
Written by: Salim
2010-03-23 10:26:27
11
Written by: amph
2010-03-28 19:34:07
$dom = new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$tables = $dom->getElementsByTagName('table');
$rows = $tables->item(0)->getElementsByTagName('tr');
Its 10 times as easy.. Hope it helps someone.
12
Written by: iguess
2010-05-25 22:27:43
unfortunately dont have getElemById... :)
can you add that please?
13
Written by: Gustavo Bellino site
2010-05-27 20:09:33
Regards.
14
Written by: StuR
2010-06-03 09:45:18
15
Written by: Leo site
2010-06-04 15:55:18
16
Written by: Sumit
2010-06-09 16:45:20