Scraper scripts often need to extract all links on a given page. This can be done in a number of ways like regex, domdocument etc.
Here is simple code snippet to do this using domdocument.
/*
Function to get all links on a certain url using the DomDocument
*/
function get_links($link)
{
//return array
$ret = array();
/*** a new dom object ***/
$dom = new domDocument;
/*** get the HTML (suppress errors) ***/
@$dom->loadHTML(file_get_contents($link));
/*** remove silly white space ***/
$dom->preserveWhiteSpace = false;
/*** get the links from the HTML ***/
$links = $dom->getElementsByTagName('a');
/*** loop over the links ***/
foreach ($links as $tag)
{
$ret[$tag->getAttribute('href')] = $tag->childNodes->item(0)->nodeValue;
}
return $ret;
}
//Link to open and search for links
$link = "http://www.php.net";
/*** get the links ***/
$urls = get_links($link);
/*** check for results ***/
if(sizeof($urls) > 0)
{
foreach($urls as $key=>$value)
{
echo $key . ' - '. $value . '<br >';
}
}
else
{
echo "No links found at $link";
}
Hello, I faced with the problem of drawing links from the web site. How to pull links from html I understood, but how to pull links are loaded dynamically – I do not understand. Please tell me how to pull the page link which redirects google advertising.