Slicing Strings with PHP: Be Mindful of Output that Contains HTML Tags

When experimenting with strings which contain HTML code, be mindful of what you're getting for output. Especially if there is something unexpected about the results. That's what I learned the hard way when extracting an open anchor tag from the source code of a web page. The variables used to locate the anchor tag appeared to be working, but for some reason the extracted code wouldn't display to the screen. Let's take a look at where I went wrong.

The Problem

Let's say we're tasked with getting the open anchor tag for the "About Us" link from the code below. Note that there is more code, but we'll keep things simple.

<?php
$code_from_website = '...ul id="Menu1"><li><a href="/about/"  onmouseover="ShowMenu('Menu1')" onfocus="ShowMenu('Menu1')" onmouseout="HideMenu()" onblur="HideMenu()">About Us</a></li><li><a href="/about/history.php"  onmouseover="ShowMenu('Menu1')" onfocus="ShowMenu('Menu1')" onmouseout="HideMenu()" onblur="HideMenu()">History</a></li><li><a href=...';
?>

To extract the anchor tag, we'll need to figure out where the link text (About Us) starts and where the corresponding open anchor tag is. The print statements are there for us to see that we're getting seemingly valid values.

<?php
$position_linkText = strpos($code_from_website, 'About Us');
$position_anchorStart = strrpos( substr($code_from_website, 0, $position_linkText), '<a');
 
print "<div>Link Text Position: $position_linkText</div>";
print "<div>Open Anchor Position: $position_anchorStart</div>";
?>

With the new variables, we know where the open anchor tag is in the overall string. So let's grab the code and display the result.

<?php
$length_openAnchor = $position_linkText-$position_anchorStart;
$openAnchorCode    = substr($code_from_website, $position_anchorStart, $length_openAnchor);
 
print "<div>Open Anchor Tag Length: $length_openAnchor</div>";
print "<div>Anchor Code: $openAnchorCode</div>";
?>

The first print statement works as expected, but the second one doesn't seem to display anything. So what went wrong? Everything else displays fine…

Well, if we think about; the answer is obvious. The $openAnchorCode variable contains HTML code for to open an anchor tag. What happens if display an anchor tag without a text label? We get a hidden anchor which is only visible by looking at the source code. With that in mind, we just need to make one small change to our code.

<?php
print "<div>Anchor Code: " . htmlentities($openAnchorCode) . "</div>";
?>

Conclusion

The moral of the story is that we need to be mindful of what we're trying to display. If the output contains HTML tags, or anything that can be interpreted by the browser, mysterious things may happen. We'll end up wasting time trying to figure out where the program went wrong…and most likely looking in the wrong place for the fix.

0 Comments

There are currently no comments.

Leave a Comment


Warning: Undefined variable $user_ID in /home/cyberscorp/webdev.cyberscorpion.com/wp-content/themes/scorpbytes/comments.php on line 72