Snipping HTML using PHP

I

InstilledBee

Guest
So I'm making a website for a friend, and wants a custom blog to go along with it. I'm cool with writing the code for the blog and all, but I ran into a little issue with regarding to snipping the text. (to display it in a sort-of home portal)

Basically I need something that can snip the blog text into a certain length, like a summary of the blogpost of sorts, that would fit on the portal home page. But since the post would have HTML enabled and the HTML would be stored in the DB, snipping it may cut off part of the HTML and not display properly. So, it needs to be split into certain tokens and then output a certain amount of tokens.

I kind of get the logic and idea, but I don't know how to put it into code, basically. Er, programmer's block, perhaps? :p Any help would be appreciated. :)
 

UndeadDragon

Super Moderator
Reaction score
447
When displaying the text you could use PHP's substring:

PHP:
<body>
<?php 
$text = mysql_query("SELECT text FROM table");
$shortText = substr($text, 0, 30);
echo($shortText);
?>
</body>
 

Ghan

Administrator - Servers are fun
Staff member
Reaction score
888
I know you can use regular expressions to parse out the correct data while making sure that HTML tags are closed properly. Unfortunately I'm not versed enough in regexes in order to actually help with any code. :p
 
I

InstilledBee

Guest
When displaying the text you could use PHP's substring:

PHP:
<body>
<?php 
$text = mysql_query("SELECT text FROM table");
$shortText = substr($text, 0, 30);
echo($shortText);
?>
</body>

Hmm. Yeah I already got that part and am using substr(), but I still need code for making sure the HTML tags are properly parsed and do not get snipped halfway through the tag or something. But thanks! :D

I know you can use regular expressions to parse out the correct data while making sure that HTML tags are closed properly. Unfortunately I'm not versed enough in regexes in order to actually help with any code. :p

Well, I will try reading a thing or two on regexps. Thanks for the suggestion! :D
 

UndeadDragon

Super Moderator
Reaction score
447
Sorry, I didn't think of that part of it :p

Are there tags inside the actual passage of text, or do they just surround it?
 
I

InstilledBee

Guest
Hmm. What is originally planned is that the author can insert his own HTML tags as he writes the blogpost. For listing the posts, it is outputted on a <p> and trimmed if the length is greater than n, like so:

PHP:
	if($posts == 0) {echo '<h2>No posts yet.</h2>';}
	else {
		for($i = 0; $i < count($posts); $i++) {
			if(!$summary) {echo '<div class="pbody">';}
			echo '<h1><a href="index.php?page=blog&pid=', $i + 1, '">', $posts[$i]['title'], '</a></h1>';
			if(strlen($posts[$i]['message']) < $lim) {echo '<p>', $posts[$i]['message'], '</p>';}
			else {echo '<p>', substr($posts[$i]['message'], 0, $lim), '... <a href="index.php?page=blog&pid=', $i + 1, '">(Read more)</a></p>';}
			echo '<p><strong><em>Posted by ', $posts[$i]['author'], ' on ', $posts[$i]['time'], '</em></strong></p>';
			if(!$summary) {echo '</div>';}
		}
	}

(It's not the best code, but it gets the job done. :D Sorry if it looks, er, unorganized :eek:)
 

UndeadDragon

Super Moderator
Reaction score
447
With some playing around, I could work out if a set of tags was detected, and you can compare the before and after, however I can't work out out how to close an already opened tag... yet :p

PHP:
<?php
function numberOfTags( $html ) {
  preg_match_all("/(<([\w]+)[^>]*>)(.*?)(<\/\\2>)/", $html, $matches, PREG_SET_ORDER);
  
  return count($matches);
}

$string = "<b>Testing whether the tags are still here</b>";
$tagsBefore = numberOfTags($string);
$short = substr($string, 0, 10);
$tagsAfter = numberOfTags($short);

echo($string . ": " . $tagsBefore . " sets of tags<br />");
echo($short . ": " . $tagsAfter . " sets of tags<br />");
echo("<br />");

if($tagsBefore == $tagsAfter) echo("Success");
else echo("Tags do not match");
?>

http://labs.omega-designs.com/shortened.php
 

celerisk

When Zerg floweth, life is good
Reaction score
62
PHP:
# some random "html" content
$html = '<div>Testing... 1 2 3, <strong>BOLD HERE</strong>, <i>italic</i>. Done!<br />
<dl>
<dt>List test:</dt>
<dd>List item 1
<dd>List item 2</dd>
<dd><a href="http://www.thehelper.net/">Oh, a link. Click</a></dd>
<dd>Last item</dd>
</dl>';

# ask someone who knows what he is doing
$document = new DOMDocument();
$document->loadHTML($html);

# if you feel like seeing what it loaded as
# print_r($document->saveHTML());

# "extract" text version
$text = $document->getElementsByTagName('body')->item(0)->nodeValue;

# do whatever
echo "<pre>$text</pre>\n";
 
General chit-chat
Help Users
  • No one is chatting at the moment.

      The Helper Discord

      Staff online

      Members online

      Affiliates

      Hive Workshop NUON Dome World Editor Tutorials

      Network Sponsors

      Apex Steel Pipe - Buys and sells Steel Pipe.
      Top