Ruby versus PHP or There and Back Again

Well, I imagine that this opinion piece by Derick Silvers will cause some conversations: 7 reasons I switched back to PHP after 2 years on Rails. The gist being that a big bang rewrite of an existing code base is always a risk and that Rails is opti...

Using PHP and Regular Expressions to Tidy Up Variables Print E-mail
User Rating: / 4
PoorBest 
I recently had to develop a very simple email manager for a client. It was necessary to extract some text from a database and then to insert that text into an email message, which was then fired off to a mailing list.

The problem I had was that the text contained HTML tags as well as other HTML characters (for example, nbsp;), and I only wanted plain text in the email.

(Please note that for display reasons in this article, I've omitted the leading ampersand from the HTML character.)

I was able to use the PHP strip_tags function to remove the HTML tags (see below), but this still left me with several HTML characters in the text.

The use of a regular expression solved the problem.

Here is the bit of code I used to clean up the contents of the variable:

// Get rid of HTML tags
$contents = strip_tags($contents);

// Get rid of non-breaking spaces
$pattern = '/nbsp;/';
$replacement = ' ';
$contents = preg_replace($pattern, $replacement, $contents);

When I extracted the piece of text from the database I placed it in a variable called $contents. I then ran the PHP strip_tags function on the variable to get rid of the HTML tags.

Next we have the bit of code that includes the regular expression.

$pattern contains the HTML character we want to search for. Here, $pattern contains nbsp;, which is the HTML character for a non-breaking space. I needed to get rid of this and replace it with a normal space because it looked a bit strange in the email message. For example, I needed to change:

'thisnbsp;week'snbsp;specialnbsp;offernbsp;is...'

to:

'this week's special offer is...'

$replacement contains a blank space, which is what I want to replace nbsp; with.

The last line in the bit of code is the actual regular expression.

About the Author: John Dixon is a web developer and technical author. These days, John spends most of his time developing dynamic database-driven websites using PHP and MySQL.
Go to http://www.computernostalgia.net to view one of John's sites. This site contains articles and photos relating to the history of the computer.
To find out more about John's work, go to http://www.dixondevelopment.co.uk

 
< Prev   Next >