Ruby versus PHP or There and Back Again

Well, I imagine that this opinion piece by Derick Silvers will cause some conversations: 7 reasons I switched back to PHP after 2 years on Rails. The gist being that a big bang rewrite of an existing code base is always a risk and that Rails is opti...

Using PHP and Regular Expressions to Tidy Up Variables Print E-mail
User Rating: / 4
I recently had to develop a very simple email manager for a client. It was necessary to extract some text from a database and then to insert that text into an email message, which was then fired off to a mailing list.

The problem I had was that the text contained HTML tags as well as other HTML characters (for example, nbsp;), and I only wanted plain text in the email.

(Please note that for display reasons in this article, I've omitted the leading ampersand from the HTML character.)

I was able to use the PHP strip_tags function to remove the HTML tags (see below), but this still left me with several HTML characters in the text.

The use of a regular expression solved the problem.

Here is the bit of code I used to clean up the contents of the variable:

// Get rid of HTML tags
$contents = strip_tags($contents);

// Get rid of non-breaking spaces
$pattern = '/nbsp;/';
$replacement = ' ';
$contents = preg_replace($pattern, $replacement, $contents);

When I extracted the piece of text from the database I placed it in a variable called $contents. I then ran the PHP strip_tags function on the variable to get rid of the HTML tags.

Next we have the bit of code that includes the regular expression.

$pattern contains the HTML character we want to search for. Here, $pattern contains nbsp;, which is the HTML character for a non-breaking space. I needed to get rid of this and replace it with a normal space because it looked a bit strange in the email message. For example, I needed to change:



'this week's special offer is...'

$replacement contains a blank space, which is what I want to replace nbsp; with.

The last line in the bit of code is the actual regular expression.

About the Author: John Dixon is a web developer and technical author. These days, John spends most of his time developing dynamic database-driven websites using PHP and MySQL.
Go to to view one of John's sites. This site contains articles and photos relating to the history of the computer.
To find out more about John's work, go to

< Prev   Next >