Blogger: Beyond the Basics

Blogger: Beyond the Basics Forthcoming - Customize and promote your blog with original templates, analytics, advertising, and SEO This book will take you beyond the basics of Blogger, helping you take full advantage of the rich and powerf...

Using PHP and Regular Expressions to Tidy Up Variables Print E-mail
User Rating: / 1
PoorBest 
I recently had to develop a very simple email manager for a client. It was necessary to extract some text from a database and then to insert that text into an email message, which was then fired off to a mailing list.

The problem I had was that the text contained HTML tags as well as other HTML characters (for example, nbsp;), and I only wanted plain text in the email.

(Please note that for display reasons in this article, I've omitted the leading ampersand from the HTML character.)

I was able to use the PHP strip_tags function to remove the HTML tags (see below), but this still left me with several HTML characters in the text.

The use of a regular expression solved the problem.

Here is the bit of code I used to clean up the contents of the variable:

// Get rid of HTML tags
$contents = strip_tags($contents);

// Get rid of non-breaking spaces
$pattern = '/nbsp;/';
$replacement = ' ';
$contents = preg_replace($pattern, $replacement, $contents);

When I extracted the piece of text from the database I placed it in a variable called $contents. I then ran the PHP strip_tags function on the variable to get rid of the HTML tags.

Next we have the bit of code that includes the regular expression.

$pattern contains the HTML character we want to search for. Here, $pattern contains nbsp;, which is the HTML character for a non-breaking space. I needed to get rid of this and replace it with a normal space because it looked a bit strange in the email message. For example, I needed to change:

'thisnbsp;week'snbsp;specialnbsp;offernbsp;is...'

to:

'this week's special offer is...'

$replacement contains a blank space, which is what I want to replace nbsp; with.

The last line in the bit of code is the actual regular expression.

About the Author: John Dixon is a web developer and technical author. These days, John spends most of his time developing dynamic database-driven websites using PHP and MySQL.
Go to http://www.computernostalgia.net to view one of John's sites. This site contains articles and photos relating to the history of the computer.
To find out more about John's work, go to http://www.dixondevelopment.co.uk

 
< Prev   Next >