Jepson Center for the Arts

Adding and Filtering Raw HTML in WordPress Posts

I write a lot of tutorials on this site that cover topics ranging from WordPress and PHP to XHTML and CSS. In these articles I often give sample code to show an example of how to implement the script into your working site. Until this point I had been manually converting certain characters in the code examples to their character entities so they would display properly on screen. Primarily it’s been the left and right angle brackets (<,>) that have needed to be converted since these are the HTML delimiters and will trigger display issues.

A Brief Background

HTML is a structured markup language where the text is delimited by tags in angle brackets. These tags (among other things) instruct your browser how to structure/display the information between them. Common tags are the opening and closing paragraph tags (<p>, </p>); this informs your browser that the content in between these tags is a paragraph in the overall structure of the document (web page).

But what happens when you’re trying to show the actual tag to the viewer?

You can’t just type the tag out, your browser will think it’s actual markup code and not display it. What we need to do is instruct the browser to display the actual angle brackets. We do this by using the angle bracket’s character entity, the code that represents the actual character. This code can either be in the form of the entity name, or the entity number. An entity name is an easy-to-remember name to display an HTML character (ex: &quot; displays a straight quotation mark "). An entity number is the numeric code used to display an HTML character (ex: &#34; displays a straight quotation mark ").

To display the left-angle bracket we could use either &lt; or &#60;. I prefer to use entity numbers over names because not all browsers support all names, whereas support for numbers is generally strong.

Adding HTML Code To Your WordPress Post

Now that we’ve covered how to display HTML code on-screen, let’s discuss how we can implement this into a WordPress post. On the surface it’s easy enough–just convert left and right-angle brackets to their character entities. This is fine if you only add a few lines of code, but it’s pretty time-consuming if you have a lot of code to display. There are a few plugins out there you can install that could work, but most of them require additional markup in your post. What I was looking for was a way to just type out a post as normal and add HTML code without having to do anything special.

What I decided upon was creating a custom WordPress filter that would look for certain sections of the post and auto-magically convert HTML characters to their numeric entities. More specifically I wanted it to only convert angle brackets.

By employing a custom WordPress filter, I could just type my sample code as normal, wrap it in <pre> tags, and be done with it. But that meant coming up with a custom filter.

WordPress Filters

From the Codex:

Filters are functions that WordPress passes data through, at certain points in execution, just before taking some action with the data (such as adding it to the database or sending it to the browser screen).

What we’re going to do is create a custom filter that filters the Post content before WordPress displays it on the viewer’s screen.

I decided the best way to do this was to have the filter look for anything between <pre> tags and convert the code within them before displaying the post on-screen. Why <pre> tags? Adding code between <pre> tags is a semantically-correct way to designate programming code within your markup, so it made sense to use this as the filter trigger.

Using <code> tags as the filter trigger is another way to go as well, since it’s also another semantically‐correct way to display code. And it functions very similarly to <pre> tags. I was mainly concentrating on large blocks of code when this filter was written though, and <code> tags are commonly used for in-line code examples.

The Custom WordPress HTML Post Filter Explained

First, the code:

<?php
function mish_code_filter($content_text) {
    $content_text = preg_replace('!(<pre.*?>)(.*?)</pre>!ise', " '$1' .  stripslashes( str_replace(array('<','>'),array('<','>'),'$2') )  . '</pre>' ", $content_text);
    return $content_text;
    }

add_filter('the_content','mish_code_filter', 1, 1);
?>

Now let’s break the code down. First we’re creating a new function named “mish_code_filter”:

function mish_code_filter($content_text) {
    <!-- function actions -->
    }

Note: When writing your own custom functions for WordPress it’s a good idea to use a unique naming convention to help prevent possible conflicts with other WordPress core functions or other plugin functions written by other plugin authors. I preface all my custom functions with “mish_”.

Next we’re defining what the function does. In this case we’re using the core PHP function, preg_replace(), and telling it to look for the text between <pre> tags…

$content_text = preg_replace('!(<pre.*?>)(.*?)</pre>!ise',

…and replace it with encoded text:

" '$1' .  stripslashes( str_replace(array('<','>'),array('<','>'),'$2') )  . '</pre>' ",

This second part of the preg_replace() function uses 2 more core PHP functions: stripslashes() and str_replace().

stripslashes() removes the backslashes (\) added by WordPress before inserting data into the post database. Normally WordPress removes these slashes before display content on-screen, but this custom filter initializes before WordPress does its slash-stripping function so it’s required in our code.

str_replace() looks for the angle brackets and replaces them with their character entities.

The $1 and $2 are the values of the match, where $1 = (<pre.*?>) and $2 = (.*?).

The final piece of the function returns the value (the result) of the function so we can use it:

return $content_text;

Now we have to hook the filter into our WordPress install so it will perform at a specific time. In this instance, we’re hooking the filter onto the “the_content" filter hook:

add_filter('the_content','mish_code_filter', 1, 1);

The “the_content” filter hook is used to filter the content of the post after it is retrieved from the database and before it is printed to the screen. So in this instance we’re telling WordPress to associate our custom filter (mish_code_filter) with the “the_content” filter hook so that our filter will kick in before WordPress displays the post on the screen.

The first “1” in the code tells WordPress that this filter is a high priority and should be executed before any other filters, and the second “1” tells WordPress that we’re passing only 1 parameter to our custom filter.

Installing The Custom WordPress HTML Post Filter

Open your “functions.php” file (located in your theme directory), or create one if your theme doesn’t have one, and add the following:

function mish_code_filter($content_text) {
    $content_text = preg_replace('!(<pre.*?>)(.*?)</pre>!ise', " '$1' .  stripslashes( str_replace(array('<','>'),array('<','>'),'$2') )  . '</pre>' ", $content_text);
    return $content_text;
    }

add_filter('the_content','mish_code_filter', 1, 1);

Then, just save your “functions.php” file and upload it back to your theme directory. Or if you’re using the built-in Editor in your WordPress Admin Section (Appearance → Editor → Theme Functions) add the function, then click the “Update File” button.

And that’s it!

Notes and Resources

  • One caveat: If you use <pre> tags in your sample code, this filter will accidentally choke on the closing </pre> tag. To fix this simply encode the closing tag in your sample like so: &#60;/pre&#62;.
  • W3 Schools provides a comprehensive list of HTML Character Entity Codes here.
  • For more on WordPress filters see this page in the Codex.

Some plugins that offer similar functionality:

Topics

Or you can check out the full archive here: Article Archive.

Search