Earlier this month, I wrote an introduction on validation and sanitization in WordPress, which emphasized the importance of making sure that data coming into your site via HTTP requests or into PHP functions is authorized, properly formed, and not malicious.
That article was all about inputs. This one is all about outputs. We are going to discuss the precautions to take when retrieving data out of the database to display or input in other classes or functions.
I hope by this point you are learning to trust no one and nothing. The same goes for the database.
Repeat after me,”the database is not a trusted data source.”
The Human Problem
Even if you’ve implemented my advice on input validation and sanitization, all of the data in your database is still not 100 percent safe.
Every database, every application, every website, and every WordPress plugin and theme has a major security flaw…humans. Humans make mistakes writing and using
Humans make mistakes writing and using software. This is known. Whether it’s you, your clients, someone who installs your plugins and themes, or the person who wrote the code sample that gets pasted into functions.php while desperately solving a problem, one of you will inevitably do something wrong and create misformed data in the database or worse, a security vulnerability.
By implementing the best practices in this and the last article, you can spot those mistakes much easier. Also, we’re about to talk about the principle of late escaping, which generally speaking is the process of preparing content to be printed to the screen.
Escape All The Things
Every type of element we output as HTML needs to be escaped properly based on context. For example, we often see something like this to create a form element:
echo '<label for="' . $name . '">' . $label . '</label><input type="text" name="' . $name . '" value="'.$value ." />';
Besides being almost unreadable, if $name or $value are not safe to use as HTML attributes, we could have broken HTML or worse. Also, we want to make sure that $label is actually safe HTML.
This is especially important if $label is coming from the database or some other untrusted source. Consider the difference between these two bits of code:
echo '<script>window.location = 'http://shadyonlinepharmacy.com';</script>' ;
Or this code:
esc_html( '<script>window.location = 'http://shadyonlinepharmacy.com';</script>' );
In the first example, if that was the string that came out of your database, and it’s a perfect example of the kind of thing hackers will put in your database — your users will be redirected to http://shadyonlinepharmacy.com. The latter example, however, just looks bad.
To refactor our example for an html element, we want to escape all HTML attributes using esc_attr() and html content using esc_htm(). Here is what it looks like:
echo '<label for="' . esc_attr( $name ) . '">' . esc_html( $label ) . '</label><input type="text" name="' . esc_attr( $name ) . '" value="'. esc_attr( $value ) .'" />';
That’s even less readable than our original example. Closing quotes, using a dot, a function, and then another dot, and then re-opening quotes is a mess and can be easy to screw up. If you throw in mixing single and double quotes, it becomes even more complicated.
Instead sprintf() or printf() is much cleaner. Here is the same thing, refactored to use printf so I have my markup on one side and my values on the other:
printf( '<label for="%">%s</label><input type="text" name="%s" value="%s" />', esc_attr( $name ), esc_html( $label ), esc_attr( $name ), esc_attr( $value ) );
Late Escape All The Things
Escaping untrusted input is important. But, if you’re not careful, you can end up over escaping, which can cause a lot of issues.
Not escaping is bad. Over-escaping is bad. Following the practice of late-escaping consistently avoids both a lack of escaping and over-escaping.
For example, earlier this year there were several security issues in a lot of popular WordPress plugins related to failing to escape like queries in SQL or URL query strings built using add_query_arg(). The rush to fix these issues and avoid falling victim to similar issues lead to a lot of over escaping, which caused its own issues.
For example, a lot of people escaped every usage of add_query_arg(). This can cause a problem if a URL string, which has query arguments and has been escaped is then passed through add_query_arg() again.
For example, I often write a function like this:
function slug_get_api_url() { return rest_url( 'my-namespace' ); }
You might look at that and think, “you must escape all URLs.” This is generally true — but not here because I might use that function later as the base for a query string. And I may or may not pass that result through add_query_arg() based on some conditional.
This is the reason why we practice late escaping. Escaping should only be done when a variable is about to be echoed or printed — no earlier.
Here is an example of preparing URLs to be output for use in JavaScript using wp_localize_script, where we escape the URLs late as possible.
function slug_my_custom_api( $action = false ) { $url = home_url( 'custom-api' ); if ( is_string( $action ) ) { $url = add_query_arg( 'action', $action, $url ); } return $url; } function slug_submit_action( $id ) { return add_query_arg( 'id', (int) $id, slug_my_custom_api( 'submit' ) ); } add_action( 'wp_enqueue_scripts', function(){ if( ! is_singular() ) { return; } $post = get_post(); wp_enqueue_script( 'slug-script', ... ); wp_localize_script( 'slug_script', 'SLUG', array( 'api' => esc_url(slug_my_custom_api() ), 'read' => esc_url( slug_my_custom_api( 'read' ) ), 'submit' => esc_url( slug_submit_action( $post->ID ) ) )); });
Learn This
I hope you have learned the basics from this post about the importance of escaping all of your outputs. The WordPress VIP developer best practices guide is an excellent resource on best practices that every WordPress developer should read. The Importance of Escaping All The Things is a must read.
Now that you have a basic understanding of escaping, and where to find escape functions — read the source Luke — I hope you will start scrutinizing the code you write, or cut and paste it from helpful tutorials for failures to follow this best practice.
No Comments