Test-driven development (TDD) is a philosophy of software development that is based on writing tests before writing a feature or bug fix. This is a big difference in terms of how you think about development. I find it to be very hard to move to TDD, both because it’s a different mental model and because it’s hard to change how you approach a code base.
Before TDD, I wrote code that should work, tested it manually in my browser to see why it didn’t work, fixed it and then wrote tests. With TDD, I write functions declarations and tests that describe how the code should work and identify what criteria would make me believe the code worked. Only then do I make it work. The definition of “works” is no longer ambiguous, I have a standard to judge myself on that can be enforced programmatically. By writing out my test criteria in code, the person doing the code review can address if the testing criteria is correct or not. That last part avoids the ambiguous “works for me” statement between team members and the follow up to figure out what is different — test environment or testing criteria. With manual QA, unless you document every testing step, you just don’t have that.
In this article, I’m going to walk through using examples from my plugin Caldera Forms of using phpunit to write TDD pull requests.
This is not an article on how to do testing in WordPress development. I have written about PHP unit testing, PHP integration testing, JavaScript unit testing, JavaScript integration testing for Torque before. This post is based on my experience adopting TDD for our plugin Caldera Forms. I prefer TDD and think it is faster, especially in JavaScript development than manually testing with the browser. When done right, it makes code more maintainable.
A Quick Introduction To TDD
If you’re new to TDD, you might find it easier to think of it as test-first. Here are the steps in an easy to follow list:
- Write enough of the functions you need to describe them in tests.
- Write failing tests.
- Make tests pass.
- Code review.
When using TDD, you have to make and push commits that do not work. Pushing to a master or your develop branch before the tests pass would mean every change breaks the main branch, which is a problem. That’s why git flow makes sense. You open a new branch, make incremental changes and only merge the changes when it passes automated checks such as tests, lints, code quality scanning, and a human code review pass.
Let’s talk about a contrived example so it’s simple and then look at a real-world example.
Let’s say your requirement was to create an object for adding two numbers. First, i would write a function with no body:
Then I would add two tests to ensure that asserted the rules of mathematics were being followed:
Then I would change my original function — in a separate commit — to actually work:
The reason why the failing tests get committed before the code is made to work is for two reasons. First, commits should do one thing only when possible. Secondly, it’s possible that after you make the tests pass, you or the person code reviewing the change may decide the tests were correct but the implementation was not. If that happens you can revert the commit with the implementation without losing the tests or doing fancier git and then start over.
TDD Pays Off Later
This code is testing JavaScript right now. But having a well-tested function in place allows you to safely iterate on it. Suppose you needed to add a third argument to optionally round the result to a specified number of decimals. First change the function signature:
I made this an optional argument to maintain backward-compatibility. To ensure that my assumption is correct, I’m not going to change the existing tests. They prove that backward-compatibility was maintained. Changing existing tests smells bad and is a sign that your tests are too rigid.
Instead, I would add two more tests:
By implementing tests at each stage, I have insurance that my improvements are actual improvements, and not causing new defects. When you have to modify existing code in the future, the time you invested in tests pays off.
How Much Testing Coverage Do You Need?
It depends on what you are building. The most orthodox rules of TDD I can find come from Uncle Bob:
- You are not allowed to write any production code unless it is to make a failing unit test pass.
- You are not allowed to write any more of a unit test than is sufficient to fail, and compilation failures are failures.
- You are not allowed to write any more production code than is sufficient to pass the one failing unit test.
That is a standard that is very rigid and can easily lead to tests that make changing the codebase harder. Rember, the goal is to increase, not decrease development velocity.
Kent C. Dobbs — an engineer at PayPal who is also the author of a course on testing JavaScript applications — has a great post on this topic. He argues in that post that “you get diminishing returns on your tests as the coverage increases much beyond 70%”. I don’t love that statement, but he is more experienced than me, by a lot. He does note that his open-source projects have 100% coverage because they are “ tools that are reusable in many different situations (a breakage could lead to a serious problem in a lot of consuming projects)” which sounds like a rule that would apply to a WordPress plugin. He also writes that hiss OSS project are “relatively easy to get 100% code coverage on anyway.” Which doesn’t sound like a lot of WordPress plugins.
Personally, my rule is more coverage than we currently have. Lack of tests is a technical debt that comes due later. If writing tests now take more time, it’s worth it. For brand-new code, it forces you to write code using testable patterns. Having to refactor code so it’s testable first is a pain sometimes.
Isolated unit testing in WordPress is not simple. Often times, automated UI tests using Cypress.io or Ghost Inspector has served me better. I can cover a lot of functionality quickly without having to worry about the fact that the code isn’t really testable.
Adding A Feature With TDD
I’d like to walk through an example of a TDD pull request I made to Caldera Forms. In this case it was a new feature — adding a setting for maximum file upload size. One part of TDD that I like is it forces you to figure out what new functions you need BEFORE you write them. I don’t know about you, but I’ve written a lot of code that took a lot of time to get working only to realize I didn’t need it. TDD forces me to think through my plans before moving forward.
Here is the pull request on Github if you want to read it: https://github.com/CalderaWP/Caldera-Forms/pull/2823
I should note that this PR is weird because we had to merge multiple in-progress branches from related changes together. Adopting TDD is messy and I claim no perfect adherence to its laws.
Writing Failing Tests
Sometimes it’s hard to do that all at once. For example, in this case, I needed to develop some utility methods to read field settings and do the file size check and I needed to integrate those utility methods into the existing code. I chose to do that in two steps. I got the utility methods working and tested and then I moved to use those new methods.
Here is my first commit: https://github.com/CalderaWP/Caldera-Forms/pull/2823/commits/d878e9d501af6aae92d72e45340a185dea1e9c69
If you look at the code, I added two utility methods to a class, and gave them no function body:
That’s it in this commit for the code I’m developing. I also committed tests that demonstrate how those functions should work:
These tests show how the new methods are supposed to work. The inline comments explain why each assertion is being made. That process forced me to think about how the settings, which at this point had no UI, should be structured and how I would later use them.
My general rule is that a commit should do one thing only. This commit adds the new methods and the failing tests. The word “and” in the previous sentence shows that I had violated that rule. I probably should have done two commits. More importantly, I want to note that I spent a lot of time working through the logic of what to test and there are a lot of pre-commit revisions there.
Making Tests Pass
In the past, when I was not using TDD, I would have had to test this by adding the UI option, creating a form with that option, then submit the form with a file and see if my code worked as expected. Using xDebug to step through the code and examine the results of the functions helps that process a lot, but it’s SO slow. Also, once it does work, there is no way to know if anything else breaks that feature later.
This is why I find TDD faster — when its possible — and saves time and worry in the future. Running the whole test suite between each commit would make the process very slow though. For JavaScript testing I use Jest as my test runner and it can be easily configured to run only the tests on code that has changed. That leads to a strategy where I write failing tests, get all of the tests related to changes to pass and then I tell it to run all of my tests to make sure nothing unexpected happen.
Here is an article on how to do something similar with phpunit. My personal solution is to use a “now” group annotation in my docblocks. In phpunit, you can use @group to group your tests by feature. Then you can run only tests in that group with the –group flag in the phpunit cli. In Caldera Forms we have a composer script to run the @now tests.
With the tests in place, I was able to start working on my two new methods and run just the two tests each time and see why each test was failing. In the process, I saw both PHP errors, warning and notices as well as test failures. I got pretty annoyed at myself at one point for code that should have worked and I had no idea why not, but at least I had proof I wasn’t insane.
I should also note that the reason I wrote two tests with multiple assertions is that they fail faster that way. If I had one assertion per test as is often recommended and is generally a good idea, I would have seen 10 or so fails per test. That’s a lot harder to make sense of. Organizing one test with one assertion after another helps solve one problem at a time.
The actual functions I wrote look pretty simple:
My tests cover a few different situations that are easy to overlook. For example, what happens if the setting doesn’t exist. Conditional logic based on the contents on an associative array is tricky in php because indexes may be missing, values may be represented with different scalar types — what if the integer 1 or the strings ‘1’ or ‘true’ is used instead of the boolean true?
I definitely thought, this is simple, I don’t need tests while working on these methods. Also, my first attempt didn’t work and I only know that because my tests failed and failed in a way that helped me see why.
Moving Beyond Isolation
The tests so far were technically integration tests because the environment requires WordPress to work and I did not think using mocks for unit tests was worth the pain. The rest of the tests were mainly actual integration tests.
For Caldera Forms file fields, we have a separate endpoint for uploading the file to. In this case, I didn’t need to touch that endpoint’s handler, because I didn’t mix this logic into the API handler. The API handler’s responsibility is the interaction of the WordPress REST API and a seperate class “UploadHandler” that does the upload, using data passed from the REST API.
That meant, I only had to make the changes in my “UploadHandler”. That class was changing. The change that was being made was it needed to enforce the file size limit. The business logic was elsewhere. I just needed to make sure that if I gave it too large of a file. I needed to make sure that with the right size file it worked the same way, and it threw an exception when the file was too large. Here are the three new tests:
The first test — testKnowsFileIsTooLarge() — does not do all of the permutations of tests that I had for the utility method I had previously created, because I already know it works. I was just checking that function works in this context.
The second test — testExceptionWhenFileIsTooLarge() — ensures that the result of that test passing is that an exception is thrown. Notice that I didn’t use try/catch pattern. I used phpunit’s expectException. That’s the right way to do it according to the phpunit manual and makes it simpler to write than running assertions inside of a try/catch, but it means that the test code looks less like the way someone would actually use the code, which smells a little bad to me.
The third test — testAllowsFileSize() — make sure that when the right sized file is passed, it works as expected. This test doesn’t do anything super specific. I mainly added it because the existing tests didn’t account for these settings. It’s an integration test that will fail if one of many things go wrong. Which tests it fails with will indicate more clearly what the issue is.
It’s Worth It
Adopting a test-driven approach to development can help a lot, especially as your team grows. Even for a solo developer, being forced to think about what changes you need to make before making those changes has a ton of benefit. In addition, having the tests in place before the implementation means you’re not spending time or brain power on testing or devising ways to test.
Think about all of the different informal tests you’ve set up while working on a feature. How many times have you called a function and var_dump()ed the result until you get the right result and then delete that code and move on? That’s the same basic approach as writing a test that asserts the result of the function is what you expect. Don’t you wish you could have kept those informal tests with your code base forever?
2 Comments