I Hate It When My Tests Pass

I’ve been writing a bunch of unit tests recently, and I came across an odd behavior. I was writing the tests for existing code and whenever I would write a test, as soon as it passed, I wouldn’t trust it. Nothing ever works the first time around. It simply can’t be. So I found myself repeating quite an odd ritual:

This is a companion discussion topic for the original entry at http://amir.rachum.com/blog/2012/04/23/i-hate-it-when-my-tests-pass/

If it were possible to trust that software does what we expect it to, I'm not sure if tests would be necessary in the first place.

You got a point there :)

You're supposed to write the test first so that it fails and then write the code that makes it pass for exactly that reason

Well, it always does what we tell it to, but not always what we intend it to.

Instead of writing tests that just test the happy path then adding tests because you don't trust the first tests, just write tests to test failure the first time.

It's a pretty old tool and it doesn't seem to be actively maintained anymore, but you might be interested in a tool called Jester. Basically it screws around with your production code and checks how many of your tests start failing. If all your tests still pass even after "x + 10" has been rewritten to "x - 10", it's quite likely that they are not good tests.

This. More formally

I doubt most programmers do this, but it's not completely my call (See my earlier post

i've used a pattern like this

@expect(result=5, exception=None)
def test_foo_count(self):
return 5

This reminds me of how when a lightbulb has blown, people will switch the switch several times in order to make sure that it really doesn't work.

It's kinda like we assume that the lightbulb/compiler didnt hear us the first time when we asked it to turn on/compile with new tests.

Tools such as pitest (www.pitest.org) help me get rid of those nagging feelings :-)

you need to write tests for your tests

The obvious solution: http://cdn.memegenerator.net/i...

In step 3, you should change the code being tested in a way that should break the tests. This will have the same effect as writing tests with TDD in the first place, causing you to trust them:
def factorial(x): # written a year ago return int(x==1) or x * factorial(x - 1)def test_factorial(): # written just now by me expected = 6 assert expected = factorial(3)Secretly, there are bugs in the factorial function AND in the test function. But the test passes. I'm worried that test doesn't work, so I break the factorial function.def factorial(x):
return x - 10
Whoops! The test still passes. What's up? I rewrite the test until it breaks :) Then I rewrite factorial until it passes.

You might consider stochastic testing (quickcheck for haskell and erlang is one example.) That way, you don't have to try to define a search space, and parameter sets you wouldn't have considered get tried quite frequently.