Thursday, May 31, 2012

Wonderful Uses of Unit Testing in PostgreSQL DB's

I am sure many readers will read the headline of this post and immediately think of pgTAP, which has become the de facto way to write unit tests in PostgreSQL db's.  The sad part is that we on the LedgerSMB team have not yet gotten around to porting our unit tests to pgTAP.  We have a slightly different approach which works in a similar way.  But this post isn't really about frameworks, but rather applications of the concept of unit testing.

Just to give folks some background, we started adding database-level unit tests to LedgerSMB back in 2007, at a time when pgTAP was still an alpha, and at the time we thought that we should wait for pgTAP to prove itself (pun intended) before relying on it as a testing infrastructure.  It has certainly matured since then.  Our approach instead is to write unit tests in SQL which write to a temporary table called test_result.  At the end of the test run, we select from test_result, provide a report of successes and failures, and roll back the transaction.  Of course if you encounter an exception part way through, your test cases will die and you won't have access to the test results until it is able to run through to completion. 

The test results are then parsed in Perl and the results fed back through Perl's test harness.  This approach has a few advantages and a few disadvantages compared to other fraeworks.  I like the fact that all the results are visible at the end.  It makes debugging easy when there is a failure because you can look at the full pattern of successes and failures and then scroll back to find any debugging information that the developer felt like outputting between test cases.

However what I really want to talk about is how these are or can be used in a mission critical environment.  I think most people would agree that software that tracks money is mission critical at least in terms of its data integrity.  If the data integrity suffers, everything in the business can fall apart quite quickly.  So ensuring data integrity for accounting software that may be customized by third parties is both a hard job and one that is absolutely necessary.

One key aspect of unit tests in this case is that they don't provide relevant lasting impacts on the database they are run against.  While there are some exceptions that are outside transactional control (sequences and the like) usually we don't care so much as to whether those are in the state they would have been for but for the test run.,  The ability to run unit tests on a live system without the possibility of screwing something up makes customization and rapid development possible in environments that it would not otherwise be.  (In actuality we have conventions to offer multiple layers of protection against data corruption.  For example serial values are typically assigned in the test case from a range of negative numbers, thus unlikely to conflict with existing real values.    However, this is nothing compared to the ability to simply roll back the transactions.)

Consequently you can use unit tests on a live system as a measure of ensuring that everything is ok.  A very useful corollary here is that you can use unit tests as a troubleshooting mechanism.  Something not working as expected?  Maybe the first thing that should be done is running the unit tests and see if anything fails.  You can thus add all sorts of checks that are useful for support and deployment purposes.

One key example here is the fact that behavior of LedgerSMB's query mapper is undefined where the target function is overloaded and no arguments defined.    Consequently it is very important to know which functions are overloaded that shouldn't be.   We actually have a test for overloaded functions, and we run this typically against misbehaving databases to see if there are issues there.

Occasionally I have recommended to customers that they wait to install add-ons until unit tests have been written.  This has never yet been the wrong call.

I am a firm believer in unit testing in the db.  They are just useful and not only in a QA environment.  These tests are just as useful as general diagnostic tools frameworks.  Therefore I think every database application should be using them.

No comments:

Post a Comment