As we are all aware, this is an area fraught with problems.
Problem 1 is which test do we use. If we use the same test, using the same words, every year, inevitably teachers will teach to the test. This is why I like Dennis Young's Parallel Spelling Test, which allows teachers to use different sets of words so that the same test isn't given twice. However, as Jenny has pointed out, the test hasn't been re-normed for a long time and should perhaps be brought up to date.
Problem 2 is the question of what is being tested. Are we to test Dolch words, or should the criterion be to test what has been taught. If we were to follow the latter, then we'd probably agree to test CVC words, followed by CVCC, CCVC, CCVCC, and so on. Nevertheless, not everyone would be happy with this because, as we know, L&S makes the explicit teaching of words of greater complexity (CVCC, CCVC, etc.) optional. Some teachers would also want to throw in double consonants and some consonant digraphs, such as <sh>, <ch>, and so on, at an earlier stage than others.
Whatever we think about 1 and 2, problem 3 is: how do we go about testing different ways of spelling sounds? Do we test just one way of spelling sounds? Two? Three? Clearly teachers will jump up and down if we test code knowledge which hasn't (yet?) been taught.
When we decided, strategically, to lay more emphasis on testing spelling, we did so for a principled reason: we believed (and still do) that spelling gives a far more accurate picture of children's literacy than reading. Fortuitously, it is also much easier to test, in that a test can be delivered to a whole class in one fell swoop and it is easy to mark. That isn't to say that we don't think it's a good idea to test reading.
A lot of heads of schools involved in helping us to collect data were able, after a number of years, to be able to see at a glance how their cohorts compared. Obviously, although cohorts differ slightly from year to year, heads realised that there shouldn't really be major shifts in results, which then also gave an indication of how well teachers were teaching. The tests were also an encouragement for teachers to 'keep up to the mark'. [Of course, the problem with this last point is that once results between schools are compared, testing starts to become high stakes and then results are more likely to become skewed.] another advantage in using a test, any test, is that it keeps in mind that the teaching of reading and spelling is a big deal and that it is the foundation on which all else that takes place within the primary school is based.
What really surprised us when we started collecting data was how few schools were using any kind of normed and standardised tests. Every judgement they made about pupils' reading and spelling abilities seemed to be based on opinion. Unfortunately, this is still so often the case. What also surprised us was that when marking the spelling tests we sent out, teachers seemed to be more strict (over anomalies) than we were and there was hardly any evidence at all of cheating!
We have begun to think about this again recently and I have asked a number of schools to start testing, using the Parallel Spelling Test. For us though, it is enormously expensive to conduct on the same level that we did in the beginning. This is because one of our team [David Philpot] spent a lot of his working time sending out tests, lists of words to use that year, and clear instructions on how to conduct the tests. Then, there was all the data on the chronological ages of each individual child, the double marking, and so on. It was truly a labour of love and it costs a lot of money to be able to do it. David's reputation as an educational psychologist and his integrity were also widely respected - which was certainly a factor in Wigan.
As I've said before, I'd also love to see what would happen if the Schonell were to be used again and compared with Jenny's results. I wish now I'd kept the vast numbers of tests results we collected on boys arriving in secondary school at the beginning of what we now call Year 7. I remember very well how the results got worse year after year and that this made decisions about who needed interventions and who didn't very tricky, particularly when more boys were dropping below the 9:0 to 9:6 levels. This was generally considered to be the level at which children could at least function within the first year of their secondary schooling; yet there were so many dropping below the nine years level that we didn't have the resources to be able to offer extra help. Only the very most needy were given support. And that, was pretty much whole language in those days!