Formative vs Summative Assessment

Over the last term I have become more interested in assessment, particularly with centre assessed grades and how reliable and accurate teacher assessment can be. Now just to clarify, I am not saying this is a revolutionary idea nor am I saying assessment is wrong. On the contrary, I think assessment is vital in teaching, but I think what we perceive to be assessment could re adapted. For example it should rather be the purpose (function) driving the format (form) not the format driving the purpose. For example a mock exam or end of topic test being used as the default for predicted grades. However, what I don’t think is vital is the terms formative and summative assessment since they can be conflated and ambiguous depending on who you ask.

Ask any two teachers and the definitions for formative and summative will either be confused or blended, or they will resort to examples. This isn’t necessarily wrong, but it is not surprising that, as a trainee, I often got assessment for learning and assessment of learning confused. For example, would a mock exam be considered formative or summative if it was used to both predict a target grade and to assess gaps in knowledge?

From doing my Masters in Expert Teaching, I am aware there exists a wide range of studies on assessment (I have a bibliography of 37+). What I concluded, however, from the reading is that “summative” assessment is definitive, a grade or mark that has a wider impact and “formative” is everything else.  Therefore, formative should be responsive, dynamic and not worth anything to anyone except for the teacher who is using it to inform a wide range of factors.

So, through the reading I did in my assessment module of my masters (37 papers and counting), I decided to change my terminology. Not a radical change, but a change that I feel is vital. Rather than summative and formative, we have low-stakes and high-stakes. Although initially this might seem like re-branding, I personally find identifying effective assessment easier when thinking about it in these terms.

High stakes assessment is any medium which has real world impact, be it GCSE exams, SATs etc.. It needs to be reliable and valid/accurate, whereas low-stakes is more driving student improvement or to quote Black & William, 1998 “all those activities undertaken by teachers and/or by their students which provide information to be used as feedback to modify the teaching and learning activities”, or as Harry Fletcher-Wood puts it “Responsive Teaching” (Good book and worth a read).

As I said, nothing new, nothing that is not already floating around, but it is surprising how few little people realise the impact of this semantic change. By using high-stakes and low-stakes, suddenly the function drives the form.  Mock exams to assess knowledge taught by the end of Year 10 to address gaps is low-stakes, whereas if it is used for predicted grades (or more recently some evidence towards centre assessed grades) then it is high-stakes.

This also means, if it is high stakes, it needs to be reliable and valid. There are a range of methods to improve reliability and validity, for example comparative judgements, blind marking or more simply moderation and standardisation. What I think, however, is important is that for each high-stakes assessment you need to clearly show how you have ensured reliability and validity of the marks etc.. This means knowing the purpose of the assessment and why it is being used.

Low stakes, on the other hand, drives student improvement and as such reliability of the result is not as important as the precision of inferences teachers make. You could use end of topic tests, homework or mock exams to check this. But why create more marking – I would rather use something like a hinge question (Dylan Wiliam has a great article on this – worth a read) or even an exit ticket to gauge gaps and therefore where to take the lesson next.  It’s quick, immediate and no extra work.  The issue here however is the precision of the inferences. A poorly designed hinge question can provide inaccurate assessment, so planning of these are important.

What is clear to me is that I finally now understand that the purpose of the assessment is the most important part. This has already impacted my teaching; I am no longer using the same mock exam to look at what misconceptions or gaps exists as well as for evidence for a grade. Likewise, I now regularly use Hinge Questions to respond to my students rather than end of topic tests. Yes they are useful for assessing a broader range of concepts, but if I am responding to the learning to drive improvement, checking a wider domain means I lose some of that precision.

So that’s my thoughts, feel free to comment, and if it is positively received I might write more around methods to ensure reliability and validity as well as improving precise inferences when it comes to responsive teaching. If not, I will write it anyway.

Published by EducationRambler

Welcome to my blog. Follow me on twitter @gavint321. I am a passionate teacher and leader looking to ensure good practice is shared. All thoughts are my own etc... I have a passion for research based education, craft skills and reflections on my own practice.

Leave a comment

Design a site like this with WordPress.com
Get started