Tuesday, April 9, 2019

Mastery (specification) grading, and why you should try it

Students come up during our Friday class, ask me for a completely optional voluntary quiz, I hand it to them, then they say “thank you!” An unbelievable occurrence that happens all the time with Mastery grading. -- (@katemath)


This semester I tried a new approach to grading, called “Specification grading” (also known as Mastery-Based grading), and honestly, I think it is the best thing I’ve tried in my teaching career, ever. It is so good, it pains me to think that I haven’t discovered it earlier, and honestly I don’t understand why all faculty in the world haven’t switched to it yet. It’s not even “revolutionary”, it is just how grading should be. How it always meant to be.

So now, as I hopefully grasped your attention with these empty superlatives, let me explain how specification grading works in practice. It may sound a bit confusing and underwhelming at first, so I’ll start with the definitions, then provide a rationale for them, and then will show how I adjusted my courses to this grading scheme, as an example.

There are three main principles:
  1. Instead of assigning grade-points, and then calculating letter-grades based on these points, you define a set of criteria they need to meet to get an A. You also describe what share of these criteria students would need to meet to get a B, or a C. Some people call these criteria “bundles”, some call them “standards”, or “benchmarks”. 
  2. You give students several opportunities to meet each of these criteria. It can mean that they are allowed to take a test several times, until they pass. Or maybe they can rework and improve a piece of writing several times, until it is good enough. But that’s the key point: you don’t grade HOW and WHEN they arrived at a given level of mastery; you only care WHETHER they demonstrated this level of mastery before the course ends. 
  3. As your time is limited, and students can now try every assignment several times, you cannot afford to grade individual assignments on a point-based system: you either accept an assignment as “good enough” (if it passes your benchmark), or you say “try again”, and provide targeted feedback. Crucially, this feedback is completely uncoupled from the final grade.

This list may, at first, seem rather flat for the grandiose claim with which I started this letter. So, why does it work exactly? Here are the benefits, compared to a point-based system:
  • - Students can try to pass a benchmark several times. It means that they keep practicing the task, which is of course the best way to learn! 
  • - It also means that they have less anxiety (and as you have surely heard by now, anxiety is the biggest issue modern students face, for systemic / societal reasons). Imagine how liberating it is for a student to know that, at least in principle, in your course there are no irreversible mistakes. Given time, they can fix everything. If they are sick on a day of an exam, if they have test anxiety, they know that there’s a safety net. They can try again. They still need to plan their work (as you won’t accept it after the course is over), and they still need to put the effort in, but this system is infinitely easier on the nerves. 
  • - There may be more than one way to pass a benchmark, which is great for inclusion. Say, you can decide to use a timed test for the first attempt, a short essay on a matching topic for the 2nd attempt, and an alternative between a longer take-home essay or a short verbal exam for all subsequent attempts. It’s up to you how you define the benchmarks! But this way if a student has a test anxiety, or an attention deficit, even if undocumented (which is common for students from traditionally underrepresented groups), they can always use this extra flexibility to meet your criteria. 
  • - This improvement in students’ well-being is good for you, as now you don’t have to meet with sad anxious people, or entitled grade-grubbers, justifying each point on each quiz. If they think that you misunderstood them, they are always welcome to try again, and make sure that this time they are easy to understand. (Of course, some extreme students may still try to extort grades from you, but even then, this type of grading is really not conducive to grade-grubbing) 
  • - It is also good for you, as it makes your grading easier, and almost painless. You no longer have to obsess about “marginal cases”. As any assignment can be retaken, it is much easier, psychologically, to say “Try again! This little thing is missing, but I’m sure you can get it right next time!” In fact, you don’t even have to say “try again”: you may choose to go with “fix this one thing, and it will be a “pass”. You no longer have to be obsessively specific, documenting and justifying every little point that you give or take away. The only thing that matters is that you give some good, pointed, clear feedback about those few things, or maybe even that one thing, that matters. 
  • - For writing assignments that go through several revisions, the process of meeting the criteria becomes more of an interaction; similar to how we work with reviewers, when submitting a research paper to a journal. You tell students what they need to improve, if they want you to accept their work. You can also require that they make it easy for you to see and assess these changes (you can ask them to bring the previous copy with them, or “track changes”, or submit a list of responses - whatever works for you). And then you only recheck this one part that they improved; you don’t have to re-read the entire thing. 
  • - As a side-effect of this “benchmark” approach, you may actually make some bundles optional, and required for an “A” only (but say, not for a “B”). Now it’s up to the student whether they want to take the challenge. This moves the initiative in learning to the student; makes them “buy” into the assignment, and commit to it. It improves their learning, as now they own it! 
  • - And incidentally, it makes life a bit easier for you (which is important, as some of the points I described above could make your life a bit harder). If a student doesn’t want to attempt an assignment, as they are fine with a “B”, they just don’t. It means that you have less work to grade. And it is critical, as you have more to grade for those students who want to give another assignment another try!


How to convert an existing course to a specification grading system? Here’s my process:
  1. Identify key criteria your students need to meet. Some of them may come from skills you teach (writing, presentation); some may represent different conceptual topics of your course (cell division, species diversity). Enumerate them. 
  2. For each criterion, define how you can tell whether a student “got it”. Try to find a test that may be repeated several times. Say, when teaching math, a professor may create 5 variants of an assignment, with 4 problems in each of them, and define “success” as solving 3 problems out of 4 (in which case “retake attempts” would not be not infinite, as one could only try each test 5 times, but it is still a lot!). Or you can make students work on a piece of writing, with a rubric, until they meet 9/10 of criteria of the rubric. It’s up to you how you define it; the only two things that matter is that students understand what they need to do, and that they can have several tries at it. 
  3. Once you defined an “A”, define how a “B” and a “C” would look like. You can define pluses and minuses as well, if you want to, or you can just state on your syllabus that pluses and minuses are reserved for intermediate cases. 4. Plan your assignments in a way that would give students time to attempt them. Say, you can no longer have a final exam on the completion week. But you can use the completion week to give students an extra opportunity to get passes on some of the prior topics.


Now, some examples. I am teaching my “Introduction to Neurobiology” on a specification-based grading system this semester, and it works marvelously. Here’s an excerpt from the syllabus:


Course material consists of four "Bundles":
  1. Core bundle (weekly reading reflections and practical homework). 
  2. Knowledge bundle (tested via quizzes). 
  3. Lab bundle (lab work and lab reports) 
  4. Depth bundle (paper reviews)
Depending on what end-course grade you want to get, you need to complete different bundles to different levels. Your grade is in your hands!
 
End-course grade
Core
Knowledge
Lab
Depth
A
not more than 1 reflection missed
All quizzes completed
Good lab work, all lab reports
4 paper reviews
B
up to 2 reflections missed
90% of quizzes completed
Good lab work, 80% of lab reports
C
up to 4 reflections missed
70% of quizzes completed
50% of lab reports
D
up to 5 reflections missed
50% of quizzes completed

All individual assignments are graded on a pass/fail (or rather, “pass / try again”) basis. There are no points, and no partial credits; you just need to pass. Simple!

The core bundle is easy, but it is the only one that is unforgiving: you need to submit your short responses (typically, 2-3 sentences) on time, before 9 am the day the assignment is due (usually Monday). Make sure to set aside time for it, and put it on your schedule. That is the only one that cannot be redone.

For all other bundles, you can have several attempts to pass them. If you miss or fail a quiz, you can try again during allotted times in class (there will be one opportunity before the spring break, and one more during the completion week). Better yet, you can come to drop-in hours, or schedule a visit (send me an email), and pass this topic as a short verbal exam (conversation). You can have as many tries as you need; the only limitation is that you can only give it a try once a week, so if you tried but haven't passed, you need to have a week-long cool-down.

For lab work, you need to be present, and meaningfully engaged (e.g. not texting in a corner, but organizing your team, building things, taking notes, collaborating, logging and analyzing data etc.)

For lab reports and paper reviews, it is expected that you may have to go through several iterations. I will provide feedback, and request revisions, until your work passes.

The description of the paper review assignment will be shared separately.

No work will be accepted after the last day of Completion week.

Pluses and minuses will be used for intermediate cases, at the discretion of the instructor. The easiest way to get a plus is to go above the minimal set of requirements for a grade.




In practice, for the “Knowledge” bundle, which corresponds to different conceptual topics we study (such as action potential, synaptic transmission, retinal circuitry etc.), I give students several attempts to pass them, and all of these attempts are slightly different. First they have a 10-min “pop quiz” in class. If they aren’t successful with it right away, they can either come to my office hours and have a verbal micro-exam, or they can wait till the special class time (15 min set aside during the lab time) when I make them provide a long answer in class. The verbal exam idea worked particularly well for me: for the first time ever, I actually have students come to my office hours, and these office hours are uniquely productive! I ask them to explain me a topic. If they are faltering, I help them. If they manage to explain the topic with only 1-2 prods, they earn a pass. If not, (and that’s the surprisingly great part!) our discussion just naturally evolves into a productive tutoring session. We transition right from their answer into a discussion about what they miss, or misunderstand, or need to express better, and then I tell them to come again in a week, and give it another try.

Only imagine how pleasant it is to meet with a student who “failed” a topic back two months ago, but who can now explain this topic to you reasonably well, even though you haven’t recently referred to it in class! It means that they went back to the material, studied it, and figured it out. They are happy, you are happy, it’s a win-win!

Does this approach necessarily make the course too easy? No, because now you can reasonalbly expect a bit better level of understanding! My mid-term results were 30% “B”; 40% “B”, 10% “C”, and 20% “D”, which is pretty similar to my grade distributions in prior years, when I used point-based grading. But who got As and Bs this time was slightly different: people who would typically be B students, but who persevere, got an A, while some smart but scattered students got a B, or even a “C”. At this point, all students can still get an A if they want to, but they need to put some effort if they want to earn it. (And in fact I had a bout of activity immediately after the spring break, as students received mid-term grades, didn’t like it, and closed some of the gaps, so current running grades are actually a bit higher than the mid-term grades).

Does this approach increase the work load on the faculty? I want to tentatively say “no”; at least, not if you assume that office hours were already spent meeting with students, before you switched to this method, and also assuming that you don’t overcommit. It was an increase for me personally, as I almost never had students attend my office hours before, and now they are packed, but it is still contained within about 2-3 hours a week, so it seems doable. It also makes planning office hours more important, as this approach essentially makes office hours attendance semi-required for a student to succeed. Which means that if a student has a time conflict, you may have to meet with them outside of you normal office hours, which may be hard sometimes. But that said, grading got so much easier, and (crucially) it is no longer painful, or emotionally draining, that it is still a net win. You know that you don’t close the door of opportunity, which gives you a moral permission to be a bit more demanding when working with “gray zone” answers, which is a huge relief, and makes the process almost pleasant.

The only word or caution here: there may be students who, despite all your statements and encouragement, both in class and in the syllabus, would be hesitant to come to office hours, especially if your standard drop-in hours don’t work with their schedule. They would be ashamed to be a burden, so they’ll just sit there, waiting for in-class opportunities to retake the quiz, but they won’t be proactive about it. It may be a personality issue, or an issue of culture, or both (think socioeconomic status, gender, introversion), but either way, if one needs to come to your office to retake the quiz, make sure that you demand from each student that they show up to office hours at least once. And then talk to them. Make sure they believe you when you say that retaking a quiz is part of the deal, and not some extra favor they have to ask for.


Finally, as another example, here’s what I plan to do with my Statistics class next fall. For topical goals, I’ll still use problem quizzes, but will prepare several versions of them. While verbal exams are great, because stats are closer to math, I think problems could work better. For skills (data wrangling, R coding, visualization, figure captions), I’ll introduce several in-class data workshops that would each count towards “closing” a topic. This would be a change to my current process: in the past, I offered all workshops as take-home assignments; graded some of them, and provided formative feedback on the rest. All workshops were submitted openly but anonymously (every student could see every other student’s submission), and the final exam was optional (if you liked your running grade, you could skip it). Now I will have to split all workshops into two types: those that are take-home, and those submitted by the end of class. (I still plan for both types to be submitted openly and anonymously). Take-home assignments will count towards “participation” (every student needs to submit something vaguely reasonable), while n-class assignments will serve as mid-terms. Except that these mid-terms will be “anticumulative”: if you proved that you know topic A, you don’t have to polish this part of an assignment anymore, you can concentrate on the topic B part of it. (I still need to put some thought into making it transparent to students, but it seems doable).

To sum up, I think specification grading (aka Mastery grading) is obviously better than the point-based grading: both for the students, and for the faculty. It seems to improve learning; it is more fair and inclusive, and it is unexpectedly pleasant to use! So I definitely encourage you to give it a try!


References

Kate Owens. A Beginner’s Guide to Standards Based Grading (A practical blog post) https://blogs.ams.org/matheducation/2015/11/20/a-beginners-guide-to-standards-based-grading/

Sadler, D. R. (2005). Interpretations of criteria‐based assessment and grading in higher education. Assessment & evaluation in higher education, 30(2), 175-194. https://uncw.edu/cas/documents/sadler2005.pdf

Carberry, A., Siniawski, M., Atwood, S. A., & Diefes-Dux, H. A. (2016, June). Best practices for using standards-based grading in engineering courses. In Proceedings of the 123rd ASEE Annual Conference and Exposition, New Orleans, LA.
https://www.asee.org/file_server/papers/attachment/file/0006/9169/SBG_ASEE_final_submitted.pdf

A giant open repository of materials on Mastery-based grading (curated by Dr. Rachel Weir, Allegheny College):
https://drive.google.com/drive/folders/1GNSqfOb0LZS6BeAuc1tqPDZWKkPk11KT

Nilson, L. (2015). Specifications grading: Restoring rigor, motivating students, and saving faculty time. Stylus Publishing, LLC. (Book)