Born in 1978, is a journalist specializing in education, social issues and society. She studied psychology in Giessen and then worked for the "Suddeutsche Zeitung" newspaper volonteered. Since 2010, she has worked as an author and freelance editor in Berlin, including for the Berlin-based "Tagesspiegel", the business magazine "brand eins and the language learning magazine "Deutsch perfekt
Numerical grades from 1 to 6 are as much a part of school as lessons, class trips and snacks. But grades have been criticized for decades. They are considered unfair, arbitrary, not comparable. Why then are they still given almost everywhere?
alt="A boy is holding a report card in a classroom of the middle school St. Stephan in Straubing (Lower Bavaria) holds his interim report card in his hand." width="620" height="413" /> A boy holds a test in a classroom of the middle school St. Stephan in Straubing (Lower Bavaria) holding his midterm report card. (© picture-alliance/dpa)
If you get a 1 in math, you can be happy about your top performance; if you get a 3, you know that you are average; and if you get a 6, you know that your math performance this school year is "insufficient" was. Grades summarize information in numbers that can be understood at a glance.
But this reduction to numbers has been criticized for many years. Grades are considered unfair, prone to bias, and poorly comparable. Primary school teachers in particular are campaigning to replace numerical grades with other forms of assessment. Waldorf schools and reformed model schools do without grades until the upper school years. Several German states are trying out alternatives to traditional grades. Are grades as bad as their critics claim? And if so: why are they still awarded everywhere? Who needs grades? An overview of key issues in the debate over numerical grades.
How can measurement methods be assessed – and what does this mean for school grades??
Many questions researchers are interested in can only be answered by counting and measuring phenomena. Because measurement plays such a central role in science, a number of quality criteria have been agreed upon that can be used to determine how good a measurement method really is. School grades are also measurements: For they claim to measure the performance of students, whether in a subject, a class assignment, or an oral examination. Therefore, the quality criteria for measurement established in science can also be applied to school grades. What conditions would grades have to fulfill in order to be considered a good measurement method??
- objectivity: Is a measurement procedure independent of the person who uses it?? This measures objectivity. A measurement is objective if different observers arrive at the same results. School grades would therefore be objective if different teachers assessed the performance of a student with the same grade.
- Reliability: Measures a measurement method reliably? This captures reliability. Among other things, a measurement is reliable if a person gets the same result when the measurement is repeated. For school grades, this means that they would be reliable if a student received the same grade if he or she wrote two papers with comparable tasks in succession.
- Validity: How well does a measurement method really measure what it is supposed to measure?? Validity (validity) is aimed at this question. For example, observations and measurement results from different sources are compared with each other. Consequently, school grades would be considered valid if students who achieved a good grade in one paper also did well in other examinations related to the same area of knowledge.
Question 1: What do grades measure??
What exactly does a 2 in German mean? Behind the number is a wealth of individual achievements. Critics of numerical grades say that a grade does not tell us what a child is really capable of. The German grade includes reading comprehension, written formulation, spelling and oral expression. Perhaps the student is excellent at phrasing, but has weaknesses in spelling? The overall grade, for which the average of several partial performances is calculated, compensates for such differences – and thus makes them invisible. In addition, there is the fundamental question of whether grades can capture the actual level of knowledge in a subject. To answer this question, educational researchers compare school grades with assessments from other sources, for example, with children’s performance on standardized tests. This was also done in the 2006 Pisa study, which examined the scientific literacy of 15-year-old students.
A correlation was found between school grades and scientific competence: those who had good grades in biology, physics and chemistry also tended to achieve a higher score in the Pisa test. However, this relationship was relatively weak. The authors of the 2006 German Pisa study explain this by the fact that the Pisa test and school grades capture different facets of performance. Report card grades, which are made up of classwork, tests and oral quizzes during the school year, therefore tend to reflect short-term learning effects, often related to specific tests. The Pisa test, on the other hand, primarily tests the sustainability and flexible application of what has been learned.
Caution with comparisons
Grades make it possible to get an idea of a person’s performance and to compare people with each other without much effort. In fact, such comparisons are problematic. This is because the respective learning group is the reference value for assigning grades. Thus, grades do not reflect the objective level of achievement, but rather the ranking within a class. This is not altered by the fact that there is some correlation between grades and performance measured in standardized tests such as PISA.
The reference point is the class
The classic six-point grading scale is based on the assumption that aptitude and achievement follow a normal distribution: Most of the class is in the average range, plus some very good students and some particularly poor students. This pattern should be reflected in the distribution of grades. This means that teachers have to include some particularly difficult tasks in tests, which only the best students can solve. School administrators and school authorities push teachers to varying degrees to take this scheme into account when evaluating them.
Here a 2, there a 4
From the orientation to the normal distribution, it follows that a mediocre performance can lead to different grades in different classes: In a bad class, it might already get a 2, in a good class only a 4. A comparison of grades is therefore only possible to a very limited extent. This is true for different classes in the same year group of a school, as well as for comparisons between schools – and even more so for the comparison of grades from different federal states in which, in addition, teaching is based on different curricula.
Question 2: Are grades objective??
Several teachers assess the same work differently in some cases. Studies have repeatedly shown this. In the case of German essays, this is perhaps unsurprising, and in fact their evaluation is also considered by many scholars to be very subjective and difficult. In fact, however, studies have found sometimes large differences in scores even for supposedly objective criteria such as math problems and spelling (for an example, see Brugelmann and Backhaus, 2006). One explanation for this is the fact that educators basically have a lot of leeway when it comes to grading. For example, most schools leave it up to individual teachers to decide how many points to award for a correct answer in an exam and how much to deduct from the total score for each incorrect answer. Although more and more schools are recognizing the problem and setting uniform grading standards, these only apply to written work. The same problem arises, of course, in the evaluation of oral performance, where teachers usually have even greater leeway. This is because they can decide not only how to grade in a particular case, but also how many oral grades to collect. Thus, a student who gets several more chances to improve orally after a failed exam will probably end up with a better report card grade than a student who did not get that opportunity.
What distorts judgments
There is also the fact that human judgments are often influenced by unconscious psychological processes. Thus, a teacher is very likely to give an average paper a better grade if he or she has previously corrected several poor papers. The previous impression of a student can also influence the evaluation: If a child has only written great essays so far, the teacher may read a German exam with a mental bonus in mind, which can eventually lead to a better grade. Such bias mechanisms have been well documented in psychological studies – and apply not only to teachers who correct a paper (an overview is given, for example, by Brugelmann& Backhaus, 2006 and Oelkers, 2001).
This text is available under a Creative Commons License "CC BY-NC-ND 3.0 EN – Attribution – Non-Commercial – No Derivative Works 3.0 Germany" Published. Author: Barbara Kerbel for bpb.en
You may use this text under license CC BY-NC-ND 3.0 DE and the author(s) share.
Copyright information about pictures / graphics / videos can be found directly at the pictures.