Wednesday, May 20, 2015

What Makes a Test Good?

There are three essential factors that contribute to how good or bad a test is. These factors apply to all tests regardless of whether they are large-scale standardized tests or small custom tests created for a course. The three factors of good quality are validity, reliability, and fairness.

How do we know that these are the factors that are important? Well, three professional organizations came together to define formal criteria for developing tests and testing practices (among other things). The three organizations were the American Educational Research Association (AERA), the American Psychological Association (APA), and the National Council on Measurement in Education (NCME). These sponsoring organizations published a book entitled Standards for Educational and Psychological Testing (2014), referred to as the Standards. In the Standards, the three concepts that are defined as foundational for testing are validity, reliability, and fairness.

Although the authors fully acknowledge that many of the practices described in the Standards are not feasible in the classroom because, after all, instructors are not publishing their tests for public use; however, they do state that “the core expectations of validity, reliability/precision, and fairness should be considered in the development of such tests” (p. 183).

In academics, usually the main purpose of giving an exam is to determine a student’s level of mastery of the material and skills covered in a course. Although tests can be used for many other reasons (licensure, placement, progress monitoring), the focus here will be on measuring content mastery, with the ultimate goal of helping to assign grades.

Therefore, with the goal of content mastery in mind, the requirements of a test are quite simple. First, the test needs to include relevant material and skills from a course (validity). Second, it needs to be able to discriminate between those students who have acquired knowledge, skills, and abilities from those students who have not (validity). Third, it needs to be consistent and not award different grades to two people with the same knowledge (reliability). Finally, it needs to be equally accessible to all students enrolled in the course and free from bias (fairness). If a test meets these requirements, then an instructor has evidence that the test is good and defensible.

No comments:

Post a Comment