A common question that we receive from principals is, “I have my benchmarking test scores, how to I maximize insights from these tests to boost student performance?” When thinking about academic leadership in December (the transition month between the Formative Period and the Calibrating Period of the Learner Year), the focus should not just be on getting everyone safely into winter break, but about taking stock of your current situation, and making strategic plans to re-open school in January. This plan needs to take into account where your students are (your benchmarking scores) to where they need to be (end of year standardized testing).
We’ve created a plan template for second semester academic leadership including three stages.
Stage 1 involves a self-assessment of the school’s progress in building proficient learners and preparing students for rigorous calibrating work in the second semester. We start this with an analysis of data points related to student progress beginning with a “red flag analysis” of state and district scores. We use the chart below to provide a graphic display of the various “proficiency” data points represented by the score reports.
|Test||Reading||Writing||Math||Science||Social Studies||Total Score|
|Best state report|
|Last year’s state report|
|First district report|
|Latest district report|
|Goals for this year’s test|
Analysis of this chart begins with the “red flag” analysis of benchmarking scores.
Use use a red highlighter to identify scores that were below the state average or inconsistent with state expectations (on the district tests). Multiple red flags in any tested area indicate a “priority concern” for academic planners.
Stage one concludes with an analysis of the academic performance goals (if any) set at the beginning of the year, and a review/revision of those goals in terms of current data.
Stage 2 involves “making meaning” from the different analyses included in step one. This begins with a “data dive” into the state and district reports. We are still gathering descriptive data that will help move towards an intentional plan for the second semester. The chart below is one of the charts we use to make meaning, and we use it twice, once for the latest state report and once for the latest district report (if available).
|Question||Our Analysis Shows||Priority|
|Is there a discrepancy between grade levels?|
|Is there a discrepancy among teachers?|
|Is there discrepancy among programs?|
|Are any of the scores a big surprise?|
|Were any of these scores addressed in our school SIP?|
This chart moves one step closer to understanding what the test scores we’re dealing with actually mean for academic planners. It identifies possible “hotspots” that academic leaders need to review before the end of December.
Another “making meaning” activity involves quantifying assessment data. Use the chart below as a quantifying activity school leadership teams. It is designed to turn general scores into the numbers they describe. This begins with a quantification of the state report:
|2022 Test Report||Reading||Writing||Math||Science||Social Studies||Total Score|
|Total number tested|
|Number needed to reach state goal|
This is, of course, data that is a year old. If we use it as our only driver for planning, it is like driving down the interstate by only looking into the rearview mirror. We need to take this one step farther and look at any district or school data that describes student proficiencies in terms of state expectations.
When district or school data is available, we can continue the analysis of the most current benchmark assessment scores. This chart is designed to help us quantify where we really are terms of readiness for state assessments.
|Report Used –||Reading||Writing||Math||Science||Social Studies||Total Score|
|Total number tested|
|State percent proficient expectation|
|Status – percent above (+) or below (-)|
|Number of students needed to be moved to meet or exceed state expectations|
We can continue our data dive by providing a disaggregation of the numbers into target areas. The chart included below is one example of the tools Ed Directions uses for this step.
|Test||Percent/number proficient plus||Percent/number barely proficient||Percent/number barely below but||Percent/number well below|
barely proficient = one or two questions above cut score
barely below = one or two questions below cut
This chart has several uses. It can further help us quantify our task by assessing the strength of our proficient score. It can also help us assess the difficulty of reaching our goal. In one school, the principal was concerned that he needed immediate help in reading (his reading score was in the low teens), and was ditching his reading program. When he used this probe, he found out that he did, in fact, have few proficient students and many of them were barely proficient. But, interestingly, he also founds that 80% of his students were just barely below the proficient cut score. When we moved onto the next stage and probed this data, we found out that they were below the cut score because none of the students had provided adequate answers on the highly weighted open response questions.
This “next stage” differs from state to state and depends on the nature of the data given by the state in their specifications for assessment and in the school/district reports that describe the state analysis of performance.
Stage 3 addresses the problem of “why are we here?” It begins the transition to planning that is student-based and not numbers-based. It involves speculation into why we are where we are and begins the process of prioritizing or interventions for the second semester. In this stage we look at different ways we can define the “why” behind the scores. The first stage of this analysis involves looking for areas of concern (red flags) that we found in our analysis of the state and district reports. Here is and example of how to layout this form.
|Red flag||our “why” thoughts||priority|
|Low reading scores|
|Test item difficulty|
Over reading scores – significant numbers 1 or 2 questions below the cut scores test item difficulty –difficulty with one of the types of test questions
- multiple choice
- open response
- real-world application
- technology enhanced questions
Venue difficulty – problems with the venue
- paper and pencil test
- computer-based test
- hybrid test
Genre difficulty – genre is spelled out differently in each state but all states include number of
- fiction selections
- informational selections
- real-world selections
Duration difficulty – difficulty with the length and duration of the test in terms of student experience
- length and complexity of the questions
- length of the reading selections
- the length and complexity of the sentence/paragraph instruction
- the amount of time allocated for the test
the number of types of test items/genre/venues included
At this point the school probably has enough data in hand to begin developing an SIP for the second semester that is student-focused and not score-focused. The school can define where they are in multiple ways and form initial thoughts on the why’s behind the scores.