My most recent post on this blog describes how the State Department of Education has lowered the standard for passing LEAP and EOC to appallingly low levels. This was done primarily to show false progress in improving student achievement in ELA and math. My post showed that this so called progress is contradicted by the NAEP tests. But it was also done because the tests are much too difficult for our students. Now a study just released by NEPE (The National Education Policy Center) provides more than adequate cause why these tests should be trashed before they do more damage.
The NEPE study titled "A Consumer’s Guide to Testing under the Every Student Succeeds Act (ESSA): What Can the Common Core and Other ESSA Assessments Tell Us?" is a long technical report that would be extremely difficult for any lay person to comprehend. In addition, it was obviously written in such a way as to avoid offending the powers that keep our school systems chained to the Common Core standards. But buried in the report are some shocking findings, if you look hard enough to find them.
The most shocking finding of the NEPE report is not to be found in the summary of the findings even though it should have been included as one of the most significant concerns. This finding is about the validity of the PARCC tests upon which at least 90% of our Louisiana tests are based and it is just casually mentioned on page 44 of the 60 page report. It reads as follows:
Item analysis results from the first operational testing of PARCC show that test items across grades, subject areas, and modes of testing are extremely difficult for targeted students by grade. On ELA/L items the median proportions of students who were able answer items correctly ranged from 37%- 47% only in 2015-16. On math items those median proportions ranged from 22%-55% only (PARCC-Pearson, 2017, pp 64-65). Users and test-makers should examine the possible causes for this finding, as that report uses data collected five years after reform implementation began. Discussions on how best to align CCSS reform implementation in schools with the testing schedule under ESSA need to occur immediately, so as to optimize results with higher levels of content-based validity.
Notice that this analysis recommends the discussion of immediate changes to the PARCC tests if they are to be brought to some valid level. So why are so many states continuing to use a group of invalid tests? Remember, in Louisiana, these tests are used to grade schools and as a 35% portion of each teacher's evaluation.
What this finding means is that nationwide, the average proportion of questions answered correctly on these tests was less than 40%. This is the generally same conclusion I got when I reviewed the raw scores of our Louisiana students. As I pointed out in the previous post, such low performance and such low cut scores makes outright guessing a major factor in the test results. If students can come close to passing the tests by just making random guesses then the test is measuring very little of the student's real knowledge.
There is another important factor that some teachers have brought to my attention that may be invalidating our state tests. Teachers are observing that a significant number of students are so discouraged by the difficulty of the tests, that they often turn in their test paper with most of the questions left unanswered. Some kids just quit trying after attempting just a few questions. These test results are totally invalid in my opinion. Yet in Louisiana, a student who answers none of the questions on a LEAP test gets a score of 625 out of 850. Such a score is meaningless and extremely misleading.
On page 10 of the report James Harvey, the Executive Director of the National Superintendents Roundtable states the following:
The entire education community has been bamboozled by the testing companies and our fake state and national education reformers into adopting a set of standards that are totally inappropriate for our students and that allow these "priests" to tell us what it all means since it is assumed we are not qualified to make such interpretations.