Tuesday, November 3, 2015

Louisiana's PARCC-like Tests Not Compatible


The following is a guest post by math teacher and band director, Herb Bassett. Herb asked me to include this special note to our readers:

I have concerns that Louisiana's PARCC ELA and Math test results are suspiciously high. In this blog entry I compare PARCC results across five states to those states' performances on the PARCC and the NAEP, the gold-standard of cross-state education comparisons, to build the case that our PARCC scores are not compatible with other states and internally inconsistent as an indicator of being on track for college or careers. The main selling points of the PARCC tests was that the results could be compared across states and that there would be a universal measure or readiness for college or careers.

The PARCC is a new test. All of the other PARCC states administered a version of the test developed by the test company Pearson. Louisiana, however, created is own version of the test through the test company DRC, and Louisiana was ultimately responsible for guaranteeing that the grading would be consistent with the other states.

The discrepancies with scores in other states, and the internal inconsistency of our own results raise important questions about whether or not the results are believable and meaningful.

Herb Bassett 

State Superintendent John White's October 27, 2015 Superintendent's message  linked to this quote: 

"NAEP’s definition of readiness for the next level of education is “Proficient,” the second highest level on the test. Louisiana has aligned its definition of readiness with NAEP’s by designating “Mastery,” also the second highest level as indicative of readiness. "

White's rhetoric does not bear up under scrutiny: Louisiana's "PARCC" ELA Mastery (level four proficiency) rate is actually not at all in line with its NAEP Reading proficiency rate:

This year, fourth and eighth grade students took both the PARCC tests and the NAEP. The National Assessment of Educational Progress (NAEP) ELA and Math tests have been administered nationally to a sampling of fourth, eighth and twelfth grade students every other year since at least the early 1970's. The NAEP is used to track long-term trends in the states and the nation. 

The relative comparison between the Louisiana's NAEP and "PARCC" results when referenced to the performance of other states raises questions about the accuracy of Louisiana's "PARCC" tests 


Louisiana's "PARCC" ELA level four proficiency rates are shown by the solid red line above. The endpoints of the dotted red line below it are Louisiana's NAEP Reading proficiency rates. Louisiana's  "PARCC"-to-NAEP gaps are the widest, eleven and seventeen points. New Jersey has the next widest gaps, but its higher scores are not directly comparable to Louisiana's.

Ohio's, Illinois', and New Mexico's PARCC ELA and NAEP Reading proficiency rates all align within four points. Louisiana's fourth and eighth grade PARCC proficiency rates equal or exceed those of Ohio and Illinois, so why is Louisiana so far below those states on the NAEP? 

In other states, a PARCC ELA score of proficient means that a child is very likely to score proficient on the NAEP Reading test. In Louisiana, a "PARCC" ELA score of proficient means that a child has at best a two-out-of-three chance of scoring proficient on the NAEP Reading test.

So, proficient on Louisiana's "PARCC" does not mean the same thing as proficient on the PARCC in Ohio, Illinois, and New Mexico and it does not align with proficient on the NAEP.

This misalignment renders the first two promises for our new "PARCC" tests empty:

1. The "PARCC" test will let us compare our results with other states. 
2. We will raise the bar to "level four" instead of "level three" for the students. 
3. The "PARCC" test will show if students are on track for college or careers. 

Louisiana's "PARCC" fails a common sense test on the last promise as well.

If PARCC reliably indicates being on track for college and careers, grade-to-grade movement would be small; if thirty-six percent of fourth graders are indicated to be on track for college or careers, it makes sense that about the same percentage of fifth graders would be on track and so on. 

Ohio, Illinois, New Mexico, and New Jersey (grade four and after) show gradual movement across grades on the PARCC. Proficiency rates are stable within four points per grade. 

Louisiana's grade-to-grade up-and-down swings are wider and mostly go against-the-grain compared to Ohio, Illinois, New Mexico and New Jersey. Louisiana's instability raises yet more questions about its "PARCC" tests and how its scores were computed. 

The other PARCC states administered tests prepared by the testing company Pearson. Louisiana's test was acquired through a different testing company, DRC.

According to LDOE, Louisiana's tests included items developed through the PARCC process, but the test form was developed by the Louisiana Educator Leader Cadre and Department staff. This also left either LDOE or DRC to equate Louisiana's form to others administered in other PARCC states. (This is why I have distinguished Louisiana's test with the use of quotation marks.)

The PARCC Consortium set the cut scores for the five achievement levels and Louisiana adopted those cut scores; they are points in the scaled scores (which range from 650 to 850) that divide the scores into the five performance categories. However, there is another important step in shaping the final results.

Raw scores - the number of items a student answered correctly - must be converted to scaled scores through test equating. Even with the same cut scores, the "man behind the curtain" is the test equating process whereby the difficulty of the questions is taken into account and the numbers of correct answers required to achieve the different cut scores are set into a transformation table. Different transformation tables are made for different forms of the test.

The transformation tables could significantly affect the final results without altering the cut scores.

LDOE and/or DRC had considerable control over the test equating process. Louisiana likely had greater control over the final results than the states that administered the Pearson-created PARCC test. 

Louisiana's test making and equating process has yielded "PARCC" results higher than would be expected when referenced to the NAEP and compared to other PARCC states

In Math, Louisiana's seventh-to-eighth-grade "PARCC" proficiency rate spike is puzzling. It is inconsistent with its own NAEP trend and the PARCC results of Ohio, New Mexico and New Jersey. (New Jersey DOE says the low eighth grade PARCC proficiency is due to high performing students being allowed to take the Algebra PARCC test instead of the eighth grade test.):  


All of the other states proficiency rates in Math are lower on the PARCC than the NAEP; Louisiana's rates are the opposite. We could conclude that the PARCC Math test was harder than the NAEP in the other states, but the "PARCC" was easier than the NAEP in Louisiana. Why would this be?

Are we to believe that only 22 percent of Louisiana's seventh graders on track for college or careers then suddenly 32 percent of its eighth graders are? Do we really believe "PARCC" results that say that Louisiana sends more eighth graders to high school on track for college and careers than Ohio and Illinois, even though Louisiana's NAEP proficiency rate is barely half theirs? Does Louisiana really have three times as many eighth graders on track for college or careers as New Mexico even though New Mexico has a higher eighth grade NAEP Math proficiency rate?

In the above charts, the other states very credibly compare to each other in relative performance on the PARCC and NAEP. Louisiana's comparative results are completely inconsistent with theirs. 

Louisiana's "PARCC" results simply do not compare to other states. Louisiana's "PARCC" scores are unbelievably high when referenced to the NAEP both in the ELA/Reading and Math comparisons.

How then does Louisiana's "PARCC" compare to its previous LEAP and iLEAP?

Louisiana's "PARCC" level four is a lower bar than the previous LEAP/iLEAP level four (Mastery).

Below are the 2015 "PARCC" ELA results by grade compared to the 2014 LEAP/iLEAP. The charts show the percent of students attaining each achievement level. The bottom of green (M) indicates the minimum percentile rank of students achieving the new definition of proficiency, level four. 


At every grade level in ELA, more students achieved Mastery or above on the "PARCC" (2015) than previously did on the LEAP/iLEAP (2014) (examine the bottom of green (M)). The shorter blue bands (A), however, show that the "PARCC" is more selective than the LEAP/iLEAP for level 5. 

We should assume that actual student performance was stable between the two years and that the dramatic shift is a result of the tests not aligning rather than a sudden jolt of improvement in student performance. No attempt was made to statistically equate the "PARCC" to the LEAP/iLEAP.

John White stated that the bar would be raised from level three (Basic) to level four (Mastery) as the definition of proficiency by 2025.  He did not mention lowering the bar for level four. 

In eighth grade Math, the requirement for level 4 (Mastery) has also been lowered, greatly increasing the percentage of students scoring Mastery or higher. Meanwhile the requirement for level 3 (Basic) has been raised in each grade: 



To summarize, Louisiana created its own form of the PARCC test and was responsible for the test equating. The promises made for the new tests have all been broken somehow in that process:

1. The results are out of line with other states when using the NAEP as a reference point. 

2. Level 4 on the "PARCC" has been set to be less selective than Mastery on the former LEAP and iLEAP tests, and it is less selective than NAEP proficiency, resulting in a lower-than-implied bar for the new 2025 proficiency standard.

3. "PARCC" grade-to-grade fluctuations yield an inconsistent measure of college or career readiness. 

The comparisons raise serious questions about how Louisiana's tests were made and how Louisiana's raw scores were converted into scaled scores. Results which are unreasonable and are not corroborated by other measures do not yield confidence.

The process of test equating needs close scrutiny; due to the legislative requirement that the 2015-16 test be less than fifty percent PARCC items, we can expect LDOE to go through the equating process again and it may once more create results that bear little relation to either our testing history or the results in other states. 


Herb Bassett, Grayson, LA
hbassett71435@gmail.com

Note:

Data for the charts came from the Ohio, Illinois, New Mexico and Louisiana Departments of Education websites. New Jersey data is from nj.com. Ohio and Illinois results are preliminary and include only the online version which was taken by about two-thirds of their students. However, Ohio DOE has stated that it expects the pencil and paper versions to yield comparable results.