Wednesday, September 3, 2014

New Revelations About LEAP and NAEP

The LA State Dept. of Education yesterday attempted to explain the setting of cut scores on LEAP. There is a section of the September 2 LDOE newsletter addressing the testing issue. Readers may remember that this blog demonstrated that the percentage of correct answers needed for a rating of basic in 2014 dropped drastically in 3 out of 4 categories in ELA and Math compared to 2013.

According to the LDOE, the scale scores for a rating of basic have not changed since they were created. It still takes a scale score of 301 to get a rating of Basic in 4th grade ELA and it still takes a score of 321 to get a rating of basic in 8th grade Math in 2014 just like it did in 2013. Even though the tests were made more difficult by going to the more rigorous Common Core aligned questions, the scale cut scores remained the same according to LDOE. But they also admit that the percentage of correct answers equating to those cut scores can change from year to year because of a process used by the testing company called “test equating”. That's an adjustment made mostly to insure that the students are not penalized or rewarded as the new test form either gets harder or easier.

In my blog post of August 17, I suggested that the statement made by the LDOE, that the percentage of Louisiana students scoring basic had remained steady even though the tests had gotten harder was misleading. My point was that the “test equating” process had adjusted the minimum raw percentage scores to insure that the percentage of students scoring basic remained steady. It was a rigged result, and no one really knows whether our students' learning remained steady from 2013 to 2014.

But the LDOE description of the LEAP design process made another statement that in my opinion further destroys their credibility in approving LEAP cut scores. Here it is, from the LDOE website. The following is the statement that concerns me.

The scaled scores and cut points for LEAP – what it takes to earn Basic, Mastery, Advanced – were set in 1999 when Louisiana first created the LEAP assessments; the scaled score ranges for iLEAP were set in 2006. To ensure rigorous achievement levels, Louisiana set these cut scores using the National Assessment of Educational Progress (NAEP) as guidance. Thus, Basic on LEAP roughly equates to Basic on NAEP and Mastery on LEAP roughly equates to Proficient on NAEP.”

I decided to check whether or not the LEAP scores really do equate to the NAEP and also to study the trend in NAEP scores for Louisiana compared to LEAP over a period of years. There were a few problems to overcome however. Since John White came in as State Superintendent, much of the data from previous years has disappeared from the Louisiana Believes website, and also the NAEP provides only reading and math scores instead of ELA and math, and their data only goes to 2013. So I got LEAP data as far back as I could and used the reading score on NAEP to compare to the ELA score on the LEAP. So here are the results of my comparison of percentage of students attaining Basic or above on LEAP and NAEP from 2005 to 2013:

2005 LEAP Results compared to NAEP (percentage of students at Basic or above)

4th grade ELA – LEAP- 66.9%     4th grade reading – NAEP – 53%

4th grade math – LEAP- 63.4%    4th grade math - NAEP – 74%

8th grade ELA- LEAP- 53.2%      8th grade reading – NAEP – 64%

8th grade math-LEAP-54.9%         8th grade math - NAEP – 59%

2013 LEAP Results compared to NAEP ( percentage of students at Basic or above)

4th grade ELA -LEAP- 77%        4th grade reading – NAEP – 56%

4th grade math – LEAP -71%      4th grade math - NAEP – 75%

8th grade ELA - LEAP - 69%      8th grade reading – NAEP – 68%

8th grade math - LEAP - 66%      8th grade math - NAEP – 64%

It does not look to me like LEAP and NAEP scores are very compatible. If we compare LEAP scores to NAEP, in 2005 we find that the 4th grade ELA LEAP scores are a lot higher than the 4th grade reading NAEP scores, but the 8th grade ELA- LEAP scores are significantly lower than the 8th grade Reading- NAEP percentages.

Then looking at math in 2005, we find that the 4th grade LEAP math percentage attaining basic or above is a lot lower than the NAEP percentage and the 8th grade LEAP math is pretty close to the NAEP math.

But look at the change in LEAP and NAEP results for Basic from 2005 to 2013. The LEAP scores in both ELA and math went up pretty dramatically during that 8 year time period, but the NAEP scores went up only a little. The average increase in LEAP in the 4 categories reviewed above was about 11 percentage points but the average increase in NAEP was only about 3.25 percentage points. But since both tests were measuring the same thing and the LDOE has told us that Basic on LEAP equates to Basic on NAEP, both sets of scores should have gone up pretty much the same.

So now in addition to obvious test score manipulation, we have test score inflation on LEAP!

Our LDOE (or the testing company) has artificially inflated the percentage of students passing the LEAP to make it look like the gut wrenching deforms to our public education system and the obsessive teaching to the tests have all been worth it. The truth is that using the NAEP as a more objective measure of our real progress, Louisiana has gained an average of only 3.25 percentage points for students reaching Basic. But not only that, using NAEP comparisons, the gap has widened slightly between Louisiana and the national average.

So in the last 10 years, Louisiana has spent millions upon millions of dollars on a testing system that manipulated and inflated scores while our students have lost ground in comparison to the rest of the states. In addition, we have lied to the rest of the nation, telling everyone that our students in the Louisiana Recovery District have made tremendous progress in attaining grade level performance when that measurement of grade level (Basic on LEAP) was inflated to the point of making a mockery of the Louisiana accountability system.

It pains me as a public education advocate to point out these false claims of test measured progress of our educational system in Louisiana, but the truth always eventually comes out. We should always base our education policies on the truth.

I believe that all this emphasis on testing is wrong because scores are manipulated and because these so called standardized tests are not a fair and accurate measure of our students and our schools in the first place. In my opinion, all that time our teachers were forced to spend drilling our students for taking the LEAP could have been much better spent teaching kids how to enjoy and appreciate reading, and math and science and history and music. We are killing all the joy of learning for both students and teachers and we have almost nothing to show for it.


lbarrios said...

These distorted test scores are used to fail teachers and schools, distribute state and federal tax dollars and to promote charters and vouchers. That should mean that misrepresenting them and misdirecting dollars is a criminal offense. John White will continued to spin his story. When will John White be indicted?

Jeff Sadow said...

Unfortunately for your purposes, what you present here doesn't come close to demonstrating your hyperventilated claims. Essentially what you are arguing is that by reviewing a pair of data points for each kind of exam that you can extrapolate changes in each,where significantly different differences between the two categories of tests indicate a lack of reliability across time for one of the two, which then you attribute to allegedly malign forces.

There are several problems with this approach, both at an instrumental and theoretical level. To start, you use only two data points and assume all change is uniform across the several data points you exclude. This is an extremely insensitive and crude method from which to draw inferences, and always raises the question of the decision rule used to choose 2005 (2013 as the latest seems logical) and whether it came randomly, or for convenience, or as a result of trying to manipulate the results. The point is that we have no way of knowing whether this is a representative data point.

Much better would be to get all data points between 2005-2013 (even better, go back to 1999) and then perform an analysis of variance. There are a couple of ways to do it; I would recommend taking the difference for each year for each kind of test between the LEAP and NAEP and then seeing whether the variance fluctuates more than randomly (in fact, I think theoretically you could justify running all four comparison sets together to get a very healthy number of data points). If non-randomness is discovered, your thesis is verified only if, after you do further analysis of the residuals, that there is a increase in these values LEAP minus NAEP over time, which can be done by computing a statistic for autocorrelation.

There also is an unsustainable assumption that you make here, if there is non-randomness in differences and residuals over time -- that it is a reliability problem with LEAP. What if it's actually caused by a problem with NAEP, or even both? The only way to test for this is to compute a reliability statistic for each of the exams (meaning test items over time) and even then you can only say one is more or less reliable than the other, unless you assign a cutoff value to designate either reliable or unreliable. The problem with your a priori assumption is that it depends solely on subjective ideological considerations -- being as we have zero evidence of manipulation for both LEAP and NAEP -- so even if we found non-randomness, with no certainty could we buy your explanation for it.

In short, what you have done here gets about 1 percent of the way to not disconfirming your overwrought thesis, and in no way provides any objective support for it.