Thursday, January 7, 2016

Bassett Examines Progress Point Influence on School Performance Scores

Why question School Performance Scores?

The School and District Performance Scores, along with their associated letter grades, were released without fanfare just as schools dismissed for the Christmas holiday. Potentially controversial news is often held until a holiday.

Despite the letter grades being curved to maintain a constant grade distribution, well over one third of schools and almost one-fourth of districts moved up or down a letter grade.

I would like to explain to your readers how the LDOE designed this churn into the accountability system. The promulgated rules for Progress Points magnify small changes in performance into large point jumps. A computer model that produces highly erratic year-to-year outcomes determines the winners and losers of a Hunger-Games style competition for the Progress Points.

This churn-by-design perpetuates an unjustified sense of crisis about school performance. When your local school drops a letter grade, you are concerned. The churn means that, if not this year, sometime within the next few years you will almost certainly experience the concern. This turmoil provides a ready environment into which LDOE can plant its next untested reform. Progress Points fertilize the turmoil.

After careful study, my opinion is that they are complete natural fertilizer.

Herb Bassett

Progress Points introduce unnecessary volatility to school performance measures 

The 2015 School and District Performance Scores were released later than usual this year, just before the Christmas holiday. An important feature in those scores in the John White era is Progress Points which were first known as Bonus Points when introduced in 2013. Progress Points are proving to be a major source of year-to-year volatility in the performance scores.

In theory, Progress Points identify and reward schools and districts that excel at the job of improving the performance of struggling students.  In reality, they fail at that task and randomly fluctuate the letter grades assigned to schools and districts, causing turmoil and giving a false impression that we have found ways to break the relationship between scores and poverty.

The LDOE curved the School Letter Grades this year to hold the overall distribution of letter grades constant. This presented a calm surface, but underneath there was tremendous churn. The  high school letter grades rose overall through eased Progress Points requirements. Apparently to balance this, there was a slightly downward trend in the Elementary/Middle school letter grades. Over one third of those schools changed letter grades from 2014 to 2015, with 145 going up, and 238 down in letter grade.

Of the elementary/middle schools changing letter grades, I conservatively counted 122 changing letter grades due to an increase or decrease in Progress Points. These figures could be slightly off, but they show the magnitude of the issue. (Some K-2 elementary schools in the count receive the same scores as the schools with which they feed, making some schools count twice.)

When I examined district level data, serious questions arose about both the quality of the computer model (VAM) and the rules which potentially magnify small variances in the VAM outcomes into large point changes.

In the first part I will show that Progress Points destabilized the District Performance Scores because there is no year-to-year consistency in the VAM used to award them. Districts that VAM identified as outstanding at raising the scores of struggling students one year could not be counted on to give a strong performances the next year. This instability may be exaggerated in the School Performance Scores. Parents making school choices need to understand that the percentages of struggling students exceeding expectations varies wildly from year-to-year at the school level not necessarily because of real improvements in their education.

In the second part I will explain implications of the destabilized scores and how the rules exaggerate small performance differences into large Performance Score changes.

Part 1: Proficiency rates are stable, VAM outcomes are not.

Other than RSD LA and RSD EBR dropping from D to F, the number of A's, B's, C's, and D's for the 72  districts were the same in 2014 and 2015. However, there was considerable churn beneath. Seven districts rose and nine dropped one letter grade. The loss or gain of Progress Points caused eight of those changes. Without Progress Points applied in either year, six districts would have risen and only two would have dropped one letter grade.

District Performance Scores primarily reflect state test scores of grade 3-8 students. While they also include high school graduation data, and End-Of-Course test and ACT data, the grade 3-8 state test scores count more because there are more students in those grades than in high school. The grade 3-8 Assessment Index is based on proficiency rates - the percentages of students scoring Basic, Mastery, and Advanced on state tests. The Grade 3-8 Assessment Indices for 2013 and 2014 are shown below. (Click on the image to enlarge it)

Chart 1:

The data points very nearly fall into a straight line. This indicates that the scores change little from year-to-year. The correlation (R2) value of the above indicates a very strong correlation. (R2 can range from 0 to 1 with 0 meaning no correlation and 1 meaning perfect correlation.)

With the Progress Points removed from the District Performance Scores, the DPSs reflect the historic stability (year-to-year consistency) of the grade 3-8 Assessment Index.

Chart 2:

When the Progress Points are included, the scores are spread out. Through Progress Points, a small change in performance can lead to a ten point difference across years.

Chart 3:

In Chart 3, the data points are spread out more than in  Chart 2. That spread is caused by some districts receiving- and others not receiving- Progress Points. The overall shift right and up is the general effect of adding Progress Points into the scores. The lower correlation results from some districts receiving Progress Points one year, but not the other.

Unstable VAM outcomes give faulty information to parents choosing schools.

Suppose my child, Joe, is a struggling (scoring below Basic on the state tests) elementary student.

I might want to choose a new school/district where more students pass the tests. My child might be influenced by new peers and make higher achievement. If I choose a school based on its past test performance, I can be confident that its overall test scores will be similar the next year. Adding the Progress Points loosens the relationship between proficiency rates and Performance Scores.

Still, I might not choose a school for Joe based strictly on test scores. He could end up in a class full of high performing students, and the instruction might be over his head. What I really want to know is how well the school handles struggling students like him.

Progress Points are sold as the answer to that question. The School Report Cards issued by LDOE ask and answer the question, "Did this school make progress with students who struggled academically?" The answer is given as the percentage of non-proficient students "exceeding expectations".

After the state tests are graded, a computer model (VAM) calculates whether or not each struggling student scored better than a statistical average for that group. The VAM balances the results so that about half of struggling students "exceed expectations".

Joe struggled in Math. East Carroll scored highest in the state in 2014; 71% exceeded expectations. Let's say I enrolled Joe in East Carroll Parish for the 2015 school year.  Imagine my dismay when East Carroll had the lowest score in the state in 2015 with only 37% exceeding expectations.

This example is not isolated. In fact, there was no consistency from year-to-year in the percentages of students exceeding expectations at the district level.

The charts below show 70 of the 72 districts in Louisiana. (I could not locate data on Jackson Parish because of its high number of "opt-outs" from testing in 2015, and Cameron Parish because LDOE did not list details about students exceeding expectations for it in the source I used.)
Chart 4:

The very spread-out pattern shows virtually no correlation between years. How a district rated in 2014 gave no indication of how it would rate in 2015. The underlying "percent of students exceeding expectations" could be very misleading to parents making school choices. It does not predict how a school or district will perform next year.

The stability between years for ELA was only marginally better, and still very, very weak. Here, too, there is no predictive value for parents making school choices for their children.

Chart 5:

I have examined other measures of the year-to-year stability of the progress points and all point to the the same conclusion. How a district measures one year gives no indication of how it will perform the next. The computer model simply does not identify a consistent pattern of excellent achievement.

Part 2: What then, is the purpose of issuing Progress Points?

The obvious reason is to create a churn of higher and lower letter grades to maintain a local sense of crisis in education. If my school or district drops a letter grade, cries for reform will ring out, even if the actual proficiency rates rose.

LDOE benefits when the VAM outcomes are unstable, because more letter grades go up and down each year.  Half of the letter grade changes for the districts between 2014 and 2015 were directly caused by the instability of the VAM outcomes. If the VAM identified the same districts every year, the churn would not exist. The churn maintains a local sense of crisis, making parents and the public more receptive to the next untested panacea LDOE proffers.

Churn was apparent in the School Performance Scores as well, and a considerable portion of the churn came from the instability of the Progress Points.

And there is another, more subtle reason to issue Progress Points.

Proficiency rates have a moderate-to-strong correlation to the level of poverty in the district or school. The following shows the relationship between "at risk" (receiving free or reduced lunch, a measure of poverty) and the test-based grade 3-8 Assessment Index.

Chart 6:

So, the outside-of-school effects of poverty may be more of an influence on test scores than the in-school processes of teaching and learning.  Thus, proficiency rates alone are not necessarily a good measure of what a great school truly is.

It is tempting, then, to explore "growth" measures such as VAM to evaluate schools and districts.

If the purpose of the assessment system - from state tests to formulas and computer models - is to improve the schools, then it must identify the same schools or districts from year-to-year that can serve as models.

The problem here is that the VAM results are completely erratic.

The arbitrary fluctuations caused by VAM through the Progress Points create an apparent weakening of the connection between the District Performance Scores and poverty.  Breaking the effect of poverty is the long-sought-after Holy Grail of education.

However, it is an illusion.
Chart 7:

Chart 8:

So to summarize, Progress Points are based on a spectacularly unstable VAM system which - I say artificially - creates churn. The churn (falsely) gives the appearance that the system identifies influences on the scores other than poverty.  Making letter grades go up and (especially) down creates a sense of crisis which maintains support for untested corrective measures offered by LDOE.

Apparently LDOE designed the Progress Points rules to create churn. (In this section I will show how the promulgated rules destabilize the Performance Scores and create an illusion of a weakened connection to poverty.)

The formulas used to calculate the Progress Points can turn districts into winners or losers over small margins of change. LDOE may have some control of how many winners and losers there are from year-to-year.

The basic rules for Progress Points (a.k.a. Bonus Points when introduced in 2013):

1. There are two subjects in which grades 3-8 can independently earn points - ELA and Math.

2. There must be at least 10 eligible students in the subject (scored below Basic on prior year test.)

3. A Value-Added-Model sets the expectations (after the test data is in) so that about half of students exceed and about half do not exceed their expectations.

4. Threshold rule - Over 50% of eligible students must exceed their expectations as set by the VAM. If fewer than 50% exceed expectations, zero points are awarded.

5. Number or percent rule - If under 100 students are eligible, 0.1 points times the percent of students exceeding expectations starting from unsatisfactory and 0.05 times the percent of students starting from Approaching Basic exceeding expectations are awarded (rule 4 still applies). If 100 or more students are eligible, then the same 0.1 or 0.05 points are multiplied by the number of students exceeding expectations. Points are capped at a maximum of ten.

6. High school students are eligible in ELA and Math, with similar rules except:
For high school, in 2015, 30% of eligible students had to meet their targets on the ACT series. In 2014 and 2013 they had to exceed their targets on the ACT series. (Very few schools and no districts to my knowledge earned Progress Points for high school performance prior to 2015).

In the explanation that follows, I am considering only grades 3-8. LDOE has made available for comparison only grade 3-8 data for students in both 2014 and 2015.

How VAM and the Threshold Rule work together "Hunger-Games" style:

The VAM assigns half of students to exceed expectations and half to not exceed them. Now, if one district has 55% of students exceed expectations, then some other districts will have fewer than 50% meet expectations to compensate. Since there are two subjects in which Progress Points can be earned, over half of the districts earned at least some Progress Points. Still, some districts earned none.

Imagine awarding bonus points based on whether or not a high school's football team won one or both of the first two games of the season. Half of the schools each week would be denied bonus points. For every team that won both games, somewhere there would be a team that won neither. While simplified for illustration, this is essentially the way winners and losers are made through the Threshold Rule.

The Threshold Rule by design creates winners and losers, and makes their scores move in jumps.

The lowest possible number of points awarded would result from small a district having 51% qualify in one subject only. If all of the students started from Approaching Basic, it would earn 51 x 0.05 = 2.55 points. In large districts, the minimum jump is ten points, which is 2/3 of a letter grade.

How the Number or Percent Rule favors large districts over small districts:

Smaller districts receive Progress Points based on the percent of eligible students exceeding expectations. In 2014, tiny Cameron Parish (1254 student population) had 55% in ELA and 53% in Math of its eligible students exceed expectations; it received 7.4 Progress Points combined.

Larger districts received Progress Points based on the number of eligible students exceeding expectations.

In the same year, Caddo (40,658 student population) had lower percentages, 51% in ELA and 48% in Math of its eligible students exceed expectations, but it received 10 Progress Points.

Although Caddo did not receive points for Math, it had a high enough number of eligible students in ELA alone that by reaching only 51% the formula awarded the full ten points to it, even though its percent performance was below that of Cameron Parish in both ELA and Math.

Districts over the median size (5000) earned either 10 points or 0 points in 2014. By the rules, if there were over 400 eligible students in the district, the district received either the maximum of ten Progress Points or zero Progress Points. 

Chart 9:

The down-side for large districts is that very small changes in student performance are magnified into large point changes by the scoring rules. This is important to the churn.

Say a district has 400 eligible students two years in a row. The first year, 201 exceed expectations and the district receives ten Progress Points because over 50 percent exceeded expectations. The next year two fewer students (199) exceed expectations, so the district receives no Progress Points.  A change of two students - representing 0.5% of the eligible students - results in a ten point loss.

One large district, Calcasieu Parish, fell by ten Progress Points in 2015. Calcasieu had a large number of students opt-out of testing. Individual schools were not penalized for the opt-outs. However, it would appear that in the calculation for the district that opt-out students eligible to earn Progress Points counted as not exceeding expectations. If so, thanks to the Threshold rule, a small change (from opt-outs) led to a large point drop.

Chart 10:

You may notice two things in chart 10 compared to chart 9.

1. Intermediate values were introduced into the mid-sized districts that earned Progress Points for high school performance (but not grade 3-8 performance). There are typically fewer eligible high school students, so earning an automatic ten points does not kick in for those mid-sized districts in that case. A rule change in 2015 caused districts to receive Progress Points based on the performance of struggling students in the high schools for the first time.

2. Fewer districts earned points from grades 3-8 in 2015 than 2014. In  2014, 61 of 72 districts earned Progress Points in either (grade 3-8) ELA or Math or in both. In 2015, only 44 of 70 districts (missing data on Jackson and Cameron) earned Progress Points. Why?

In 2014, the VAM system allowed 56% percent in ELA and 53% in Math to exceed expectations statewide. In 2015, it allowed only 50% in ELA and 49% in Math to exceed expectations statewide. The tighter (closer to 50%) percentages set in the VAM system tightened the "Hunger-Games" style competition in 2015.

It is not clear to me exactly what latitude the LDOE has, if any, in setting the VAM results. If the LDOE has the ability to tweak the expectations set by the VAM system, then at its whim it can create more or fewer winners year to year. Conceivably, it could focus on the final outcomes for a particular school or district and tweak the system accordingly. (I do not see any evidence that LDOE favored the Recovery School Districts in 2015 through the DPS VAM system.)

In conclusion, I offer these hopes:

I hope that those in districts and schools that rose or dropped letter grades this year will not over-react to the grade changes. In the underlying data, there often was little change.

I hope that the district Superintendents will demand an end to the "Hunger-Game" style competition. There should not be a threshold for earning points. There should not be purposely-induced churn through the artificial creation of winners and losers..

I hope that the Superintendents of smaller districts will demand an end to the patently unfair Number or Percent rule.

The VAM system behind the Progress Points is far too erratic to be of use for parents trying to match student and school. I hope they understand that it has no predictive value.

As I understand it, the Progress Point system (or something like it) was required under Race-to-the-top; with the overhaul of ESEA now complete, I hope LDOE/BESE will explore removing Progress Points from the School and District Performance Scores.

Herb Bassett,
Grayson, LA