Saturday, September 24, 2016

School Disciplinary Authority is Being Seriously Eroded

This article in Education Week magazine describes a growing national trend to reduce educator authority to suspend students for disciplinary infractions. A major substitute for out-of-school suspension is a program called restorative justice. In this scheme, teachers are expected to refrain from removing disruptive students from the classroom and instead implement various interventions. As the article explains, denying teachers the authority to remove extremely disruptive students often infringes on the rights of other students to receive productive instruction.

This is how the writer explains the issue:

Why can't Johnny read? Or, assuming he can, why isn't Johnny closing the achievement gap?
It's politically fashionable to blame his tenure-protected teacher. But might it have more to do with the pathologically disruptive classmate who, given infinite "second chances" by detached policymakers and feckless administrators, never gets removed from Johnny's classroom?

Thanks, in part, to an increasingly popular behavior-management approach known as "restorative justice," soft discipline is on the rise in public schools at the same time that education reformers are demanding higher standards and teacher accountability.
Restorative justice emphasizes correction and counseling over punishment, and seeks to replace strict zero-tolerance discipline policies with collaborative opportunities for restitution. Its primary goal is to keep students in school rather than suspending or expelling them.
Generally proponents of alternatives to suspensions are not the professionals who must deal with disruptive behavior in the classroom on a daily basis. It is relatively easy to be a "Monday morning quarterback" and insist that there must be a way to correct student misbehavior without the need to resort to removal of a student from the learning environment. Often the "experts" on such alternatives are persons who never have to actually implement these "miracle cures" in a real classroom setting. You see it is much easier to advocate for these alternatives to suspension than to actually implement them!

In Louisiana our State Department of Education has for several years recommended a program called Positive Behavior Interventions and Supports known as PBIS.  Many Louisiana school systems require schools to implement PBIS as a way of reducing suspensions.  But the problem is that many teachers complain that PBIS greatly reduces the teacher's ability to take immediate action to stop disruptive, dangerous or disrespectful behavior by removing a student from the classroom. Such behavior often interferes with orderly and effective instruction of the great majority of students whose instruction is put on hold while the teacher fills out paperwork and then attempts to accommodate disruptive or disrespectful students using this alternative strategy.

Right now state law in Louisiana gives each teacher the right to remove extremely disruptive or disrespectful or dangerous students from the classroom by simply filling out a discipline referral form and sending the student to the appropriate disciplinary administrator. That is currently the law and the teacher should have the right to use the law to insure that she/he can effectively conduct class without interruption. But some school systems and some administrators have instructed teachers that they may not remove a student unless the teacher has implemented various steps of the PBIS procedure such as documenting several disruptive incidents and sometimes even telephoning or conferencing with the parent. But such alternatives for the one disruptive student can take away from time the teacher could be instructing the class. Is is right to deny or delay instruction of cooperative students to deal with one student who refuses to comply with the teacher's directives? I believe that school systems that deny a teacher the right to implement immediate removal of extremely disruptive or disrespectful students are in violation of state law. But there are current attempts to change state law to take away the teacher's right to remove such students.

During the 2016 legislative session, the legislature debated a bill (HB 833) by Representative Leger that in its original form would have forced schools with 150% of the average number of student suspensions to implement a plan to curtail suspensions by the use of alternatives to suspensions such as Restorative Justice or PBIS. Many administrators and teachers contacted their legislators and explained that such mandatory restrictions would tie the hands of principals and teachers in schools that faced greater than average challenges to maintain a productive classroom environment. Does it surprise you to learn that some schools face greater challenges to maintain discipline than others because of the communities they serve? It turns out those are the same communities with high crime rates and a high incidence of juvenile delinquency.  Fortunately this bill was defeated, but another bill by Senator Claitor which passed and became Act 522 mandates the establishment of an Advisory Council on Student Behavior and Discipline which will study various discipline statistics and make recommendations to BESE and the Legislature.

The Discipline advisory committee set up by Act 522 is heavily stacked with advocates for alternative disciplinary methods and very lean on actual practitioners who deal with discipline problems on a daily basis. Their recommendations to BESE could end up being the same harebrained schemes rejected by the legislature. Only this time they will be presented as "evidence based" programs. We have already seen what happens when public school policies are dictated by persons who are not education professionals but whose policy mandates are derived more from ideological assumptions than by practical considerations. That's why we have been saddled with test-based merit pay and TFA corps members instead of real teachers.

The author of the article in Education week who has 27 years of teaching experience, goes on to explain the following:

Alas, in a profession where ideologically motivated reforms abound, restorative justice in many districts has recklessly morphed into de facto "no student removal" policies that are every bit as flawed as the inflexible zero-tolerance policies they were designed to replace.
"Just how many times should the student who spews obscenities be sent back to class with no reprisals?"
The process by which this happens is all too familiar to teachers.
Far removed from the pedagogical trenches, federal and state education departments craft behavior-management guidelines designed to vastly reduce suspensions and expulsions, and keep even the most dangerous and defiant students in the least-restrictive educational environment possible.
In Louisiana, as part of the debate on HB 833, a recent report to the legislature was referred to that classifies "disrespect of the teacher" as a minor offense that should be dealt with without the need to remove a student to the classroom. Tell that to a young teacher who has just been viscously cursed using the most profane and foul language by a student twice her size who by his very demeanor appears to be on the verge of attacking her physically! This is the type of detached diagnosis of real serious disciplinary issues that we often get from folks who never have to manage a classroom.

For years now, our legislature and our LDOE have been mischaracterizing "failing schools" as schools where test scores are lower than average. Those who really understand schooling know that a failing school is not determined by student test scores. That's because low scores are much more influenced by the poverty of the student body than by ineffective teaching. Educators know that the real problems are with schools that are failing to deal decisively with classroom disruption and disrespect of teachers and other students. The parents who are appropriately dissatisfied with their schools are those responsible parents whose children attend a school where their children are subjected to disruption of classroom instruction and who fear physical harm to their children by individuals who often cannot be effectively disciplined because of the restrictions placed on school professionals. Those are the real "failing schools".

Tuesday, September 20, 2016

Louisiana Accountability System Changes to Reflect Political Goals

Note to Readers: The following post by Herb Bassett traces the history of test-based accountability in Louisiana from 1999 to the present. Mr Bassett's analysis demonstrates that the grading and ranking of our schools and teachers based on student test scores is purely arbitrary and is constantly being manipulated by our State Superintendent to produce whatever results suits his objective at the time. 

Several years ago reformers chose to portray a large number of our schools as failures so that there would be an excuse for "reforms" such as school privatization and reductions in teacher job protections. Now that the reformer policies have been in effect for a number of years, the testing data is being manipulated to demonstrate "improvement". But such alleged "improvement" is not supported by the results of the National Assessment of Educational Progress tests (NAEP).  These test results show that since 2005, Louisiana students have achieved only small improvements in national test results, and have actually lost ground compared to all other states in three out of the 4 categories tested. 

Test based accountability in Louisiana has failed to make any significant improvement in student performance. At the same time our students' futures have become more limited because of the increasing emphasis on college prep for-all, at the expense of vocational prep opportunities. 

The scapegoating of teachers for low student performance that has been an inherent part of the reforms has demoralized and de-professionalized the teaching profession in Louisiana. Many highly respected teachers have retired early. Meanwhile the natural joy of teaching and learning has been replaced by a constant, dull, non productive, test-prep exercise with no end in sight.

Mr Bassett's analysis

The Every Student Succeeds Act (ESSA) requires us to revamp our school accountability system. As we decide what changes to make, we should examine our current status, take lessons from the history of our statewide tests, and recognize how politicized our accountability system has become. It is imperative that we bring nuanced understanding to the decision making process.

History warns that the results of a single summative measure can be shaped by political considerations.

My hope here is to restore history that John White has removed from public view and reveal the current strategy to produce an illusion of ever-improving student performance by changing the metrics. This link is to the GEE 21, LEAP/iLEAP, LAPARCC and EOC data I compiled for this study.

I. The politicization of the accountability system:

Louisiana set up its accountability system even before No Child Left Behind. The first school performance scores were given in 1999. In 2003, the Louisiana Legislature created the Recovery School District to take over failing schools. The definition of failure was a school performance score of below 60. Only five schools had been taken over for their failing status before hurricane Katrina.

Shortly after Katrina, the Louisiana legislature arranged for the takeover of the majority of the New Orleans schools. Act 35 allowed the Recovery School District to take over schools with school performance scores simply up to the state average of 87.4. Only New Orleans schools were targeted at the time. Afterward, school performance scores became increasingly politicized.

A star rating system of schools was in use from 2003 through 2010. A push for reforms began in the fall of 2011, when school letter grades were instituted to create a sense of urgency. Forty-four percent of schools were suddenly declared to be failing because they were graded D or F. This provided momentum for the passage of education reforms in Acts 1 and 2 of the 2012 legislative session.

Acts 1 and 2 tied tenure and teacher evaluations to student test scores, and expanded charter schools and vouchers. Would the changes spur increased student achievement? The reformers would soon need evidence of improvement.

While the reform legislation was being passed, John White, the then-new state superintendent, pushed new school accountability formulas for 2013 through BESE. The new formulas were virtually guaranteed to yield higher overall school letter grades.

Meanwhile, the requirement for a D was raised by 10 points in 2012 in order to produce more Fs. This would help to maintain the sense of urgency created the year before. But instead, good news came at an embarrassing time for the reformers.

When the 2012 letter grades were released, over one-third of our high schools suddenly were "A" schools. This was due to the changeover from using the old GEE 21 to the new End-Of-Course tests in the accountability formulas. The astounding illusion of a turnaround appeared before the new legislation had gone into effect.

The 2012 high school letter grade inflation was rectified by the new accountability formulas in 2013.
The new formulas put the high school letter grades back in linewith the 2011 results, but the K-8 schools got a huge boost. The new formulas were completely different from the old, and the grading scale was changed from a 200 point system to a 150 point system.  Bonus points were added.

At the time, I ran the 2011 test data for each school through the old and new formulas and found that - based on exactly the same test data - the new formulas alone would yield much higher K-8 letter grades. This LDOE file confirms the inflationary shifts in the 2013 results.

The 2013 LDOE press release however attributed the improved letter grades to a slight improvement in student test scores that year despite the reality that most of the letter grade gains came directly from the new formulas. 

"Letter grade outcomes improved overall for the 2012-2013 school year because of record setting student achievement: 71 percent of students tested at Basic or above this year in ELA and math, graduation rates are at an all-time high, a record number of students earned college-entry ACT scores this year, and Louisiana students increased their AP participation more than any other state."

After 2013, to avoid political fallout during a time of test transition, LDOE promised to curve the school letter grades as needed to maintain the letter grade distribution.

The bigger picture is this. The accountability formula changes were just one part of a planned series of changes to guarantee rising measures that would shield the reforms from any real scrutiny. Changes to the tests themselves were the next step.

Our test history shows that we can expect test scores to rise simply because the tests are new. There also are tweaks that can be made to raise or lower the scores as the political winds change.

By instituting the reforms hand-in-hand with radical changes to the accountability formulas, standards, and the statewide tests, the reformers ensured there would be no consistent measure of actual student improvement.

The tests that existed before 2012 show slowed gains or reversals after 2012 until they were phased out. Growth on our grade 3-8 tests slowed after 2012, and the proficiency rates on three of our four oldest high school EOCs are currently in decline.

The ACT is another longstanding test on which we are in a slight decline. White spins that a slightly higher number of students this year earned an 18 than ever before. True, for the number as he states, but the percentage of students scoring 18 fell nominally, by a half percent. According to the official October 2014 and 2015 multi-stats reports, the senior class this year was enough larger than last year to make the difference.

And this decline comes after schools have increasingly pushed students to repeat the test until they score an 18 or higher. Some schools recently have even required students to take an ACT prep elective until the score of 18 is reached.

The consistent data paint a picture of decline and it explains why the reformers would coordinate a series of test changes and accountability formula changes with the reform legislation. With enough changes to the metrics, there would be no way left to tell if students really benefited from the reforms.

Soon we will have only the reformers' spin on our statewide tests.

II. Our current status no longer can be compared with the past:

The August 4 press release, "Louisiana Students Show Improvement as Schools Adjust To Higher Expectations" told us that "raised expectations" put more students at the Mastery or above (Mastery+) achievement level. It vaguely reviewed changes to the tests between 2012 and 2016, but did not clarify that there are simply no statistical connections between those tests.

The "improvement" is much better explained as a by-product of setting different curves on the totally different tests given in those years.

At the same time state superintendent John White released that spin, he cleansed years of test data from LDOE's current LouisianaBelieves website. Gone are the LEAP and iLEAP results from 2008-2013. Gone are the files showing the year-to-year progress at the Basic+ level that we tracked for a decade.

The files were already up; White removed them from the website.

This was nothing new. Superintendent White began purging test and demographic data when he replaced LDOE's former website with LouisianaBelieves in January, 2013. At that time, fourteen years of records were removed from public view. The limited amount of data that was re-posted on LouisianaBelieves had much of the important information redacted.

Taking down data from public view protects his spin from scrutiny. The latest data he removed - and then referenced in the press release - would remind us that the test score shifts of the last two years are totally out of line with the rest of our test history.

From 2006 to 2014, statistical test equating ensured the year-to-year comparability of the LEAP/iLEAP tests.  But in 2015, the new LAPARCC test could not be equated to the old tests. The ELA and Math LAPARCC tests were based on new questions developed by the PARCC consortium; a committee met to determine how many students scored Basic and Mastery.

Still, White's press release prominently featured a comparison of 2012 to 2016 at the Mastery+ level. The current percentage of students at that level in ELA and Math is up 14 points statewide.

On the other hand, my saved files show that Basic+ dropped by three points overall from 2012 to 2016.

He does not want us to see that Basic+ fell at the same time Mastery+ rose. It would highlight the incompatibility of the 2012 and 2016 measurements.

Since there is no statistical connection between the 2012 and 2016 tests, it is not my point here to claim that our students are actually doing worse. Rather, I am putting forth an example of one way John White ensures that we get only his spin. He makes inconvenient data disappear.

The 2015 LAPARCC was more like a second opinion than a follow-up examination. If you are diagnosed with a terrible disease and then get a second opinion that says you only have a head cold, the second opinion is not evidence that you were miraculously cured.

And our 2015 LAPARCC results were questionable at best. One of the selling points for the switch to the PARCC test was that our scores would be compatible with other states'. It turns out they were not. PARCC tacitly admitted that our students had an advantage.

The official report showed us doing better than Arkansas, New Mexico, and Mississippi, running slightly ahead of Maryland, and on several individual tests, exceeding Illinois. But we were the only state in with a footnote clarifying that only the pencil-and-paper version was administered. Most students in other states took the computer-based test.  Comparisons showed that, overall, students who took the pencil-and-paper version scored higher than expected when compared to similar students taking the computer-based version. White remained silent about this.

Now, the legislature required a new test for 2016 on which fewer than half of the questions would be PARCC questions. Despite our 2015 results being unjustifiably high in comparison to other states, scores rose even higher this year on our new LEAP tests. Whether the gains came from level setting or merely because the tests were shortened and scheduled later, the results were too erratic to take year-to-year comparisons seriously as a sign of actual student improvement.

When a new standardized test is introduced, a committee meets after the test is given. It has raw score data in hand as it makes the judgment calls that ultimately set the curve. Existing tests then have their curves adjusted each year through a statistical process of test equating. The 2015 and 2016 tests introduced new curves and the changes were extreme.

Consider the eighth grade tests. On the 2016 tests, ELA Mastery+ was 27 points above the 2014 level. Math was 18 points higher at Mastery+ but eight points lower at the Basic+ level. These shifts are totally out of line with our testing history.

There are six grade levels of ELA and Math tests - twelve in total. In 2015 and 2016, record year-to-year Mastery+ gains were set on nine of those tests and two tied for record gains when compared with the years 2006 to 2014. At the Basic+ level record losses were set on nine tests and one tied the record.

White's spotlighting of Mastery+ while removing Basic+ data sustains an illusion of improvement.

This goes to show how much changing tests allows the results to be reshaped.

I should note that reshaping the results also invalidates the year-to-year tracking of achievement gaps between subgroups.

Suppose teacher Smith and teacher Jones gave the same test. In teacher Smith's class, several students scored just below an A, but only one actually made an A. In teacher Jones' class, three students made an A, but the next highest grade was a C. By the standard of making A's, Jones' class was better.

Smith and Jones decide to curve the results. The curve raises four more students in Smith's class to an A, but none in Jones' class. The grade distribution now is different; Smith's class has more A's. This change, however, does not show improvement in Smith's class - the raw results did not change. The curving of the grades, not anything the students did, produced the changes.

Now say that they decided not to curve this test, but agreed in advance to curve the next test. On the next test, the highest scorer was still the highest, the second highest remained the second highest, and so on. Since teachers Smith and Jones already agreed to a curve, now Smith's class has more A's than Jones'. But again there was no real change in performance; the results were changed by a curve which affected Smith's and Jones' classes differently.

This would not be evidence Smith's class closed the achievement gap, since the new curve, not a change in student performance, made the difference.

White erred when he compared improvement of districts at Mastery+ from 2012 to 2016 in his press release. While I intend no disrespect to the districts cited, the data better justifies the conclusion that the changes came from the different curves on the different tests rather than real changes in student performance.

John White showed a similar misunderstanding about the reduced number of failing schools when he provided this quote for the release of the 2013 school letter grades (where the results had just been inflated through totally new formulas, and new grading scale, and bonus points).

Changes made to the formula have led to real increases in student achievement,said Superintendent John White.

I documented in part one that the formula change by itself yielded higher letter grades.

So these programmed changes to the tests and formulas are part of a strategy that yields an illusion of improved student achievement. (Let me note here that BESE approved these changes.)
Our grade 3-8 ELA and Math tests have now been completely changed since 2014. A new Social Studies test was field tested this year. Science will follow in 2018-19.

White has now proposed to change the scoring system for our our high school EOC tests in the near future. This, too, can result in a new curve that gives whatever illusion he desires.

White needs history to disappear because it shows how perpetual proficiency rate gains can be created through systematic changes to the tests. He wants us to forget these lessons so that future gains will not be questioned.

The ruse is to change the tests and put students at maximum disadvantage at first, then provide tweaks as necessary to progressively increase student advantage on subsequent administrations.

 III. What does our test history tell us to expect in the future?

Tweaks to the tests and outside influences can increase proficiency rates. To understand this, consider the many ways the volume I hear from my stereo can be raised.

On my home system I can turn the volume knob, choose to play it through different speakers, or adjust the output level of my iPod playing into the system. I can even move closer to the speakers. An adjustment to any one affects what I hear. The original recording is unchanged, but what I perceive is louder.

Likewise, test has a series of "knobs" that can be tweaked to affect the results. As long as the "knobs" are turned down on the first administration, one or more can be turned up on subsequent administrations to continually increase proficiency rates. Arguably, test scores can be made to rise without students truly knowing more at the end of the year.

New-test score-rise effect: Common sense reasons why proficiency rates on new tests rise initially then level off:

Our test history shows that proficiency rates tend to rise for the first three to six years before leveling off. After that, a tweak of some sort or a new test is required to spur increased rates.

Redesigned in 2001 and 2002, the GEE 21 proficiency rates (Basic+) on each subject rose initially, but reached a peak within three to six years. (Aside from a third year stumble in ELA.) The gains between the initial administration and the initial peaks for the four subjects ranged from 8 to 15 points.

Afterward, although there were ups and downs for individual subject tests, overall proficiency peaked in 2004 and slowly declined until 2009. (Initial gains highlighted below:) (click on the figure to enlarge it)

There are some common-sense reasons for the new-test score-rise effect:
·       Teachers become familiar with which material is emphasized on the test and stress it more in class at the expense of covering other material.
·       Teachers/students develop test-taking strategies specific to the new tests over the first few years.
·       Schools may institute test-prep strategies such as trading class time for special test prep periods in the weeks before the test or by requiring struggling students to take remediation in place of other elective classes.
·       The state or district may tie test performance to school letter grades or individual course grades, or the school might provide trips or other rewards for desirable test scores to motivate students to simply try harder on the test days.

The question is, do these things make students smarter overall, or just better prepared and/or more motivated test-takers?

The scheduling of the tests in the school year affects the proficiency rates:

After slowly declining since 2004, the 2009 GEE 21 was scheduled about three weeks later in the school year. Teachers and students had more time to prepare. Proficiency rates rose on three subject tests and one held steady. On average, the increase was four points.

Likewise, from 2006 to 2008, the LEAP/iLEAP Basic+ rates were stagnant; the 2009 tests were moved later in the year and the test schedule was altered to start mid-week so that students no longer had five consecutive testing days without a weekend break. Basic+ performance rose four points, the largest single year gain I could find.
Now, were the students really smarter at the end of that year than previous years, or did taking the tests later and with a weekend break account for the higher scores?

If there is any comparison to be made between the 2016 LEAP and the 2015 LAPARCC tests, the later testing dates and condensed schedule in 2016 would have to be taken into account. This is another reason why any comparison between them is not valid.

Our End-Of-Course (EOC) test history confirms the new-test score-rise effect:

After the GEE 21 proficiency rates stagnated, they were replaced by the EOCs, which were phased in one test per year from 2008 to 2013. The four oldest tests, Algebra I, English II, Geometry, and Biology have peaked and declined since their introduction. English III and US History are still in the period of initial gains.

Scores awarded on the EOC's are Needs Improvement, Fair, Good, and Excellent. Fair+ is the graduation requirement. Good+ is considered proficiency.

Thanks to the new-test score-rise effect, phasing in one test per year ensured long-term overall average proficiency gains. As the first tests hit peaks and began to decline, later tests made gains to offset those losses. (However, overall proficiency now has been unmoved since 2014.) Part of White's spin is averaging the results of several different subject tests to show sustained overall growth.
This explains how the staggered schedule of the redesigns of the LEAP ELA and Math, Science, and Social Studies tests will capitalize on the new-test score-rise effect to produce sustained overall proficiency rate gains.

Test outcomes have been affected by test equating in a curious relationship;

The EOCs have evolved, but the development of the test questions has remained under the control of Louisiana and the testing company, Pacific Metrics. This has allowed test equating to be performed, but some of the outcomes are interesting.

Algebra I hit a relative peak in 2012 at 56 then declined in 2013 by a point. It hit a new peak at 58 this year, but its new peak came along with a lowered raw-score requirement.

While the Good+ rate was up, fewer correct answers were required for Good. (25 out of 50 in 2012; 23 out of 50 were required in 2016).

Similarly, Good + on the Geometry EOC rose every year from 2010 to 2015. Each year from 2011 to 2015, the number of correct answers required for Good was lowered. In 2016, the required number of correct answers was finally left unchanged; Good+ dropped by a point.

Now, there were content changes to the tests at the same time, so the lowered requirements may have been justified. However, it is fair to ask - are the students getting smarter, or is it simply getting easier to make the required scores? Can we be sure that test equating makes the year-to-year comparisons valid?

Student and school-level motivation affects scores:

In fact, the EOC's executive summary does caution against comparing results across years due to policy changes connecting stakes to student performance (see page 6, paragraph 3). The testing company warned that outside factors can indeed influence proficiency rates.

Yet White touted year-to-year overall gains while the scores rose; he remained silent this year as overall Good+ showed no improvement for the third consecutive year.

State policy linked EOC results to students' grades and graduation eligibility beginning in 2010-2011. Needs Improvement rates dropped by nine points that year on each of the three existing tests and have remained close to those levels ever since. Good+ rose by seven or more points for each test that year.

Did these gains actually reflect increased student knowledge or did the students merely take the tests more seriously? What gains should we expect to see if schools develop better ways to motivate students to put forth more serious effort on test days?

Initial proficiency rates may be set for expedience rather than actual student achievement:

When a new test is introduced, the cut scores are set after the test is administered and the raw scores are in. But how do we know what percentage of students should rate Basic or Mastery on a new test?

Committees meet and set the achievement levels using human judgment. So, how do we know that outside influence does not creep in to the score setting process?

The first three EOCs entered with very different initial proficiency rates (Good+). Algebra I - 36 percent; English II - 50 percent; Geometry - 34 percent.

The initial proficiency rates of subsequent tests, however, closely tracked the previous years' average proficiency rates. This would be an excellent strategy to prevent a new test from pulling the average significantly up or down.

The 2010 average proficiency rate of the three tests then in existence - Algebra I, English II, and Geometry - was 43 percent. In 2011, the Biology EOC was introduced. At 43 percent proficiency.

The 2011 average proficiency rate rose to 49 percent in part due to the student motivation discussed above. In 2012, the English III EOC was introduced at 50 percent. Because of the limited number of questions on the test, that was as closes as the new proficiency rate could have been set to match the previous year's average.

The 2012 average proficiency rate was 55 percent. In 2013, the U. S. History EOC was introduced at 53 percent. Again, it was as close as possible to the previous year average.

I have no "smoking gun" to prove that this was intentional, but we have been focused on ever-rising proficiency rates since the inception of No Child Left Behind. It also is consistent with White's method of combining results from different subjects into one average. I find the close correlation to be curious.

This level setting could have been done without penalty to students because the graduation requirement is only Fair, not the Good+ required to earn points toward the school performance scores.

The question it raises is - how much outside influence can be exerted on the setting of the initial proficiency rates when new tests are implemented? To what extent can the initial proficiency rates be shaped to produce a desired political result?

Redefining achievement levels opens the door for outside influence:

The EOC results will be redistributed in the future. LDOE has announced plans to switch from the current four level system to a five level system (Unsatisfactory, Approaching Basic, Basic, Mastery, Advanced). The EOCs are the end of the line of the standardized tests. Students do not need a five level system to predict performance on a future test.

This unnecessary change could raise otherwise stubbornly immovable proficiency rates. Will the raw score required for the new Mastery level match that required for the old Good level? The cut points will have to be redefined, again raising the question of what political influence might be exerted.

Furthermore, the English III EOC will be replaced by an English I EOC, introducing yet another new test and level-setting.

How will these changes affect overall proficiency rates?

Switching from pencil-and-paper tests to computer-based tests will require adjustment for our younger students:

Different formats pose different test-taking challenges, even if the questions are the same. One pencil-and-paper specific strategy is underlining key words or phrases in the test booklet to refer back to. Computer-based math equation editors can be awkward for students to use. I invite teachers to point out more examples in the comments.

It was shown that the PARCC pencil-and-paper version yielded higher scores than the computer-based version nation-wide. However, the current disadvantage of the computer-based test may dissipate as students and teachers become familiar with the format.

It likely will take several computer-based administrations for teachers to fully form new computer-specific strategies and find the best ways to teach then to their students. As students and teachers become more familiar with the format, scores should rise beyond any actual improvement in student learning.

We are scheduled to move to computer-based tests exclusively for grades 5-8 this year. What shifts should we expect in this transition? Should we make allowances for schools that did not administer the computer-based tests in 2016 different than for those that did? What will initially rising proficiency rates really reflect - improved test taking skills or greater knowledge?

Final questions:

History shows that scores will rise on new tests for reasons other than actual improvements in student learning. How much score rise should be required to show that the reforms are working and that students have actually become smarter and are able to do more things?

How can we ensure that level setting and adjusting is untainted by political considerations?

What safeguards should we put in place to ensure that adjustments to the accountability formulas do not falsely promote improved student achievement?

Should we require LDOE to maintain historical records in easy public view to ensure transparency?

And most importantly, given that there are so many ways the measures can be tweaked to produce continued averaged gains, should we continue with a single summative measure of a school at all?

Herb Bassett, Grayson, LA