Tuesday, August 28, 2012
Dissident educational voices? Testing?
In How Children Fail John Holt states his belief that end of year achievement tests do not show real learning. The material learned is forgotten shortly after the tests because it was not motivated by interest, nor does it have practical use
In De-schooling Society Ivan Illich posited self-directed education, supported by intentional social relations in fluid informal arrangements: What testing?
In Against Schooling John Gatto argues compulsory schooling has conformist notion taken from the worst aspects of Prussian culture. Testing as Prussian militarism?
In Teaching as Subversive Activity Postman believes that inquiry and open ended questions are best for learning. Testing for what? Data
Sounds very different from Race To the Top language.
Tuesday, February 28, 2012
Teacher Evaluation?
A simple question teachers should ask about their profession
A simple question teachers should now ask about their profession
Sunday, March 13, 2011
How Bill Gates misinterprets ed facts
By Richard Rothstein
Microsoft Chairman Bill Gates authored an op-ed published in The Washington Post late last month, “How Teacher Development could Revolutionize our Schools,” proposing that American public schools should do a better job of evaluating the effectiveness of teachers, a goal with which none can disagree. But his specific prescriptions, and the urgency he attaches to them, are based on the misrepresentation of one fact, the misinterpretation of another and the demagogic presentation of a third. It is remarkable that someone associated with technology and progress should have such a careless disregard for accuracy when it comes to the education policy in which he is now so deeply involved.
Gates’ most important factual claim is that “over the past four decades, the per-student cost of running our K-12 schools has more than doubled, while our student achievement has remained virtually flat.” And, he adds, “spending has climbed, but our percentage of college graduates has dropped compared with other countries.” Let’s examine these factual claims:
Bill Gates says: “Our student achievement has remained virtually flat.”
The only longitudinal measure of student achievement that is available to Bill Gates or anyone else is the National Assessment of Educational Progress (NAEP). NAEP provides trends for 4th, 8th, and 12th graders, disaggregated by race, ethnicity, and poverty, since about 1980 in basic skills in math and reading (called the “Long Term Trend NAEP”) and since about 1990 for 4th and 8th graders in slightly more sophisticated math and reading skills (called the “Main NAEP”).[*]
On these exams, American students have improved substantially, in some cases phenomenally. In general, the improvements have been greatest for African-American students, and among these, for the most disadvantaged. The improvements have been greatest for both black and white 4th and 8th graders in math. Improvements have been less great but still substantial for black 4th and 8th graders in reading and for black 12th graders in both math and reading. Improvements have been modest for whites in 12th grade math and at all three grade levels in reading.
The following table summarizes these results, for the earliest and most recent years for which disaggregated data were collected.

Bill Gates may think that these improvements are insufficient, and perhaps he is correct. But, as Daniel Patrick Moynihan reportedly quipped, “everyone is entitled to their own opinions, but not to their own facts.” No rational reading of these NAEP data can support Bill Gates’ claim that “student achievement has remained virtually flat” over the last four decades.[†] And, to repeat, no other longitudinal data are available that describe student achievement over time.
These facts also don’t support the story that the typical teacher of disadvantaged children is ineffective. Certainly, some teachers are ineffective, and schools should do a better job of removing them. But that should not, if facts are to be believed, be the main story.
Yet it seems to be. Secretary of Education Arne Duncan recently asserted that “many, if not most, teacher-training programs are mediocre.” This may be true, but how does he know? What is his evidence? It wouldn’t seem that mediocre teacher training programs could consistently be turning out teachers who have posted the kinds of gains we’ve seen on NAEP in the last generation and more.
It is important to investigate why, in the most recent period, typical teachers have been more effective with elementary school children than with high-schoolers, but curiously, the reforms Bill Gates and like-thinking policymakers are pursuing concern elementary school teachers almost exclusively – because the student value-added scores on NCLB-required standardized tests by which they propose to evaluate these teachers are available only for elementary, not secondary school students. It is also important to investigate why teachers have apparently been more effective during most (though not all) of the last few decades in teaching math than reading, but it is difficult to motivate anyone to investigate this if our vision is clouded by the myth that all student achievement has been flat.
Bill Gates says: “The per-student cost of running our K-12 schools has more than doubled.”
Here, Bill Gates is nominally correct, but misleading. When properly adjusted for inflation, K-12 per pupil spending has about doubled over the last four decades, but less than half of this new money has gone to regular education (including compensatory education for disadvantaged children, programs for English-language learners, integration programs like magnet schools, and special schools for dropout recovery and prevention). The biggest single recipient of new money has been special education for children with disabilities. Four decades ago, special education consumed less than 4% of all K-12 spending. It now consumes 21%.[‡]
Detailed tables documenting these trends are available here.
http://epi.3cdn.net/1726cc68ca1a71563a_o3m6bhrub.pdf
American public education can boast of remarkable accomplishments in special education over this period. Many young people can now function in society whereas, in the past, children with similar disabilities were institutionalized and discarded. But it is not reasonable to complain about the increase in spending on such children by insisting that it should have produced greater improvement in the achievement of regular children.
The increase in regular education spending has still been substantial, even if not nearly as great as Bill Gates implies. Should this spending increase have produced even greater improvement in achievement than has in fact occurred? This is a more difficult judgment to make. But in light of the actual achievement improvements documented by NAEP, it is not reasonable to jump to the facile conclusion of a productivity collapse in K-12 education. A more reasonable story is that spending has increased and achievement has increased as well. Perhaps we have gotten what we paid for.
Bill Gates says: “Spending has climbed, but our percentage of college graduates has dropped compared with other countries.”
This is the Bill Gates claim that can properly be called demagogic. It attempts to agitate readers by presenting a positive development in a negative light. A climb in spending should produce an increase in the percentage of college graduates. And it has. (http://nces.ed.gov/programs/digest/d09/tables/dt09_008.asp?referrer=list) In the last four decades, the percentage of college graduates in the United States has nearly doubled. In 1970, 16% of young adults (ages 25 to 29) were college graduates. Today, it is 31%. The improvement has been across the board: the share of African-American young adults who are college graduates has gone from 10% to 19%; for whites it has gone from 17% to 37%. Somehow, Bill Gates saw fit to present this as an indictment.
Should our college graduation rate be rising faster? Of course, that would be a good thing. Should the spending increases we have experienced have generated a faster increase in college graduation than, in fact, they have? That would be worth exploring, but Bill Gates’ phrasing suggests to the less-than-careful reader that spending increases haven’t been productive at all, because our college graduation rate has “dropped…” Would a faster increase require even greater increases in spending? That is also likely, but it is not a conclusion that Bill Gates intends to suggest.
It is commonplace to imply, as Bill Gates does in his Washington Post op-ed, that our failure to increase our college graduation rate “compared with other countries” will prevent us from “build[ing] a dynamic 21st-century economy.” Certainly, we need a sufficient number of well-trained college graduates for such an economy, but there is no reason to believe that a graduate rate in excess of 30% is too small for this purpose, or that economic dynamism can, after reaching sufficiency, increase linearly with increases in the share of young people who graduate from college. The threats to a dynamic 21st century economy are likely to come from a failure of macroeconomic policy, regulation of speculation, and investment in education, not from inefficiency in the investment we already make.
We only need to examine the list of international college graduation rates to see the absurdity of efforts to make a direct link between college graduation rates and economic success. The Organization for Economic Cooperation and Development (OECD) publishes comparative data.
One country that outranks the United States in college graduation rates is Ireland, whose economy has now collapsed because its regulation of the real estate bubble was even more careless and corrupt than ours. Another is Portugal, whose economic health is also worse than that of the United States.
Of course there are also nations on the list that are not on the verge of bankruptcy, but the chief lesson of the list is this: provided a nation has a sufficient number of college graduates for a dynamic economy, rankings above that point are irrelevant. Of course we should increase our college graduation rate, and there are many civic and cultural reasons to do so, even if we may already produce (as some analyses suggest) an apparent surplus, for economic purposes, of science, technology, engineering, and math graduates.
Education is complex, and the relationship between education and the economy even more so. Our ability to grapple with the challenges these present is not enhanced by factually inaccurate and hyperventilated appeals from those who should know better.
[*] In theory, the Long Term Trend (LTT) is distinguished from the Main Assessment because the LTT assesses the same skills, whereas the Main Assessment changes over time, as the curriculum changes. But in fact, the LTT also changes somewhat over time, and the Main Assessment is sufficiently stable to make longitudinal comparisons.
[†] If the data are further disaggregated by decade, there have been some interim periods of flatness within the overall growth. For example, gains were strongest for black elementary students in the LTT in the 1980s and 2000s, and flat in the 1990s, but on the Main Assessment they showed strong gains in the 1990s as well. Twelfth grade LTT reading scores have been mostly flat since 1990, after a dramatic leap of 24 scale points for blacks in the 1980s. Fourth grade LTT reading scores fell for blacks in the 1980s, but rebounded in the 1990s and jumped even more strongly in the 2000s.
View full page: voices.washingtonpost.com/answer-sheet/school-turnaroundsreform/fact-challenged-education-p...
Generated by Instapaper's Text engine, which transforms web pages for easy text reading on mobile devices.
Sunday, March 6, 2011
Regents responsibililty?
Educational responsibility
The Buffalo News article, Several city schools face 'radical intervention' by Mary B. Pasciak January 30, 2011 and the online comments seem to beg the question of whose responsible for education. I am amazed that Dr. Steiner, the Board opf Regents and the State Educational Department can distance themselves from the present state of education in urban areas. It is ironic there seems to be an amnesia when it comes to the last decade of “No Child Left Behind”. Wasn’t it the State that both backed and promoted the idea that the strong man of accountability would usher in a gilded age for education? Wasn’t it the State that set the parameters, standards and dolled out the money for what should happen in education? Why is it the State hierarchy is quick to blame Unions, Superintendents, Administrators, Schools, Teachers and even Parents? Why does there seem to be such a lack of self- reflection and humility on their part? Could we not hear at least an attempt to answer why the State initiatives of the last ten years have evidently failed miserably? Again, why is it we are so gullible to believe the story that leaves the State leaders without responsibility? Most of all, why in heavens name, do we believe that these present “radical interventions” will impact education positively?
Robert Tyrrell
New York State Department of Ed Responsibility
Lies, more lies and statistics
According to a memo from NYSED dated July 28, 2010 (http://www.oms.nysed.gov/press/Grade3-8_Results07282010.html), “cut scores for the state’s 2010 Grade 3-8 assessments in Math and English tests were set according to new Proficiency standards redefined to align them with college-ready performance”. My first thoughts went to the several times I heard speakers in the last decade of educational testing and standards. How many times did we hear that the “new standards” were set so that students would be prepared for college (http://www.eagleforum.org/educate/2001/june01/standards.shtml)? How odd then the statement from the memo. “As a result of raising the bar for what it means to be proficient, many fewer students met or exceeded the new Mathematics and English Proficiency standards in 2010 than in previous years”. Wasn’t the bar supposed to be set there?
There are at least two logical inferences from this last statement from the memo. First inference, the bar can be moved. Doesn’t that empty the word “standard” of any meaning? Do not the standards set the bar? Isn’t moving the bar considered a deceitful activity, a lie? Second inference, many students in past years were not really as proficient as advertised. Let me quote again from the memo "We are doing a great disservice when we say that a child is proficient when that child is not. Nowhere is this more true than among our students who are most in need. There, the failure to drill down and develop accurate assessments creates a burden that falls disproportionately on English Language Learners, students with disabilities, African-American and Hispanic young people and students in economically disadvantaged districts” But doesn’t the accuracy of the assessments in this instance rest on where the bar is placed? Doesn’t “failure to develop accurate assessments” really mean the NYSED failed to put the bar in the right place? Is this educational language an attempt to deflect that responsibility?
More lies are implicit in Chancellor Merryl H. Tisch statements. “The Regents and I believe these results can be a powerful tool for change. They clearly identify where we need to do more and provide real accountability to bring about the focused attention needed to implement the necessary reforms to help all of our children catch up and succeed”. If the movement of bar created lower proficiency rates, then how could knowing the results be a powerful tool for change? How does knowing that we are not doing as well as we thought help us “clearly identify” anything let alone “implement necessary reforms”? The results of these tests simply beg the question, what are the necessary reforms? A question I believe Outcome-based Education has been discussing for the better part of a decade.
The writing of these thoughts brings to mind three questions. First, what has prompted all this activity at NYSED? Second, what have been the responses of the educational community? Third, where does all this leave a parent, a school, a teacher, a school district? In this last case, I would summit in search of statistics that really mean something. I would appeal to the reader to use this Blog for discussion of these questions and more.
http://www.emsc.nysed.gov/irts/ela-math/steps-to-determine-rawscale-score.html
Tuesday, September 7, 2010
Teacher Evaluation
Acceptance, and Critics
By SAM DILLON
How good is one teacher compared with another?
A growing number of school districts have adopted a system called value-added modeling to
answer that question, provoking battles from Washington to Los Angeles — with some
saying it is an effective method for increasing teacher accountability, and others arguing
that it can give an inaccurate picture of teachers’ work.
The system calculates the value teachers add to their students’ achievement, based on
changes in test scores from year to year and how the students perform compared with
others in their grade.
People who analyze the data, making a few statistical assumptions, can produce a list
ranking teachers from best to worst.
Use of value-added modeling is exploding nationwide. Hundreds of school systems,
including those in Chicago, New York and Washington, are already using it to measure the
performance of schools or teachers. Many more are expected to join them, partly because
the Obama administration has prodded states and districts to develop more effective
teacher-evaluation systems than traditional classroom observation by administrators.
Though the value-added method is often used to help educators improve their classroom
teaching, it has also been a factor in deciding who receives bonuses, how much they are and
even who gets fired.
Michelle A. Rhee, the schools chancellor in Washington, fired about 25 teachers this
summer after they rated poorly in evaluations based in part on a value-added analysis of
scores.
Formula to Grade Teachers’ Skill Gains Acceptance, and Critics... http://www.nytimes.com/2010/09/01/education/01teacher.html?...
1 of 4 9/7/10 12:38 PM
And 6,000 elementary school teachers in Los Angeles have found themselves under scrutiny
this summer after The Los Angeles Times published a series of articles about their
performance, including a searchable database on its Web site that rates them from least
effective to most effective. The teachers’ union has protested, urging a boycott of the paper.
Education Secretary Arne Duncan weighed in to support the newspaper’s work, calling it an
exercise in healthy transparency. In a speech last week, though, he qualified that support,
noting that he had never released to news media similar information on teachers when he
was the Chicago schools superintendent.
“There are real issues and competing priorities and values that we must work through
together — balancing transparency, privacy, fairness and respect for teachers,” Mr. Duncan
said. On The Los Angeles Times’s publication of the teacher data, he added, “I don’t
advocate that approach for other districts.”
A report released this month by several education researchers warned that the value-added
methodology can be unreliable.
“If these teachers were measured in a different year, or a different model were used, the
rankings might bounce around quite a bit,” said Edward Haertel, a Stanford professor who
was a co-author of the report. “People are going to treat these scores as if they were
reflections on the effectiveness of the teachers without any appreciation of how unstable
they are.”
Other experts disagree.
William L. Sanders, a senior research manager for a North Carolina company, SAS, that
does value-added estimates for districts in North Carolina, Tennessee and other states, said
that “if you use rigorous, robust methods and surround them with safeguards, you can
reliably distinguish highly effective teachers from average teachers and from ineffective
teachers.”
Dr. Sanders helped develop value-added methods to evaluate teachers in Tennessee in the
1990s. Their use spread after the 2002 No Child Left Behind law required states to test in
third to eighth grades every year, giving school districts mountains of test data that are the
raw material for value-added analysis.
In value-added modeling, researchers use students’ scores on state tests administered at the
end of third grade, for instance, to predict how they are likely to score on state tests at the
Formula to Grade Teachers’ Skill Gains Acceptance, and Critics... http://www.nytimes.com/2010/09/01/education/01teacher.html?...
2 of 4 9/7/10 12:38 PM
end of fourth grade.
A student whose third-grade scores were higher than 60 percent of peers statewide is
predicted to score higher than 60 percent of fourth graders a year later.
If, when actually taking the state tests at the end of fourth grade, the student scores higher
than 70 percent of fourth graders, the leap in achievement represents the value the
fourth-grade teacher added.
Even critics acknowledge that the method can be more accurate for rating schools than the
system now required by federal law, which compares test scores of succeeding classes, for
instance this year’s fifth graders with last year’s fifth graders.
But when the method is used to evaluate individual teachers, many factors can lead to
inaccuracies. Different people crunching the numbers can get different results, said Douglas
N. Harris, an education professor at the University of Wisconsin, Madison. For example,
two analysts might rank teachers in a district differently if one analyst took into account
certain student characteristics, like which students were eligible for free lunch, and the
other did not.
Millions of students change classes or schools each year, so teachers can be evaluated on the
performance of students they have taught only briefly, after students’ records were linked to
them in the fall.
In many schools, students receive instruction from multiple teachers, or from after-school
tutors, making it difficult to attribute learning gains to a specific instructor. Another
problem is known as the ceiling effect. Advanced students can score so highly one year that
standardized state tests are not sensitive enough to measure their learning gains a year
later.
In Houston, a district that uses value-added methods to allocate teacher bonuses, Darilyn
Krieger said she had seen the ceiling effect as a physics teacher at Carnegie Vanguard High
School.
“My kids come in at a very high level of competence,” Ms. Krieger said.
After she teaches them for a year, most score highly on a state science test but show little
gains, so her bonus is often small compared with those of other teachers, she said.
The Houston Chronicle reports teacher bonuses each year in a database, and readers view
Formula to Grade Teachers’ Skill Gains Acceptance, and Critics... http://www.nytimes.com/2010/09/01/education/01teacher.html?...
3 of 4 9/7/10 12:38 PM
the size of the bonus as an indicator of teacher effectiveness, Ms. Krieger said.
“I have students in class ask me why I didn’t earn a higher bonus,” Ms. Krieger said. “I say:
‘Because the system decided I wasn’t doing a good enough job. But the system is flawed.’ ”
This year, the federal Department of Education’s own research arm warned in a study that
value-added estimates “are subject to a considerable degree of random error.”
And last October, the Board on Testing and Assessments of the National Academies, a panel
of 13 researchers led by Dr. Haertel, wrote to Mr. Duncan warning of “significant concerns”
that the Race to the Top grant competition was placing “too much emphasis on measures of
growth in student achievement that have not yet been adequately studied for the purposes
of evaluating teachers and principals.”
“Value-added methodologies should be used only after careful consideration of their
appropriateness for the data that are available, and if used, should be subjected to rigorous
evaluation,” the panel wrote. “At present, the best use of VAM techniques is in closely
studied pilot projects.”
Despite those warnings, the Department of Education made states with laws prohibiting
linkages between student data and teachers ineligible to compete in Race to the Top, and it
designed its scoring system to reward states that use value-added calculations in teacher
evaluations.
“I’m uncomfortable with how fast a number of states are moving to develop teacherevaluation
systems that will make important decisions about teachers based on value-added
results,” said Robert L. Linn, a testing expert who is an emeritus professor at the University
of Colorado, Boulder.
“They haven’t taken caution into account as much as they need to,” Professor Linn said.
Formula to Grade Teachers’ Skill Gains Acceptance, and Critics... http://www.nytimes.com/2010/09/01/education/01teacher.html?...
4
Tuesday, August 17, 2010
Standardized testing
According to a memo from New York State Education Dept (NYSED) dated July 28, 2010 (http://www.oms.nysed.gov/press/Grade3-8_Results07282010.html), “cut scores for the state’s 2010 Grade 3-8 assessments in Math and English tests were set according to new Proficiency standards redefined to align them with college-ready performance”. My first thoughts went to the several times I heard speakers in the last decade that referenced educational testing and standards. How many times did we hear that the “new standards” were set so that students would be prepared for college (http://www.eagleforum.org/educate/2001/june01/standards.shtml)? How odd then the statement from the memo. “As a result of raising the bar for what it means to be proficient, many fewer students met or exceeded the new Mathematics and English Proficiency standards in 2010 than in previous years”. Wasn’t the bar supposed to be set there?
There are at least two logical inferences from this last statement from the memo. First inference, the bar can be moved. Doesn’t that empty the word “standard” of any meaning? Do not the standards set the bar? Isn’t moving the bar considered a deceitful activity, a lie? Second inference, many students in past years were not really as proficient as advertised. Let me quote again from the memo "We are doing a great disservice when we say that a child is proficient when that child is not. Nowhere is this more true than among our students who are most in need. There, the failure to drill down and develop accurate assessments creates a burden that falls disproportionately on English Language Learners, students with disabilities, African-American and Hispanic young people and students in economically disadvantaged districts” But doesn’t the accuracy of the assessments in this instance rest on where the bar is placed? Doesn’t “failure to develop accurate assessments” really mean the NYSED failed to put the bar in the right place? Is this educational language an attempt to deflect that responsibility?
More lies are implicit in Chancellor Merryl H. Tisch statements. “The Regents and I believe these results can be a powerful tool for change. They clearly identify where we need to do more and provide real accountability to bring about the focused attention needed to implement the necessary reforms to help all of our children catch up and succeed”. If the movement of bar created lower proficiency rates, then how could knowing the results be a powerful tool for change? How does knowing that we are not doing as well as we thought help us “clearly identify” anything let alone “implement necessary reforms”? The results of these tests simply beg the question, what are the necessary reforms? A question I believe Outcome-based Education has been discussing for the better part of a decade.