What the research says on teacher evaluations

Editor’s note: In 2010, New York state passed a research-based law establishing a new process for teacher evaluations. New York State United Teachers supported the new law then and supports it now.

After the law was passed, the State Education Department decreed that it would be OK to use a single state standardized test to measure student progress. We objected – that wasn’t what the law said – and the State Supreme Court agreed.

We are now in talks with state education officials to resolve the details on measuring student progress in a way that honors the law.

Despite this delay, numerous districts around the state have already put evaluation plans in place.


Summary of the new evaluation requirements

New York state’s comprehensive, rigorous new process for teacher evaluations, established in a law enacted in 2010, is dedicated to improving student learning by advancing teacher effectiveness. It requires teachers to be evaluated annually according to the state’s seven research-based teaching standards and the highly specific performance rubrics approved by the State Education Department. To meet these demanding criteria, evidence of teacher effectiveness must include multiple sources, not just student test scores.

This process of teacher evaluation calls for continual improvement for all teachers; requires a detailed Teacher Improvement Plan for those who are ineffective or developing; and expedites removal of those who, despite support and training, do not improve.

A key strength of New York State’s evaluation process is that it results from collaboration among policymakers, teacher leaders and other stakeholders. That collaboration was credited in New York’s success at garnering $700 million in federal Race to the Top funds.

Research says evaluations should be standards-based to spur gains in student achievement.

The law requires New York state’s teacher evaluations to be comprised 20 percent from student growth on state tests; 20 percent on student achievement on other locally selected measures; and 60 percent on principal observations and other evidence of teacher effectiveness. Rigor is ensured through a process that requires adherence to New York State Teaching Standards; to highly detailed rubrics of teacher performance; and to exacting state standards for measuring student learning. These uniform standards will serve as the basis of teacher evaluations, with the State Education Department responsible for reviewing and approving specific rubrics for district use. These job performance criteria and new, detailed standards for teacher evaluators are designed to eliminate the subjectivity that too often in the past resulted in evaluations that were cursory, haphazard and unrelated to improving teacher effectiveness (Weisberg, 2009). Research shows a significant relationship between the use of standards-based evaluations and gains in student achievement (Rockoff, 2010).

The comprehensive nature of New York state’s new evaluation process is a fundamental component of both its rigor and objectivity. Each of New York state’s seven research-based teaching standards addresses an element of professional practice that, taken together, comprise the knowledge and expertise required of effective teachers. Just as no single test or homework assignment is reliable as a comprehensive measure of student learning, so too is it necessary to require multiple measures of teacher performance – to evaluate teachers on the full range of professional job requirements, to document district needs and conditions, and to ensure objectivity and reliability in evaluations (Almay, 2011; Goe, 2010).

Research says that teacher evaluations should incorporate information on student progress – including, but not limited to, standardized tests. 

As the American Federation of Teachers notes, “Student learning is at the heart of the teaching profession and must be included in any credible teacher evaluation” (AFT, 2010). New York state’s evaluation system appropriately incorporates student learning as a component in a composite score of teacher effectiveness. New York state now requires fully 40 percent of a teacher’s evaluation to be based on evidence of student growth and achievement. In establishing this requirement, New York state law calls for the use of multiple measures of student learning, reflecting the substantial body of research that shows that a single snapshot of student learning, such as how students score on a standardized state test, is not a reliable indicator of teacher effectiveness nor a reliable tool to improve instruction. Researchers at the Economic Policy Institute (EPI) and the Rand Corporation have independently documented the inherent problems in over-reliance on student test scores to gauge teacher effectiveness (McCaffrey, 2003;Baker, 2010). There is “broad agreement among statisticians, psychometricians and economists that student test scores alone are not sufficiently reliable and valid indicators of teacher effectiveness,” (Baker, 2010). Misuse of test scores would not only jeopardize some of the state’s most effective teachers, it could also fail to diagnose teachers who fall short. The law appropriately recognizes this in requiring multiple measures of student progress.

Research says effective teaching is best evaluated in a variety of ways.

Too often in the past, teacher evaluations were given short shrift, no more than cursory “drive-by” evaluations by an untrained administrator using vague or subjective standards for assessment. By requiring multiple measures of teacher effectiveness – principal classroom observations conducted with detailed job performance rubrics; evidence of teacher growth in content knowledge and teaching expertise; student growth in learning; evidence of teacher engagement with parents and professional colleagues – New York state can measure the full range of knowledge and expertise required of educators. Evaluating expertise through multiple measures results in more accurate results (Patton, 1999) and is long-established practice in both public and private sectors.

A significant body of research documents the range of strengths and degree of mastery required for effective and master teachers (Bond, Lloyd et al. 2000).

Research correlates gains in student achievement and proficiency to teacher evaluation systems that are comprehensive; focused on teacher effectiveness and student learning; and linked to continual professional development (Little, 2009). These are the underpinnings of New York state’s evaluation process.

Research stresses that evaluations should be part of a system that supports all teachers in continually working to improve.

Classroom observations, based on state teaching standards and specific job performance rubrics, constitute a significant required component of teacher evaluations. In addition, teachers are required to document numerous elements of their professional practice, because not every state teaching standard can be measured through in-class observation. An effective teacher is expected to demonstrate quality planning (Standard 2) that tailors instruction to individual student needs. Similarly, the state’s “Professional Growth Standard 7″ requires teachers to provide evidence of their continued professional growth in areas such as content knowledge and pedagogy (New York State Board of Regents, 2011).

New York state’s seven research-based teaching standards, and the highly detailed job performance rubrics that evolve from them, reflect the depth and breadth of a profession that requires content knowledge, pedagogical expertise and continual learning. Research analyzing more than 1,300 programs (Yoon, 2007) found that teachers who received substantial professional development increased student achievement by 21 percentile points. Integrating teacher evaluations with ongoing professional development improves instruction substantially (Donaldson, 2010).

Classroom observation alone cannot document a teacher’s progress in professional development and post-graduate education. The state’s new evaluation process appropriately requires this to be an evidence-based measure tied not only to state, but also to district requirements (Koppich, 2008). Although state standards and job performance rubrics establish a foundation for teachers’ post-graduate education, requiring all teachers to succeed in and document post-graduate education on an ongoing basis throughout their careers (New York State Board of Regents, 2011), many districts go significantly beyond in setting these requirements (Heneman, 2006). These expectations can now be tailored to both state and district requirements and incorporated into teacher evaluations.

Because New York state took the time to do this right – basing its new evaluation process on research, phasing in implementation, and involving all stakeholders in its development to ensure buy-in – it is positioned to avoid the implementation disasters experienced by states such as Tennessee that have rushed to implement and now see their evaluations imploding (Winerip, 2011).

BIBLIOGRAPHY

Almy, Sarah. Fair to Everyone: Building the Balanced Teacher Evaluations that Educators and Students Deserve. 2011. The Education Trust. Retrieved from http://www.edtrust.org/dc/publication/fair-to-everyone-building-the-balanced-teacher-evaluations-that-educators-and-student November 2011.

AFT, A Continuous Improvement Model for Teacher Development and Evaluation. Working paper, January, 2010. American Federation of Teachers. Retrieved from http://www.aft.org/pdfs/teachers/improvemodelwhitepaper011210.pdf November 2011.

Baker, Eva, et. al. Problems With the Use of Student Test Scores to Evaluate Teachers. Economic Policy Institute Briefing Paper #278. 2010 , Economic Policy Institute, Retrieved from http://www.epi.org/publication/bp278/ November 2011.

Bond, Lloyd et al. (Richard Jaeger, Tracy Smith, & John Hattie). The Certification of the National Board for Professional Teaching Standards: A Construct and Consequential Validity Study. 2001. Education Matters. Retrieved from http://educationnext.org/defrocking-the-national-board/ November 2011.

Donaldson, M. L . “No More Valentines. The Key to Changing the Teaching Profession.” May 2010 | Volume 67 | Number 8 Pages 54-58 Education Leadership, ASCD

Dillon, Sam. “Eastern States Dominate in Winning School Grants.” New York Times. 24 Aug. 2010.: Print

Goe, Laura. Multiple Measures of Teacher Effectiveness. December 3, 2010. National Comprehensive Center for Teacher Quality. Retrieved from http://www.tqsource.org/publications/practicalGuide.pdf. November 2011.

Heneman, H. G., III, Milanowski, A., Kimball, S. M., & Odden, A. Standards-based teacher evaluation as a foundation for knowledge- and skill-based pay (CPRE Policy Brief No. RB-45). 2006. Philadelphia: Consortium for Policy Research in Education. Retrieved from http://www.cpre.org/images/stories/cpre_pdfs/RB45.pdf November 2011.

Herman, J., Heritage, M. and Goldschmidt, P. Guidance for Developing and Selecting Assessments of Student Growth for Use in Teacher Evaluation Systems. Assessment and Accountability Comprehensive Center. CRESST. 2011

Jackson, C. K. & Bruegmann, E. Teaching students and teaching each other: The importance of peer learning for teachers. 2009. Cornell University, School of Industrial and Labor Relations. Retrieved from http://digitalcommons.ilr.cornell.edu/workingpapers/77/ November 2011.

Johnson, S.M., Papay, J.P. Is PAR a Good Investment? Understanding the Costs and Benefits of Teacher Peer Assistance and Review Programs. Harvard Graduate School of Education Project on the Next Generation of Teachers. 2011. (NGT Working Paper). Retrieved from http://www.gse.harvard.edu/~ngt/PAR%20Costs%20and%20Benefits%20-%20January%202011.pdf November 2011.

Kane, T., Taylor, E., Tyler, J., Wooten, A. Identifying Effective Classroom Practices Using Student Achievement Data Working Paper 15803. 2010. National Bureau of Economic Research Cambridge, MA. Retrieved from http://www.nber.org/papers/w15803 November 2011.

Koppich, J and C Showalter. Strategic Management of Human Capital – Cross-Case Analysis. Philadelphia: Consortium for Policy Research in Education. 2008.

Leana, C. The Missing Link in School Reform. Stanford Social Innovation Review Fall 2011. Retrieved from http://www.ssireview.org/articles/entry/the_missing_link_in_school_reform November 2011.

Linman, T. 360-degree Feedback: Weighing the Pros and Cons. 2004. College of Education, San Diego State University Retrieved from http://edweb.sdsu.edu/people/ARossett/pie/Interventions/360_1.htm November 2011

Little, O. Teacher Evaluation Systems: The Window for Opportunity and Reform. National Education Association. 2009. Retrieved from http://forum.mdischools.net/sites/default/files/forum.mdischools.net/2009_NEA_teacherevaluationsystems.pdf November 2011.

Little, O., Goe, L., and Bell, C. A practical guide to evaluating teacher effectiveness. 2009. National Comprehensive Center for Teacher Quality. Retrieved from http://www.tqsource.org/publications/practicalGuide.pdf November 2011.

McCaffrey, Daniel F. et al. Evaluating Value-Added Models for Teacher Accountability. 2004. Rand Corporation. Retrieved from http://www.rand.org/pubs/monographs/2004/RAND_MG158.pdf November 2011.

Newmann, F. and Wehlage, Gary G. Successful School Restructuring: A Report to the Public and Educators. 1995. The Center on Organization and Restructuring of Schools. Wisconsin Center for Education Research, Madison, W.I. Retrieved from http://www.wcer.wisc.edu/archive/cors/Successful_School_Restruct.html November 2011.

NYS Board of Regents, NYS Teaching Standards: September 12, 2011. NYS Education Dept. http://www.highered.nysed.gov/tcert/pdf/teachingstandards9122011.pdf Retrieved November 2011.

NYS Board of Regents, Teaching Standards. NYS Education Dept. January 2011. http://www.regents.nysed.gov/meetings/2011Meetings/January2011/111hed3.pdf Retrieved November 2011.

Odden, A., Kelley, C., Heneman, H., III, & Milanowski, A. Enhancing teacher quality through knowledge- and skills-based pay (CPRE Policy Brief No. RB-34). Philadelphia: University of Pennsylvania, Consortium for Policy Research in Education. 2001.

Patten, M.Q. Enhancing the quality and creditability of qualitative analysis. December 1999. Health Services Research. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1089059/ November 2011.

Rockoff, J and Cecilia Speroni. Subjective and Objective Evaluations of Teacher Effectiveness. New York: Columbia University. 2010.

Weisberg,Daniel, Sexton, Susan, Mulhern, Jennifer, Keeling, David. The Widget Effect: Our National Failure to Acknowledge and Act on Differences in Teacher Effectiveness. 2009. The New Teacher Project. Retrieved from http://widgeteffect.org/downloads/TheWidgetEffect.pdf November 2011.

Winerip, Michael. “In Tennessee, Following Rules for Evaluations Off a Cliff.” New York Times. 6 Nov. 2011: Print

Yoon, K. S., Duncan, T., Lee, S. W.-Y., Scarloss, B., & Shapley, K. Reviewing the evidence on how teacher professional development affects student achievement. 2007. Issues & Answers Report. REL 2007–No. 033. Regional Educational Laboratory at Edvance Research, Inc. Retrieved from http://ies.ed.gov/ncee/edlabs/regions/southwest/pdf/rel_2007033.pdf November 2011.