Evaluating Teacher Effectiveness -- Research Summary

Terry Doyle:doylet@ferris.edu

Center for Teaching & Learning Ferris State University

The intent of this paper is to inform the Ferris State academic community about the research findings on the use and value of teacher effectiveness evaluation tools and to make suggestions, based on the research, as to what an effective evaluation process might look like at Ferris, including attempting to define in which areas of instruction students are qualified to give meaningful feedback to faculty and which they are not. This paper is the result of a request made by former Vice President for Academic Affairs Barbara Chapman. In her cover letter (dated October 30, 2002) addressing the revised Post-Tenure Review Policies and Procedures, Dr. Chapman charged the Center for Teaching, Learning and Faculty Development (the Center) with offering training for faculty on how to meaningfully use and interpret the quantitative results of student evaluations of instruction and asked that the training program be instructive to both faculty and administration. Introduction The collection of student ratings is not the only way or the best way but rather one way to evaluate instruction. Professionals in the field of teacher evaluation advocate a multiple-source and multiple-method approach to evaluating teaching effectiveness. The collection of student ratings should be combined with data collected from different sources using various methods such as peer review, teaching portfolios, classroom-observations, or self-evaluation (Ory, 2001). The use of students’ ratings for evaluating teacher effectiveness is the single most researched issue in all of higher education. Over 2000 articles and books have been written on this topic over the past 70 years (Ory, 2001). Because this issue is especially important at a teaching and learning institution like Ferris, the Center has engaged in an extensive review of this topic and is distributing its findings to all faculty and academic administrators. It is the first time the Center has chosen to address an issue campus-wide. Key Finding The one issue that is absolutely clear from the research is that the effective use of any teacher rating system is directly tied to universities taking four actions: Educating those that will interpret the results on how to do so effectively and fairly Educating students on how to give precise and meaningful feedback to faculty and defining the key vocabulary words of a ratings instrument Assisting faculty in their understanding of the benefits and limitations of teacher effectiveness ratings Clarifying for faculty and administrators the purposes for which the ratings will be used by the university (Centra, 1993; Marsh, 1987; Murray, 1994). How to Look at the Research Perhaps the best way to benefit from the information in this paper is to first accept that there is not now, nor is there likely to be any time in the future, complete agreement on the effectiveness of using student ratings as a means of evaluating teaching effectiveness. Second, is to be accepting of the overall findings of the 70 years of research that has been done that indicates the value student ratings can have for teachers. These overall findings are summarized by William Cashin in his meta-analysis of the research in 1995: “In general, student ratings tend to be statistically reliable, valid and relatively free from bias or need for control; probably more so than any other data used for evaluation” (IDEA Paper No. 32). Researchers are not likely to ever offer a set of clear and indisputable conclusions as to the best ways to evaluate teaching. In addition, they are unlikely to guarantee that the variables that may influence students’ assessment of teaching effectiveness can be controlled completely. The rating forms and the feedback they give, however, are like many other aspects of teaching that Derek Bok, former president of Harvard, eloquently addressed in his book Beyond the Ivory Tower when he asked the question “Can we always prove that what we do is effective? In candor, we cannot answer with certainty. But certainty has never been the criterion for educational decisions. Every professor knows that much of the material conveyed in the classroom will soon be forgotten. The willingness to continue teaching must always rest upon an act of faith that students will retain a useful conceptual framework, a helpful approach to the subject, a valuable method of analysis or some other intangible residue of lasting intellectual value.” Definition of Teaching Effectiveness as an “Act of Faith” The most accepted criterion for measuring good teaching is the amount of student learning that occurs. There are consistently high correlations between students’ ratings of the “amount learned” in the course and their overall ratings of the teacher and the course. Those who learned more gave their teachers higher ratings (Cohen, 1981; Theall and Franklin, 2001). This same criterion was also put forth by Thomas Angelo, co-author of Classroom Assessment Techniques, when he said “teaching in the absence of learning is just talking.” A teacher’s effectiveness is again about student learning. However, all teachers realize that what a student learns is not always within the teachers’ control. The literature on teaching is crammed full of well researched ways that teachers can present content and skills that will enhance the opportunities for students to learn. It is equally filled with suggestions of what not to do in the classroom. However, there is no rule book on which teaching methods match up best to which skills and/or content that is being taught. Students often have little expertise in knowing if the method selected by an individual instructor was the best teaching method or just “a method” or simply the method with which the teacher was most comfortable. Teachers also have limited control over many of the most important factors that impact students’ learning, including students’ attitudes, background knowledge of the course content, study and learning skills, time students will spend on their learning, their emotional readiness to learn, and on and on. Since there is clearly a shared responsibility between the teacher and the student as to what that student learns, and because many students are able to learn in spite of the teacher, while others fail despite all of the best efforts of a skilled practitioner, the definition of “teacher effectiveness” appears to be, as Derek Bok put it, “an act of faith” on the part of students and teachers to do their best. Research Findings on Student Ratings of Teacher Effectiveness Research indicates that instructors benefit most from formative evaluation (evaluation meant to improve teaching) when they have helped to shape the questions posed, when they understand the feedback that is provided, and when assistance and resources for making improvements are available (Gaubatz, 2000). If student ratings are one of several methods used to evaluate a faculty member’s teaching (others include peer review, alumni ratings, self-rating, teaching portfolio and informal classroom assessment) the research strongly indicates that few faculty object to their use (Braskamp and Ory, 1994; Centra, 1993a; Doyle, 1983; Seldin, 1999). Research indicates that students are the most qualified sources to report on the extent to which the learning experience was productive, informative, satisfying, or worthwhile. While opinions on these matters are not direct measures of instructor or course effectiveness, they are legitimate indicators of student satisfaction, and there is substantial research linking student satisfaction to effective teaching (Theall and Franklin, 2001). A meta-analysis of 41 research studies provides the strongest evidence for the validity of student ratings since these studies investigated the relationship between student ratings and student learning. There are consistently high correlations between students’ ratings of the "amount learned" in the course and their overall ratings of the teacher and course (Marsh, 1982; Gaubatz, 2000). Research on student evaluation of teaching generally concludes that student ratings tend to be reliable, valid, relatively unbiased and useful (Murray, 1994). Evaluations are generally consistent across raters, rating forms, courses and time periods for a given semester They correlate moderately to highly with evaluations made of the same instructor by independent observers They correlate significantly with various objective indicators of student performance, such as performance on standardized exams There are low correlations with extraneous factors such as class size, severity of grading etc. (Murray, 1994). In short, the research shows that student evaluations of an instructor provide a reliable, valid assessment of that instructor's teaching effectiveness, especially if they reflect the views of many students in several different course offerings (Felder, 2001). Do Instructor/Instruction Factors Affect Student Ratings? Students want teachers that have “hardness of head but softness of heart” (Goldsmid, Gruber, and Wilson, 1977). Students want teachers that know what they are talking about but also care about them. Instructors’ Personality Traits- Research has shown that if personality traits affect student ratings it may be caused more by what instructors do in their teaching than who they are as a person (Erdle, Murray, and Rushton, 1985). The personality traits of a teacher are important but have not been seen to invalidate or bias student ratings ((Erdle, Murray, and Rushton, 1985). Ratings in Elective Courses- Ratings in elective courses are higher than those in required courses. It is suggested by Ory (2001) that separate norms be established for elective courses, required courses and courses that are a mixture of students that are taking a required course with those that are taking the course as an elective. The differences found between these kinds of courses were not overwhelming, but enough variances consistently exist to create separate norms (Costin, Greenough, and Menges 1971; Feldman, 1978; Mc Keachie, 1979 and Marsh 1984). Level of the Course (100-400)- The level of the course has a marginal impact on ratings with higher level courses tending to have better course ratings than lower level courses (Aleamoni and Graham, 1974; Bausell and Bausall, 1979; Feldman, 1978; Kulik and McKeachie, 1975). Innovation/New Course or Revised Courses- New courses usually get lower than expected ratings the first time they are taught (Franklin, 2001). Time of Day- The time of day that a course meets appears to have no influence on ratings (Franklin, 2001). Class Size- Based on 52 studies of student ratings Feldman (1978) found class size was no serious source of bias, and Centra (1993b) concluded class size had little practical significance. Different Disciplines—It is suggested evaluators develop separate norms for different disciplines as instructors teaching in certain disciplines receive higher student ratings than instructors in other disciplines; however Chiu’s 1999 study showed only a 1.19 percent of rating variance. In descending order the disciplines were arts and humanities, biological and social sciences, business, computer science, math, engineering, and physical science. These differences are not large, but they are consistent over many studies. Gender -There are no signs of any significant relationship between the gender of the instructor and course ratings (Bennett, 1982; Ferber and Huber, 1975; Lombardo and Tocci, 1979; Strathan, Richardson and Cook, 1991). There is, however, some evidence that ratings are slightly higher in classes where the majority of the students are the same gender as the instructor (Feldman, 1993). Rank, Age, and Years of Experience- All of these factors have been shown to have minimal impact on student ratings. However, first-year instructors usually receive lower ratings than do experienced instructors (Feldman, 1983) and professors of various rank receive higher ratings than teaching assistants (Brandenburg, Slinde and Batista, 1977; Centra and Creech, 1976). Race- Race has not been shown to have any biasing impact on ratings (Ory, 2001). Do Student Factors Affect Student Ratings? There are some consistent trends among student groups, but none has been shown to be of significant consequence. However, anyone who is reviewing the ratings needs to be aware that collectively these factors can influence rating outcomes. Prior Interest- Students with prior interest in a course give somewhat higher ratings to the instructor (Marsh and Cooper, 1981; Ory, 1980; Perry, Abrami, Leventhal and Check, 1979). Majors- Those students who are majors tend to rate instructors teaching courses in their major more positively than non-majors (Feldman, 1978). Gender of the Students- Gender is not related to the rankings students give professors; however, when the majority of the students are the same sex as the professor they tend to give higher ratings to the professor (Feldman, 1993). Students who Expect to Earn High Grades- Students who enter a course expecting to do well do give higher course ratings than students that have an expectation of earning a lower grade (Abrami, Perry and Leventhal, 1982; Feldman 1976; Howard and Maxwell, 1980). The impact, however, is minimal (Abrami, Perry and Leventhal, 1980 and Centra, 1993). Students’ Academic Ability –Academic ability as measured by GPA has shown little relationship to the rating students give (Theall and Franklin, 1990). Academic Rigor of the Course- Academic rigor is often associated with low ratings or is offered as a reason for low ratings; however, there is no evidence to support this claim—academic rigor by itself is not a sign of good teaching (Franklin, 2001). Conclusion-All of the above factors and variables (among teachers, students and courses) account for some of the ratings variance teachers see, but the truth is “that professors cannot manipulate the ratings as much as they think they can” (Ory, 2001). Administration of Ratings Questionnaires The manner in which ratings are administered has only a marginal impact on the results. A standardized procedure should be developed and followed by all to keep the playing field level. Final Exam Week- Ratings given during final exam week are generally lower than those given during a regular class period (Frey, 1976). Signed Ratings- Signed ratings are more positive than anonymous ratings. This may be due to fear of retribution (Argulewiz and O’Keefe, 1978; Feldman, 1979; Hartnett and Seligsohn, 1967). Used for Promotion-If students are told the ratings are for promotion, ratings are more positive (Centra, 1976; Feldman, 1979). Staying in the Room – Remaining in the room can slightly raise ratings (Feldman, 1979). Talking about Importance of Ratings-Giving a short speech about the importance of the ratings by the person handing out the rating forms can slightly raise ratings (Frey, 1976). All of the above situations should be avoided when ratings are administered. Benefits Student Ratings Can Have for an Institution Intended Benefits (Ory, 2001) Instructors value the input and make improvements in their teaching Instructors are rewarded for having excellent ratings Instructors with very low ratings are encouraged to seek help Students perceive and use ratings as a way to suggest improvements in teaching Students have more information on which to make their course selections Ratings motivate instructors to improve teaching Students see ratings as a vehicle for change Unintended Consequences Ratings Can Have on an Institution (Ory, 2001) Instructors alter their teaching to get higher ratings including weakening the difficulty of the course or giving higher grades Poor teaching is accepted and overall standards are lowered Campus uses ratings as only measure of effectiveness out of convenience The content of the student rating form may drive what is taught Students reward poor teaching by giving high ratings in exchange for high grades Ratings are used to make discriminations between instructors that are not supported by other data Instructors alter administration of evaluation to get higher ratings The data becomes meaningless because of the lack of use and control Optimal Conditions for Students to Give Instructors Feedback (Svinicki, 2001) 1. Students need adequate notice of when they will be asked to give feedback. This will allow for time to think about the questions that will be asked. Ideally students would be informed a day ahead of time that an evaluation will be done so they can take some time to think about the learning experience and be prepared to give precise and meaningful feedback to the instructor. 2. Students need adequate instruction on how to give the feedback. Students need instruction in how to be precise in their comments and in the definitions of the terms being used in the evaluation. Also students should be informed on how instructors plan to respond to the feedback that the students give. · One way to assist students in becoming more precise is to share a sample of student responses from previous evaluations that were helpful in improving the learning experience. · Another way is to ask for informal feedback at various times throughout the semester (every four weeks is a good timetable) to a few important questions about the learning experience. Share the responses anonymously with the class asking for clarification of responses that were vague or too general and demonstrating how the more precise the students are, the more valuable the feedback becomes. 3. Let students give feedback on a regular basis throughout the semester. Assign a few students in the class to be administrators and summarizers of this feedback process. This can improve the rapport with the students and increase the trust among the students and the instructor leading to students’ willingness to be more thoughtful, honest and precise with their feedback on the final evaluation. 4. Students need adequate time to give the feedback. Instructors need to be willing to take class time to get meaningful feedback. Ratings forms should not be given out at the end of the class period as students may tend to hurry so they can leave. Areas in which Students are Knowledgeable to Share Feedback with Faculty These recommendations are based on a synthesis of the many studies that were reviewed and not based on the findings on any one researcher. These are not specific recommendations for what might be used on a ratings form but rather a list of the areas in which students have the experience and knowledge to give feedback. Students can determine: If the learning objectives set out in the syllabus for the class have been covered by the instructor If they are getting regular and timely feedback from the instructor on their learning progress If the instructor let the class go early and how often this action occurred If the instructor cancelled class and how often it occurred If the instructor made it clear as to the time period in which students would receive their assignments and tests back, and kept to it If the material that was questioned on the tests was identified by the instructor as being the responsibility of the students to know for the test If the professor was on time for the class each day and how often he/she was late If the professor was available for help outside of class time (If he/she was not available the students should indicate how many times this occurred) If the professor kept to the timeframe announced to students that would be used to return students’ phone calls or emails If the teacher provided a clear explanation for the grades that were assigned to all work and tests If the instructor spoke clearly and could be easily understood If the professor was willing to answer students’ questions during class or provided other opportunities for the questions to be answered If the teacher offered regular encouragement to the students to do well If the teacher sought students’ input on issues that directly impacted their learning (discussion guidelines, assessment methods, paper or project topics as examples ) If the professor made it clear why (or gave the learning purpose) students were to do the assignments given both in and outside of class If the teacher kept the classroom environment positive for learning (didn’t allow sleeping, talking, doing other work, phone calls etc.) If the professor knew the names of the students If the textbook or other supplementary material was helpful in their learning of the course material If the professor provided a clear set of learning objectives, or goals, or purpose statements etc. for each class around which students could organize the information they received in the class If the pace of the class was reasonable for them individually If the professor kept to the rules, policies and guidelines outlined in the syllabus Areas in which Students Have Limited Qualifications to Give Faculty Feedback These questions are based on a synthesis of the research and not based on the findings on any one researcher. Most students lack the expertise needed to comment on: If the teaching methods used were appropriate for the course If the content covered was appropriate for the course If the content covered was up-to-date If the motivational methods used were appropriate to the level and content of the course If the assignments were appropriate for aiding student learning If what they learned has real world application If what they learned will help them in future classes If the type of assistance, help or support given to students was appropriate to the learning goals of the class If the difficulty level of the course material was at an appropriate level If the course or the instructor was excellent, average or poor* * Unless given a rubric to use in making this judgment Research Conclusions about Student Ratings Training is necessary for anyone who will use the ratings information to make decisions about a teacher’s performance. Understanding the mitigating circumstances that can affect ratings is necessary to interpret the rating fairly and accurately (Centra, 1993; Marsh, 1987; Murray 1994). Students’ ratings should be only one of several forms of evaluation used to shed light on teaching effectiveness. Peer review, self-evaluation, teaching portfolios, and student achievement as examples should also be used (Seldin, 1999; Doyle, 1983; Centra, 1993). Administration of ratings forms must be uniform and standardized to keep the playing field level (Cashin, 1999; Theall and Frankiln, 1990). Students need to receive training in how to give precise, meaningful feedback and will need opportunities to practice giving feedback for their ratings to become more helpful to faculty (Bandura, 1986). Students must be assured that the information they are giving is welcomed by the faculty and will be used to improve the teaching and learning in the course; otherwise they are unlikely to take the rating process seriously (Peterson, Maier, and Seligman, 1993). A minimum percentage of students depending upon the size of the class must be present to do the ratings for the information to be considered representative and reliable Class Size Recommended Response 5-20 80% 20-30 75% 30-50 66%-75% 50-more 60%-75% 100-more 50%-75% (Franklin and Theall, 1991) Students need definitions of terms used in the ratings questions especially what the institution means by teaching effectiveness. Research has shown wide interpretations of meanings of even common terms like timeliness, dependable, effective, caring and organized (Slagle and Icenogle, 2001). Institutions must carefully define those areas in which students are capable of giving feedback to faculty and those that are beyond their expertise (Ory, 2001). The lack of preciseness of any ratings instrument needs to be considered in the interpretation of any results. Rating averages likely fall in a range two to three tenths of a point in either direction -- 4.2 may represent a range from 3.9-4-5 (Doyle, 1983). Students must not fear retribution based on their feedback, or it will significantly inhibit their willingness to be honest in their feedback (Gordon and Stuecher, 1992). Suggestions for Improving the Effectiveness of Using a Student Ratings Form Based on the research findings the following recommendations are being made by the CTLFD for improving the use of teacher effectiveness evaluations at Ferris State Faculty need to continually assure students throughout the semester that the ratings will be used by the faculty for productive change and that there will be no chance of retribution to the students. Faculty need to help educate students in effective ways of giving precise feedback that addresses specific aspects of their learning experience. Faculty need to give students multiple informal opportunities to give feedback throughout the semester, thus practicing their feedback skills. This is also an effective way to improve teaching practice. The university community needs to define key vocabulary words for students that are used in both the formal ratings questionnaires and that they may use in written comments—such words include effectiveness, dependable, organized, reasonable, interesting, excellent, and caring, among others. The university community needs to make certain rating questionnaires are administered in standardized ways including the time of semester and time of the class (beginning of the class)—never during final exam week etc. Ratings questions need to be limited to those areas in which students have adequate expertise to give meaningful feedback. Those persons interpreting the results of student ratings should be given assistance on how to use the data, its reliability, validity and factors that may impact the results, including the number of students present the day of the rating, whether it’s an elective or required course, the type of course and the experience of the faculty member, among other issues. The university needs to assure faculty that ratings data will be collected over several classes (a minimum of 5) before any conclusions about results are made. Those persons interpreting the results need to compare the results with other measures of teaching effectiveness including peer ratings, self-ratings, teaching portfolios, student learning and alumni ratings before any conclusions are drawn about the ratings’ information. Faculty need to be assured that ratings are a formative method of evaluation and that assistance to improve their teaching will be made available to them. If a summative use of ratings is to be used it should be the result of multiple courses over several semesters and the intended use of the findings should be made clear to all faculty. References Abrami, P. C., Perry, R. P., and Leventhal, L. “The Relationship Between Student Personality Characteristics, Teacher Ratings, and Student Achievement.” Journal of Educational Psychology, 1982, 74, 111-125. Aleamoni, L. M., and Graham, N. H. “The Relationship Between CEQ Ratings and Instructor’s Rank, Class Size, and Course Level.” Journal of Educational Measurement, 1974, 11, 189-201. Angelo, T. and P. K. Cross. Classroom Assessment Techniques. San Francisco: Jossey-Bass, 1993. Argulewiz, E., and O’Keefe, T. “An Investigation of Signed Versus Anonymously Completed Ratings of High School Student Teachers.” Educational Research Journal, 1973, 3, 39-44. Bausell, R. B., and Bausell, C. R. “Student Ratings and Various Instructional Variables from a Within-Instructor Perspective.” Research in Higher Education, 1979, 11, 167-177. Bennett, S. K. “Student Perceptions of and Expectations for Male and Female Instructors: Evidence Relating to the Question of Gender Bias in Teaching Evaluation.” Journal of Educational Psychology, 1982, 74, 170-179. Bok, D. C. “Beyond the Ivory Tower.” Harvard University Press: Cambridge, Ma. 1982. Brandenburg, D. C., Slinde, J. A., and Batista, E. E. “Student Ratings of Instruction: Validity and Normative Interpretations.” Journal of Research in Higher Education, 1977, 7, 67-68. Braskamp, L. A., Brandenburg, D. C., and Ory, J. C. Evaluating Teaching Effectiveness: A Practical Guide. Thousand Oaks, Calif.: Sage, 1984. Braskamp, L. A., and Ory, J. C. Assessing Faculty Work. San Francisco: Jossey-Bass, 1994. Cashin, W. E. “Student Ratings of Teaching: Uses and Misuses.” In Peter Seldin (ed.), Changing Practices in Evaluating Teaching. Bolton, Mass.: Anker, 1999. Centra, J. A. “Use of the Teaching Portfolio and Student Evaluations for Summative Evaluation.” Paper presented at the annual meeting of the American Educational Research Association, Atlanta, Apr. 1993a. Centra, J. A. Reflective Faculty Evaluation. San Francisco: Jossey-Bass, 1993b. Centra, J. A., and Creech, F. R. The Relationship Between Students, Teachers, and Course Characteristics and Student Ratings of Teacher Effectiveness. Princeton, N.J.: Educational Testing Service, 1976. Centra, J. and Noreen Gaubatz. “Is there a Gender Bias in Student Evaluations of Teaching?” Journal of Higher Education, 71(1):17. 2000. Chiu, S. “Use of the Unbalanced Nested ANOVA to Exam Factors Influencing Student Ratings of Instructional Quality.” Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, 1999. Cohen, P. A. “Student Ratings of Instruction and Student Achievement: A Meta-Analysis of Multisection Validity Studies.” Review of Educational Research, 1981, 51, 281-309. Costin, F., Greenough, W. T., and Menges, R. J. “Student Ratings of College Teaching: Reliability, Validity, and Usefulness.” Review of Educational Research, 1971, 41, 511-535. Doyle, K. O. Evaluating Teaching, San Francisco: New Lexington Press, 1983. Erdle, S., Murray, H. G., and Rushton, J. P. “Personality, Classroom, Behavior, and College Teaching Effectiveness: A Path Analysis.” Journal of Educational Psychology, 1985, 77, 394-407. Feldman, K. A. “Grades and College Students’ Evaluations of Their Courses and Teachers.” Research in Higher Education, 1976, 4, 69-111. Feldman, K. A. “Course Characteristics and College Students’ Ratings of Their Teachers and Courses: What We Know and What We Don’t.” Research in Higher Education, 1978, 9, 199-242. Feldman, K. A. “The Significance of Circumstances for College Students’ Ratings of Their Teachers and Courses: A Review and Analysis.” Research in Higher Education, 1979, 10, 149-172. Feldman, K. A. “Seniority and Experience of College Teachers as Related to Evaluations They Receive from Their Students.” Research in Higher Education, 1983, 18, 3-124. Feldman, K. A. “College Students’ Views of Male and Female College Teachers: Part II – Evidence from Students’ Evaluations of Their Classroom Teachers.” Research in Higher Education, 1993, 34, 151-211. Ferber, M. A., and Huber, J. A. “Sex of Student and Instructor: A Study of Student Bias.” American Journal of Sociology, 1975, 80, 949 – 963. Frey, P. W. “Validity of Student Instructional Ratings as a Function of Their Timing.” Journal of Higher Education, 1976, 47, 327 – 336. Gaubatz, N. “What’s the Use of Student Ratings of Teaching Effectiveness?” http://csti

/ 0 نظر / 7 بازدید