In the face of a long, protracted pandemic with reports of exhaustion streaming in from all quarters, many educators may feel they were not able to teach as effectively as they would have liked to. This is the perfect time to revisit (or ask anew) a critical question: What is the best way for a faculty member to demonstrate teaching effectiveness?
Most discussions in higher education seem to revolve around the use of student evaluations of teaching (SETs). Yes, for promotion and tenure, external letter writers receive dossiers of information, also looked at by internal committees, but too often SETs are the key. It is how higher education has done it for years but need not persist. To fully capture the hard work that is teaching, we need to change how we evaluate and reward teaching.
Teaching is primarily discussed (when discussed) in the context of determining whether the teaching is good enough for promotion or tenure or both. By hewing to requirements in faculty handbooks, evaluators review SET scores, peer observation reports, self-reflections, and often a range of course materials. The result of the process is a simple yes (promote) or no (do not promote). Quick? Yes. Painless? Yes (for most). Best? Not by a long shot, as standard operating procedures show little validity or reliability.
A good evaluation should serve as a vehicle for improvement. The process should help the instructor improve, help the instructor help their students improve, and can help the department and university better fulfill their charge. When one considers this major purpose of evaluation, one sees that overreliance on SETs, like overreliance on grades in the measurement of learning, is misguided. Alternatives to grading are receiving much-needed attention, as the “ungrading” revolution indicates (see Blum, 2021). It is time for the same revolution in teaching evaluation.
Most faculty handbooks describe what teachers should be doing. The language and aspirations are commendable, but they bely the fact that most faculty do not receive training in how to be effective teachers or how to document effectiveness.
Thankfully, and not surprisingly, faculty handbooks mostly reflect what research shows constitutes effective teaching (Richmond et al., 2022). Studies of exceptional teachers (Bain, 2004) and detailed examinations of the evidence of model teaching show that the fundamental hallmarks of effective teaching are clear: strong course design (assessments and course activities that map onto explicit student learning outcomes); clear, student-centered syllabi; instructor knowledge of content; the use of effective instructional methods (e.g., fostering active learning); and inclusive teaching practices. Some of these hallmarks of effective teaching can be demonstrated by a collection of course materials showing evidence of the practices used. Missing are adequate pedagogical training for these areas and effective ways to document them.
One feature insufficiently documented is student learning. While some handbooks may note that where obtainable, evidence of student learning enhances evaluation, this prescription rarely makes its way into the documentation process. The majority of portfolios I review internally and externally lack any discussion of the extent to which student learning outcomes have been achieved in a given class.
There is no one gold standard to measure effective teaching! While this may seem like bad news, it provides both faculty and administrators with the opportunity to focus first on what they find most important and then on how to assess it. Unfortunately, because there is no set standard, it is easy to overly rely on what is most commonly used to measure teaching (SETs).
SET scores are exceeding easy to generate, perhaps one key reason they are so ubiquitous in higher education. There are also fraught with problems. While some SETs suffer from significant scale construction, validity, reliability, and response rate issues, it is also clear that a number of factors—such as course difficulty, the instructor’s race and gender, the instructor’s presentation style, and even chocolate—can influence them (Boysen, 2016; Carpenter & Witherby, 2020).
There are numerous ways to measure teaching, each varying in development history (e.g., inductively vs. deductively), extent of empirical validation, necessity of accompanying supporting evidence, and popularity and efficiency of usage. Meta-analyses, studies combining the results of a large amount of research, show a significant relationship of evaluations to student achievement (effect size = .44; Hattie, 2012), so measuring teaching is clearly a worthwhile endeavor.
Though most universities still rely heavily on quantitative data from SETs, there are many ways to capture effective teaching (Bernstein et al., 2006). This said, it is rare to see universities have consistent (i.e., across schools and departments) multi-faceted measures of teaching. It is easy to understand why: Holistic pictures of teaching take time to put together and take time to evaluate. Often the knowledge of how best to do both is lacking.
Reality: Learning is complex, and students’ perceptions of their learning are biased. Learning is difficult to measure as it is biased by a wide host of factors related to the student, the instructor, and the course. Furthermore, as SETs demonstrate, instructor demographics and teaching behaviors and practices can easily influence perceptions of learning and instruction (Carpenter & Witherby, 2020). This said, “teaching occurs only when learning takes place” (Bain, 2004, p. 173), so including measures of learning when evaluating teaching is critical.
Solution: Provide faculty with assessment know-how, and support reporting student learning outcome achievement, changes, and levels (Suskie, 2018).
Reality: Teaching excellence is contextual. What works at one university, in one discipline, for one level (first year, senior year), and for one group of students may not work elsewhere. This makes “best practices” a misnomer as practices may not be “best” for every context.
Solution: Provide faculty with course design know-how, and support modifying assignments and using different instructional methods (see Richmond et al., 2022).
Reality: Teaching excellence is not a fixed entity. Effective teachers need to be ready to change their practices and evolve to address different pedagogical challenges and external uncontrollable events (e.g., pandemics). This means it is unreasonable to set numerical quantitative benchmarks to assess teaching.
Solution: View teaching effectiveness holistically, providing faculty with ways to document their efforts and track and reflect on changes in student learning over time (see Bernstein et al., 2006).
Reality: Capturing effective teaching is challenging. It would be nice to have a quick, effective, cheap measure of teaching but it is difficult to get all three at once. Good measurement takes time and is not always easy. Faculty need to be given the time, resources, and incentives to engage in evaluation as effective evaluation benefits from training.
Solution: Give faculty funding to participate in workshops on good evaluation, and support them with well-staffed centers for teaching and learning.
Effective instructors partake in the hard work of teaching, and the first step is to find ways to best capture this hard work, a process called the scholarship of teaching (Boyer, 1990)—related to but not to be confused with the scholarship of teaching and learning (Gurung, in press).
A first step in the better evaluation of teaching is to reorganize our priorities for measuring teaching or, alternatively, be clear on all the benefits of measurement. If the goal of higher education is to help students be lifelong learners and gain the skills and knowledge to be happy, healthy, and responsible citizens (albeit only one set of aspirations), we need to help teachers help students learn. Beyond training in pedagogical fundamentals, we need to recognize how teachers can inspire learning, and it is time to foster inspiring teaching (Gurung, 2021).
Measures need to capture the fundamentals of effective teaching while providing easy ways to scale up the level of detail and complexity for those who opt for it. Most measures are self-reports where a faculty member reflects on their own knowledge, skills, and abilities, or can also be completed by students. Providing faculty with checklists of the fundamentals gives them a clear set of goals and benchmarks with which they can track their own progress and development.
A list of options is provided in the pragmatic tools section below. You need only pick which of the scales or subscales most fit what you want to measure. You then decide on a response format (qualitative, quantitative, mixed, checklist, categorial or continuous) and how much evidence to collect (i.e., syllabi, assignments, or a narrative). A university can set different levels of measurement or amounts of evidence for different stages of promotion or, better still, have a basic checklist for model teaching that all faculty can aspire to.
Faculty can curate classroom artifacts (syllabi, assignments, and assessments) using an online dossier (Google Sites) that facilitates self-reflection and peer observation, and they can use one of the following organizational structures, either by itself as a self-report measure or supplemented with evidence:
Abello, D., Alonso-Tapia, J., & Panadero, E. (2020). Development and validation of the Teaching Styles Inventory for Higher Education (TSIHE). Anales de Psicología, 36(1), 143–154. https://doi.org/10.6018/analesps.370661
Bain, K. (2004). What the best college teachers do. Belknap.
Blum, S. (2021). Ungrading: Why rating students undermines learning (and what to do instead). West Virginia University Press.
Boysen, G. A. (2016). Using student evaluations to improve teaching: Evidence-based recommendations. Scholarship of Teaching and Learning in Psychology, 2(4), 273–284. https://doi.org/10.1037/stl0000069
Carpenter, S. K., Witherby, A. E., & Tauber, S. K. (2020). On students’ (mis)judgments of learning and teaching effectiveness: Where we stand and how to move forward. Journal of Applied Research in Memory and Cognition, 9(2), 181–185. https://doi.org/10.1016/j.jarmac.2020.04.003
Gurung, R. A. R. (2021). Inspire to learn and be CCOMFE doing it. Canadian Psychology/Psychologie canadienne, 62(4), 348–351. https://doi.org/10.1037/cap0000277
Gurung, R. A. R. (in press). The scholarship of teaching and learning: Scaling new heights, but it may not mean what you think it means. In C. E. Overson, C. M. Hakala, L. L. Kordonowy, & V. A. Benassi (Eds.), In their own words: What scholars want you to know about why and how to apply the science of learning in your academic setting. Society for the Teaching of Psychology.
Hattie, J. (2012). Visible learning for teachers: Maximizing impact on learning. Routledge.
Keeley, J., Smith, D., & Buskist, W. (2006). The teacher behaviors checklist: Factor analysis of its utility for evaluating teaching. Teaching of Psychology, 33(2), 84–91. https://doi.org/10.1207/s15328023top3302_1
Richmond, A. S., Boysen, G. A., & Gurung, R. A. R. (2022). An evidence-based guide to college and university teaching: Developing the model teaching criteria. Routledge.
Suskie, L. (2018). Assessing student learning: A common sense guide (3rd ed.). Jossey-Bass.
Regan A. R. Gurung, PhD, is associate vice provost and executive director for the Center for Teaching and Learning and professor of psychological science at Oregon State University. Follow him on Twitter @ReganARGurung.