Free
Education  |   November 2016
Redesign of the System for Evaluation of Teaching Qualities in Anesthesiology Residency Training (SETQ Smart)
Author Notes
  • From the Professional Performance Research Group, Center for Evidence-Based Education, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands (K.M.J.M.H.L.); Department of Anaesthesia and Intensive Care Medicine, Craigavon Area Hospital, Portadown, United Kingdom (A.F.); Department of Anesthesiology, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands (M.W.H.); Centre of Health Sciences Education, Faculty of Health, Aarhus University, Aarhus, Denmark (B.M.); Department of Epidemiology, Fielding School of Public Health, University of California, Los Angeles (UCLA), Los Angeles, California (O.A.A.); and UCLA Center for Health Policy Research, Los Angeles, California (O.A.A.).
  • Submitted for publication September 18, 2015. Accepted for publication July 27, 2016.
    Submitted for publication September 18, 2015. Accepted for publication July 27, 2016.×
  • *Members of SMART Collaborators are listed in appendix 1.
    *Members of SMART Collaborators are listed in appendix 1.×
  • Address correspondence to Dr. Lombarts: Professional Performance Research Group, Academic Medical Center, University of Amsterdam, J1a-119, Meibergdreef 9, 1100 DD Amsterdam, The Netherlands. m.j.lombarts@amc.uva.nl. Information on purchasing reprints may be found at www.anesthesiology.org or on the masthead page at the beginning of this issue. Anesthesiologys articles are made freely accessible to all readers, for personal use only, 6 months from the cover date of the issue.
Article Information
Education / Original Investigations in Education / Education / CPD
Education   |   November 2016
Redesign of the System for Evaluation of Teaching Qualities in Anesthesiology Residency Training (SETQ Smart)
Anesthesiology 11 2016, Vol.125, 1056-1065. doi:10.1097/ALN.0000000000001341
Anesthesiology 11 2016, Vol.125, 1056-1065. doi:10.1097/ALN.0000000000001341
Abstract

Background: Given the increasing international recognition of clinical teaching as a competency and regulation of residency training, evaluation of anesthesiology faculty teaching is needed. The System for Evaluating Teaching Qualities (SETQ) Smart questionnaires were developed for assessing teaching performance of faculty in residency training programs in different countries. This study investigated (1) the structure, (2) the psychometric qualities of the new tools, and (3) the number of residents’ evaluations needed per anesthesiology faculty to use the instruments reliably.

Methods: Two SETQ Smart questionnaires—for faculty self-evaluation and for resident evaluation of faculty—were developed. A multicenter survey was conducted among 399 anesthesiology faculty and 430 residents in six countries. Statistical analyses included exploratory factor analysis, reliability analysis using Cronbach α, item-total scale correlations, interscale correlations, comparison of composite scales to global ratings, and generalizability analysis to assess residents’ evaluations needed per faculty.

Results: In total, 240 residents completed 1,622 evaluations of 247 faculty. The SETQ Smart questionnaires revealed six teaching qualities consisting of 25 items. Cronbach α’s were very high (greater than 0.95) for the overall SETQ Smart questionnaires and high (greater than 0.80) for the separate teaching qualities. Interscale correlations were all within the acceptable range of moderate correlation. Overall, questionnaire and scale scores correlated moderately to highly with the global ratings. For reliable feedback to individual faculty, three to five resident evaluations are needed.

Conclusions: The first internationally piloted questionnaires for evaluating individual anesthesiology faculty teaching performance can be reliably, validly, and feasibly used for formative purposes in residency training.

What We Already Know about This Topic
  • Healthcare systems are beginning to adopt tools for measuring and enhancing teaching performance and training

  • System for Evaluation of Teaching Qualities is a well-documented system for evaluating faculty teaching performance, originally developed for use in The Netherlands

What This Article Tells Us That Is New
  • System for Evaluation of Teaching Qualities (SETQ) Smart questionnaires were developed and validated for assessing teaching performance of faculty in residency training programs in different countries through a multicenter survey

  • SETQ Smart scores correlated with global ratings and yielded reliable feedback from three to five resident evaluations, thus providing a validated tool for international resident teaching assessment

THE quality of the medical workforce training ultimately determines the quality of patient care provided. Given the changing individual and population health needs, and the widespread belief that twentieth-century educational strategies are unsuitable for tackling twenty-first-century challenges, organizations involved in training medical specialists are pressured to reform graduate medical training.1,2  Professional bodies and governmental institutions alike have published new directives and recommendations for medical training reform,1–4  all reflecting the need to train medical professionals who have a broad range of skills relevant to the complex challenges facing modern healthcare systems.
For example, the European Union of Medical Specialists (Brussels, Belgium), in collaboration with the European Boards, has worked to develop European standards in medical training. A unified approach is important, given the European legal framework that enables medical specialists and trainees to move across Europe.5  Anesthesiology was one of the first specialties to agree to new European-wide regulations for residency training. Since 2013, the new directives also include training requirements for anesthesiology teachers. Recognizing that the levels of professional development, educational training support, promotion of skill development, and encouragement of educational innovations among faculty vary significantly across European countries, the Standing Committee on Education and Professional Development of the European Board of Anaesthesiology (Brussels, Belgium) recommends training of anesthesiology faculty in core teaching competencies.5  Many healthcare systems around the world have needs and challenges not unlike these and will need or are beginning to adopt tools and systems for measuring and enhancing teaching performance and training. In Europe, for example, the Standing Committee of the European Board of Anaesthesiology chose to rely on the System for Evaluation of Teaching Qualities (SETQ), the well-documented system for evaluating faculty’s teaching performance.6–18  Originally developed for use in The Netherlands,6,7  and reported in Anesthesiology in 2009,7  the system gained the attention of anesthesiology faculty in other countries who were looking for valid, reliable, and feasible systems for assessing teaching performance. This global interest led to an international project by the Standardizing Measurements in Anesthesiology Residency Training (SMART) Collaborators whose aim was to create Standardized Measurements in Anesthesiology Residency Training (Smart), by adapting, updating, and validating the original SETQ for use in different countries.
Since its introduction in 2008, the SETQ system has quickly expanded to include other specialties, residency training programs, and teaching institutions. Approximately 10,000 teaching faculty and residents representing some 260 residency training programs in 60 Dutch hospitals now participate in the SETQ system. Most programs use the system annually. The system typically comprises three phases: (1) the Web-based (self-)evaluations completed by faculty and residents; (2) reporting back individualized feedback to faculty; and (3) individualized faculty follow-up for professional development purposes.
The current study reports on the development and validation of the international adaptation of the SETQ, named the SETQ Smart. Our research aimed to investigate (1) the structure of the SETQ Smart questionnaires (one used for resident-completed evaluation of faculty and one for faculty self-evaluation); (2) the reliability and (construct) validity of the SETQ Smart tools; and (3) the number of residents’ evaluations per anesthesiology faculty needed for reliable feedback to faculty.
Materials and Methods
Developing the SETQ Smart Questionnaires
The development of the SETQ Smart questionnaires, one for residents to evaluate faculty and one for faculty self-evaluation, was based on the original specialty-specific SETQ questionnaires, which have been found to provide valid and reliable evaluations.6–11  The Dutch SETQ questionnaire for anesthesiology consists of five teaching aspects and two global questions, or 24 items in total, and was taken as a starting point for updating and validating the tool for international use. This involved new languages and perceived need for new areas of teaching qualities not previously covered in the Dutch SETQ. The project team consisted of practicing anesthesiologists from the participating countries and the lead researchers (K.L. and O.A.A.) who led the development of all SETQ tools and related follow-up research.
The SETQ Smart questionnaires were developed iteratively using literature review, multiple (telephone and face-to-face) discussions in the research project group, and consultation rounds with faculty and residents at various teaching sites. The SMART project team designed the SETQ Smart to (1) be applicable in the participating countries; (2) incorporate modern ideas about graduate medical education; (3) reflect the most pivotal discussions in professionalism; (4) incorporate previous SETQ research findings; and (5) include both quantitative and narrative feedback. It was important to keep the resulting tools as lean and as comprehensive as needed to make the performance feedback valuable to faculty and the online deployment efficient and effective for anonymous data collection.
In 2011, Srinivasan et al.19  presented a “Teaching as a Competency” framework for medical educators, after a rigorous incremental developmental process involving literature review, conferences with medical and nonmedical educators, and international expert consultation. Critical skills for medical educators were defined by using both the Accreditation Council for Graduate Medical Education20  (Chicago, Illinois, USA) and the CanMEDS21  (Ottawa, Ontario, Canada) frameworks for physicians’ competencies and roles. The “Teaching as a Competency” framework19  promotes a culture of effective teaching and learning and was, therefore, used by the project team to reflect on the content validity of the original SETQ. The team decided to incorporate four competencies from the framework into the SETQ Smart questionnaires: learner centeredness, professionalism and role modeling, interpersonal and communication skills, and medical (or content) knowledge. When the project team reached consensus on the first theory-based adjusted version of the SETQ Smart questionnaires, the team members discussed the questionnaires in group sessions with anesthesiology staff and residents at their local sites to check for relevance, clarity, context specificity, and completeness of the questionnaire. All individual items and the integrated questionnaires were addressed in these sessions; the SMART research team members led and documented these group discussions. All detailed comments and suggestions for changes were then discussed in the research team until consensus was reached on the final version of the SETQ Smart questionnaires. Also, it was decided that the five-point Likert response scale in the original SETQ questionnaires should be changed to a seven-point Likert scale to allow for more nuanced assessments.22–24  Given the international nature of the project, all communications took place in English. The resulting English questionnaires were then translated into Danish, Dutch, and German following appropriate forward–backward translation procedures.25,26 
SETQ Smart Questionnaires
The SETQ Smart questionnaires used for the pilot consisted of 27 items across six scales on the teaching qualities of anesthesiology faculty: (1) creating a positive learning climate; (2) displaying a professional attitude toward residents; (3) evaluation of residents’ knowledge and skills; (4) feedback to residents; (5) learner centeredness; and (6) professionalism. In addition, three previously researched role modeling items were included in the questionnaires as global performance ratings. The first four scales were mostly copied from the original SETQ tool. Learner centeredness contained two items from the original SETQ scale “communication about learning goals,” and two new items. The professionalism scale comprised five new items. Last, the role modeling items inquired how faculty were performing as role model teachers, physicians, and persons, a split that was found relevant in the literature on role modeling,27,28  as reported by previous SETQ research.16,17  Each of the items had a seven-point Likert-type response scale: totally disagree, disagree, somewhat disagree, neutral, somewhat agree, agree, and totally agree. The questionnaires concluded with one global question rating the faculty’s overall teaching qualities on a numeric 10-point scale. It has been reported that respondents are more comfortable rating global questions on a 10-point scale.22–24  In addition, the residents’ questionnaire contained two free-text questions requesting them to list the teaching strengths of anesthesiology faculty and provide them with suggestions for improvement. We collected data on residents’ sex, year of training, and the number of years of experience in practice, research, or otherwise before entering anesthesiology training. For faculty, we collected data on their sex, age, year of qualifying as an anesthesiologist, current position, number of years at current hospital, number of years in previous teaching institutions, and whether they had previously completed a “teach-the-teacher” course.
Study Settings and Participants
Members of the SMART project team used their professional connections to approach anesthesiology program directors in Austria, Denmark, Germany, The Netherlands, Switzerland, and the United Kingdom to participate in the pilot project. Program directors were informed about the background, purpose, and specific contents of the SETQ Smart in person or by telephone and by providing materials such as Powerpoint® (Microsoft Corporation, USA) presentations. From November 2013 to April 2014, faculty and residents were invited to participate in the SETQ Smart (self-)evaluations. Data collection was Web based through the Professional Performance Online platform (website.professionalperformanceonline.nl), which was developed specifically for facilitating physicians’ performance evaluations. Invitations were emailed through the Professional Performance Online platform on the first day of the data collection period, stressing confidential and anonymous participation. The emails contained personal passwords for protected and safe personal login. Data collection lasted 4 to 6 weeks, and up to three reminders were sent to nonresponders during that time. In addition, the program directors received two-weekly emails from the project team reporting actual response rates. Immediately after closure of the data collection period, all teaching faculty could download their automatically generated, personalized feedback reports. These anonymously reported the residents’ evaluations of faculty teaching performance, their performance ranking among peers, and the residents’ written comments on faculty’s teaching strengths and areas for improvement. In addition, each program director received a report summarizing, anonymously and at aggregated (group) level, the teaching performance of participating faculty.
Statistical Analyses
After using appropriate statistics to describe the study participants, we used several analytical strategies to address our research objectives. To address our research question on the structure of the SETQ Smart questionnaires, we conducted exploratory analysis using principal components to extract meaningful conceptual scales of teaching performance for the faculty and resident questionnaires separately. Items were assigned to the scale on which they loaded highest, with a factor loading of at least 0.4. The resulting scale structure was carried forward to reliability analysis. We used Cronbach α (which should be at least 0.70 to be acceptable) to assess the internal consistency reliability of each scale.7,9,26  Next, item-total scale correlations adjusted for item overlap were used to assess the homogeneity of each scale. We averaged items to compute composite scores for each scale and the overall instrument for the residents and faculty evaluations separately. We then estimated interscale correlations using nonparametric correlation coefficients to check for the interpretability of the scales as distinct but interrelated (sub)constructs.7,9,26  We further checked for construct validity using purposeful hypothesis testing; we estimated nonparametric correlation coefficients for the separate and combined correlations of the scales with the global ratings, namely (1) faculty being seen as a role model teacher; and (2) faculty’s overall teaching qualities. In line with previous works,7–10,15–17  we hypothesized that the scores from the separate and combined scales for residents and faculty separately should be positively correlated with both global ratings, with correlations in the range from 0.40 to 0.80.7–10,26  Finally, for the residents’ instrument, we conducted generalizability analysis to quantify the number of resident evaluations needed for reliable estimation of the scale and total scores. We treated faculty as the unit of analysis, nesting the residents within faculty and allowing for crossing of faculty–resident nesting and with the number of items as fixed in an unbalanced hierarchical study. Therefore, for the scale and total scores, we decomposed the total variance into components associated with faculty (f) and residents (r) nested (:) within faculty and crossed (×) with the items (i). We used this (r:f) × i design when we estimated the minimum number of resident evaluations per faculty for varying reproducibility or dependability coefficients of 0.60 to 0.90 (the higher the coefficient, the more desirable). We used the reproducibility coefficient as a measure of the dependability of the scales for making absolute decisions such as the minimum number of resident evaluations needed per faculty. All statistical analyses were conducted using IBM® SPSS® Statistics for Mac version 23 (2015, IBM Corp., USA), Stata® 14 (2015, Stata Corp., USA), and Microsoft® Excel® for Mac 2011 version 14.5.2 (Microsoft Corporation, USA).
Results
Study Setting and Participants
The study was conducted on samples of 247 anesthesiology faculty and 242 residents, with response rates of 62 and 56%, respectively. Participation varied per country (table 1). Residents completed a total of 1,622 faculty evaluations, with an average of seven resident evaluations per faculty.
Table 1.
Characteristics of Residents and Faculty Who Participated in the Study
Characteristics of Residents and Faculty Who Participated in the Study×
Characteristics of Residents and Faculty Who Participated in the Study
Table 1.
Characteristics of Residents and Faculty Who Participated in the Study
Characteristics of Residents and Faculty Who Participated in the Study×
×
Structure, Reliability, and Validity of the SETQ Smart Questionnaires
Table 2 shows the results of the exploratory factor analysis that yielded six teaching scales consisting of 25 items for both the resident and the faculty self-evaluation questionnaire: (1) creating an open and stimulating learning climate (six items); (2) displaying a professional attitude toward residents (four items); (3) learner centeredness (four items); (4) giving feedback (four items); (5) evaluating residents’ knowledge and skills (four items); and (6) professional practice management (three items). The two items “adheres to professional practice standards in the field of anesthesiology” and “demonstrates compassion and integrity toward patients and their families” did not fit with any of the teaching scales and were dropped from the scales. The two items will still remain available to training programs that deem the information they provide important. Appendix 2 lists the new SETQ Smart questionnaire. Internal consistency reliability expressed as Cronbach α was very high for the total SETQ Smart residents’ instrument (0.98) and for the faculty’s instrument (0.96), indicating that both instruments provided reliable data (table 2). Cronbach α for the residents’ instrument ranged from 0.87 for the teaching performance scale “professional practice management” to 0.97 for the scale “evaluation of knowledge and skills.” When the reliability of the residents’ instrument was calculated at the faculty (aggregate) level, Cronbach α remained stable. For the faculty self-evaluation instrument, the Cronbach α results for internal consistency reliability were well above 0.80 for all but one scale, namely “professional practice management” with a Cronbach α of 0.74. The item-total correlations were high for all items within their scales in the resident questionnaire and moderate to high for the faculty self-evaluation questionnaire.
Table 2.
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations×
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations
Table 2.
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations×
×
Table 3 displays the descriptive statistics and the interscale correlations. The means ranged from 5.17 to 5.89 with variances of 1.50 to 1.70 for the resident evaluations and 5.10 to 5.88 with variances of 0.69 to 0.94 for self-evaluation. The correlation coefficients ranged from 0.52 between “professional attitude” and “evaluation” to 0.78 between “learning climate” and “learner centeredness” on the residents’ instrument (P < 0.01), all within the acceptable range. The faculty self-evaluation questionnaire displayed somewhat lower interscale correlations, varying from 0.47 (between “professional attitude” and “evaluation”) to 0.74 (between “learning climate” and “feedback”) although they were within the acceptable range of moderate correlation. Last, the construct validity analysis found that the overall instruments and their scales correlated moderately to highly with the two global ratings “role model teacher” and “overall teaching performance” for both instruments (table 4), although slightly more so for the “role model teacher” global rating. Overall, the correlations were somewhat stronger for the resident instrument than for the faculty instrument, but all correlations fell within the literature-based expected range of 0.40 to 0.80.19  Last, assuming a reproducibility coefficient of 0.70 for the entire SETQ Smart instrument, at least three resident-completed evaluations per anesthesiology faculty would be required (table 5). When applying stricter reproducibility coefficients of 0.80 or even 0.90, 5 to 11 faculty evaluations by residents will be needed. Considering the reproducibility of the six scales separately, the number of residents’ evaluations needed vary from two (for “learning climate” and “professional attitude”) to five (for “professional practice management”) when assuming a reproducibility coefficient of 0.70.
Table 3.
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately×
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately
Table 3.
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately×
×
Table 4.
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations×
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations
Table 4.
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations×
×
Table 5.
Number of Residents’ Evaluations Needed Per Faculty
Number of Residents’ Evaluations Needed Per Faculty×
Number of Residents’ Evaluations Needed Per Faculty
Table 5.
Number of Residents’ Evaluations Needed Per Faculty
Number of Residents’ Evaluations Needed Per Faculty×
×
Discussion
Main Findings
The SETQ Smart questionnaires for faculty and resident (self-)evaluation of anesthesiology faculty teaching performance are valid, reliable, and feasible for use in various anesthesiology residency training programs in various European countries with very different languages, cultures, and healthcare systems. We found that teaching performance of faculty can be captured under six distinct teaching qualities. The newly developed SETQ Smart questionnaires capture general pedagogical skills and specific teaching skills needed in modern anesthesiology residency training. Overall, the new tools can be feasibly deployed since only reasonable numbers of resident evaluations per faculty are needed.
Strengths and Limitations
To our knowledge, this is the first attempt to design international measurement instruments for teaching performance in anesthesiology. Nonetheless, we note several strengths and limitations of our study. Strengths include the well-researched original SETQ questionnaires underlying the newly developed SETQ Smart, the diverse composition of the project team including clinicians and researchers from various health systems and cultures, the active participation of both practicing anesthesiology staff and residents in the development of the questionnaires, the single specialty focus of the pilot, and the multilingual availability of the Web-based questionnaires.
The limited sample size per country can be considered a limitation since we could not perform separate comparative country analyses. Single-country studies based on multicenter participation may be performed in the future to investigate the psychometric properties of the SETQ Smart further. In addition, this pilot study could have been stronger with somewhat higher response rates.7  The overall response rates of 56 and 62% for residents and faculty, respectively, are, however, in line with, if not on the “high” side of, the current trend in study participation rates seen in the literature.29–31  Although we did not have the demographic data on the nonrespondents to check for possible sources of nonresponse or selection bias,32  we reanalyzed the data for the participating sites with 50% response or higher versus those with less than 50% response and found similar results. We speculate that if we had the extreme scenario of nonrespondents being those who would have given extreme scores, then the teaching performance means and variances could favor the respondents. Yet, the performance score variation reported here is in line with previous works that had higher response rates.7–10  Nonetheless, it will be worthwhile in future work to use the flexible but complex nonresponse bias analysis method32  that we helped develop for this source of uncertainty.33  Future SETQ Smart users should closely monitor response progress over the evaluation period and consider multiple implementation strategies to maximize participation.
Explanation of Findings
The mounting evidence on the impact of residency training on the quality of patient care,34  worldwide reforms in residency training, new quality standards for postgraduate medical education, and residents demanding the best possible training35  have all fueled faculty development including measuring teaching performance. This study provides empirical evidence that the SETQ Smart questionnaires can be validly, reliably, and feasibly used in different anesthesiology training settings.
As expected, the six teaching scales we found included the four scales from the original SETQ questionnaires, the fifth new scale (“learner centeredness”) based on the literature and discussions with faculty and residents, and the sixth new scale named “professional practice management.” However, this “professional practice management” scale retained only three of the five new professionalism items introduced during development. The two remaining items (“adhering to professional practice standards in the field of anesthesiology” and “demonstrating compassion and integrity toward patients and their families”) did not fit on any particular teaching scale. They indeed stand out in that they do not address teaching in particular but clinical practice in general. Nevertheless, we decided to keep them as separate additional items in the questionnaires, given their feedback value. Overall, the reported reliability and validity results support the use of the SETQ Smart in quantifying and stimulating excellence in faculty teaching performance internationally.
Implications for Practice and Research
The SETQ Smart was developed for the purpose of formative performance evaluation, meaning that teaching faculty can use performance feedback to develop their professional performance improvement plans. For this reason, the collected feedback, including the ratings and narratives, can be summarized and fed back to teaching faculty in personalized reports. From our conversations with teaching faculty using the SETQ, often, the combination of quantitative and qualitative feedback is crucial. The written comments tend to parallel and elaborate on the reports and ratings provided by residents.13  However, the use of feedback to enhance performance is not self-evident36 ; it is, therefore, recommended to discuss, plan, and monitor follow-up actively. Engaging (peer) coaches or mentors may effectively facilitate the needed reflection process.37  Previous research on the original SETQ indeed showed that repeated participation in a system of measuring performance and following up on its results might lead to improved faculty teaching performance in the eyes of residents.12,18  These studies need to be replicated to document whether the SETQ Smart would also have positive effects on residents’ teaching performance evaluations and in what context.
This pilot study needs wider international adaptation and adoption to generate and share evidence on what works and what does not. Accumulating more data in future work would support detailed country-specific confirmatory analyses, not done in this study that pooled data across countries. In the countries that piloted the SETQ Smart, repeated use in anesthesiology training programs will be encouraged. In addition, training programs in new countries can collaborate with or contact the SMART project to explore how to adapt and use the questionnaires in their own settings (as was recently done by a Swedish residency program).
For several reasons, we expect that SETQ Smart can be used successfully by various countries. First, there are enough similarities in anesthesiology training and practice between many countries38,39  to allow for structured teaching performance measurement and monitoring using tools like SETQ Smart. Given that the new tools appear to work in the very diverse countries in this study, it is reasonable to expect that these tools will work well in other countries.39  Second, the original SETQ questionnaires6,7  developed in The Netherlands were informed by the 26-item Stanford Faculty Development Program instrument from Stanford University (Stanford, California).40–42  Third, just as the work on effective clinical teaching by Stanford University researchers informed the original SETQ design, their latest research-based views on teaching informed the redesign of the SETQ into the SETQ Smart.19  To close the loop, the widely published SETQ is now included in Stanford’s faculty development curriculum, demonstrating the global relevance of the SETQ Smart. Fourth, the first SETQ publication in Anesthesiology7  elicited multiple reactions from American anesthesiologists and residency program directors from across the United States and elsewhere, including requests to use the questionnaires in local faculty assessment programs. We interpret the international interest in SETQ as face validation and evidence of the international transportability of these tools. Finally, our finding that the new questionnaires work well for the very different language, cultural, and healthcare system settings supports the view that the new questionnaires can be deployed in many countries. SETQ Smart may present new research and cross-national learning opportunities on teaching performance in anesthesiology.
Conclusions
High-quality training is essential for excellence in anesthesiology. The SETQ Smart questionnaires represent the first international measurement tools for evaluating, monitoring, and improving anesthesiology faculty teaching performance in residency training programs. The SETQ Smart may also be useful to policymakers in their continuous efforts to investigate and minimalize variations in anesthesiology training within and across countries.
Acknowledgments
The authors acknowledge the input of all faculty and residents in the participating anesthesiology training programs. They thank the European Society of Anesthesiology (Brussels, Belgium) for facilitating the project presentation at the annual conference in Stockholm. The authors also thank D. Benders and R. van der Sanden from Medox.nl, 'S-Hertogenbosch, The Netherlands, for providing Professional Performance Online as a platform for SETQ Smart. The authors also thank the SMART Collaborators.
Research Support
Support was provided solely from institutional and/or departmental sources. Support was provided by Center for Evidence Based Education, The Academic Medical Center, Amsterdam, The Netherlands (to Dr. Lombarts) and Department of Anesthesiology, The Academic Medical Center, Amsterdam, The Netherlands (to Dr. Hollmann). Funds were received from the European Society of Anesthesiology for presenting the project at the annual meeting in Stockholm (to Dr. Lombarts). Supported by the Department of Anaesthesia and Intensive Care Medicine, Craigavon Area Hospital, Portadown, United Kingdom (to Dr. Ferguson). Supported by the Central Denmark Region (Aarhus, Denmark) foundation for improvement of the quality in specialist training (to Dr. Malling). Supported by Department of Epidemiology, Fielding School of Public Health, The University of California, Los Angeles, Los Angeles, California (to Dr. Arah). Dr. Arah received additional funding for his contributions to the larger project “Quality of clinical teachers and residency training programs,” which is co-financed by the Dutch Ministry of Health, the Academic Medical Center, Amsterdam, The Netherlands, and the Faculty of Health and Life Sciences of the University of Maastricht, Maastricht, The Netherlands.
Competing Interests
The authors declare no competing interests.
References
Frenk, J, Chen, L, Bhutta, ZA, Cohen, J, Crisp, N, Evans, T, Fineberg, H, Garcia, P, Ke, Y, Kelley, P, Kistnasamy, B, Meleis, A, Naylor, D, Pablos-Mendez, A, Reddy, S, Scrimshaw, S, Sepulveda, J, Serwadda, D, Zurayk, H Health professionals for a new century: Transforming education to strengthen health systems in an interdependent world.. Lancet. (2010). 376 1923–58 [Article] [PubMed]
Cook, M, Irby, D, O’Brien, DC Educating Physicians: A Call for Reform of Medical School and Residency. (2010). San Francisco Jossey-Bass Carnegie Foundation for the Advancement of Teaching
General Medical Council, Quality Framework for Specialty including GP training 2010.. Available at: http://www.gmc-uk.org. Accessed August 27, 2015
Royal Dutch Medical Association, The Competency profile of program directors and supervisors.. Available at: http://www.knmg.artsennet.nl. Accessed: August 27, 2015
European Union of Medical Specialists, Training Requirements for the Specialty of Anaesthesiology, Pain and Intensive Care Medicine. (2013). Brussels European Standards of Postgraduate Medical Specialist Training, EUMS
Lombarts, MJ, Bucx, MJ, Rupp, I, Keijzers, PJ, Kokke, SI, Schlack, W An instrument for the assessment of the training qualities of clinician-educators [in Dutch].. Ned Tijdschr Geneeskd. (2007). 151 2004–8 [PubMed]
Lombarts, KM, Bucx, MJ, Arah, OA Development of a system for the evaluation of the teaching qualities of anesthesiology faculty.. Anesthesiology. (2009). 111 709–16 [Article] [PubMed]
van der Leeuw, R, Lombarts, K, Heineman, MJ, Arah, O Systematic evaluation of the teaching qualities of obstetrics and gynecology faculty: Reliability and validity of the SETQ tools.. PLoS One. (2011). 6 e19142 [Article] [PubMed]
Arah, OA, Hoekstra, JB, Bos, AP, Lombarts, KM New tools for systematic evaluation of teaching qualities of medical faculty: Results of an ongoing multi-center survey.. PLoS One. (2011). 6 e25983 [Article] [PubMed]
Boerebach, BC, Arah, OA, Busch, OR, Lombarts, KM Reliable and valid tools for measuring surgeons’ teaching performance: Residents’ vs. self evaluation.. J Surg Educ. (2012). 69 511–20 [Article] [PubMed]
Boerebach, BC, Lombarts, KM, Arah, OA Confirmatory factor analysis of the System for Evaluation of Teaching Qualities (SETQ) in graduate medical training.. Eval Health Prof. (2016). 39 21–32 [Article] [PubMed]
Leeuw van der, RM, Boerebach, CM, Lombarts, KMJMH, Heineman, MJ, Arah, OA Clinical teaching performance improvement of faculty in residency training: A prospective cohort study.. Med Teach. (2016). 38 464–70 [PubMed]
van der Leeuw, RM, Overeem, K, Arah, OA, Heineman, MJ, Lombarts, KM Frequency and determinants of residents’ narrative feedback on the teaching performance of faculty: Narratives in numbers.. Acad Med. (2013). 88 1324–31 [Article] [PubMed]
Arah, OA, Heineman, MJ, Lombarts, KM Factors influencing residents’ evaluations of clinical faculty member teaching qualities and role model status.. Med Educ. (2012). 46 381–9 [Article] [PubMed]
Lombarts, KM, Heineman, MJ, Arah, OA Good clinical teachers likely to be specialist role models: Results from a multicenter cross-sectional survey.. PLoS One. (2010). 5 e15202 [Article] [PubMed]
Boerebach, BC, Lombarts, KM, Keijzer, C, Heineman, MJ, Arah, OA The teacher, the physician and the person: How faculty’s teaching performance influences their role modelling.. PLoS One. (2012). 7 e32089 [Article] [PubMed]
Boerebach, BC, Lombarts, KM, Scherpbier, AJ, Arah, OA The teacher, the physician and the person: Exploring causal connections between teaching performance and role model types using directed acyclic graphs.. PLoS One. (2013). 8 e69449 [Article] [PubMed]
Boerebach, BC, Arah, OA, Heineman, MJ, Busch, OR, Lombarts, KM The impact of resident- and self-evaluations on surgeon’s subsequent teaching performance.. World J Surg. (2014). 38 2761–9 [Article] [PubMed]
Srinivasan, M, Li, ST, Meyers, FJ, Pratt, DD, Collins, JB, Braddock, C, Skeff, KM, West, DC, Henderson, M, Hales, RE, Hilty, DM “Teaching as a Competency”: Competencies for medical educators.. Acad Med. (2011). 86 1211–20 [Article] [PubMed]
Swing, SR The ACGME outcome project: Retrospective and prospective.. Med Teach. (2007). 29 648–54 [Article] [PubMed]
Frank, JR The CanMEDS 2005 Physician Competency Framework. Better Standards. Better Physicians. Better Care. (2005). Ottawa The Royal College of Physicians and Surgeons of Canada
Finstad, K Response interpolation and scale sensitivity: Evidence against 5-point scales.. J Usability Stud. (2010). 5 104–110
Russell, CJ, Bobko, P Moderated regression analysis and Likert scales: Too coarse for comfort.. J Appl Psychol. (1992). 77 336–42 [Article] [PubMed]
Diefenbach, MA, Weinstein, ND, O’Reilly, J Scales for assessing perceptions of health hazard susceptibility.. Health Educ Res. (1993). 8 181–92 [Article] [PubMed]
Guillemin, F, Bombardier, C, Beaton, D Cross-cultural adaptation of health-related quality of life measures: Literature review and proposed guidelines.. J Clin Epidemiol. (1993). 46 1417–32 [Article] [PubMed]
Streiner, DL, Norman, GR Health Measurement Scales: A Practical Guide to Their Development and Use. (2008). 4th edition Oxford Oxford University Press
Ullian, JA, Bland, CJ, Simpson, DE An alternative approach to defining the role of the clinical teacher.. Acad Med. (1994). 69 832–8 [Article] [PubMed]
Boor, K, Teunissen, PW, Scherpbier, AJ, van der Vleuten, CP, van de Lande, J, Scheele, F Residents’ perceptions of the ideal clinical teacher—A qualitative study.. Eur J Obstet Gynecol Reprod Biol. (2008). 140 152–7 [Article] [PubMed]
Galea, S, Tracy, M Participation rates in epidemiologic studies.. Ann Epidemiol. (2007). 17 643–53 [Article] [PubMed]
Ehrenfeld, JM, McEvoy, MD, Furman, WR, Snyder, D, Sandberg, WS Automated near-real-time clinical performance feedback for anesthesiology residents: One piece of the milestones puzzle.. Anesthesiology. (2014). 120 172–84 [Article] [PubMed]
Baird, M, Daugherty, L, Kumar, KB, Arifkhanova, A Regional and gender differences and trends in the anesthesiologist workforce.. nesthesiology. (2015). 123 997–1012 [Article]
Thompson, CA, Arah, OA Selection bias modeling using observed data augmented with imputed record-level probabilities.. Ann Epidemiol. (2014). 24 747–53 [Article] [PubMed]
Helmich, E, Boerebach, BC, Arah, OA, Lingard, L Beyond limitations: Improving how we handle uncertainty in health professions education research.. Med Teach. (2015). 37 1043–50 [Article] [PubMed]
van der Leeuw, RM, Lombarts, KM, Arah, OA, Heineman, MJ A systematic review of the effects of residency training on patient outcomes.. BMC Med. (2012). 10 65 [Article] [PubMed]
Ortwein, H, Blaum, WE, Spies, CD Anesthesiology residents’ perspective about good teaching–a qualitative needs assessment.. Ger Med Sci. (2014). 12 Doc05 [PubMed]
van der Leeuw, RM, Slootweg, IA, Heineman, MJ, Lombarts, KM Explaining how faculty members act upon residents’ feedback to improve their teaching performance.. Med Educ. (2013). 47 1089–98 [Article] [PubMed]
Overeem, K, Wollersheimh, HC, Arah, OA, Cruijsberg, JK, Grol, RP, Lombarts, KM Factors predicting doctors’ reporting of performance change in response to multisource feedback.. BMC Med Educ. (2012). 12 52 [Article] [PubMed]
Steurer, MP, Ganter, MT Comparison and contrast of anesthesia practice in Europe and the US.. ASA Monitor. (2015). 79 18–20
Egger Halbeis, CB, Schubert, A Staffing the operating room suite: Perspectives from Europe and North America on the role of different anesthesia personnel.. Anesthesiol Clin. (2008). 26 637–63, vi [Article] [PubMed]
Skeff, KM, Stratos, GA, Berman, J, Bergen, MR Improving clinical teaching. Evaluation of a national dissemination program.. Arch Intern Med. (1992). 152 1156–61 [Article] [PubMed]
Litzelman, DK, Stratos, GA, Marriott, DJ, Skeff, KM Factorial validation of a widely disseminated educational framework for evaluating clinical teachers.. Acad Med. (1998). 73 688–95 [Article] [PubMed]
Litzelman, DK, Westmoreland, GR, Skeff, KM, Stratos, GA Factorial validation of an educational framework using residents’ evaluations of clinician-educators.. Acad Med. (1999). 7410 suppl S25–7 [Article] [PubMed]
Appendix 1: Overview of All SETQ SMART Collaborators
The SETQ SMART collaborators in alphabetical order are as follows: O. A. Arah, M.D., M.Sc., D.Sc., M.P.H., Ph.D. (UCLA, Los Angeles, California); M. M. Berger, M.D. (Paracelsus Medical University, Salzburg, Austria); A. Ferguson, M.B., F.R.C.A., F.F.I.C.M., M.Acad.M.Ed. (Craigavon Area Hospital, Portadown, United Kingdom); E. van Gessel, M.D., Ph.D. (European Union of Medical Specialists, Brussels, Switzerland); R. Hoff, M.D., Ph.D. (University Medical Center, Utrecht, The Netherlands); M. W. Hollmann, M.D., Ph.D., D.E.A.A. (Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands); P. Houweling, M.D., Ph.D. (Diakonessen Hospital, Utrecht, The Netherlands); S. Loer, M.D., Ph.D. (Free University, Amsterdam, The Netherlands); Kiki M. J. M. H. Lombarts, M.Sc., Ph.D. (Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands); B. Malling, M.D., Ph.D., M.H.P.E. (Aarhus University, Aarhus, Denmark); S. A. Padosch, M.D., M.B.A. (Universitätsklinikum Köln, Köln, Germany); M. J. Schramm, M.D. (Universitätsklinikum Köln, Köln, Germany); W. S. Schlack, M.D., Ph.D. (Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands); L. A. Steiner, M.D., Ph.D. (University Hospital of Basel, Basel, Switzerland); R. J. Stolker, M.D., Ph.D. (Erasmus Medical Center, Rotterdam, The Netherlands).
Appendix 2: SETQ Smart Questionnaire
Table 1.
Characteristics of Residents and Faculty Who Participated in the Study
Characteristics of Residents and Faculty Who Participated in the Study×
Characteristics of Residents and Faculty Who Participated in the Study
Table 1.
Characteristics of Residents and Faculty Who Participated in the Study
Characteristics of Residents and Faculty Who Participated in the Study×
×
Table 2.
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations×
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations
Table 2.
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations
Item and Scale Characteristics, Internal Consistency Reliability, and Item-Total Correlations×
×
Table 3.
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately×
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately
Table 3.
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately
Descriptive Statistics and Interscale Correlations for Residents’ and Faculty Evaluations Separately×
×
Table 4.
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations×
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations
Table 4.
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations
Correlations Between Scales and Global Ratings of (1) Faculty Being Seen as a Role Model Teacher and (2) Faculty’s Overall Teaching Performance, Estimated Separately for Residents’ and Faculty’s Evaluations×
×
Table 5.
Number of Residents’ Evaluations Needed Per Faculty
Number of Residents’ Evaluations Needed Per Faculty×
Number of Residents’ Evaluations Needed Per Faculty
Table 5.
Number of Residents’ Evaluations Needed Per Faculty
Number of Residents’ Evaluations Needed Per Faculty×
×