Abstract
Introduction
No standardized evaluation tool for fellowship applicant assessment exists. Assessment
tools are subject to biases and scoring tendencies which can skew scores and impact
rankings. We aimed to develop and evaluate an objective assessment tool for fellowship
applicants.
Methods
We detected rater effects in our numerically scaled assessment tool (NST), which consisted
of 10 domains rated from 0 to 9. We evaluated each domain, consolidated redundant
categories, and removed subjective categories. For 7 remaining domains, we described
each quality and developed a question with a behaviorally-anchored rating scale (BARS).
Applicants were rated by 6 attendings. Ratings from the NST in 2018 were compared
with the BARS from 2020 for distribution of data, skewness, and inter-rater reliability.
Results
Thirty-four applicants were evaluated with the NST and 38 with the BARS. Demographics
were similar between groups. The median score on the NST was 8 out of 9; scores <5
were used in less than 1% of all evaluations. Distribution of data was improved in
the BARS tool. In the NST, scores from 6 of 10 domains demonstrated moderate skewness
and 3 high skewness. Three of the 7 domains in the BARS showed moderate skewness and
none had high skewness. Two of 10 domains in the NST vs 5 of 7 domains in the BARS
achieved good inter-rater reliability.
Conclusion
Replacing a standard numeric scale with a BARS normalized the distribution of data,
reduced skewness, and enhanced inter-rater reliability in our evaluation tool. This
provides some validity evidence for improved applicant assessment and ranking.
Keywords
To read this article in full you will need to make a payment
Purchase one-time access:
Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online accessOne-time access price info
- For academic or personal research use, select 'Academic and Personal'
- For corporate R&D use, select 'Corporate R&D Professionals'
Subscribe:
Subscribe to Academic PediatricsAlready a print subscriber? Claim online access
Already an online subscriber? Sign in
Register: Create an account
Institutional Access: Sign in to ScienceDirect
References
- Use of the interview in resident candidate selection: a review of the literature.J Grad Med Educ. 2015; 7: 539-548
- Fellowship candidate factors considered by program directors.J Am Coll Radiol. 2020; 17: 284-288
- How do we choose? A review of residency application scoring systems.J Surg Educ. 2021; 78: 1461-1468
- Implicit racial bias in medical school admissions.Acad Med. 2017; 92: 365-369
- Hawks and doves: adjusting for bias in residency interview scoring.J Surg Educ. 2020; 77: e132-e137
- Should candidate scores be adjusted for interviewer stringency or leniency in the multiple mini-interview?.Med Educ. 2010; 44: 690-698
- Bias in radiology resident selection: do we discriminate against the obese and unattractive?.Acad Med. 2019; 94: 1774-1780
- Detecting and measuring rater effects using many-facet Rasch measurement: part I.J Appl Meas. 2003; 4: 386-422
- Detecting rater bias using a person-fit statistic: a Monte Carlo simulation study.Perspect Med Educ. 2018; 7: 83-92
- Improving student selection using multiple mini-interviews with multifaceted Rasch modeling.Acad Med. 2013; 88: 216-223
- Global rating scales in residency education.Acad Med. 1996; 71: S55-S63
- Exploring Methods for Developing Behaviorally Anchored Rating Scales for Evaluating Structured Iinterview Performance.ETS Research Report Series, Princeton, NJ2017: 1-26
- Asking applicants what they would do versus what the did do: a meta-analytic comparison of situational behaviour and past behaviour employment interview questions.J Occup Organ Psychol. 2002; 75: 277-294
- A behaviourally anchored rating scale for evaluating the use of the WHO surgical safety checklist: development and initial evaluation of the WHOBARS.BMJ Qual Saf. 2016; 25: 778-786
- Validity and reliability of an application review process using dedicated reviewers in one stage of a multi-stage admissions model.Curr Pharm Teach Learn. 2017; 9: 972-979
- A contemporary approach to validity arguments: a practical guide to Kane's framework.Med Educ. 2015; 49: 560-575
- The distinctions between theory, theoretical framework, and conceptual framework.Acad Med. 2020; 95: 989-994
- Computing inter-rater reliability for observational data: an overview and tutorial.Tutor Quant Methods Psychol. 2012; 8: 23-34
- Undesired variance due to examiner stringency/leniency effect in communication skill scores assessed in OSCEs.Adv Health Sci Educ Theory Pract. 2008; 13: 617-632
- “On the same page”? The effect of GP examiner feedback on differences in rating severity in clinical assessments: a pre/post intervention study.BMC Med Educ. 2017; 17: 101
- Adapting the McMaster-Ottawa scale and developing behavioral anchors for assessing performance in an interprofessional team observed structured clinical encounter.Med Educ Online. 2015; 20: 26691
Article info
Publication history
Published online: December 02, 2021
Accepted:
November 29,
2021
Received:
September 17,
2021
Footnotes
The authors have no conflicts of interest to disclose.
Identification
Copyright
Copyright © 2021 by Academic Pediatric Association