Let me summarize a few points of crystallization for this thread and provide an argument that I think many of us, who have been through the process, can buy
Points in favor of making STEP1 pass/fail
1. Minorities and low SES people do worse on STEP1
2. STEP1 is not a good measure of clinical ability
3. Making STEP1 pass/fail would incentivize using better markers for clinical ability
Points against making STEP1 pass/fail
1. It would remove the major meritocratic criteria present
2. This would in turn place even heavier emphasis on school prestige
3. It would force more research years, elongating med school in the era of more debt, this would in fact hurt low SES students, who happen to disproportionately be minorities, even more
4. There are ways the exam can be removed or the scoring modified such that less emphasis is placed on the exam, instead of going pass/fail. For example, a quintiles based approach (you are reported if you are in the top 20 the next 20 and so on and so forth). This would screw some people on the borderline of the quintiles but on balance mess with the least number of people will deemphasizing the current trend of OCPD style studying of trying to squeeze out every single last point. So top 20% would roughly be a mid to high 240s on STEP1, the median for even the most competitive specialties, barring a couple that have reached the 250 mark.
5. DO students will get screwed. USMLE is the only objective way for many to "prove" they can be as "good" as their MD counterparts.
Underrepresented Minorities have lower entrance requirements to begin with. The admit rates for a 3.60-3.79 GPA with a 30-32 on the MCAT for the following races
Blacks: 93.7%
Hispanics: 83.4%
Whites: 63%
Asians: 57.7%
All tables publicly available
Data on U.S. medical school applicants, matriculants, enrollments, and graduates; as well as data on MD-PhD students, residency, and residency applicants.
www.aamc.org
The MCAT correlates with USMLE weakly to moderately (ranges from 0.4 for the whole thing in some studies to as high as 0.6, when looking at biological reasoning only). Admission to med school means that one has gone through undergraduate education. Undergraduate education completion means that one should be at a college reading level. STEP1, in terms of non medical terminology, can be argued to be written at below college reading level. Some of my classmates would have trouble understanding a tougher article in the Atlantic or WSJ than they would a STEP1 question. The exam is probably calibrated, with regard to non medical terminology, to be at something like a 10th grade reading level, hence why so many foreign grads, including those with poor English backgrounds, can still score in the upper echelons.
The STEP1 is a knowledge exam. It is a licensing test, not an aptitude test. The MCAT is an aptitude test but still with a heavy knowledge component, unlike the LSAT. This is why MENSA only accepted the latter in the past (it no longer accepts either). The rate limiting step the success on the STEP1 is how well you have memorized the information and the relevant connections. If one can meritocratically get into med school, one can do well on the STEP1.
The MCAT is based on years of accumulated knowledge with a big high level reading component. It, like the SAT, has some dependency on access to good prep materials and social class. But even that can be argued, given our internet age and access to a lot of good cheap resources. Also, unlike the SAT, people in tough situations can delay taking the MCAT for years, until they are prepared and "caught up" on the basic skills they need to brush up on. Even then, it can be shown the STEP1 deviates from the MCAT and that good STEP1 scores can be achieved even by very average MCAT scorers. The median MCATs of 1st year residents for following elite specialities
1. ENT: 33
2. Plastics: 32
3. Neurosurg: 32
4. Ortho: 32
Test Scores and Experiences of First-Year Residents, by Specialty
www.aamc.org
These specialties have around 85th percentile average STEP1 scores. Essentially, average level standardized test scorers in the past suddenly became around 1 SD above the mean. Again, this reflects the difference in the exams. STEP1 is largely a knowledge test. Knowledge tests rely more on crystallized intelligence than fluid intelligence. Crystallized intelligence is correlated with fluid intelligence, but it mostly a function of, barring the tiny tiny minority of med students with near photographic memories, consistent hours spent doing quality studying with quality resources. Most school googledrives have all of the resources. UFAP is like $500. People sign contracts that they aren't supposed to do full time jobs in med school, when they sign up for med school. There is enough time for people to use these resources to do really well.
In the end, the exam is really about hard work. What is this nonsense about no clinical correlation? If you don't understand basic pathophys of how action potential works, how can you truly understand an EKG? Do you want doctors to be pattern monkeys like many PAs and NPs? The essence of STEP1 is to test the building blocks that allow one to apply foundation knowledge to novel situations that don't fit neatly into algorithms. That being said, STEP1 is getting more and more clinically oriented. I had questions on first line treatments for various things. Scores are partially a function of that. Are you saying that isn't clinically relevant?
Here are some possible reforms
1. Make the exam more clinically relevant by using lingo that patients in inner city urban areas and rural areas use in the stem. This will actually show the broad variety of patients someone can work with.
2. Put some ultra basic Spanish in the question stem. It is essential to know a little bit in any ED nowadays, especially with the shoddy interpreter services that are often present at many non large academic centers.
3. Maybe some of nitty gritty biochem can really go away, but it is already trending that way. The questions mostly focus on clinical conditions.
4. Test underlying principles more. For example, contrast induced nephropathy can cause a pre renal azotemia type of BUN/creatinine. Instead of testing this by giving that type of ratio and then heaving most people who haven't come across that fact in a book get it wrong because they eliminate the choice, given the ratio is like 32 instead of like 12, as would be expected for an intrarenal phenomena like contrast induced nephropathy, test the underlying pathopahys. The reason this occurs is because contrast not only poisons the tubules directly, resulting in ATN, but also causes spasm of the prerenal vasculature, thus causing the prereanl azotemia type BUN/creatinine in some cases. Put in the stem the part about the spasm and ask what type of ratio would be expected. Force testing of foundational principles rather than specifically memorized cases for more esoteric things.
5. Stop testing two intuitively correct choices, where one was elucidated as the answer just by a recent trial. This makes these questions a 50/50 shot based on who guesses right or who happens to read latest trial data. The latter is what residency is more for. The purpose of STEP1 should be testing the foundation concepts. It is lazy question writing to just put in nit picky question that was only elucidated by a trial. Some of my STEP1 and STEP2 questions I could only find in journals. That is going too far.
My final recommendation would be a quintile based approach. That way people will still care about scores but not obsess over them. It will hurt borderline people that can be on balance argued to be better than the current arms race of turning everything into a one score depression, OCPD, and anxiety encouraging nightmare. Finally, shelf exams should be a thing for preclinical. That will allow better comparison of preclinical grades. NBME should release system based or subject based ones and standardize them well. They already sort of do this. But mandate it. That way the first two years will have a standardized component.
I can see why some elite schools are salivating at P/F. It provides excellent plausible deniability of, at least on the surface, commitment to the social justice vision of fairness. Yet the actual effects will be greater emphasis on school prestige, thus further entrenching and enhancing the advantage of high tier status for residency selection. This will favor their grads and then in turn favor them down the line because even more of their grads will dominate medical leadership. Some elite places have not only gotten rid of preclinical grades but also clinical ones. They want their students to be able to ride the prestige wave all the way.
All in all, I think a quintile based approach and mandating shelf exams for preclinical and clinical years would be a good approach along with reforming the exam to have more relevant questions, not only clinically but also more oriented towards foundation knowledge rather than memorization of esoteric information.
Potential biases and my background
Asian male, upper middle class NE/Cali childhood and adolescence
2330 SAT (99.5%+)
State school undergrad free instead of expensive T20 private schools with much better rank I got into. 3.7 GPA
36 MCAT (97th percentile)
Mid Tier Med School preclinical rank (about 50th percentile but thank god for P/F)
STEP1 247 (82nd percentile)
Honored all rotations and shelf exams were 80th-99th percentile- medicine, peds, neuro, psych were 99th with surgery my lowestv
STEP2CK 261 (86th percentile- based on averaging means and SDs over last 3 years, since 2019 mean and SD unavailable)