ASTRO Congress News

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

Palex80

RAD ON
15+ Year Member
Joined
Dec 17, 2007
Messages
3,413
Reaction score
4,444
Hey!

Was anything interesting presented at ASTRO this year?

Members don't see this ad.
 
Hey!

Was anything interesting presented at ASTRO this year?
Wallner got the boot. That was most interesting thing I learned at ASTRO. It wasn't widely announced however. Seems they wanted to keep it a secret from you sdn trouble makers.
 
  • Haha
Reactions: 1 user
Wallner got the boot. That was most interesting thing I learned at ASTRO. It wasn't widely announced however. Seems they wanted to keep it a secret from you sdn trouble makers.
He presented at the ADROP meeting. Got the boot?
 
Members don't see this ad :)
Didnt' get to go to much b/c of interviews, but 9601 secondary analysis showing a non-oncologic survival detriment to using ADT in patients with PSA < 0.7 will likely make me eliminate the use of ADT in patients with PSA < 0.7 in combination with initial 9601 analysis.

There was a proton vs photon study that was being presented but I didn't catch it and forget the details.
 
  • Like
Reactions: 1 user
I think data with the greatest chance to change practice came from RAVES, if you were even seeing the eligible patients for adjuvant therapy in the first place.
 
  • Like
Reactions: 1 users
Didnt' get to go to much b/c of interviews, but 9601 secondary analysis showing a non-oncologic survival detriment to using ADT in patients with PSA < 0.7 will likely make me eliminate the use of ADT in patients with PSA < 0.7 in combination with initial 9601 analysis.

There was a proton vs photon study that was being presented but I didn't catch it and forget the details.
It's a post hoc study. Use with caution. Paul Nguyen is right
 
  • Like
Reactions: 1 user
Didnt' get to go to much b/c of interviews, but 9601 secondary analysis showing a non-oncologic survival detriment to using ADT in patients with PSA < 0.7 will likely make me eliminate the use of ADT in patients with PSA < 0.7 in combination with initial 9601 analysis.

There was a proton vs photon study that was being presented but I didn't catch it and forget the details.
I was surprised that this got into the plenary, and think that Dr. Spratt really oversold the data. I don't think a post-hoc analysis from a study using long-term anti-androgen therapy can really be used to argue against short-term ADT given that we have level 1 data supporting its use. The long-term toxicity profile from 2 years of AA is vastly different than 6 months GnRH. Dr. Nguyen made the very fair point that it is premature to conclude that the GETUG study was negative for a survival benefit, as more follow-up is needed. Agree there is controversy here for low PSA patients, but I honestly don't think this analysis really moves the needle.

Aside from RAVES, the HN-002 results (while not practice changing) were pretty much the only other significant study that I saw. Shame that the winning arm from this study (60 Gy + weekly cis) is not one of the arms of the current HN005 phase II-III study.
 
Wallner got the boot. That was most interesting thing I learned at ASTRO. It wasn't widely announced however. Seems they wanted to keep it a secret from you sdn trouble makers.

Is that true? I’d like to take some credit if so
 
  • Haha
  • Like
Reactions: 1 users
RAVES results? Early salvage okay?

Yup. At median FU of 6 years no difference in rates of BCR, local or distant failure. Reduced GU toxicity in early salvage group. Early salvage definitely the way to go.
 
  • Like
Reactions: 2 users
Members don't see this ad :)
Yup. At median FU of 6 years no difference in rates of BCR, local or distant failure. Reduced GU toxicity in early salvage group. Early salvage definitely the way to go.

Pretty much as I expected. The question I have now is: what the hell are we to do with the adjuvant guys who have high risk Decipher scores?? I have been perfectly comfortable offering early salvage previously but now my urologists are ordering this damn test on everyone. It seems that every patient is high risk... how do you guys and girls incorporate Decipher into these decisions?
 
Yup. At median FU of 6 years no difference in rates of BCR, local or distant failure. Reduced GU toxicity in early salvage group. Early salvage definitely the way to go.
May take longer to tease out more info.

The original update of the swog study didn't show an OS benefit, but that ended up occurring with longer fu, on top of reduced adt use, improved MFS etc
 
The „problem“ with Raves is the study population. Lots of early disease, low GS tumors randomized. Yet, the message is clear.
 
  • Like
Reactions: 1 users
It's a post hoc study. Use with caution. Paul Nguyen is right

Post hoc significant results to me are much more valuable than post hoc results deemed 'insignificant' or 'no difference'. We do this for many post-hoc or unplanned subset analyses. If they're significant they're implemented clinically (see lobectomy vs pneumonectomy after pre-op chemoRT per Albain), if they're non-significant then I personally feel that they won't be powered to identify a difference.

I'm happy to wait for a peer-reviewed paper before finalizing I suppose, but maybe this just reinforces my practice that PSA < 0.7 may not need ADT in salvage setting.
 
  • Like
Reactions: 1 user
Post hoc significant results to me are much more valuable than post hoc results deemed 'insignificant' or 'no difference'. We do this for many post-hoc or unplanned subset analyses. If they're significant they're implemented clinically (see lobectomy vs pneumonectomy after pre-op chemoRT per Albain), if they're non-significant then I personally feel that they won't be powered to identify a difference.
Biostatistical ruminations...
The rate of any post hoc and/or unplanned subset analyses in trials in modern oncology literature: ~100%.
The incidence of positive findings on post hoc analyses: ~5%.*
The incidence of subsequent separate paper reportage on positive post hoc analyses: really high.**
The incidence of subsequent separate paper reportage on negative post hoc analyses: really low.
Post hoc significant results get the glory; but are they really "much more valuable?"

* Swat enough hornet nests, eventually you'll get stung. Run enough post hoc analyses, eventually you'll get a p<0.05 result.
** For positive treatment results. For positive toxicity results, not as high.
 
  • Like
Reactions: 1 user
Didn't hear about RAVES. I think that's great honestly. I'd rather early salvage a fraction of that population who end up needing treatment than adjuvanting them all when their risk of permanent incontinence is highest.

Of course we'll need to confirm that IRL, the patients are being sent for EARLY salvage, rather than delayed, but that's not something the trialists can control.
 
  • Like
Reactions: 1 user
Did I hear this right - there was significant mortality from taking Casodex for 2 years? How?
 
Did I hear this right - there was significant mortality from taking Casodex for 2 years? How?
There is evidence that bicalutamide may lead to excess mortality from the EPC studies from long ago.

 
Casodex isn't a benign drug in my experience. I really try not to use it.
 
  • Like
Reactions: 1 user
I have a 71-year-old gentleman for whom this is a very relevant discussion.

Preop PSA 5.4, biopsy 3/12 cores positive for GS 4+5=9 disease (x2) and 4+4=8 in the third core. Underwent RP, with pathology showing only 3+3=6 disease (?!). Entire prostate was step-sectioned and tested to ensure it was from the correct patient, which it was. No other disease found. No ECE or SVI. No SM or LN involvement.

6-week postop PSA 0.41.

I'm obviously worried for the presence of undetectable mets in his case, given his findings, but metastatic workup was negative.

What would everyone do, considering the new data we have? HT alone? HT+RT? RT alone? He's in very good health and working full-time as an architect.

Edit: By HT I mean Lupron. I also stay away from bicalutamide, given the well-documented SEs.
 
Was a pre-op MR done?

The guy as described almost certainly has occult mets. I think it's still below the typical detectable level of Axumin so that's probably low yield. Traditional imaging as well. I'd probably have a very frank conversation with him, let him know my honest thoughts, and offer him ADT +/- XRT vs (more strongly) med onc referral.
 
  • Like
Reactions: 1 user
I would check another post-op PSA in 6 weeks. If still 0.41 or higher:

Salvage RT to LNs per SPPORT + ADT with 6 months of Lupron would be my preference @OTN

Auxumin will be negative with this PSA.

Was original Biopsy done on outside? If so, would have those slides checked at your institution to re-confirm that initial cores were actually G9 or 8 disease.
 
  • Like
Reactions: 2 users
Did I hear this right - there was significant mortality from taking Casodex for 2 years? How?
Probably need a normal level of testosterone to be a healthy functioning human male. And not just in the sex dept. I recall that there has been a lot of data associating adverse health w/ testosterone suppression in males, even males with prostate cancer. E.g., physical activity is one of main determinants of survival in breast cancer. So theoretically if you were to suppress physical activity in breast cancer it would lead to decreased survival. Anti-testosterone therapy really suppresses men's physical activity levels. Heck, even if you take Propecia it makes life kind of sucky, suppresses exercise, etc.
 
Casodex isn't a benign drug in my experience. I really try not to use it.
I actually will do a loading dose of degarelix/firmagon when possible before switching to lupron to avoid giving casodex at all in the beginning to blunt the testosterone flare with lupron
 
  • Like
Reactions: 1 user
Post hoc significant results to me are much more valuable than post hoc results deemed 'insignificant' or 'no difference'. We do this for many post-hoc or unplanned subset analyses. If they're significant they're implemented clinically (see lobectomy vs pneumonectomy after pre-op chemoRT per Albain), if they're non-significant then I personally feel that they won't be powered to identify a difference.

I'm happy to wait for a peer-reviewed paper before finalizing I suppose, but maybe this just reinforces my practice that PSA < 0.7 may not need ADT in salvage setting.
I wouldn't believe Spratt over Nguyen. I have no affiliation with either of their centers, but one of those two talks with his emotions on his sleeve, guns ablaze; and the other is a kind, mild-mannered person. I know which one of them is prone to distort his viewpoint.

BTW, your post-hoc analysis argument would also mean that we may have had to stop giving TNBC hypofrac after results of the Whelan trial. And yet...

Also, regarding Albain...surgeons' blind faith in that post hoc analysis lead to them cutting on way too many people when the clearly better answer is to save the pt of any post-op M&M, and give the powerful agent known as durva thereafter. Can't do all that if you cut because you put all your faith in a post hoc analysis.
 
  • Like
Reactions: 1 users
I wouldn't believe Spratt over Nguyen. I have no affiliation with either of their centers, but one of those two talks with his emotions on his sleeve, guns ablaze; and the other is a kind, mild-mannered person. I know which one of them is prone to distort his viewpoint.

BTW, your post-hoc analysis argument would also mean that we may have had to stop giving TNBC hypofrac after results of the Whelan trial. And yet...

Also, regarding Albain...surgeons' blind faith in that post hoc analysis lead to them cutting on way too many people when the clearly better answer is to save the pt of any post-op M&M, and give the powerful agent known as durva thereafter. Can't do all that if you cut because you put all your faith in a post hoc analysis.

Spratt? Emotions on his sleeves? He doesn't usually wear sleeves...
_14038-spratt-d-39.jpg
 
  • Haha
  • Like
Reactions: 4 users
I have a 71-year-old gentleman for whom this is a very relevant discussion.

Preop PSA 5.4, biopsy 3/12 cores positive for GS 4+5=9 disease (x2) and 4+4=8 in the third core. Underwent RP, with pathology showing only 3+3=6 disease (?!). Entire prostate was step-sectioned and tested to ensure it was from the correct patient, which it was. No other disease found. No ECE or SVI. No SM or LN involvement.

6-week postop PSA 0.41.

I'm obviously worried for the presence of undetectable mets in his case, given his findings, but metastatic workup was negative.

What would everyone do, considering the new data we have? HT alone? HT+RT? RT alone? He's in very good health and working full-time as an architect.

Edit: By HT I mean Lupron. I also stay away from bicalutamide, given the well-documented SEs.

PSMA-PET-CT if available. If not, re-image with whole body MRI.
 
I wouldn't believe Spratt over Nguyen. I have no affiliation with either of their centers, but one of those two talks with his emotions on his sleeve, guns ablaze; and the other is a kind, mild-mannered person. I know which one of them is prone to distort his viewpoint.

BTW, your post-hoc analysis argument would also mean that we may have had to stop giving TNBC hypofrac after results of the Whelan trial. And yet...

Also, regarding Albain...surgeons' blind faith in that post hoc analysis lead to them cutting on way too many people when the clearly better answer is to save the pt of any post-op M&M, and give the powerful agent known as durva thereafter. Can't do all that if you cut because you put all your faith in a post hoc analysis.
0.7 is such an obviously arbitrary level derived from letting a data set make a hypothesis for you.

I'm sure .2, .3, .4, .5, .6, .8, .9, 1.0, 1.1..... were similarly tested. 0.7 happened to win the race to <0.05. Random data sets have random break points.

I mean, would you believe it at all if the number was 0.78 rather than 0.7? If not, believe neither.
 
  • Like
Reactions: 3 users
Probably need a normal level of testosterone to be a healthy functioning human male. And not just in the sex dept. I recall that there has been a lot of data associating adverse health w/ testosterone suppression in males, even males with prostate cancer. E.g., physical activity is one of main determinants of survival in breast cancer. So theoretically if you were to suppress physical activity in breast cancer it would lead to decreased survival. Anti-testosterone therapy really suppresses men's physical activity levels. Heck, even if you take Propecia it makes life kind of sucky, suppresses exercise, etc.
I always thought as a general rule castrated animals lived longer?
 
2 second google search returned :2 of 3 known studies of castration in humans showed significant longevity gains.... as well as a ton of results
For castration + longevity

Notion of antagonistic pleiotropy is pretty well accepted- factors associated with growth while young/reproductive age are bad in old age- mtor, growth hormone etc.
 
Last edited:
2 second google search returned :2 of 3 known studies of castration in humans showed significant longevity gains.... as well as a ton of results
For castration + longevity

Notion of antagonistic pleiotropy is pretty well accepted- factors associated with growth while young/reproductive age are bad in old age- mtor, growth hormone etc.
The typical eunuch, correct me if I'm wrong, gets castrated very early in life. I'm talking prepubertally. So maybe there's a difference there versus losing your testosterone in the CaP-relevant age group. And maybe there's a difference with chemical castration versus Errol Flynn-type castration. Don't get me wrong, I give the ADT of course when clearly indicated. But for sure it makes men feel not that great, makes them lose muscle mass, etc. So is it all benefit/no risk or a Faustian bargain.
 
The typical eunuch, correct me if I'm wrong, gets castrated very early in life. I'm talking prepubertally. So maybe there's a difference there versus losing your testosterone in the CaP-relevant age group. And maybe there's a difference with chemical castration versus Errol Flynn-type castration. Don't get me wrong, I give the ADT of course when clearly indicated. But for sure it makes men feel not that great, makes them lose muscle mass, etc. So is it all benefit/no risk or a Faustian bargain.
Outside of bonafide high risk patients, I rarely recommend ADT
 
I wouldn't believe Spratt over Nguyen. I have no affiliation with either of their centers, but one of those two talks with his emotions on his sleeve, guns ablaze; and the other is a kind, mild-mannered person. I know which one of them is prone to distort his viewpoint.

BTW, your post-hoc analysis argument would also mean that we may have had to stop giving TNBC hypofrac after results of the Whelan trial. And yet...

Also, regarding Albain...surgeons' blind faith in that post hoc analysis lead to them cutting on way too many people when the clearly better answer is to save the pt of any post-op M&M, and give the powerful agent known as durva thereafter. Can't do all that if you cut because you put all your faith in a post hoc analysis.

I'm not believing Spratt over Nguyen. Even if it wasn't Dan Spratt up there I'd still have the same thought process. It's OK if you disagree and I wouldn't fault anybody for not having this change their practice, but given that PSA < 0.7 was already shaky ground (IMO) in regards to a survival benefit of long-term Casodex, this study reinforces my potential practice as an attending.

Can you link the Whelan TNBC hypofrac post-hoc analysis? I was only aware of the G3 data that has been since disproven with a change from a rare grading system to the more commonly used one.

I agree with you that, since Durva, those patients are probably best served by definitive chemoRT (mainly because of the Durva addition rather than the magic of chemoRT), but just saying that post-hoc analysis frequently drive practice.

Outside of bonafide high risk patients, I rarely recommend ADT

For definitive patients, Unfavorable intermediate risk patients get short term ADT from me. Favorable intermediate risk I'm not doing ADT. High risk get the discussion that 18 months is the minimum that has been studied that is not visibly inferior. If they're younger I will generally say "let's try 6 months and see what happens" then "let's continue for a year and see what happens".
 
  • Like
Reactions: 1 user
I'm not believing Spratt over Nguyen. Even if it wasn't Dan Spratt up there I'd still have the same thought process. It's OK if you disagree and I wouldn't fault anybody for not having this change their practice, but given that PSA < 0.7 was already shaky ground (IMO) in regards to a survival benefit of long-term Casodex, this study reinforces my potential practice as an attending.

Can you link the Whelan TNBC hypofrac post-hoc analysis? I was only aware of the G3 data that has been since disproven with a change from a rare grading system to the more commonly used one.

I agree with you that, since Durva, those patients are probably best served by definitive chemoRT (mainly because of the Durva addition rather than the magic of chemoRT), but just saying that post-hoc analysis frequently drive practice.



For definitive patients, Unfavorable intermediate risk patients get short term ADT from me. Favorable intermediate risk I'm not doing ADT. High risk get the discussion that 18 months is the minimum that has been studied that is not visibly inferior. If they're younger I will generally say "let's try 6 months and see what happens" then "let's continue for a year and see what happens".
My bad, I meant G3 not TNBC.
I agree with you that post hoc analyses often drive practice, but I don't believe they should. Every trial has forest plots of post hoc analyses done, and that doesn't mean that one should conjure up a specific subgroup (s) of pts to give that trial therapy in. Guess for durva that means withholding it if the pt is >14d from CRT...
 
My bad, I meant G3 not TNBC.
I agree with you that post hoc analyses often drive practice, but I don't believe they should. Every trial has forest plots of post hoc analyses done, and that doesn't mean that one should conjure up a specific subgroup (s) of pts to give that trial therapy in. Guess for durva that means withholding it if the pt is >14d from CRT...

Comes back to my original point - Post-hoc analyses that are negative don't drive my practice because they are usually not sufficiently powered for a certain conclusion.

That being said, I do think durva starting closer to RT is likely the better play to catch as much of the antigen presentation increase from RT (even fractionated). Of course the counter argument is that the confounder is the ability of the patient to tolerate chemoRT, which is a reasonable conclusion.
 
  • Like
Reactions: 1 user
Comes back to my original point - Post-hoc analyses that are negative don't drive my practice because they are usually not sufficiently powered for a certain conclusion.

That being said, I do think durva starting closer to RT is likely the better play to catch as much of the antigen presentation increase from RT (even fractionated). Of course the counter argument is that the confounder is the ability of the patient to tolerate chemoRT, which is a reasonable conclusion.

This doesn't make sense. "Post-hoc analyses that are negative don't drive my practice because they are usually not sufficiently powered for a certain conclusion."
- So only the positive post-hoc ones drive your practice? And they're powered differently than negative ones?

There are always a few items on a forest plot that are statistically positive but don't mean much clinically. All the more reason to take it with caution.

Just remember that post hoc analyses are usually univariate comparisons, which is a main reason they're not reliable. For instance, if you don't want to give ADT to a PSA of 0.5, but what if that same patient had T3b Gleason 10 disease with multiple positive margins? The statement that "only the PSA value matters" ignores other variables since it's just a univariate comparison. All the more reason to proceed with caution.
 
This doesn't make sense. "Post-hoc analyses that are negative don't drive my practice because they are usually not sufficiently powered for a certain conclusion."
- So only the positive post-hoc ones drive your practice? And they're powered differently than negative ones?

There are always a few items on a forest plot that are statistically positive but don't mean much clinically. All the more reason to take it with caution.

Just remember that post hoc analyses are usually univariate comparisons, which is a main reason they're not reliable. For instance, if you don't want to give ADT to a PSA of 0.5, but what if that same patient had T3b Gleason 10 disease with multiple positive margins? The statement that "only the PSA value matters" ignores other variables since it's just a univariate comparison. All the more reason to proceed with caution.

My general philosophy is that a post-hoc analysis that is statistically significant would continue to be statistically significant if the sample size increased (directly related to power), as the standard deviation would (generally) shrink, and thus confidence intervals would (generally) shrink. With negative post-hoc analyses, the issue is that if the sample size was increased, would the non-significant difference become significant? Think of it like this: in trials positive for their main endpoint, we don't evaluate whether the study was powered for the conclusion made. In trials negative for their main endpoint, we frequently evaluate whether the study was adequately powered to detect a statistical difference given the frequent differences seen in event rates between the folks planning the study and the patients actually being treated by it.

I do agree with you that it is sometimes a univariate analysis. I'm all for interpreting positive post-hoc analyses with caution, but I'm not of the opinion that they are all worthless as a blanket statement, and I disagree that the solution to 9601 is re-running the trial with patients only with PSA < 0.7, for example.

PSA is just one factor in deciding use of ADT. Some people give ADT to every single salvage case they ever treat, which is fine and an individual's practice. Some of us would like to try to select patients a bit better.

If the Gleason 10 patient in the example has micrometastatic disease already, ADT given concurrently with salvage radiation is not going to cure them, and you have an explanation of the residual PSA with localized disease given multiple positive margins. I would be more wary of a T3b G10 that had negative margins and a positive post-op PSA.
 
My general philosophy is that a post-hoc analysis that is statistically significant would continue to be statistically significant if the sample size increased (directly related to power), as the standard deviation would (generally) shrink, and thus confidence intervals would (generally) shrink. With negative post-hoc analyses, the issue is that if the sample size was increased, would the non-significant difference become significant? Think of it like this: in trials positive for their main endpoint, we don't evaluate whether the study was powered for the conclusion made.
("Power" becomes a moot point once a difference is found; you're out of type II territory in that circumstance and into type I error territory)
In trials negative for their main endpoint, we frequently evaluate whether the study was adequately powered to detect a statistical difference given the frequent differences seen in event rates between the folks planning the study and the patients actually being treated by it.
I love talking 'bout stats.

When you have a positive result in any significant test, your idea about the p-value changing from significant to insignificant (based on sample size changes) is dependent on MANY factors. But perhaps the main one is this: it depends on what the initial p-value was. If the initial p-value was 0.04 in a post hoc analysis based on ~100 samples where one group has a measured mean of 1.05 and the other group mean is 1.00, there is actually a reasonable probability that more samples COULD change the p-value to >0.05. But if the p-value were <0.00000001 in a sample of 100 and if one group's mean were 15,000 and the other group's mean were 1.00, it's VERY unlikely more samples could change the p-value. This is partially why there's a crisis of reproducibility in science and also why there are calls to lower significant p-values to <0.005. Many of these so-called significant results in the "pre hoc" and post hoc settings have what's called extreme fragility; in post hoc (and "pre hoc") analyses that showed p-values <0.05, just taking a few "positive" patients OUT of the sample would have made the analysis insignificant in several studies which changed the standard of care. Thus, the reason medical practice often has to be "reversed." So it isn't just "what if we added more patients--it wouldn't matter for the p-value" it's also "what if we hadn't added the 2 or 3 or 4 more patients that we added? ... the study would've been negative."

The knife cuts both ways. Sometimes we "know" results are different but can't make a claim on statistical significance. This is a different discussion. But I don't think you can have a discussion about "I only change my practice based on positive results" (ie making a few type I errors) without also exploring "I wonder how many positive results I miss out on by 100% ignoring all negative results" (ie making many type II errors) and the repercussions of accepting all "positive" results. In medicine, like in the legal system, we have aimed toward making many less type I errors (finding the innocent guilty) vs type II (letting guilty people walk free). However, it's a trade-off, and we are probably making a ton of type I errors at p-values of 0.05 or less instead of a more stringent level. I'm just a jaded skeptic re: trial results nowadays and keep to heart the Newtonian ideal that when it come to practice-making changes based on single trial post hoc results "every action has an equal and opposite reaction."
 
  • Like
Reactions: 2 users
("Power" becomes a moot point once a difference is found; you're out of type II territory in that circumstance and into type I error territory)

I love talking 'bout stats.

When you have a positive result in any significant test, your idea about the p-value changing from significant to insignificant (based on sample size changes) is dependent on MANY factors. But perhaps the main one is this: it depends on what the initial p-value was. If the initial p-value was 0.04 in a post hoc analysis based on ~100 samples where one group has a measured mean of 1.05 and the other group mean is 1.00, there is actually a reasonable probability that more samples COULD change the p-value to >0.05. But if the p-value were <0.00000001 in a sample of 100 and if one group's mean were 15,000 and the other group's mean were 1.00, it's VERY unlikely more samples could change the p-value. This is partially why there's a crisis of reproducibility in science and also why there are calls to lower significant p-values to <0.005. Many of these so-called significant results in the "pre hoc" and post hoc settings have what's called extreme fragility; in post hoc (and "pre hoc") analyses that showed p-values <0.05, just taking a few "positive" patients OUT of the sample would have made the analysis insignificant in several studies which changed the standard of care. Thus, the reason medical practice often has to be "reversed." So it isn't just "what if we added more patients--it wouldn't matter for the p-value" it's also "what if we hadn't added the 2 or 3 or 4 more patients that we added? ... the study would've been negative."

The knife cuts both ways. Sometimes we "know" results are different but can't make a claim on statistical significance. This is a different discussion. But I don't think you can have a discussion about "I only change my practice based on positive results" (ie making a few type I errors) without also exploring "I wonder how many positive results I miss out on by 100% ignoring all negative results" (ie making many type II errors) and the repercussions of accepting all "positive" results. In medicine, like in the legal system, we have aimed toward making many less type I errors (finding the innocent guilty) vs type II (letting guilty people walk free). However, it's a trade-off, and we are probably making a ton of type I errors at p-values of 0.05 or less instead of a more stringent level. I'm just a jaded skeptic re: trial results nowadays and keep to heart the Newtonian ideal that when it come to practice-making changes based on single trial post hoc results "every action has an equal and opposite reaction."

I can't agree more with Scarbtj here. Post-hoc analyses are riddled with problems and namely you can look at multiple amounts of variables and by sheer chance something will show a p-value of < 0.05 ie if you start calculating for v5, v6, v7, v8, v9, v10..., LC among M, LC among those < 20 yo, < 30, <40, etc. you will eventually get something (i think something like this happened in the RTOG 0617 analysis...). It's not wrong and they are good to generate hypothesis or use if you know that question will never have clinical trial to definitively answer (ie what is the best Vx for the heart). Rarely (ever?) does one publish all the negative post-hoc findings (a suggestion that has been made by statisticians).

After doing too many retrospective reviews, you learn to hunt / p-hack / data dredge so you can publish something - hence the upmost importance of pre-specified endpoints (ie No cheating!).

See this paper here on interim-analysis Multiplicity in randomised trials II: subgroup and interim analyses. - PubMed - NCBI It truly is one of the best papers on stats for clinicians I have come across. Not only does it explain what O'Brien-Fleming and Peto group sequential stopping methods are it gives echoes what Scarbtj is saying about his p < 0.005 courtesy of the man Dr. John Ioannidis himself.

Graph below shows why the interim sequential tests have a much smaller alpha (Peto p = 0.001 and O'Brien-Felming p = 0.005 for the first interim analysis if doing 2 analysis in order to stop the treatment). You need to "spread out the alpha" b/c just from sheer chance you can get a P value < 0.05 which on the graph below shows that by chance your p value reached its goal at 18 mo, but if you waited just a little longer would've went up very high. TL;DR post-hoc analyses are prone to lots of statistical errors and would hold them tentatively until verified further.

1569348731406.png


Edit: Don't forget the infamous ISIS-2 cardiology study which noted a benefit to aspirin after MI in patients born under all astrological signs except for Gemini and Libra. The legend is that the author was forced to do a post-hoc analysis so decided have a little fun at the editor's expense, point being if we take other post-analysis to be true why not this one.
 
Last edited:
  • Like
Reactions: 2 users
Graph below shows why the interim sequential tests have a much smaller alpha (Peto p = 0.001 and O'Brien-Felming p = 0.005 for the first interim analysis if doing 2 analysis in order to stop the treatment). You need to "spread out the alpha" b/c just from sheer chance you can get a P value < 0.05 which on the graph below shows that by chance your p value reached its goal at 18 mo, but if you wanted just a little longer would've went up very high. TL;DR post-hoc analyses are prone to lots of statistical errors and would hold them tentatively until verified further.
You brought out the OBrien-Fleming, impressive. (These discussions get a little boring, wonky, and even philosophical. Throwing astrology into the mix helps/always a classic.) You probably know this but Peto is the guy who is (partly) responsible for the Peto-Peto-Wilcoxon and named the logrank test. Where would we be without the logrank? Peto is also an EBCTCG guy. And the guy who asked the question "Why don't whales or elephants have higher rates of cancer than humans or mice?"
 
  • Like
Reactions: 1 user
You brought out the OBrien-Fleming, impressive. (These discussions get a little boring, wonky, and even philosophical. Throwing astrology into the mix helps/always a classic.) You probably know this but Peto is the guy who is (partly) responsible for the Peto-Peto-Wilcoxon and named the logrank test. Where would we be without the logrank? Peto is also an EBCTCG guy. And the guy who asked the question "Why don't whales or elephants have higher rates of cancer than humans or mice?"

I guess you and I are the weirdos that don’t find this boring (def weird and wonky). I recognize that the statistical force is strong with you :bow:
 
My general philosophy is that a post-hoc analysis that is statistically significant would continue to be statistically significant if the sample size increased (directly related to power), as the standard deviation would (generally) shrink, and thus confidence intervals would (generally) shrink. With negative post-hoc analyses, the issue is that if the sample size was increased, would the non-significant difference become significant? Think of it like this: in trials positive for their main endpoint, we don't evaluate whether the study was powered for the conclusion made. In trials negative for their main endpoint, we frequently evaluate whether the study was adequately powered to detect a statistical difference given the frequent differences seen in event rates between the folks planning the study and the patients actually being treated by it.

I do agree with you that it is sometimes a univariate analysis. I'm all for interpreting positive post-hoc analyses with caution, but I'm not of the opinion that they are all worthless as a blanket statement, and I disagree that the solution to 9601 is re-running the trial with patients only with PSA < 0.7, for example.

PSA is just one factor in deciding use of ADT. Some people give ADT to every single salvage case they ever treat, which is fine and an individual's practice. Some of us would like to try to select patients a bit better.

If the Gleason 10 patient in the example has micrometastatic disease already, ADT given concurrently with salvage radiation is not going to cure them, and you have an explanation of the residual PSA with localized disease given multiple positive margins. I would be more wary of a T3b G10 that had negative margins and a positive post-op PSA.
I hear what you're saying, but as eloquently stated by the posts above mine, it's just simply unreliable. They said it better than I could've ever done.
 
Late to the game. I have little additional to offer except to say that subgroup analyses are not all alike. Rarely, subgroups will be prespecified and used as stratification criteria for randomization. In this case (especially if powered appropriately) these SGAs are more valid than noodling around for the sacred p-value. A recent example is the STAMPEDE trial which found evidence of interaction according to prespecified subgroups.

The initial report of RTOG 96-01 did similarly report an interaction (differential effect according to baseline value); beneficial effect of 2 years of bicalutamide only observed if PSA >0.7 ng/mL. The problem is that the stratification used for preRT PSA was above or below 1.5ng/mL. Thus the beneficial effect of bicalutamide in the 0.7-1.5 group was hypothesis generating and less valid than the >1.5 group.

The worst example of SGA from the NRG comes from RTOG 94-08. This trial observed a beneficial effect of STADT (in the setting of 66.6 Gy) for the entire population. The analysis then included an unplannned SGA based on NCCN risk groups and despite the global interaction term being NS (p.0.05) they nevertheless concluded that the benefit was only observed in the IR group.

You can't have it both ways. Either the global interaction test is "significant" in which case you are justified to report by subgroup or it isn't.
 
  • Like
Reactions: 1 user
I hear what you're saying, but as eloquently stated by the posts above mine, it's just simply unreliable. They said it better than I could've ever done.

Fair enough. I am cognizant of the concept of fragility, but perhaps I need to consider that a bit more during statistical review of trials than I currently do now.

To get back to the original point, though, I suppose I'm one of those people who didn't believe in ADT for those with low PSA before (although that was not a pre-specified subset analysis either) and this post-hoc analysis reinforces my practice. Maybe I'm just grasping at straws for any justification to identify patients that I do not put through the toxicity of ADT if I don't absolutely have to.
 
Fair enough. I am cognizant of the concept of fragility, but perhaps I need to consider that a bit more during statistical review of trials than I currently do now.

To get back to the original point, though, I suppose I'm one of those people who didn't believe in ADT for those with low PSA before (although that was not a pre-specified subset analysis either) and this post-hoc analysis reinforces my practice. Maybe I'm just grasping at straws for any justification to identify patients that I do not put through the toxicity of ADT if I don't absolutely have to.
I share the same bias.

Two additional RCTs are relevant to the question: RTOG 0534 (plenary at ASTRO 2018) and GETUG (update presented at ASCO 2019).

Both of these studies were more contemporary than 9601 and the range of preRT PSA was much lower. GETUG median 0.3 IQR 0.2-0.5 RTOG 0534 median 0.34 with 95% below 1.0

I won't change my practice until a hard endpoint like DM, PCSM or OM are changed.

0534 is too premature
GETUG update reported improvement in distant metastases at 10 years. (verbatim below from abstract)
"Metastatic free survival (MFS) is significantly improved in the combined arm (HR = 0.73 [CI95% = 0.54-0.98] ; p =
0.034) with 69% [CI95% = 63-74] versus 75% [CI95% = 70-80] of MFS at 10 years for RT alone and RT+HT, respectively."

So there are hints that STADT will move the needle on a hard endpoint with lower PSAs but we need to wait for more events.
 
Top