It's kind of hard to explain without spending a lot of time talking about statistics, but in essence:
Obviously you have to choose between answer B and D. The reason that D is incorrect is because of the inherent error that comes with multiple comparisons, i.e. the more comparisons you run, the greater the chance is that you'll discover a difference which is solely attributable to chance and not to actual effect (type I error). The question stem implies that the researchers designed a study to assess the impact of X, failed to get the results they wanted and then did a post-hoc subgroup analysis which found a difference in a certain population. While what they did isn't entirely wrong per se, they can only really justify using those results to design a future study, not to change current therapeutic guidelines (especially in Step 1 world where everything needs a RCT in order to be valid).
Granted, I think this question is sort of a pain because this happens
all the freaking time in published research, but it's a very important point that people sometimes don't realize -- especially when you have something like the big HTN / lipid drug trials where they have dozens of comparisons and may or may not actually correct for it.
Anyway, Wiki if you want to read more:
http://en.wikipedia.org/wiki/Multiple_comparisons . I'm also definitely not a stats person, so someone might be able to do a better job explaining it, but that's a least a rudimentary explanation.