EK emphasizes the "Schachter two-factor" as the main difference between the two. Physiological arousal is the first step, like James-Lange. But then there are also situational cues (the second factor) that help us sort the emotion, instead of strictly classifying it based on the physio response. EK also says that S-S "recognizes higher level thinking."
So this would be my best attempt at summarizing the three theories of emotion.
J-L stimulus ----> physio response -----> recognition and interpretation of physio response ----> emotion
C-B stimulus ----> physio response and emotion occur together
S-S stimulus ----> physio response ----> cognitive appraisal of situational cues ----> emotion
To your comment, I think you are on track. EK says the C-B theory critiqued J-L's monolithic treatment of physio responses that can be interpreted in various ways. Rapid heartbeat could be fear, excitement, anger.
In your example, A depicts stimulus ----> physio response ----> cognitive appraisal of the reason for the physio response ----> emotion.
Answer B depicts stimulus ----> physio response ----> emotion
The language to key on is "his body cues and behavior lead him to understand he is in a scary situation and he feels afraid." This is happening without cognitive appraisal of situational cues; the physiological response is dictating the emotion felt. That would be a J-L response, and in fact it plays into the critique of J-L because really, the dog running out of the house is not a life-threatening situation. It isn't even particularly "scary," so the physio response is almost tricking the dog owner.
These are kind of tough, but I hope this helps.