Well, I promised you all I would make this, and it's finally done!
GTLO's Guide to Visual Image Processing
Disclaimer: The AAMC Biological Sciences Content Outline is not any more specific than "visual image processing" when denoting this aspect of the visual system as testable content. Therefore, I have had to make my own determination of what I think the AAMC would expect a prepared test taker to know. I have made this determination based on the AAMC's assertion that the requisite science knowledge is not beyond the scope of introductory-course-level material. I have presented here more information than I believe we need to know, and emphasized what I feel the AAMC would consider requisite knowledge vs. background knowledge. If anyone would like further explanation of particular topics, let me know and I will be happy to oblige. In compiling this information I have consulted my notes and lecture materials from two courses that covered the visual system (and for the record I got high A's in both courses): one course on sensation and perception, and another on neuroscience. Both instructors are PhD faculty, and both conduct research either directly on or related to the visual system, so I assume their information to be accurate. Further, I have consulted external sources on any issue I was not comfortable simplifying while maintaining accuracy. I do not anticipate any problems with the validity of this information, especially given the level of mastery needed for the MCAT.
Preface: Herein I have presented significantly more information than I believe would be considered testable knowledge, some for context and some because I find it interesting and hope intrigue may solidify memory in others as it does for me. I have
italicized information I explicitly present as non-requisite background knowledge,
and included a TL;DR at the end of the post, stating exactly what I think you should know.
Visual Image Processing!
This is a wide area to consider, but several key aspects of sight lend themselves to effective understanding at an MCAT level, while others exclude themselves by virtue of excessive complexity and scope. I will consider mainly
1) the visual encoding of color and
2) the representation of the visual image in the brain at a cortical level.
Vision Fundamentals
To begin, let's address basic knowledge you should have about the eye, both for the MCAT and to provide a foundation for understanding this post. You should know that the retina is the sensory region of the eye, and that it contains a number of different types of specialized neurons, most importantly to us the
photoreceptors and
ganglion cells.* Photoreceptors are sensory neurons containing
photopigment molecules, which react to light to effect transduction of a neural signal. Photoreceptors are classified as either
rods or
cones, rods not exhibiting differentiation based on the wavelength of incident light, and cones exhibiting differentiation based on the wavelength of incident light. You should know that the distribution of rods and cones across the retina is not even; the
fovea (the center-point of the retina, where the image of something you're looking directly at is focused) is composed entirely of cones in high density, while the remainder of the retina contains more rods than cones (and rods greatly outnumber cones overall). Ganglion cells are interneurons that ultimately receive the signals initiated by the photoreceptors, and conduct them down their long axons, which group together to the form the
optic nerve, and exit the orbit to project into the brain.
As a note about the fovea in particular, the high density of cones coupled with the particular arrangement of neural circuits in the foveal retina gives us very high (and greatest) acuity of vision there as opposed to the peripheral retina. This is partly why it's hard to read using your peripheral vision.
*There are actually five types of neurons in the retina, but I seriously can't imagine differentiating between horizontal and amacrine cells would be expected, even though EK Bio does minimally present them labeled in a figure. Such a discrete would be borderline obscene. If you really want to know, either ask me or Google it.
Color
Rod photoreceptors can be left out of the discussion of color, because they respond equally to all wavelengths of light. Rods can all be considered identical. Cones, on the other hand, come in three types. All three types are responsive to all wavelengths of light, but each type has a characteristic response curve, with maximal response centered around a particular wavelength. These wavelengths of maximum response correspond to red, green, and blue light.
The cones are thus named S, M, and L cones based on the wavelength they respond maximally to (small wavelength = blue, etc.), and they actually contain different photopigment molecules.
Perception of color, and thus the formation of accurate cortical representations of color, can be described by the contributions of multiple valid perspectives.
It's highly unlikely you will need to know the names and features of these theories, but the concepts are simple, so I've presented them here for context. The important thing to recognize is that color encoding is complex.
- Trichromatic Color Theory describes the encoding of color as the net sum of differential contributions of S, M, and L cones to facilitate the representation of individual, distinct colors in the visual image.
- Opponent-Process Theory: Though the cones respond differently to the same wavelength of light, the ganglion cells also respond differently to the signals they produce. Different ganglion cells respond to different assortments of color information from the cones; ganglion cells respond preferentially to particular color pairs and differently between the colors in the pair. The result is antagonism at the level of the ganglion cell signal between red-green and blue-yellow signal pairs. (See for yourself! Color afterimages correspond to the opponent pairs of colors; stare at a blue image for a while and then look at a white background, you'll see a yellow image).
It turns out that simply encoding information about light wavelength isn't enough to produce the visual image we consciously see. There is a component of so-called "top-down" processing, where information from higher levels of processing influences the encoding of information at lower levels, in the case of color perception.
- Retinex Theory: Information from regions of the cortex involved in processing visual stimuli is actually involved in the representation of something as presumably simple as color (Color constancy is a good example of the function of this top-down information flow; we don't perceive colors to change even when the distribution of wavelengths reflected from them does as a result of changing ambient light sources, e.g. walking from sunlight outside into a room lit by yellow incandescent light.)
Neural Pathways
A fundamental characteristic of vision is the division of sensory information into
hemifields, and division between hemispheres of the brain. Put simply, each hemisphere of the brain receives visual information from each eye, but only from half of each retina. The fibers from each retinal hemifield cross (the proper term is
decussate) at the
optic chiasm, an X-shaped intersection of the optic nerves where the fibers cross. Each half of the brain processes one half of the visual field.
From the eye, there are actually four distinct paths the optic nerve fibers can take, leading to different parts of the brain and serving different functions. (For example, the retinohypothalamic pathway conducts visual sensory information into the suprachiasmatic nucleus of the hypothalamus, and influences the modulation of circadian rhythms; this is how staring at a computer screen for hours in the night affects the ease with which you can go to sleep afterward.) We are only concerned with the major pathway, which most fibers (and thus signals) follow, the one which actually leads to formation of visual perceptions.
Formally called the
retinogeniculostriate pathway, most optic nerve fibers project into the
thalamus (
specifically the lateral geniculate nucleus (LGN)), and from there the signal continues into the
primary visual cortex, also called
V1, in the
cerebral cortex of the
occipital lobe of the brain. Substantial processing of visual features occurs in V1;
neurons combine inputs from each eye, and are specialized to respond to particular orientations, movement directions, and colors of visual stimuli.
Primary Visual Cortex (V1)
Characteristics of V1 are exceedingly complex, but there are two very simply, very important concepts that are worth remembering. The first is called
retinotopy. The primary visual cortex,
and in fact also the LGN in the thalamus, which first receives the retinogeniculostriate signals before V1, organizes processing of visual information in the manner of a
retinotopic map. This is to say that the neurons in V1 actually process information from corresponding parts of the retina, such that the retina could be stretched out over V1 to approximate the areas where overlaid retinal cells initiate the signals. In particular, the most posterior end of V1 processes foveal information, and the processed signals originate further outward from the foveal into the peripheral retina as you move anteriorly along V1. That may be hard to visualize, so it can be simplified to this: the signals produced by adjacent retinal cells are processed by adjacent V1 neurons.
The second important characteristic is called the
cortical magnification factor. Put simply, the brain devotes a much larger portion of the neurons in V1 to processing information from the fovea than the rest of the retina. This is nice, because things we are looking directly toward get more brain processing power devoted to them, and we can more finely sense specific visual features of things in our direct field of view.
Higher Processing
From V1, visual information flows along two different pathways that mediate more advanced processing of visual features.
In the ventral pathway (or "what" pathway), visual information flows from V1 into regions of the temporal lobe cortex, where visual objects are identified. Highly specialized areas, and even individual neurons, in the inferotemporal cortex respond to particular stimuli, such as faces. In the dorsal pathway (or "where" pathway), visual information flows from V1 into regions of the parietal and and to some extent temporal lobes, where object location, motion tracking, and visually guided movement are processed.
It is the advanced processing of visual features at these higher cortical levels that enables us to make sense of the visual stimuli we are constantly bombarded with, the countless photons continually creating a stream of chaotic neural signals from our retinas.
Conclusion
I've presented a lot of information here, much more than I would ever expect even a difficult discrete to require you to know, so following is a TL;DR.
TL;DR
Vision Fundamentals: Just know everything in this section.
Color:
- Cones all respond to all colors, but not equally; three types of cones = roughly "RGB" encoding
- That's not enough though, color encoding is very complex, and involves discrimination and integration of different signals at the level of both the photoreceptors themselves, and the ganglion cells. Our perception of color even depends on higher-level input beyond what the eye sees.
Neural Pathways:
- Halves of the brain process different halves of the visual field. This is facilitated by optic nerve fibers crossing over at the optic chiasm.
- The great majority of retinal signals make their way to the primary visual cortex (V1) in the occipital lobe cerebral cortex, and this is where the first real processing occurs.
Primary Visual Cortex (V1):
- V1 is organized into a retinotopic map; signals produced by adjacent retinal neurons are processed by adjacent V1 neurons. You can actually map the retina onto V1 and see where particular signals end up.
- V1 exhibits a cortical magnification factor: a great deal more V1 area and power is devoted to processing information from the fovea than the peripheral vision. This helps focus the visual information to what's most important (i.e. what we're directly looking at).
Higher Processing:
- Advanced image processing occurs beyond V1, in the parietal and temporal lobes.
- The two main pathways can be thought of as separately processing what we're seeing (identifying visual objects) and what we're seeing happening (object motion and related features).
As I said, feel free to ask whatever questions for clarification. Commentary is also welcome!