What Versus Where: The Dual Processing Theory of Vision

Divyan Bavan

Introduction

The brain contains many separated processing streams. These streams process information from many sensory inputs, with one of them being vision. The visual pathway starts at the retina, where light is converted to action potentials through phototransduction. These signals leave the eye through the optic nerve, eventually reaching the primary visual cortex (V1). After preliminary processing, two streams emerge in the secondary visual cortex (V2)—the dorsal and ventral streams. Once they have diverged, each stream utilizes specialized regions for recognition and positioning, respectively. While specialization is evident, the streams also have crosstalk between each other, aiding in complex integration. Through understanding this pathway, it is evident that central processing of visual primitives and crosstalk between specialized streams are the principles upon which our brains decode visual information.

Central Processing within V1 and V2

Once light hits the retina, bipolar cells respond to signals from photoreceptors with graded potentials. This can lead to the firing of action potentials from retinal ganglion cells. The axons of these cells make up the optic nerve. Signals then reach and are processed by the lateral geniculate nucleus (LGN) in the thalamus (KW’s lecture slides). This area has three types of neurons: magnocellular, parvocellular, and koniocellular. Since magnocellular neurons relay information about motion and parvocellular neurons relay information about fine detail, it was hypothesized that these neurons form the split between the dorsal and ventral streams, respectively. This is not the case. Both types of neurons contribute to the dorsal and ventral streams (Goodale and Milner, 1992). This illustrates the interconnectedness of the two streams during preliminary processing.

The signals received by V1 are also not separated into streams for determining “what” and “where” an object is. General features—colour, orientation, motion, and edges—have not been extracted from the visual information yet. Therefore, the two streams of visual processing are intertwined at this stage (Kandel et al., 2021).

V1 has several types of cells for feature extraction. An example of this is orientation selectivity in simple and complex cells. These cells, like retinal ganglion cells (RGCs), have receptive fields. A field consists of a centre and periphery with opposing responses; a cell with an “ON” centre will have an “OFF” periphery, and vice versa. RGCs use this method to detect edges in objects. With this information, simple cells in V1 can detect orientation. This is because the receptive fields in simple cells are bars, the product of many circular RGC fields connecting to a single simple cell. When a bar of light is oriented with the receptive field, the simple cell will have a higher firing rate. To further encode orientation, multiple simple cells will synapse with complex cells. This creates receptive fields which do not have separation between centre and periphery but are still orientation-tuned (KW’s lecture slides).

Orientation-tuning is an example of a feature which is essential to both the dorsal and ventral streams. It enables the ventral stream to decipher the form of the object; it enables the dorsal stream to perceive the boundaries of an object when it is moving. However, some features in V1 are more important for one stream than the other. For example, blobs enable the preliminary processing of colour (KW’s lecture slides). This process is more critical for object discrimination, and thus, is more implicated with the ventral stream. Conversely, some complex cells can detect direction-specific motion (KW’s lecture slides). This is heavily implicated with the dorsal stream. Thus, it is evident that although all information is processed centrally within V1, the separation of visual information already starts to occur.

Once V1 has extracted the visual primitives, information is passed into V2—the secondary visual cortex. Since all information is passed into this cortex, it can be inferred that it is responsible for processing information that is important to both streams. As expected, this is the case. One of V2’s primary functions is to interpret binocular disparity: the differences in visual information between the left and right eye. This enables depth perception. The ventral stream uses this information to understand the 3D shape of an object, while the dorsal stream uses it to perceive the form of the object. Similar processing occurs in V3, with the split in dorsal and ventral streams becoming more apparent.

Divergence Between the Two Streams

V4 and V5 are where full specialization takes place. In V4, colour, orientation, and figure lead to specialized perception of form. In V5, cells are motion sensitive, often favouring a particular direction. Once the information has been processed, each cortex relays signals down its respective stream. V4 transmits information to the inferotemporal cortex, whereas V5 transmits information to the posterior parietal cortex (KW’s lecture slides). The separation of these streams was described by Mishkin, Ungerleider, and Macko through a variety of experiments in monkeys. When lesions were made in the inferotemporal cortex, monkeys would lose the ability to recognize objects but remain spatially aware. Conversely, when lesions were made in the posterior parietal cortex, the monkeys would lose spatial awareness but could recognize objects. This demonstrated specialization in processing of visual information (Mishkin et al., 1982).

This theory would later be modified by Goodale and Milner in 1992. By studying a patient with a lesion in the ventral stream, referred to as DF, they were able to determine that the “what” and “where” streams are better described as perception and action streams, respectively. This was the product of a series of experiments. One of these experiments involved DF posting a letter through a mailbox. When tasked with aligning a letter to the mailbox, DF showed poor matching skill. However, when she had to post the letter, she was unimpaired. This showed that the ventral stream is responsible for perceiving the fine orientation of the letter, whereas the dorsal stream is responsible for action (KW’s lecture slides; Goodale and Milner, 1992).

Connections Between the Dorsal and Ventral Stream

While both streams appear to be heavily separated post-V3, they still receive input from each other. This can be shown behaviourally and physiologically.

When Goodale and Milner did their mailbox studies with Patient DF, they also tried to use a T-shaped object with a complementary posting slot. Unlike with the regular letter, DF struggled to post the T-shaped object. This object has two principal axes, making it more complex than the letter. This demonstrated that there was higher-level processing impaired in DF. Milner hypothesized that the ventral stream provided information about higher-level shapes to the dorsal stream (Milner, 2017). Since the ventral stream is impaired in DF, this explains why she has trouble with the task. A similar conclusion was made by studying DF’s ability to grasp discs with holes. She was able to grasp discs with two holes quite well, but struggled with three holes. Again, this suggests that the ventral stream is necessary for more complex actions (Milner, 2017). While Milner suggests that the dorsal stream provides simpler inputs to the ventral stream, there are also experiments showing that the former can aid in object recognition (Milner, 2017).

Beyond behavioural experiments, it is evident that there is anatomical crosstalk between regions specific to the dorsal or ventral stream. For example, V4 has synaptic connections with V5. This aids in the sharing of information between both cortices. While the two streams are specialized for different tasks, some information is required by both streams, so it is necessary for information to be shared between them. This helps to create a complete picture of the visual field (KW’s lecture slides).

Conclusion

It is evident that the dorsal and ventral streams exist to increase the efficiency of the visual pathway. While all visual information requires preliminary processing in V1 and V2, once enough information has been extracted, the processing can split into different streams. The existence of these streams can be shown through both anatomical and behavioural evidence, exemplified through studies by Mishkin, Ungerleider, Macko, Goodale, and Milner. However, this does not mean that these streams act in isolation of each other. The evidence shows that there is crosstalk between them, the purpose of which is to create a clearer image for more complex tasks. The pattern of interconnectedness increasing with the complexity of the task goes beyond vision, and extends to many aspects of the brain. It is a balance of specialization and crosstalk that enables the brain to perceive the world and perform complex actions every day.

Works Cited

Goodale, Melvyn A., and A.David Milner. “Separate Visual Pathways for Perception and Action.” Trends in Neurosciences, vol. 15, no. 1, Jan. 1992, pp. 20–25, pubmed.ncbi.nlm.nih.gov/1374953/, https://doi.org/10.1016/0166-2236(92)90344-8.

Kandel, Eric R. Principles of Neural Science. 6th ed., S.L., Mcgraw-Hill Education, 2021.

Milner, A. D. “How Do the Two Visual Streams Interact with Each Other?” Experimental Brain Research, vol. 235, no. 5, 2 Mar. 2017, pp. 1297–1308, https://doi.org/10.1007/s00221-017-4917-4.

Mishkin, Mortimer, et al. “Object Vision and Spatial Vision: Two Cortical Pathways.” Trends in Neurosciences, vol. 6, no. 1, Jan. 1983, pp. 414–417, https://doi.org/10.1016/0166-2236(83)90190-x.

Associate Professor Kerry Walker’s Lecture Slides