Tuesday 5 September 2017

What are depth and motion perception?


Introduction

A person looking at the view from a high vantage point can easily perceive that some objects in the landscape below are closer than others. This visual discrimination appears to require no apparent skill; the information is contained in the light that enters the eyes after being reflected off the objects in question. This light is focused onto the back of the eyes by the cornea and the lens of the eye. Once it reaches the back of the eye, the light is reflected onto the two-dimensional surface of the retina and causes a variety of retinal cells to fire, yet the observer perceives the scene in three dimensions. Furthermore, in spite of the presence of two separate physical retinal images, only one visual object is perceived. These phenomena are so natural that they occasion no surprise; instead, people are surprised only if they have trouble determining depth or if they see double. Yet it is this natural phenomenon—single vision from two distinct retinal images—that is truly remarkable and requires an explanation.










Perception of a Single Fused Image

The process that results in one percept arising from two retinal images is known as fusion. The physiological hypothesis of fusion is founded on the structures and functions of the visual pathways and is supported by increasing amounts of neuroanatomical and neurophysiological evidence. It represents the mainstream of thought in physiological optics today. The phenomenon of fusion, then, is based on a relationship between points on the two retinas. For every point on the left retina, there is a corresponding point on the right retina. Corresponding points are the pairs of anatomical locations, one from each retina, that would overlap if one retina could be slid on top of the other. So, for example, if the foveae of the two eyes are both focused on the same object, A, the images in each eye cast by A are said to lie on corresponding points on the two retinas. Similarly, any other object producing images in the two eyes that are equidistant from the fovea in each eye is also said to be stimulating corresponding points. All these objects will be at about the same distance as A. This hypothesis states that corresponding points are anatomically connected and that these connections form a final common pathway, or fusion center. For sensory fusion to occur, the images not only must be located on corresponding retinal areas but also must be sufficiently similar in size, brightness, and sharpness to permit sensory unification.




Specific Qualities of Depth Perception

Perception of depth is created by various depth cues that signify the three-dimensionality of a scene from the two-dimensional information on the retina. Many of these cues are monocular depth cues, meaning that they give information about depth in a scene even if the observer is using only one eye. The cues include overlap, shadowing, relative brightness, and aerial perspective. Visual experience in these cases depends on the transfer of light from an object in the real world to the eye of the observer. These cues depend on characteristic ways that light travels to the eye or interacts with the medium through which it passes. In other cases, size and object relations provide information about depth. These cues include relative size, familiar size, perspective, and texture gradients. Monocular depth cues are used by artists to create the impression of depth in their artwork. However, it is rare that a painting is mistaken for an actual scene, because there are some cues that cannot be used in pictures.


Some of these cues, such as motion parallax and optical expansion, involve changes in the pattern of retinal stimulation as a result of motion. Objects closer to a moving observer than the fixation point would appear to move much faster than those further away. In addition, certain cues for distance and depth arise from the physical structure of visual systems. These physiological cues to depth include convergence (the turning in of the eyes as objects approach the face) and binocular disparity.


The problem of how the third dimension is perceived has been a point of debate for many noted philosophers, especially the empiricists. It has been suggested that different angles of inclination of the eyes (convergence) and different degrees of blurring of the image and strain in the muscles of the lens (accommodation) are the primary cues. There is, however, another potentially important perceptual source of information about relative distance for animals with binocularly overlapping fields of vision: binocular or retinal disparity, which is based on the fact that the two eyes see from slightly different positions, creating two different views of the world. If the two eyes are focused on an object, A, the images in each eye cast by A are said to lie on corresponding points on the two retinas as described earlier. The images cast by a nearer or more distant object, B, will fall on noncorresponding or disparate points on the two retinas, and the amount or degree of disparity will depend on the differences in depth between A and B. Thus, if the brain can compute the degree of disparity, this will give precise information about the relative distances of objects in the world.



It is possible to manipulate binocular disparity under special viewing conditions to create a strong depth impression from two-dimensional pictures. If pairs of pictures are presented dichoptically to each eye, and each picture depicts what that eye would see if an actual object in depth were presented, a strong depth effect is achieved. The stereoscope, invented by Sir Charles Wheatstone in 1833, operates on this principle. Two photographs or drawings are made from two slightly different positions, the distance being the separation of the two eyes, approximately sixty-three millimeters. The stereoscope presents the left picture to the left eye and the right picture to the right eye so that they combine to produce a convincing three-dimensional image of the scene. Wheatstone was the first to realize that horizontally displaced pictures presented in this fashion produced stereopsis, or binocular depth perception.


Most stereoscopic pictures contain other depth information in addition to disparity. Monocular cues such as linear perspective, overlap, or relative size may contribute to the depth effect. Vision psychologist Béla Julesz, however, created the illusion of depth using only a stereoscope and random dot patterns that contained no depth information other than disparity. This effect can be achieved by first generating two identical random dot patterns with a computer and then shifting a subset of the dots horizontally on one of the patterns to create the disparity. When viewed monocularly, each pattern gives a homogeneously random impression without any global shape or contour; when the two views are combined in a stereoscope, the shifted dot pattern appears as a small square floating above the background. Thus, even with no other depth information present, retinal disparity can cause the perception of depth.


How does the brain use the disparity information provided by images that fall on noncorresponding positions on the retinas of the two eyes? There are many binocular neurons in a cat's visual cortex that are sensitive to small differences in retinal disparity. Monkeys also have these neurons, also called disparity-selective cells, of which there are several types. Tuned excitatory cells respond strongly to a single, very small disparity (usually near zero) and weakly to any other disparity. The tuning width of these cells along the dimension of disparity is very narrow. Tuned inhibitory cells respond well at all depths except on or near the fixation plane. Finally, near cells and far cells respond strongly to stimuli in front of or beyond the fixation plane, respectively, but little if at all to the opposite stimuli. There is a “functional architecture” for binocular disparity in the feline visual cortex, with horopter-coding cells located near the boundaries of ocular dominance columns and near and far cells predominating at the interior of the columns. These neurophysiological findings in cats and monkeys can be extrapolated to human depth perception as well.


A further question of interest is whether the neural streams that process stereopsis are associated with the parvocellular or the magnocellular layers of the lateral geniculate nucleus. Evidence in support of a parvocellular component to stereopsis has been reported. Monkeys with lesions in the parvocellular layers of the lateral geniculate nucleus showed disruptions in fixations of random-dot stereogram stimuli, particularly for fine dot arrays, while magnocellular lesions seemed to have little or no effect. This suggests that the parvocellular stream is needed to process stereopsis when the stimuli are presented in fine dot arrays, as in random-dot stereograms.




Motion Perception

One essential quality that distinguishes animals from plants is the capacity for voluntary movement. Animals move to find mates, shelter, and food and to avoid being eaten themselves. However, the ability to move brings with it the requirement to sense movement, whether to guide one’s progress through the world or to detect the movement of other mobile animals, such as approaching predators. For sighted animals, this means sensing movement in the retinal image.


The need to sense retinal motion as quickly as possible places great demands on the visual system. Movement is characterized by subtle but highly structured changes in retinal illumination over space and over time. To sense movement very early in processing, the visual system relies on specialized neural processes that make use of information about localized changes of image intensity over time, effectively isolating the parts of the image that contain movement. However, to code the direction of movement, this temporal change information must be combined with information about spatial change-intensity edges. Increases of intensity over time come from image regions with spatial edges that are, for example, bright on the left and dark on the right. Decreases over time are associated with edges of opposite contrast polarity. These space-time pairings signify motion from left to right. A reversal of polarity in either the temporal signal or the spatial signal would signify motion in the opposite direction.


The medial temporal area in the visual cortex is thought to be very important for motion perception because 90 percent of neurons in this area are directionally sensitive and damage to this area impairs one's ability to detect the direction of movement. How can the brain tell the difference between object motion and eye movement? Corollary discharge theory proposes that information about eye movement is provided by signals generated when the observer moves, or tries to move, his or her eyes. A motor signal travels from the brain to the eye muscles to move the eyes. The image moves across the retina and creates a sensory movement signal. If the sensory movement signal reaches the cortex, motion of the object is perceived. If only the eyes move, the corollary discharge signal, a copy of the motor signal, is transmitted to a hypothetical structure (the comparator) that receives both the corollary discharge signal and the sensory movement signal. This cancels the sensory movement signal so that object motion that does not really exist is not perceived. There is a growing body of psychophysical and physiological evidence supporting corollary discharge theory, but researchers still do not completely understand the processes involved.


Perception of movement can be determined by how things move relative to one another in the environment as well. James J. Gibson
coined the term “optic array” to refer to the structure created by the surfaces, textures, and contours of the environment. He believed that what is important about the optic array is the way it changes when either an observer or something in the environment moves. A local disturbance in the optic array indicates that the object causing the disturbance is moving; a global disturbance indicates that the observer is moving through the environment, which is stationary.


Perception of movement can even assist in the perception of three-dimensional forms. Several studies have demonstrated just how much information can be derived from biological movement. In these studies, actors were dressed in black and small lights were attached to several points on their bodies, including their wrists, elbows, shoulders, hips, and feet. Films were then made of the actors in a darkened room while they were performing various actions, such as walking, jumping, dancing, and lifting both a light and a heavy box. Even though observers who watched the films could only see a pattern of moving lights against a dark background, they could readily perceive the pattern as belonging to a human, could identify the behaviors in which they were engaged, and could even tell the actors’ genders.


There are also instances in which the perception of movement exists even though no movement is actually occurring. A person who sits in a darkened room and watches two small lights that are alternately turned on and off perceives a single light moving back and forth between two different locations rather than two lights turning on and off at different times. This response, known as the phi phenomenon, is an example of apparent motion. Theater marquees and moving neon signs make use of this phenomenon. Instead of seeing images jumping from place to place, people perceive smooth movement in a particular direction. This ability to perceive movement across “empty space” was the basis for the creation of the first motion pictures in the late 1800s, and it may also explain some unidentified flying object (UFO) sightings related to flashing lights on radio towers. A related phenomenon, called induced motion, occurs when a person sitting in a train or bus feels that the vehicle has begun to move when actually the vehicle next to it has moved. The movement of one object induces the perception of movement in another object.




Bibliography


Blake, Randolph, and Robert Sekuler. Perception. 5th ed. New York: McGraw, 2006. Print.



Bruce, Vicki, Patrick Green, and Mark A. Georgeson. Visual Perception: Physiology, Psychology and Ecology. 4th ed. New York: Psychology, 2003. Print.



Dörschner, Katja. "Image Motion and the Appearance of Objects." Handbook of Experimental Phenomenology: Visual Perception of Shape, Space and Appearance. Ed. Liliana Albertazzi. Malden: Wiley, 2013. 223–42. Print.



Gibson, James J. The Ecological Approach to Visual Perception. Hillsdale: Erlbaum, 1986. Print.



Goldstein, E. Bruce. Sensation and Perception. 9th ed. Belmont: Wadsworth, 2014. Print.



Gregory, R. L. Eye and Brain: The Psychology of Seeing. 5th ed. Princeton: Princeton UP, 1997. Print.



Howard, Ian P., and Brian J. Rogers, eds. Perceiving in Depth. 3 vols. New York: Oxford UP, 2012. Print.



Johansson, Gunnar. “Visual Motion Perception.” Scientific American June 1975: 76–89. Print.



Julesz, Bela. Foundations of Cyclopean Perception. Cambridge: MIT P, 2006. Print.



Vishwanath, Dhanraj. "Visual Information in Surface and Depth Perception: Reconciling Pictures and Reality." Perception beyond Inference: The Information Content of Visual Processes. Ed. Liliana Albertazzi, Gert J. van Tonder, and Vishwanath. Cambridge: MIT P, 2010. 201–40. Print.



Wade, Nicholas J., and Michael T. Swanston. Visual Perception: An Introduction. 3rd ed. New York: Psychology, 2013. Print.

No comments:

Post a Comment

How can a 0.5 molal solution be less concentrated than a 0.5 molar solution?

The answer lies in the units being used. "Molar" refers to molarity, a unit of measurement that describes how many moles of a solu...