Wednesday 25 March 2015

What are pattern vision and perception?


Introduction

The human visual system is designed to see patterns. In fact, the system uses light only as a medium for detecting patterns; the actual intensities of light in various parts of the visual field are discarded at the first stage of visual processing, and only relative intensities remain. The fate of these patterns of intensities in the visual brain is the concern of pattern vision.









Vision relies on two kinds of information: patterns of intensity and patterns of color. Intensity patterns are resolved into objects in a number of steps. The steps can be conceived either in terms of the responses of single neurons at various levels of the visual pathways or as a series of rules, or algorithms, governing the transformations that visual information undergoes. The process begins in the retinas of the two eyes, where each receptor cell is exposed to a tiny sample of the visual world, a small spot of sensitivity. The receptor cells pass their messages to other cells at higher levels, and the visual image is reorganized, or recoded, at each step.




Vision Processing

From the point of view of pattern vision, a particularly significant step occurs when the signals enter the visual cortex, after several stages of processing. Each neuron in the cortex is excited by signals from many receptors lying in a straight line, so that the neuron responds best to a line in the visual field. The line is a receptive field, an area of the world to which the neuron responds. The line must have a particular location and orientation. The neuron responds best, in fact, to a group of parallel lines at a particular spacing. This helps to improve the reliability of the system. Other neurons at the same time respond to lines at other locations, orientations, and spacings.


From this set of line-shaped receptive fields, the visual system constructs an internal model of the visual world. First, the outlines of objects can be recognized from patterns of lines. Next, particular patterns of lines are assigned particular meanings. For example, sometimes two lines will meet at a T junction. In this case, the visual system generally assumes that the crossbar of the T is in front of the upright part, because the crossbar interrupts a line. From an image that contains many such lines and intersections, the visual system reconstructs the surfaces and objects of the world.




Motion and Shading

Often, however, this is not enough. There may remain ambiguities, unanswered questions about what objects are present in the world. The visual system has several additional methods by which to interpret the image. One of them is motion, which is usually present in the visual field. If there is no motion of objects in the world, there is often motion of the observer, so that objects move past the observer at different rates. The visual system can take any group of lines that is moving in the same way and conclude that they represent a single object moving relative to other parts of the visual world.


Objects can also reveal their shapes by their motion. A circular image, for example, might be either a disk or a sphere. A rotating sphere, however, will not change its shape, while a flipping disk will appear round, then elliptical, then flat as it tumbles. Other objects have characteristic changes in appearance as they rotate, providing the visual system with information about their structure. How the system decodes this information is the shape-from-motion problem.


The shading of objects also reveals something about their structure. A surface that is facing a source of illumination will be brighter than a surface at another angle. A rounded object will have continuous changes in shading. This is the shape-from-shading problem, and solving it gives still more information about the structure of the visual world. In this way, the visual system combines information from many cues, or sources of input from the image, and normally the brain does a remarkably good job of interpreting the visual world quickly and accurately.


Color gives still more information about visual patterns, helping the system to distinguish surfaces and textures. It is handled by separate visual mechanisms and cannot resolve as much detail as the form-processing mechanism.




Machine Vision

One reason it is important to know how pattern vision works is that an understanding of visual processing is necessary to enable machines to interact effectively with the visual environment. As humans interact more with patterns generated in graphics-oriented computers and in art, it becomes important to understand what goes on in the human mind when patterns are processed. One necessity for building robots to do many tasks is to give them the ability to recognize objects in their surroundings. Generally, the robots are computer-based and use television cameras for visual input; interpretation of the image comes next. This has proved to be more difficult than anticipated; even the first step in the process, abstracting lines and edges from the world, remains imperfect in existing systems. One of the efforts in the area of artificial pattern recognition, then, has been to investigate the pattern-recognition mechanisms of humans and animals and to try to build similar mechanisms into the machines. Every three-year-old has far better pattern-recognition capabilities than the most sophisticated machines.


One of the problems facing machine vision is that, although the identifying of lines, edges, and patterns might work well in the laboratory, the process is less successful in the real world. Humans have a remarkable ability to recognize patterns even in “noisy” environments, despite shadows, occlusions, changes in perspective, and other sources of variation. The emerging discipline of artificial intelligence is concerned with such problems.




Virtual Reality

Another application of visual pattern recognition is in the area of virtual reality, the effort to design displays that create in a computer user the illusion that one is actually in a different environment. Usually, the observer wears goggles that present images to the two eyes, reproducing everything that the observer would see in another environment. When the user’s head turns, the environment presented in the goggles rotates in the opposite way, just as it would in the real world. The system might present an undersea or space environment, for example, and can even include an artificial image of the observer’s own hand calculated from the position of an electronically equipped glove worn by the user. Here the difficulty is in deciding what the display should offer to the user; it is impractical to reproduce an entire visual world in all its richness and detail. The designer of a virtual-reality system must select the patterns that will be presented and must know what information is essential to pattern vision and what can be left out. Again, knowledge of what information the human visual system will extract from the scene is essential to guide decisions about what to present.


An example of how research in pattern vision can influence the design of virtual-reality systems is in the amount of detail that must be presented to the observer. The fovea of the eye, at the center of vision, sees much finer detail than the rest of the retina, and the farther from the fovea one goes, the less detail can be resolved. Engineers take advantage of this property of human vision by designing systems that present rich detail near objects of interest and less detail elsewhere. It is easier to update the information in this kind of display than to recalculate a finely detailed image for the whole visual field. Similarly, color information need not be presented in great detail over the entire visual field.


Other economies in design can be used in virtual-reality systems as well as in other computer displays. One such shortcut takes advantage of more subtle properties of visual pattern processing. The shading cues that are used to give an object the appearance of depth need not be accurate ones. The visual system is not sensitive to some kinds of distortions in shading and will accept an object as appropriately shaded even if the mathematics that generate the computer’s shading are simplified and distorted.


Another example of a simplification that engineers can make is in the presentation of motion. Humans are not very sensitive to differences in rates of acceleration, so these differences need not be presented accurately. In summary, the human visual system uses shortcuts in interpreting the visual image, and artificial systems can use similar shortcuts in constructing the image.




Theoretical Developments

The beginnings of research on pattern vision can be traced to the work of René Descartes in the seventeenth century. Descartes dissected a cow’s eye and found that a small upside-down image of the world was projected on the back of the eye. All the information that comes from vision passes through a similar stage in human eyes as well. For more than two centuries, however, little progress was made in deciphering what happened to visual information after it left the retina. Anatomists learned where in the brain the visual fibers led, but they could not find out what was happening there.


One of the advances that has made work on pattern vision possible is the realization that vision must be studied at many levels of analysis. One level is neurophysiology, the understanding of what goes on in the nerve cells and in the fibers that connect them. Another level is the algorithm, the set of internal rules for coding and interpreting visual information. Researchers at this level ask what steps the visual system must take to interpret a pattern. The steps themselves are taken care of at the neurophysiological level. A third level is behavior: Researchers investigate the capabilities of pattern vision in the intact human. At this more global level, one studies visual pattern processing as a whole rather than dissecting its pieces. It is relating one level to another that advances understanding. At the behavior level, it is observed that people are capable of recognizing patterns from lines alone, as in cartoons. At the algorithmic level, it is found that extracting lines from an image is a useful step in interpreting the image. At the neurophysiological level, it is found that some neurons are sensitive to lines in the visual world.


Modern theories of pattern vision all share several ideas. First, information is transformed in small steps, not all at once, from the image to its meaning. The early steps are largely independent of the use to be made of the visual pattern. Later steps involve interactions with memory and with the use to be made of the visual information. At these later stages, even single nerve cells code information from a wide region of the visual field, as these cells have the job of integrating images from large areas. At the algorithmic level, the brain engages a number of assumptions about the structure of the visual world and the objects in it to interpret a scene quickly and reliably. The visual field is represented over and over in the brain as information passes to more specialized regions that emphasize movement, pattern recognition, visual-motor interactions, and other uses that the brain makes of visual inputs.


Another common idea in pattern vision is that the image is analyzed in several different ways at once. Depth, for example, might be sought in stereoscopic vision (small differences in the images arriving at the two eyes), in superpositions (using the T junctions described above, among other methods), in shading, and in other clues. If one method does not come up with a meaningful interpretation, another one will. In this way, a reliable pattern vision system can be created from unreliable components.




Bibliography


Bridgeman, Bruce. The Biology of Behavior and Mind. New York: Wiley, 1988. Print.



Daw, Nigel. Visual Development. 3rd ed. New York: Springer, 2013. Print.



Del Viva, Maria M., Giovanni Punzi, and Daniele Benedetti. "Information and Perception of Meaningful Patterns." PLoS One 8.7 (2013): 1–9. Print.



Gregory, R. L. Eye and Brain: The Psychology of Seeing. 5th ed. Princeton: Princeton UP, 1998. Print.



Hubel, David H., and Torsten N. Wiesel. Brain and Visual Perception: The Story of a Twenty-Five-Year Collaboration. New York: Oxford UP, 2004. Print.



Humphreys, G. W., and M. J. Riddoch. To See but Not to See: A Case Study of Visual Agnosia. London: Erlbaum, 1995. Print.



Solso, Robert L. Cognitive Psychology. 8th ed. Boston: Pearson, 2008. Print.



Strasburger, Hans, Ingo Rentschler, and Martin Juttner. "Peripheral Vision and Pattern Recognition: A Review." Journal of Vision 11.5 (2011): 1–82. Print.



Tovée, Martin J. An Introduction to the Visual System. New York: Cambridge UP, 2008. Print.



Waltz, David L. “Artificial Intelligence.” Scientific American 247 (October, 1982): 118–33. Print.

No comments:

Post a Comment

How can a 0.5 molal solution be less concentrated than a 0.5 molar solution?

The answer lies in the units being used. "Molar" refers to molarity, a unit of measurement that describes how many moles of a solu...