2.10 What is Depth Perception? – Definition, Cues & Examples
We live in a 3-dimensional world, but each of our eyes is only capable of capturing a 2-dimensional image. Get up for a second, move around; see how you know when to avoid a chair or the edge of a table? That’s because you can tell how far away objects are from you. This is incredibly important in so many ways. We don’t spend our lives bruised from running into furniture. We’re able to drive, judging where other cars are in the road and knowing how fast we can go. Think of when you’ve misjudged the final step on a flight of stairs–you assumed it was higher or lower than it really was. Imagine if this happened all the time; that’s a world without depth perception.
Depth perception is so important that it may be hard-wired into our brains. At the very least, it’s something we pick up very early. In Gibson and Walk’s famous visual cliff experiment, infants as young as six months old perceived a Plexiglas-covered drop-off and approached it nervously. Most refused to cross. Gibson and Walk concluded that these infants could indeed perceive that the drop was there, and knew it could be dangerous for them.
But how do we turn flat images into 3-D? There are two main kinds of depth cues: binocular and monocular. These words really just mean ‘two-eye’ and ‘one-eye’; you can remember it because you look through binoculars with both eyes, but a proper English gentleman holds up a monocle to only one eye. Basically, there are some clues to depth that we can perceive with just one eye and others that we need both eyes for.
Let’s start with a simple example. Take a look at your desk; let’s say you have a stapler on it, as well as a few old mugs. If you can see that the stapler overlaps in front of one of the mugs, you could guess–accurately–that the stapler is closer to you than the mug. This is a monocular depth cue called interposition.
Have you ever learned about perspective in an art class? If you try to draw a road disappearing into the distance, you have the lines converge as they reach the horizon. Next time you’re out on a long stretch of road, take a look: the road really does seem to converge to a point as it gets further away. This monocular depth cue is called linear perspective; in a flat, 2-D image, things that are farther away seem to get closer together. They also appear physically higher up; this is the position cue. A car that’s further down the road will appear smaller than the same-sized car nearby; this is known as relative size.
If you started driving down this road, you might notice that the things closer to your car seem to move a lot faster than things in the distance. This is called the motion parallax, and even with one eye closed, it would clue you in to how far away things are. Some animals rely more heavily on the motion parallax than humans–have you ever seen a bird bobbing its head around as it looks straight ahead? The bird is trying to achieve its own motion parallax to tell how far away various objects are.
A distant barn also probably looks fuzzier and less distinct than a nearby shed. This is another monocular depth cue used by painters to suggest that some objects are farther away than others, and it’s called the texture gradient. You can see this really clearly in images like a field of grass, where the blades nearest to the viewer are distinct but fade to a wash of green in the distance. A related idea is aerial perspective, in which things in the distance appear foggier than things nearby.
While monocular depth cues help us learn things about the 3-D world from a flat image, binocular depth cues are involved in helping our brains produce actual 3-D images out of flat sensory input. There’s a reason we have two eyes rather than one gigantic Cyclops eye–two eyes a little bit apart from each other generate two slightly different images to send to the brain. This is known as retinal disparity. The brain merges these two slightly different images to create one that looks 3-dimensional. 3-D movies use this principle to achieve their effect. You know how if you take off your 3-D glasses the movie looks a little fuzzy? That’s because there are actually two slightly different images on the screen at once, and the glasses filter the input so that one eye gets one image and one eye gets the other. Then your brain does the rest of the work, putting it together into full 3-D.
Over shorter distances, your brain can tell how close things are based on signals it gets from your eyes’ muscles. For an object under 50 feet away, your brain can tell how much your eyes have to converge to see it. Hold up your finger in front of your face and look at it; now bring it closer. As it gets really close, you’ll definitely feel your eyes cross more and more; it might even get uncomfortable. This is happening in a less extreme way all the time, and your brain can calculate the convergence angle between the eyes and use it to figure out how far away something is. The brain can also take advantage of your eyes’ lenses ability to change their curvature when focusing on particular objects; it knows how much they’ve changed and can use this to judge depth over small, nearby distances.
So to sum things up, our brains have all sorts of cool ways of making 3-D pictures out of 2-D inputs. This ability is important, and perhaps even hard-wired into us, because it helps us function in our 3-D world. The brain can use monocular depth cues like perspective or the motion parallax, which rely on input from one eye or the other. It can also use binocular depth cues, which depend on the slightly different images our differently-positioned eyes bring to the brain and also on how they have to move to focus on objects at different distances.