What Is Computer Vision?

In 1983 satellites of the USSR registered the launch of five intercontinental missiles against Moscow. Stanislav Petrov, the duty officer in charge of launching Soviet missiles in retaliation, refused to press the buttons to launch the counterstrike against the United States. He didn’t believe the “certainty” of the computer readouts, according to the BBC.

Of course, Petrov, who passed away in March 2017, was correct in his assessment. The Computer Vision the satellites used had mistaken sunlight glinting off the tops of clouds as the flames of ballistic missiles.

Computer Vision, today, however, can identify and distinguish faces, emotions, and intentions. It can guide driverless cars through obstacles, judge whether an organ has metastasized, and determine the context in which objects of interest in a photograph reside.

What’s Computer Vision, Really?

Computer Vision is the science and technology involved in providing computers “eyes” into the physical world. During Petrov’s time computers saw in numbers. Now, scientists, researchers, and tech companies are teaching the Artificial Intelligence (AI) underpinning Computer Vision to literally see and recognize objects and the contexts in which they reside.

The technology is rapidly developing to be able to intelligently separate irrelevant objects in a context based on the specialized intent of the users of the Computer Vision. The settings in which Computer Vision is being trained to distinguish objects in unusual settings, including driverless cars, facial recognition, medical diagnoses and real estate, to name just a few.

Driverless Cars See Traffic

Elon Musk more than any individual has championed and popularized the vision of driverless cars wending their way through city streets. Indeed, his company Tesla has been at the forefront of introducing driverless cars to the public.

However, most people do not know that the way cars “see” the road and pedestrians and road obstacles is through Computer Vision.

Computer Vision technology has progressed to such a degree that it can now distinguish a pedestrian from a lamp post. It can tell if a large object in the middle of the road is an empty paper bag or a large rock. The AI “brain” of the car directs the car to stop when a pedestrian is crossing or drive on if the object in the road looks like a bag.

At the end of 2017 companies across the world will be testing driverless lorries. The New York Times wrote that in the US, UK and even in China, some of the largest technology developers will be trialing semi-trucks for long-distance hauls.

Recognizing a Face in the CrowdFace Recognition with Computer Vision

A police force in the UK has already used facial recognition software to track and find suspects of crimes, according to The Daily Telegraph. Computer vision technology attached to the ubiquitous CCTV network Britain maintains finds and tracks faces across a wide network of outdoor cameras.

Facial recognition applications are moving into corporate and public spheres, as well. A Fortune Magazine article reported that Walmart, the retailer, has tested computer vision to identify potential shoplifters.

The British government is currently testing computer vision technology that will permit train customers to pay their tickets as they walk through the gate, the BBC wrote. The software quickly identifies the face of a potential passenger and deducts the fare from the customer’s account with the train service.

Computer Vision in MedicineComputer Vision in Medicine

The potential of Computer Vision to aid radiologists in their work cannot be overrated. Radiologists review the condition of body structures and organs through a variety of media. Best known records include X-rays, sonograms, and MRIs.

While a radiologist can review several dozen charts in a day, Computer Vision promises to review thousands in the same period. Accuracy rates are as high as 98-percent. Computer Vision frees up human radiologists to review and render a professional opinion on that two-percent of results the technology finds questionable.

What is the Difference Between Computer Vision, Image Recognition, and Image Processing?

Computer Vision has two core technologies: image recognition and image processing.The Computer Vision of Petrov’s era in the early 1980s only used image processing. Image processing involves using programming and mathematics to clarify and delineate objects in an image.

In contrast, the AI underpinning modern image recognition technology uses Deep Learning. The approach improves Computer Vision’s accuracy in distinguishing an object within a context. The degree to which Computer Vision understands the context depends on how “deep” its learning has gone.

How does Computer Vision work?

Take, for example, an online photograph of a cat in a living room. The room has large patio doors, a red brick fireplace, and a sunroof. Image processing would distinguish each of the forms in the photo, but not piece it all together into the context of a cat in a living room.

A generic application of AI-driven image recognition would be able to label and categorize each of the objects in the living room. It could rely on a huge, global and publicly accessible repository of cross-referenced and labeled images called ImageNet. It may even be able to determine the context of the photo -- namely, a living room -- depending on how “deep” and specialized the AI’s learning went.

Generic Computer Vision would display text about the photo that would read something like, “Cat, red bricks, fireplace, ceiling, hole [recognized in the ceiling], large windows.”

People searching for property online, however, require much more relevant detail about photos. Not only would they be searching using their own words and ideas, but they would expect a rich, detailed and structured response to their queries.

A specialized application of Computer Vision rooted in Deep Learning would also use image processing to determine the forms and structures in the photo. However, AI that’s been “schooled” in real estate-related terminology, images and contexts would very quickly “recognize” the sum total of relevant images in the photo as a living room. It would ignore the cat in the setting since the cat is irrelevant to what is important to people searching for property.

Computer Vision in which image recognition is based on Deep Learning may display text about the photo as, “Living room with sunroof, patio, and a red-brick fireplace.”

Looking Ahead with Computer VisionComputer_vision_3.jpg

In May 2017 Amazon had sold more than 10 million of its voice-activated Alexa Echo systems to households around the world, according to GeekWire. Alexa is the AI component of the Echo device that processes voice commands and responds in kind.

Computer Vision will increasingly serve as the “eyes” of Alexa, Siri, Cortana and a host of other voice-activated assistants. People will rely on the services to understand their natural language requests and to serve up information through human-like speech. Computer Vision will provide the sight that links our “digital brains” with our real-world senses.

Computer Vision will permit the blind and sight-impaired to “see” with a richness of description never before possible. Consumers will be able to shop and purchase goods and services more quickly and precisely than in the past. Patients will receive incredibly accurate diagnoses of conditions in minutes, instead of days, from AI in hospitals and from apps on their mobile devices.

And Computer Vision may make the world just a little safer -- by design, not just by luck.

Speak to an AI consultant