Machine Vision Plus AI/ML Adds Vast New Opportunities

But to fully realize its potential, MV must boost performance and keep pace with changing security and market needs.


Traditional technology companies and startups are racing to combine machine vision with AI/ML, enabling it to “see” far more than just pixel data from sensors, and opening up new opportunities across a wide swath of applications.

In recent years, startups have been able to raise billions of dollars as new MV ideas come to light in markets ranging from transportation and manufacturing to health care and retail. But to fully realize its potential, the technology needs to address challenges on a number of fronts, including improved performance and security, and design flexibility.

Fundamentally, a machine vision system is a combination of software and hardware that can capture and process information in the form of digital pixels. These systems can analyze an image, and take certain actions based on how it is programmed and trained. A typical vision system consists of an image sensor (camera and lens), image and vision processing components (vision algorithm) and SoCs, and the network/communication components.

Fig. 1: Machine vision systems include hardware, software, and chips to perform image processing and analysis. AI is often part of the solutions and frequently MV is connected to the cloud. Source: Arcturus Networks

Both still and video digital cameras contain image sensors. So do automotive sensors such as lidar, radar, ultrasound, which deliver an image in digital pixel form, although not with the same resolution. While most people are familiar these types of images, a machine also can “see” can heat and audio signals data, and they can analyze that data to create a multi-dimensional image.

“CMOS image sensors have seen drastic improvement over the last few years,” said Ron Lowman, strategic marketing manager at Synopsys. “Sensor bandwidth is not being optimized for human sight anymore, but rather for the value AI it can provide. For instance, MIPI CSI, the dominant vision sensor interface, is not only increasing bandwidths, but also adding AI features such as Smart Region of Interest (SROI) and higher color depth. Although these color depth increases can’t be detected by the human eye, for machine vision it can improve the value of a service dramatically.”

Machine vision is a subset of the broader computer vision. “While both disciplines rely on looking at primarily image data to deduce information, machine vision implies ‘inspection type’ applications in an industry or factory setting,” said Amol Borkar, director of product management, marketing and business development, Tensilica Vision and AI DSPs at Cadence. “Machine vision relies heavily on using cameras for sensing. However, ‘cameras’ is a loaded term because we are typically familiar with an image sensor that produces RGB images and operates in the visible light spectrum. Depending on the application, this sensor could operate in infrared, which could be short wave, medium wave, long wave IR, or thermal imaging, to name a few variants. Event cameras, which are very hyper-sensitive to motion, were recently introduced. On an assembly line, line scan cameras are a slightly different variation from typical shutter-based cameras. Most current applications in automotive, surveillance, and medical rely on one or more of these sensors, which are often combined to do some form of sensor fusion to produce a result better than a single camera or sensor.”

Generally speaking, MV can see better than people. The MV used in manufacturing can improve productivity and quality, lowering production costs. Paired with ADAS for autonomous driving, MV can take over some driving functions. Together with AI, MV can help analyze medical images.

The benefits of using machine vision include higher reliability and consistency, along with greater precision and accuracy (depending on camera resolution). And unlike humans, machines do not get tired, provided they receive routine maintenance. Vision system data can be stored locally or in the cloud, then analyzed in real-time when needed. Additionally, MV reduces production costs by detecting and screening out defective parts, and increases inventory control efficiency with OCR and bar-code reading, resulting in lower overall manufacturing costs.

Today, machine vision usually is deployed in combination with AI, which greatly enhances the power of data analysis. In modern factories, automation equipment, including robots, is combined with machine vision and AI to increase productivity.

How AI/ML and MV interact
With AI/ML, MV can self-learn and improve after capturing digital pixel data from sensors.

“Machine vision (MV) and artificial intelligence (AI) are closely related fields, and they often interact in various ways,” said Andy Nightingale, vice president of product marketing at Arteris IP. “Machine vision involves using cameras, sensors, and other devices to capture images or additional data, which is then processed and analyzed to extract useful information. Conversely, AI involves using algorithms and statistical models to recognize patterns and make predictions based on large amounts of data.”

This also can include deep learning techniques. “Deep learning is a subset of AI that involves training complex neural networks on large datasets to recognize patterns and make predictions,” Nightingale explained. ” Machine vision systems can use deep learning algorithms to improve their ability to detect and classify objects in images or videos. Another way that machine vision and AI interact is through the use of computer vision algorithms. Computer vision is a superset of machine vision that uses algorithms and techniques to extract information from images and videos. AI algorithms can analyze this information and predict what is happening in the scene. For example, a computer vision system might use AI algorithms to analyze traffic patterns and predict when a particular intersection will likely become congested. Machine vision and AI can also interact in the context of autonomous systems, such as self-driving cars or drones. In these applications, machine vision systems are used to capture and process data from sensors. In contrast, AI algorithms interpret this data and make decisions about navigating the environment.”

AI/ML, MV in autonomous driving
AI has an increasing number of roles in modern vehicles, but the two major roles are in perception and decision making.

“Perception is the process of understanding one’s surroundings through onboard and external sensor arrays,” said David Fritz, vice president of hybrid and virtual systems at Siemens Digital Industries Software. “Decision-making first takes the understanding of the surrounding state and a goal such as moving toward the destination. Next, the AI decides the safest, most effective way to get there by controlling the onboard actuators for steering, braking, accelerating, etc. These two critical roles address very different problems. From a camera or other sensor, the AI algorithms will use raw data from the sensors to perform object detection. Once an object is detected, the perception stack will classify the object, for example, whether the object is a car, a person, or an animal. The training process is lengthy and requires many training sets presenting objects from many different angles. After training, the AI network can be loaded into the digital twin or physical vehicle. Once objects are detected and classified decisions can be made by another trained AI network to control steering, braking, and acceleration. Using a high-fidelity digital twin to validate the process virtually has been shown to result in safer, more effective vehicles faster than simply using open road testing.”

How much AI/ML is needed is a question frequently asked by developers. In the case of modern factories, MV can be used to simply detect and pick out defective parts in an assembly line or employed to assemble automobiles. Doing the latter requires advanced intelligence and a more sophisticated design to ensure timing, precision, and calculation of motion and distance in the assembly process.

“Automation using robotics and machine vision has increased productivity in modern factories,” observed Geoff Tate, CEO of Flex Logix. “Many of these applications use AI. A simple application — for instance, detecting if a label is applied correctly — does not require a great deal of intelligence. On the other hand, a sophisticated, precision robot arm performing 3D motion requires much more GPU power. In the first application, one tile of AI IP will be sufficient, while the second application may need multiple tiles. Having flexible and scalable AI IPs would make designing robotics and machine vision much easier.”

Machine vision applications are limited only by one’s imagination. MV can be used in almost any industrial and commercial segment, so long as it requires vision and processing. Here is a partial list:

  • Transportation (autonomous driving, in-cabin monitoring, traffic flow analysis, moving violation and accident detection);
  • Manufacturing and automation (productivity analysis, quality management);
  • Surveillance (detection of motion and intrusion monitor);
  • Health care (imaging, cancer and tumor detection, cell classification);
  • Agriculture (farm automation, plant disease and insect detection);
  • Retail (customer tracking, empty shelf detection, theft detection), and
  • Insurance (accident scene analysis from images).

There are many other applications. Consider drinking water or soft drink bottling. A machine vision system can be used to inspect fill levels, which typically is done by highly efficient robots. But robots occasionally make mistakes. MV can ensure the fill level is consistent and the labels are applied correctly.

Detecting any machine parts that deviate from measurement specification limits is another job for MV. Once the MV is trained on the specification, it can detect the parts that are outside the specification limits.

MV can detect uniform shapes such as squares or circles as well as odd-shaped parts, so it can be used to identify, detect, measure, count, and (with robots), pick and place.

Finally, combining AI, MV can perform tire assembly with precision and efficiency. Nowadays, OEMs automate vehicle assembly with robots. One of the processes is to install the four wheels to a new vehicle. Using MV, a robotic arm can detect the correct distance and apply just the right amount of pressure to prevent any damage.

Types of MV
MV technologies can be divided into one-dimensional (1D), two-dimensional (2D), and three-dimensional (3D).

1D systems analyze data one line at a time, comparing variations among groups. Usually it is used in production of items such as plastics and paper on a continual basis. 2D systems, in contrast, use a camera to scan line by line to form an area or a 2D image. In some cases, the whole area is scanned and the object image can then be unwrapped for detailed inspection.

3D systems consist of multiple cameras or laser sensors to capture the 3D view of an object. During the training process, the object or the cameras need to be moved to capture the entire product. Recent technology can produce accuracy within micrometers. 3D systems produce higher resolution but are also more expensive.

Emerging MV startups and new innovations
Tech giants, including IBM, Intel, Qualcomm, and NVIDIA, have publicly discussed investments in MV. In addition, many startups are developing new MV solutions such as Airobotics , Arcturus Networks, Deep Vision AI , Hawk-Eye Innovations, Instrumental, lending AI, kinara, Mech-Mind, Megvii, NAUTO, SenseTime, Tractable, ViSenze, Viso, and others. Some of these companies have been able to raise funding in excess of $1 billion.

In transportation, insurance companies can use MV to scan photographs and videos of scenes of accidents and disasters for financial damage analysis. Additionally, AI-based MV can power safety platforms to analyze driver behavior.

In software, computer vision platforms can be created without the knowledge of coding. Other startups have developed the idea for MV authentication software. And in the field of sports, AI, vision, and data analysis could provide coaches the ability to understand how decisions are made by players during a game. Also, one startup devised a cost reduction idea for surveillance by combining AI and MV in unmanned, aerial drone design.

Both MV and AI are changing quickly, and will continue to increase in performance, including precision and accuracy, while high GPU and ML power will come down in cost, propelling new MV applications.

Arteris’ Nightingale noted there will be further improvements in accuracy and speed. “Machine vision systems will likely become more accurate and faster. This will be achieved through advancements in hardware, such as sensors, cameras, and processors, as well as improvements in algorithms and machine learning models,” he said, pointing to an increased use of deep learning, as well. “Deep learning has been a significant driver of progress in machine vision technology in recent years, and it is likely to play an even more substantial role in the future. Deep learning algorithms can automatically learn data features and patterns, leading to better accuracy and performance. There will be an enhanced ability to process and analyze large amounts of data, as machine vision technology can process and analyze large amounts of data quickly and accurately. We may also see advancements in machine vision systems that can process significantly larger datasets, leading to more sophisticated and intelligent applications.”

Further, MV and AI are expected to integrate with other technologies to provide additional high-performance, real-time applications.

“Machine vision technology is already integrated with other technologies, such as robotics and automation,” he said. “This trend will likely continue, and we may see more machine vision applications in health care, transportation, and security. As well, there will be more real-time applications. Machine vision technology is already used for real-time applications, such as facial recognition and object tracking. In the future, we may see more applications that require real-time processing, such as self-driving cars and drones.”

MV design challenges
Still, there are challenges in training an MV system. Its accuracy and performance depend on how well the MV is trained. Inspection can encompass parameters such as orientation, variation of the surfaces, contamination, and accuracy tolerances such as diameter, thickness, and gaps. 3D systems can perform better than 1D or 2D systems when detecting cosmetic and service variation effects. In other cases, when seeing an unusual situation, human beings can draw on knowledge from a different discipline, while MV and AI may not have that ability.

“Some of today’s key challenges include data flow management and control – especially with real-time latency requirements such as those in automotive applications — while keeping bandwidth to a minimum,” said Alexander Zyazin, senior product manager in Arm‘s Automotive Line of Business. “In camera-based systems, image quality (IQ) remains critical. It requires a hardware design to support ultra-wide dynamic range and local tone mapping. But it also requires IQ tuning, where traditionally subjective evaluation by human experts was necessary, making the development process lengthy and costly. The new challenge for MV is that this expertise might not result in the best system performance, as perception engines might prefer to see images differently to humans and to one another, depending on the task.”

In general, machines can do a better job when doing mundane tasks over and over again, or when recognizing an image with more patterns than humans can typically process. “As an example, a machine may do a better job recognizing an anomaly in a medical scan than a human, simply because the doctor may make a mistake, be distracted or tired,” said Thomas Andersen, vice president for AI and machine learning at Synopsys. “When inspecting high-precision circuits, a machine can do a much better job analyzing millions of patterns and recognizing errors, a task a human could not do, simply due to the size of the problem. On the other hand, machines have not yet reached the human skill of recognizing the complex scenes that can occur while driving a car. It may seem easy for a human to recognize and anticipate certain reactions, while the machine may be better in ‘simple’ situations that a human easily could deal with, but did not due to a distraction, inattention or incapacitation – for example auto stop safety systems to avoid an imminent collision. A machine can always react faster than a human, assuming it interprets the situation correctly.”

Another challenge is making sure MV is secure. With cyberattacks increasing constantly, it will be important to ensure no production disruption or interference from threat actors.

“Security is critical to ensuring the output of MV technology isn’t compromised,” said Arm’s Zyazin. “Automotive applications are a good example of the importance of security in both hardware and software. For instance, the information processed and extracted from the machine is what dictates decisions such as braking or lane-keep assist, which can pose a risk to those inside the vehicle if done incorrectly.”

MV designs include a mixture of chips (processors, memories, security), IPs, modules, firmware, hardware and software. The rollout of chiplets and multi-chip packaging will allow those systems to be combined in novel ways more easily and more quickly, adding new features and functions and improving the overall efficiency and capabilities of these systems.

“Known good die (KGD) solutions can provide cost and space efficient alternatives to packaged products with limited bonding pads and wires,” said Tetsu Ho, DRAM manager at Winbond. That helps improve design efficiency, provides enhanced hardware security performance, and especially time-to-market for product launch. These die go through 100% burn-in and are tested to the same extent as discrete parts. KGD 2.0 is needed to assure end-of-line yield in 2.5D/3D assembly and 2.5D/3D multichip devices to realize improvements in PPA, which means bandwidth performance, power efficiency, and area as miniaturization, driven by the explosion of technologies such as edge-computing AI.”

This will open new options for MV in new an existing markets. It will be used to support humans in autonomous driving, help robots perform with precision and efficiency in manufacturing, and perform surveillance with unmanned drones. In addition, MV will be able to explore places that are considered dangerous for humans, and provide data input and analysis for many fields, including insurance, sports, transportation, defense, medicine, and more.

Leave a Reply

(Note: This name will be displayed publicly)