Machine Learning
Autonomous mobile robots are often navigating dynamic environments with inherent uncertainties. It is nearly impossible to program a robot to handle every possible situation. One technique for dealing with these uncertainties is programming robots to learn about their world just like a human child. Trial and error, test and compare, and success and failure are all methods that humans use.
For machine learning, the computer needs to be able to perceive the relevant details of its world, have goals or ideals to be testing against, the ability to judge the success or failure of its actions, and the ability to remember and reference its previous actions compared to their outcomes.
The type of feedback that the machine receives defines the learning type. Three types that are commonly used are: supervised learning, unsupervised learning, and reinforcement learning.
Supervised Learning
Supervised learning is probably the most obvious method of learning that is used. It requires the machine to figure out a function, given a set of labeled inputs and outputs. This could be cracking a simple encryption where letters have been replaced with other letters. The machine can attempt the translation, check the known inputs and results, and change its translation formula until it matches correctly.
In terms of mobile robotics, the robot may use this method to differentiate cars from people. During the learning phase, the vision system would be presented with images of cars and people, and, after giving its educated guess, presented with the right answer. A comparison is made for key features (such as size or shape) and the function for deciding which object is a human or car is adjusted.
Unsupervised Learning
Unsupervised learning requires a computer to find patterns and categorize objects. For example, the computer will learn that object A belongs in category 1 while object B belongs in category 2. Although the computer can determine which categories the object belong to, it does not know what the categories are.
While supervised learning provides the machine with input and output examples to test its behavior, no examples are given to the computer in unsupervised learning. Instead, patterns are derived directly from the known data. For example, if a robot with a camera sensor is trying to decide which fruits are different from a combined pile of limes and oranges, the computer could observe each individual fruit for its color, and then analyze the distribution of all fruit on the color spectrum. Two independent groups should be apparent from the mapped statistics, even though the computer doesn’t know that group A is made up of limes and group B is made up of oranges. Given these findings, the computer should be able to differentiate limes and oranges based on color and the statistical closeness to one of the groups as seen in the chart below.

Figure 1. Probabilities of an Object Being Either an Orange or a Lime Based on Unsupervised Learning Techniques
Reinforcement Learning
This method is the most open-ended. The computer or robot is able to try any action or any combination of actions. The success of the resulting outcome is rated by a metric to provide feedback to the machine. This rating is called the reinforcement.
Consider that you are designing a robot to ferry parts from one side of the plant to the other. Negative reinforcements could be damaging the part, robot, or facilities. Positive reinforcements could be transporting the part to the other location within the minimum amount of time or within the minimum distance. Over time, the robot will learn to select faster routes, while correcting its speed and cornering velocities to minimize both time and damage.
The trade off is that the robot must be allowed to fail if it is going to be able to try all outcomes. If the robot is trying to decode some pattern in text, there is little real-world risk. If the robot is a vehicle flying over a city, a decision to try flying straight into the ground is quite costly and dangerous. Ultimately, the robot will acknowledge that it has failed the test with that choice of action, but that is not very useful after it has been destroyed.
One possible solution for autonomous mobile robots to avoid damage, is allowing the ‘brain’ of the robot to attempt these trials in a well-modeled environmental simulation before deployment in the real world. That way, when it crashes catastrophically, it is only doing so virtually. Eventually the trained brain can be implemented in a real-world environment.
References
[1] Russell, Stuart and Peter Norvig. Artificial Intelligence: A Modern Approach. Upper Saddle River, New Jersey: Prentice Hall, 2003.
Reader Comments | Submit a comment »
Legal
This tutorial (this "tutorial") was developed by National Instruments ("NI"). Although technical support of this tutorial may be made available by National Instruments, the content in this tutorial may not be completely tested and verified, and NI does not guarantee its quality in any way or that NI will continue to support this content with each new revision of related products and drivers. THIS TUTORIAL IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND AND SUBJECT TO CERTAIN RESTRICTIONS AS MORE SPECIFICALLY SET FORTH IN NI.COM'S TERMS OF USE (http://ni.com/legal/termsofuse/unitedstates/us/).
