AI for analyzing humans in motion: Grids, networks, and meshes
One of the most exciting and challenging applications for deep learning is in analyzing human movement. Security, medicine, and even ecological preservation are just some of the areas that could be impacted by movement recognition advances.
In medicine, for instance, uses may include monitoring and therapy for patients with traumatic brain injuries, spinal injuries, or recovering movement abilities. A machine capable of recognizing and analyzing abnormal human movement would significantly improve the treatment for these patients. However, defining and organizing the data of human movement is a complex, multi-layered problem.
In order to address some more vexing AI problems, such as analyzing objects in motion, more “exotic” data sets are used for analysis. Today, machine learning experts at MissingLink.ai, a product of Samsung NEXT, are extending the reach of machine learning to tackle some of the most challenging issues that come with learning from exotic data sets.
During a recent presentation at the San Francisco Deep Learning Meetup, Or Litany, a post-doctoral fellow at Facebook AI Research and Stanford University, discussed advanced deep learning data analysis techniques. His presentation, “Deep Learning for ‘Exotic’ Data Like 3D Meshes and Point-Clouds,” is available on YouTube.
Innovation is sometimes the product of combining two previously unrelated techniques. Litany said that kind of innovation in deep learning is now being applied to 3D objects and networks.
The combination of representations for exotic data sets and techniques for mapping them to deep neural networks bring deep learning to new domains, including analyzing the deformation of 3D objects, shape matching, and shape completion.
Litany said that moving beyond tabular or grid-like data representations enables AI developers to tackle a wider range of problems than they can with grid-like representations, which are not useful for a number of important and challenging areas.
For example, he said, graph data such as social networks or protein interaction networks have no common set of coordinates for describing data and instead are modeled as nodes and edges. In other cases, 3D meshes and point clouds enable algorithms to reason about the geometry of an object in ways that are not possible with grid-like structures.
Traditional Grid-like Structures
Grid-like structures are widely used in machine learning because their properties make it fairly easy to categorize objects and measure the similarity between objects. These structures lend themselves to using the distance between points as a measure of similarity. In classification problems, planes and hyperplanes are used to partition data sets.
Litany explained that traditional machine learning algorithms, such as support vector machines (SVM), linear regression, decision trees, and random forests take advantage of the grid-like structure of many data sets.
Deep learning works well with grid-like data sets found in speech, signal processing, natural language processing, and some kinds of image analysis. A particularly impressive result is the ability of deep learning neural networks to learn to identify objects even if they shift to different locations in a scene.
Representing 3D Objects: Voxels, Point Clouds and Meshes
Litany said that researchers have successfully used grid-like structures, known as voxels, for classifying static 3D structures. These techniques, however, do not work as well with dynamic structures, such as a person bending or twisting or identifying a single person walking across a street when multiple people are in the scene.
These techniques are difficult to work with, Litany said, especially when the problem also entails filling in missing information, such as with an image that has some missing regions of an object.
Researchers on the leading edge of AI are adapting deep learning methods for dynamic structures that do not readily map to grid-like representation, according to Litany. For example, certain types of neural networks cannot use point clouds, which are sets of points in a 3-dimensional space. “There are two problems with this,” Litany said, “it lacks structure, and the other is that it lacks order.”
Another deep learning alternative to grid-like representations are meshes, or sets of node, edges, and polygons that specify the shape of a 3D object. Graph neural networks, for example, model the structure of a graph.
This representation captures information about neighborhoods in the network, which can be mapped, or embedded, into a grid-like space. From there, the neighborhoods in the graph can be treated as convolutions and used with convolutional neural networks.
High Dimension Data Sets
Deep learning researchers are extending the reach of deep learning to data that can be modeled as networks or graphs, or manifolds, which are areas within non-grid like structures that have grid-like properties.
For example, the Earth is a sphere, but in small areas, it appears to be flat, an example of parts of a 3D object that can be described using a 2D object, the plane. The same principle applies to machine learning data sets with a large number of dimensions. Data points distributed in a large number of dimensions can be mapped to a smaller number of dimensions without losing information.
Litany concluded that by applying deep learning techniques to exotic data sets, previously unsolved problems become solvable. Moreover, the continued evolution of artificial intelligence (AI) is finding its way into everyday life, from activated virtual assistants to medical diagnostics. Whether it’s through online shopping, medicine, security, or ecological preservation, machine learning AI is beginning to change the world.
Learn more about the ways deep learning researchers are tackling the challenges of complex data sets in Litany’s talk.