Bringing deep learning to IoT devices
Deep learning is well known for solving seemingly intractable problems in computer vision and natural language processing, but it typically does so by using massive CPU and GPU resources. Traditional deep learning techniques aren’t well suited to addressing the challenges of Internet of Things (IoT) applications, however, because they can’t apply the same level of computational resources.
When running deep learning analysis on mobile devices, developers must adapt to a more resource-constrained platform. Image analysis on resource-constrained platforms can consume significant compute and memory resources. For example, the SpotGarbage app uses convolutional neural networks to detect garbage in images but consumes 83 percent of CPU and takes more than five seconds to respond.
Fortunately, recent advances in network compression, approximate computing, and accelerators are enabling deep learning on resource-constrained IoT devices.
To understand the challenges of applying deep learning to IoT data, it helps to review relevant characteristics of IoT applications. These systems comprise a large number of distributed devices continuously generating a high volume of data.
Different types of devices generate different types of data, leading to heterogeneous data sets, but the data typically includes timestamps and location information. Finally, IoT applications must be designed to tolerate noise in the data due to transmission and acquisition errors.
These characteristics demand substantial computing power, which can be a limited resource on many IoT devices. Researchers have identified three ways to deploy deep learning to resource-constrained devices while reducing demand on CPU, memory, and power.
Network compression is the process of converting a densely connected neural network into a sparsely connected network. This technique does not work with all deep learning networks but when applicable, it can reduce both storage and computation load.
By intelligently pruning redundant connections, computational load can be reduced by a factor of 10 without adversely affecting accuracy. One study found another important advantage of sparse network models: these compressed models can run on commonly used IoT platforms, including those from Qualcomm, Intel, and NVidia.
Another approach to reducing computational load is to approximate, rather than exactly compute, the values of nodes with the least effect on model accuracy. For example, in one study on low power consumption design, researchers generated approximate results by reducing the number of bits per node by 33 percent to 75 percent of bits in low-impact neurons. This change did not reduce accuracy. While suitable for many applications, approximate computing is not appropriate for applications requiring high precision.
A third approach to enabling deep learning on IoT devices is to deploy specialized accelerator hardware. The Eyeriss accelerator, for example, is designed to optimize for energy consumption by convolutional neural networks (CNNs). The DianNao platform is designed for both convolutional and deep neural networks and optimizes for memory utilization as well as energy consumption.
The disadvantage of this approach is that it requires specialized hardware. Hardware-based accelerators are best suited for high-value applications that require low power consumption and fast computation.
Deep learning is not out of the reach of IoT devices. Network compression, approximate computing, and hardware accelerators all enable deep neural net modeling on devices with limited CPU, memory, and power. As with any specialized approach, there are limitations and disadvantages.
Network compression and approximate computing are well suited to applications with a tolerance for a lack of precision. Both techniques can be coupled with accelerators to further improve performance but at the cost of using specialized hardware. Network compression and approximate computing both reduce data used in calculations and should be used together with caution.
Our knowledge of deep learning on IoT devices is limited, and one should proceed with caution in several areas. Analyzing streaming data with compressed networks is not well understood, especially with regards to models that must adapt to changing patterns in the data stream.
Most of the research about deep learning on IoT employ CNNs. They networks are routinely applied to image analysis but long short term memory (LSTM) networks are widely used to analyze linear streaming data sets. It is not clear if the network compression techniques that work well with CNNs will be as effective when applied to LSTMs.
When using approximate computing and network compression there will be a point of diminishing returns. Further reduction in network or node size may produce unacceptable drops in accuracy and precision.
Deep learning is the best solution to a number of analysis and detection problems. When used with an eye to conserving compute, memory, and power resources, deep learning can bring intelligence to IoT devices.