T-2: Distributed and Efficient Deep Learning

by
Wojciech Samek and Felix Sattler
Fraunhofer Heinrich Hertz Institute

Deep neural networks have recently demonstrated their incredible ability to solve complex tasks. Today’s models are trained on Millions of examples using powerful GPU cards and are able to reliably annotate images, translate text, understand spoken language or play strategic games such as chess or go. Furthermore, deep learning will also be integral part of many future technologies, e.g., autonomous driving, Internet of Things (IoT) or 5G networks. Especially with the advent of IoT, the number of intelligent devices has rapidly grown in the last couple of years. Many of these devices are equipped with sensors that allow them to collect and process data at unprecedented scales. This opens unique opportunities for deep learning methods.

However, these new applications come with several additional constraints and requirements, which limit the out-of-the-box use of current models.

Embedded devices, IoT gadgets and smartphones have limited memory & storage capacities and restricted energy resources. Deep neural networks such as VGG-16 require over 500 MB for storing the parameters and up to 15 giga-operations for performing a single forward pass. Such models in their current (uncompressed) form cannot be used on-device.
Training data is often distributed over devices and cannot simply be collected at a central server due to privacy issues or limited resources (bandwidth). Since a local training of the model with only few data points is often not promising, new collaborative training schemes are needed to bring the power of deep learning to these distributed applications.

This tutorial will discuss recently proposed techniques to tackle these two problems.

T-2: Distributed and Efficient Deep Learning

Like this:

Follow us in social media

T-2: Distributed and Efficient Deep Learning

Share this:

Like this:

Follow us in social media