Better Deep Learning: Fewer Neurons, More Intelligence

  • By Florian Aigner / Martin Wagner (edt.)
  • 2020-10-15
  • Research
  • Innovation

AI becomes more efficient and reliable if it is more closely oriented to biological models: New approaches in AI research prove themselves in experiments.

Picture: Gerd Altmann

Due to the enormous computing power that has become available in recent years, Artificial intelligence (AI) has arrived in our everyday lives—from search engines to self-propelled cars. New results from AI research now show that more straightforward smaller neural networks can solve specific tasks even better, more efficiently, and more reliably than before.

A research team from the TU Wien Informatics, IST Austria, and MIT has developed a new type of Artificial Intelligence based on biological models, such as simple threadworms. The new AI model can control a vehicle with an amazingly small number of artificial neurons. The system has decisive advantages over previous deep learning models: It copes much better with impure input data. Because of its simplicity, its mode of operation is explainable in detail: It’s not a complicated “black box,” but can be understood by humans. This deep learning model has now been published in the journal “Nature Machine Intelligence.”

Learning from Nature

Similar to living brains, neural networks on the computer consist of many individual cells. When one cell is active, it sends a signal to other cells. All signals received by the next cell decide together whether this cell will also become active. The exact way in which one cell influences the activity of the following is initially open. These parameters are adjusted in an automatic learning process until the neural network can solve a specific task.

“For years, we have been thinking about what can be learned from nature to improve artificial neural networks,” says Radu Grosu, head of the research unit Cyber-Physical Systems at TU Wien Informatics. “The nematode C. elegans, for example, manages with an amazingly small number of nerve cells, and yet it displays interesting behavioral patterns. It is the efficient and harmonious way in which its nervous system processes information that allows for these patterns.

“Nature shows us that much can still be improved in artificial intelligence,” says Daniela Rus, director of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. “Therefore, our goal was to massively reduce complexity and improve the interpretability of the neural network.”

“Inspired by nature, we have developed new mathematical models for neurons and synapses,” says Thomas Henzinger, president of IST Austria.

“The processing of the signals within the individual cells obeys different mathematical rules than in previous deep learning models,” says Ramin Hasani, a postdoc at the Institute for Computer Engineering at TU Wien Informatics and CSAIL, MIT. “Also, not every cell was connected to every other cell—this also makes the network simpler.”

The Task: Autonomous Tracking

To test the new ideas, the team chose a particularly important test task: lane-keeping during autonomous driving. The neural network receives a camera image of the road as input and automatically decides whether to steer to the right or left.

“For tasks like autonomous driving, deep learning models with millions of parameters are often used today,” says Mathias Lechner, TU Wien Informatics Distinguished Young Alumn, and Ph.D. student at IST Austria. “However, our new approach makes it possible to reduce the size of the network by two orders of magnitude. Our systems get by with 75,000 trainable parameters”.

Alexander Amini, a Ph.D. student at CSAIL, MIT, explains that the new system consists of two parts: The camera input is first processed by a so-called convolutional network, which only perceives the visual data to recognize structural image properties in the pixels. The network decides which parts of the camera image are relevant and essential and then passes signals to the crucial part of the network—the control system, which steers the vehicle.

Both subsystems are first trained together by feeding them real-life data: hours and hours of traffic videos of human-controlled driving and information on how to steer the car in a given situation. In the end, the system has learned the correct combination of image and steering direction and can independently handle new situations.

The neural network’s control system (called “Neural Circuit Policy” or NCP), which translates the data from the visual network into a control command, consists of only nineteen cells. Mathias Lechner explains: “These NCPs are three orders of magnitude smaller than would be possible with previous state-of-the-art models.”

Causality and Interpretability

The new Deep Learning model was tested in a real autonomous vehicle. “Our model allows us to examine on what exactly the network focuses its attention while driving. It focuses on very specific areas of the camera image: the roadside and the horizon. This behavior is highly desirable, and it’s unique in systems based on artificial intelligence,” says Ramin Hasani. “We have also seen that the role of each individual cell in each individual decision can be identified. We can understand the function of cells and explain their behavior. This level of interpretability is impossible in larger deep learning models”.

Robustness

“To test how robust our NCPs are compared to previous deep learning models, we artificially degraded the images and analyzed how well the system copes with image noise,” says Mathias Lechner. “While this has become an unsolvable problem for other deep learning networks, our system is very resistant to artifacts at the input. This property is a direct consequence of the novel model and its architecture”.

“Interpretability and robustness are the two key advantages of our new model,” says Ramin Hasani. “But there is more: Our new methods allow us to reduce the duration of the training and create the possibility to implement artificial intelligence in relatively simple systems. Our NCPs make imitative learning possible in a wide range of applications, from automated work in warehouses to robots’ motion control. The new results open up important new perspectives for the AI community: The fundamentals of data processing in biological nervous systems are a great knowledge resource for creating high-performance interpretable artificial intelligence. They are an alternative to the black-box machine learning systems we have known up to now”.

Resources