Knowledge Based Neural Network Compression: Context-Aware Model Abstractions 01/11/2021 - 31/10/2023

Abstract

In the state-of-the-practice IoT platforms complex decisions based on sensor information are made in a centralized data center. Each sensor sends its information over thereafter a decision is send to actuators. In certain applications the latency imposed by this communication can lead to problems. For this, decisions should be made on the edge devices themselves. This is what the research track on resource and context aware AI is about. We want to develop edge inference systems that dynamically reconfigure to adapt to changing environments and resources constraints. This work is focused on compressing neural networks. In this work we want to extend on the current state-of-the-art on neural network compression by incorporating a knowledge-based pruning method. With knowledge based we mean that we first determine the locations of specific task related knowledge in the network and use this to guide the pruning. This way we can make the networks adjustable to environmental characteristics and hardware constraints. For some tasks in a specific environment, it might be favorable to reduce the accuracy of certain classes in favor of resource gain. For example, the classification of certain types of traffic sign types can be less accurate on highways than in a city center. Based on these requirements we want to selectively prune by locating specific task related concepts. By removing them we expect to achieve higher compression ratios compared to the state-of-the-art.

Researcher(s)

Research team(s)