Dynamic Sharing in Multi-accelerators of Neural Networks on an FPGA Edge Device
Hsin-Yu Ting, Tootiya Giyahchi, Ardalan Amiri Sani and Eli Bozorgzadeh
University of California, Irvine, USA
Abstract
Edge computing can potentially provide abundant processing resources for compute-intensive applications while bringing services close to end devices. With the increasing demands for computing acceleration at the edge, FPGAs have been deployed to provide custom deep neural network accelerators. This paper explores a DNN accelerator sharing system at the edge FPGA device, that serves various DNN applications from multiple end devices simultaneously. The proposed SharedDNN/P lanAhead policy exploits the regularity among requests for various DNN accelerators and determines which accelerator to allocate for each request and in what order to respond to the requests that achieve maximum responsiveness for a queue of acceleration requests. Our results show overall 2.20x performance gain at best and utilization improvement by reducing up to 27% of DNN library usage while staying within the requests’ requirements and resource constraints.