ACACES 2017 - Chapter: Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks

Home	News	Packages	Wiki	Publications	Software	People	Social

James Garland, David Gregg

HiPEAC, the European Network of Excellence on High Performance and Embedded Architecture and Compilation, 2017

• Convolutional neural networks (CNNs) are very successful deep machine learning technologies. • CNNs require large amounts of processing capacity and memory bandwidth. • Proposed hardware accelerators typically contain large numbers of multiply-accumulate (MAC) units. • One CNN accelerator approach is “weight sharing”: – Full range of trained CNN weight values are stored in bins; – Index to bin is used instead of the original weight value, thus reducing data sizes and memory traffic. • We propose here a novel multiply-accumulate (MAC) circuit that exploits binning in weight-sharing CNNs. • Rather than computing the MAC directly we: – Count the frequency of each weight and place the count in a bin. – Compute the accumulated value in a subsequent multiply phase. • Proposal allows hardware multipliers in the MAC circuit to be replaced with adders and selection logic. • Results in fewer gates, smaller logic, and reduced power with a slight latency increase in application specific integrated circuit (ASIC). • Results in fewer cells, reduced power when implemented in resource-constrained field programmable gate arrays (FPGAs).

(poster)

Research Group

ACACES 2017 - Chapter: Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks

James Garland, David Gregg

HiPEAC, the European Network of Excellence on High Performance and Embedded Architecture and Compilation, 2017