Vectorization of Multibyte Floating Point Data Formats

Home	News	Packages	Wiki	Publications	Software	People	Social

Andrew Anderson, David Gregg

The 25th International Conference on Parallel Architecture and Compilation Techniques, 2016

We propose a scheme for reduced-precision representation of floating point data on a continuum between IEEE-754 floating point types. Our scheme enables the use of lower precision storage formats for floating point numbers, reducing not just storage space requirements but also data transfer volume.

We describe how our scheme can be accelerated using existing hardware vector units on a general-purpose processor. Exploiting native vector hardware allows us to support reduced precision floating point with very low overhead.

Reducing data transfer volume is very important for embedded systems, because data transfer is slow. Our results demonstrate that it can also be very effective on general purpose processors; in several benchmark applications, the performance gain from transferring much less data is enough to yield and overall speedup, even with the extra computational overhead of converting data to and from the storage format.

Research Group

Vectorization of Multibyte Floating Point Data Formats

Andrew Anderson, David Gregg

The 25th International Conference on Parallel Architecture and Compilation Techniques, 2016