We propose a scheme for reduced-precision representation of floating point data on a continuum between IEEE-754 floating point types. Our scheme enables the use of lower precision storage formats for floating point numbers, reducing not just storage space requirements but also data transfer volume.
We describe how our scheme can be accelerated using existing hardware vector units on a general-purpose processor. Exploiting native vector hardware allows us to support reduced precision floating point with very low overhead.
Reducing data transfer volume is very important for embedded systems, because data transfer is slow. Our results demonstrate that it can also be very effective on general purpose processors; in several benchmark applications, the performance gain from transferring much less data is enough to yield and overall speedup, even with the extra computational overhead of converting data to and from the storage format.