Well, I feel embarrassed I cannot get this by my own, but..
How can I reduce the mantissa (and exponent) bit-with for a floating point number?
I am training a (convolutional) artificial neural network (and I'm implementing it on FPGA) and I'd like to study the relation between mantissa (and exponent) bit-width vs. testing (and training) accuracy on CPU (and GPU). Next step would be converting my floats into a fixed point representation (that is what I am using on FPGA) and see how stuff goes.
Similar kind of studies have been already done by others ([Tong, Rutenbar and Nagle (1998)] and [Leeser and Zhao (2003)]), so there should be a way of doing this, although the 'how' is not yet clear to me.
Last point, I'm programming in Lua, but I can easily include C stuff with ffi
of LuaJIT.