Questions tagged [single-precision]
57 questions
0
votes
1 answer
Writing floating point numbers in single precision to file in R?
I understand that R doesn't have a floating point format in single precision. However, I'm writing a very large number of data points from R to file, and I'd like to store them as single precision floats, rather than double precision. I don't need…

bayesian
- 65
- 8
0
votes
0 answers
Mips single to double floating point percision
I have this program in mips and I wasn't to change it to double precision. It looks like single and double precision floating instructions have the same instructions but instead of .s it is .d If anyone has commens or help it would help my out a…

jmurphy1267
- 65
- 1
- 9
0
votes
1 answer
Binary operations on Numpy scalars automatically up-casts to float64
I want to do binary operations (like add and multiply) between np.float32 and builtin Python int and float and get a np.float32 as the return type. However, it gets automatically up-casted to a np.float64.
Example code:
>>> a = np.float32(5)
>>>…

jmd_dk
- 12,125
- 9
- 63
- 94
0
votes
1 answer
integer to single precision conversion in python
I can't figure out how to convert an integer to a single precision using python, I already tried to use numpy but the float32() function doesn't help.
example
984761996 -> 1.360135E-3
0
votes
1 answer
MIPS - How to Convert a set of Integers into Single-Precision Floats
I'm really having a difficult time figuring out how to approach this problem. I get that I want to take the binary representation of both the integer and fraction, combine them for the mantissa, and assign the sign bit to the beginning, but I don't…

Nimbus
- 1
- 1
- 2
0
votes
2 answers
Accuracy of c_k = a + ( N + k ) * b
a, b are 32 bit floating point values, N is a 32 bit integer and k can take on values 0, 1, 2, ... M. Need to calculate c_k = a + ( N + k ) * b; The operations need to be 32 bit operations (not double precision). The concern is accuracy -- which…

user1823664
- 1,071
- 9
- 16
-1
votes
3 answers
Byte[] to float conversion
Float b = 0.995;
Byte[] a = Bitconverter.GetBytes(b);
Now my byte[] values are 82 184 126 63 .i.e.,
a[0] = 82, a[1] =184, a[2] = 126, and a[3] = 63.
I want to revert back above byte to float.So,I used Bitconverter.Tosingle
Float b =…

Indrajith Sukumar
- 11
- 3
-1
votes
3 answers
How many different values can be encoded in IEEE 754 32-bit base-2 floating-point system?
The wikipedia page states that
an IEEE 754 32-bit base-2 floating-point variable has a maximum value of (2 − 2−23) × 2127 ≈ 3.4028235 × 1038
In that number, are +∞, −∞ and NaN included?
What is that 2 in "(2 − 2−23)"?
Why 127 in 2127?

Joshua Leung
- 2,219
- 7
- 29
- 52
-1
votes
2 answers
Decimal to IEEE 754 Single-precision IEEE 754 code using C
We have an assignment in class to convert from Decimal to single precision using c and I'm completely lost.
This is the assignment:
The last part of this lab involves coding a short c algorithm. Every
student must create a program that gets a…

Nor Alexanian
- 9
- 7
-1
votes
1 answer
Reinterpret bytes as float in C (IEEE 754 single-precision binary)
I want to reinterpret 4 bytes as IEEE 754 single-precision binary in C.
To obtain the bytes that represent float, I used:
num = *(uint32_t*)&MyFloatNumber;
aux[0] = num & 0xFF;
aux[1] = (num >> 8) & 0xFF;
aux[2] = (num >> 16) & 0xFF;
aux[3] = (num…

Guilherme
- 35
- 1
- 8
-1
votes
1 answer
How is single precision floating point number subtraction is done?
Here is the example (I have converted them to decimal in advance).
A is 01000001000010000000000000000000^2 (in decimal 8.5)
B is 01000000000100000000000000000000^2 (in decimal 2.25)
The ((+A)-(+B)) should be 6.25 in decimal.
Normalizing A and B and…

Sanone
- 11
- 2
-1
votes
3 answers
Add Two 32 bit Floating Point Numbers with AVR-Assembler
Im trying to use AVR Studio to add two 32bit floating point numbers together. I know that I will need to store the 32bit number in 4 separate 8bit registers. I'll then need to add the registers together using the carry flag. This is what I have so…

Supercreature
- 441
- 6
- 25