When you simply store a float somewhere in memory, the bytes in the floating point number can be laid out in little-endian or big-endian fashion. If you need to communicate this value to some other computer, then the layout should be converted to a standard one so that the other party can read it properly.
Since the C platform lacks the necessary tools, I have to make my own. First of all, although very likely, little-endian and big-endian may not be the only options. For example, a machine could store two-byte sequences in reverse, creating a NUXI problem. I just can't know without trying it out, because standard libraries don't give me any information.
My end goal is to read and write floats in little-endian format. The x64 family of processors already do this. I don't know how ARM based tablet and phone processors do it.
Ideally, the tests should be run before compilation and the results should be used to activate relevant pieces of code. However, running a test program on the target might be too difficult. Since phones and tablets are pretty much walled garden environments, you can't simply upload a.out and run it. You need to package it as an application, adapt your code to run within the wierd environment and get results over the network somehow. This isn't going to be worth it for a small test like this. However, a framework which simply downloads some code from a secure server and runs it can be a good project.
In any case, I'm going to go with a run-time test. The end result will work like this: First, I'll run the tests to figure out the floating point layout of the CPU. Based on the results, I'll compute two permutations, one for converting from native to little-endian and one for the other direction.
When I want to write a float, I'll write it in native format to a properly aligned area. From there, I'll permute it out into the output buffer. Reading will be done in a similar fashion.
32-bit: SIGN(bit 31)-EXPONENT(bits 30-23)-FRACTION(bits 22-0) 64-bit: SIGN(bit 63)-EXPONENT(bits 62-52)-FRACTION(bits 51-0)The CPU could scramble the bits around, storing third bit of the fraction at the highest byte, putting the sign at the first byte etc. However, I doubt that any sane person would do such a thing. So, I'm going to assume that the fields exponent and fraction are represented continuously in the memory. For a 32-bit value, I should have the following four bytes:
bit 76543210 A: SEEEEEEE B: EFFFFFFF C: FFFFFFFF D: FFFFFFFFThe order of these bytes may differ from CPU to CPU, but the contents of these should be the same. Otherwise, the algorithm I'm implementing here would fail spectacularly.
In order to find which byte goes where, I shall make special floating point numbers and then check which bytes have the 1s. Here is a template for 32 bit floating point numbers:
SEEEEEEE EFFFFFFF FFFFFFFF FFFFFFFF bit 76543210 76543210 76543210 76543210F is fraction, E is exponent and S is sign. For 32-bit numbers, finding out the order of the fraction bytes is sufficient to figure out the overall permutation. The only remaining byte is the upmost byte. For this purpose, I'll use denormalized floating point numbers. These numbers have 0 as exponent, so the only set bits are in the fraction.
SEEEEEEE EFFFFFFF FFFFFFFF FFFFFFFF bit 76543210 76543210 76543210 76543210 ----------------------------------------- 00000000 00000000 00000000 00000001 = 1/2^149 = A 00000000 00000000 00000001 00000000 = A * 256 = B 00000000 00000001 00000000 00000000 = B * 256 sign bit check 10000000 00000000 00000000 00000000 = -0.0Since these are denormalized numbers, they don't necessarily have accurate decimal or hexadecimal representations. The last floating point number is used as a check to verify the position of the sign bit. gcc-4.8.3 has no problems parsing the negative zero "-0.0" into the proper encoding.
The 64-bit case is also the same. I just play with the fraction part to find out the encoding. The only byte which doesn't have a fraction bit is the upmost byte.
SEEEEEEE EEEEFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF 76543210 76543210 76543210 76543210 76543210 76543210 76543210 76543210 ----------------------------------------------------------------------- 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 = 1 / 2^1074 = A 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 = A * 256 = B 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000Now, in order to find the permutations, I will store these floats somewhere in memory and then search that memory for a set byte. For the first 64-bit example above, the set byte corresponds to the lowest byte. If the index of this set byte is T, then it means that the 0zix little-endian byte should go to Tzix position when converting the float to native layout:
T= find_byte(1) toNative[0]= T fromNative[T]= 0
#include <stdio.h> #include <stdlib.h> #include <stdint.h> uint8_t spc[32]; uint8_t *ali16; void align() { uint64_t T; T= (uint64_t) spc; if (T%16) { ali16= spc + (16-T%16); } else { ali16= spc; } } void prt_byte(uint8_t V) { int i; for(i=7;i>=0;i--) printf("%c", (V&(1<<i)) ? '1' : '.'); } void prt_bytes(int N) { int i; for(i=0;i<N;i++) { if (i) printf(" "); prt_byte(ali16[i]); } printf("\n"); } void prt_flo(float *V) { *(float*) ali16= *V; prt_bytes(4); } void prt_dou(double *V) { *(double*) ali16= *V; prt_bytes(8); } static int find_byte(int nb, uint8_t V) { int i; for(i=0;i<nb;i++) if (ali16[i]==V) return i; return 0; } void flo_layout(int dbg,float *F, uint8_t *fromNative,uint8_t *toNative) { char *msg[]= { " 1 / 2^149 = A ", " A * 256 = B ", " B * 256 = C ", }; int i; *F= 1; if (dbg) printf("finding float layout\n"); for(i=0;i<149;i++) *F/= 2; for(i=0;i<3;i++) { *(float*) ali16= *F; toNative[i]= find_byte(4, 1); if (dbg) { printf("%s\n", msg[i]); prt_flo(F); } *F *= 256; } *F= -0.0; *(float*) ali16= *F; if (dbg) { printf("negative zero= %g\n", *F); prt_flo(F); } toNative[i]= find_byte(4, 0x80); for(i=0;i<4;i++) fromNative[toNative[i]]= i; if (dbg) { printf("write: "); for(i=0;i<4;i++) printf(" %d", fromNative[i]); printf("\n"); printf(" read: "); for(i=0;i<4;i++) printf(" %d", toNative[i]); printf("\n"); } } void dou_layout(int dbg,double *F, uint8_t *fromNative,uint8_t *toNative) { char *msg[]= { " 1 / 2^1074 = A ", " A * 256 = B ", " B * 256 = C ", " C * 256 = D ", " D * 256 = E ", " E * 256 = F ", " F * 256 = G ", }; int i; *F= 1; if (dbg) printf("finding double layout\n"); for(i=0;i<1074;i++) *F/= 2; for(i=0;i<7;i++) { *(double*) ali16= *F; toNative[i]= find_byte(8, 1); if (dbg) { printf("%s\n", msg[i]); prt_dou(F); } *F *= 256; } *F= -0.0; *(double*) ali16= *F; if (dbg) { printf("negative zero= %g\n", *F); prt_dou(F); } toNative[i]= find_byte(8, 0x80); for(i=0;i<8;i++) fromNative[toNative[i]]= i; if (dbg) { printf("write: "); for(i=0;i<8;i++) printf(" %d", fromNative[i]); printf("\n"); printf(" read: "); for(i=0;i<8;i++) printf(" %d", toNative[i]); printf("\n"); } } int main() { float F; double D; uint8_t Pwrite[8], Pread[8]; align(); flo_layout(1,&F, Pwrite, Pread); dou_layout(1,&D, Pwrite, Pread); return 0; }Unsurprisingly, on an x64 machine the code results in trivial permutations:
finding float layout 1 / 2^149 = A .......1 ........ ........ ........ A * 256 = B ........ .......1 ........ ........ B * 256 = C ........ ........ .......1 ........ negative zero= -0 ........ ........ ........ 1....... write: 0 1 2 3 read: 0 1 2 3 finding double layout 1 / 2^1074 = A .......1 ........ ........ ........ ........ ........ ........ ........ A * 256 = B ........ .......1 ........ ........ ........ ........ ........ ........ B * 256 = C ........ ........ .......1 ........ ........ ........ ........ ........ C * 256 = D ........ ........ ........ .......1 ........ ........ ........ ........ D * 256 = E ........ ........ ........ ........ .......1 ........ ........ ........ E * 256 = F ........ ........ ........ ........ ........ .......1 ........ ........ F * 256 = G ........ ........ ........ ........ ........ ........ .......1 ........ negative zero= -0 ........ ........ ........ ........ ........ ........ ........ 1....... write: 0 1 2 3 4 5 6 7 read: 0 1 2 3 4 5 6 7