2

I have created a TensorFlow Lite .tflite model which I plan to use on a microcontroller. However, this file must be converted to a C source file, i.e, a TensorFlow Lite for Microcontrollers model. TensorFlow documentation provides a simple way to convert to a C array with the unix command xxd. I am using Windows 10 and do not have access to the unix command and there are no alternative Windows methods documented. After searching superuser, I saw that xxd for Windows now exists. I downloaded the command and ran it on my .tflite model. The results were different than the hello world example.

First, the hello world example model.h file has a comment that say it was "Automatically created from a TensorFlow Lite flatbuffer using the command: xxd -i model.tflite > model.cc" When I ran the command, model.h was not "automatically created".

Second, comparing the model.cc file from the hello world example, with the model.cc file that I generated, they are quite different and I'm not sure how to interpret this (I'm not referring to the differences in the actual array). Again, in the example model.cc file, it states that it was "automatically created" using the xxd command. Line 28 in the example is alignas(8) const unsigned char g_model[] = { and line 237 is const int g_model_len = 2488;. In comparison, the equivalent lines in the file I generated are unsigned char _________g_model[] = { and unsigned int _________g_model_len = 4009981;

While I am not a C expert, I am not sure how to interpret the differences in the files and if I have generated the model.cc file incorrectly. I would greatly appreciate any insight or guidance here on how to properly generate both the model.h and model.cc files from the original model.tflite file.

pppery
  • 3,731
  • 22
  • 33
  • 46
  • This method apparently just replaces the loading of the model from a file systems by an initialized variable with the model. Are you sure that your microcontroller can _execute_ the TensorFlow software? Did you successfully compile it for your target? -- Anyway, of course your model should give you another output than "hello world". Did you compare the result, if you convert "hello world"? -- Concerning the differing identifiers, the versions of xxd of the example and yours are apparently different or called with different options. – the busybee Aug 10 '22 at 07:15
  • Hello @thebusybee - Yes my microcontroller can execute TensorFlow. I did try to convert [hello_world.tflite](https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/micro/examples/hello_world/hello_world.tflite). My resulting model.cc file doesn't match the example (including the array). However they don't have a current version of model.cc in github (my links above are from a historical snapshot). So there could be a difference in the models in the example given the difference in time, but I can't confirm. Still not sure how to properly generate model.h and model.cc. – leone.kutch Aug 11 '22 at 05:21

1 Answers1

1

After doing some experiments, I think this is why you are getting differences:

  1. xxd replaces any non-letter/non-digit character of the path to the input file by an underscore ('_'). Apparently you called xxd with a path for the input file that has 9 such leading characters, perhaps something like "../../../g.model". The syntax of C allows only letters (a to z, A to Z), digits (0 to 9) and underscore as characters of objects' names, and the names need to start with a non-digit. This is the only "manipulation" xxd does to the name of an input file.

  2. Since xxd knows nothing about TensorFlow, it could not had generated the copyright notice. Using this as indication, any other difference had been inserted by other means by the TensorFlow authors, despite the statement "Automatically created from a TensorFlow Lite flatbuffer ...". This could be done manually or by a script, unfortunately I did not find any hint in some quick research on their repository. Apparently the statement means just the data values.

So you need to edit your result:

  1. Add any comment you see fit.

  2. Add the compiler-specific alignas(8) to the array, if your compiler supports it.

  3. Add the keywords const to the array and the length variable. This will tell the compiler to prohibit any write access. And probably this will place the data in read-only memory.

  4. Rename array and length variables to g_model and g_model_len, respectively. Most probably TensorFlow expects these names.

  5. Copy "model.cc" into "model.h", and then apply more editions, as the example demonstrated.

Don't be bothered by different values. Different contents of the model's file are the reason. It's especially simple to check the length variable, it has to have exactly the same value as the size of the input file.

EDIT:

On line 28 which is this text alignas(8) const unsigned char as shown in the example converted model. When I attempt to convert a model (whether it's my custom model or the "hello_world.tflite" example model) the text that would be on line 28 is unsigned char (any other text on that line is not in question). How is line 28 edited & explained?

Concerning the "how": I firmly believe that the authors of TensorFlow literally used an editor (an IDE or a stand-alone program like Notepad++ or Geany) and edited the line, or used some script to automate this.

The reason for alignas(8) is most probably that TensorFlow expects the data with an alignment of 8 bytes, for example because it casts the byte array to a structure that contains values of 8 bytes width.

The insertion of const will also commonly locate the model in read-only memory, which is preferable on most microcontrollers. If it were left out, the model's data were not only writable, but would be located in precious RAM.

On line 237, the text specifically is const int. When I attempt to convert a model (whether it's my custom model or the "hello_world.tflite" example model) the text that would be on line 237 is unsigned int (any other text on that line is not in question). Why are these two lines different in these specific places? It makes me believe that xxd on Windows is not functioning the same?

Again, I firmly believe this was edited manually or by a script. TensorFlow might expect this variable to be of data type int, but any xxd I tried (Windows and Linux) generates unsigned int. I don't think that your specific version of xxd functions differently on Windows.

For const the same thoughts apply as above.

Finally, when I attempt to convert the example model "hello_world.tflite" file using the xxd for windows utility, my resulting array doesn't match the example "hello_world.cc" file. I would expect the array values to be identical if the xxd worked. The last question is how to generate the "model.h" and "model.cc" files on Windows.

Did you note that the model you link is in another branch of the repository?

If I use the branch on GitHub as in your link to "hello_world.cc", I find in "../train/README.md" this archive hello_world_2020_12_28.zip. I unpacked it and ran xxd on the included "model.tflite". The result's data match the included "model.cc" in the archive. But it does not match the data of "hello_world.cc" in the same branch that you linked. The difference is already there.

My conclusion is, that the example result was not generated from the example model. This happens, since developers sometimes don't pay enough attention on what they commit. Yes, it's unfortunate, as it irritates and frustrates beginners like you.

But, as I wrote, don't let this make you headaches. Try the simple example, use the documentation as instructions on the process. Look at the differences in specific data as a quirk. You will encounter such things time after time when working with other's projects. It is quite normal.

the busybee
  • 10,755
  • 3
  • 13
  • 30
  • Thanks for doing some research on this for me. 1. xxd.exe and model.tflite are in the the same directory. In Command Prompt, I navigated to the directory where both files are and ran the command xxd -i model.tflite > model.cc 2. Correct, I don't expect it to generate the comments/copywrite. As noted in a reply to your comment above, when running the hello world example tflite file through my xxd command, I get different output (including the array). Therefore, I do not believe this method is converting the file correctly. Are there other ways to do it or validate it? Thanks. – leone.kutch Aug 12 '22 at 02:44
  • I tried two different versions of xxd on Win10, and both did not invent new names for the variables. So I'm quite puzzled that you say you got `g_model` for your input file name "model.tflite", I had expected `model_tflite`. -- Anyway, don't overthink it too much. Do the conversion and try to compile and to run the model. You don't need to have the exactly same result as the example, which is apparently not well maintained. – the busybee Aug 12 '22 at 05:54
  • I was using "model.tflite" as a generic file name. The file name itself is not a concern or issue. The focus of my question is lines 28 and 237 within the generated file is different. The generated array using the example hello_world.tflite is different for me when I generate the .cc file compared with the example's same .cc file. All other differences (including file name, comments, etc) are all non-issues. – leone.kutch Aug 12 '22 at 21:16
  • Since you wrote in your question that the length is 4009981, I was sure you converted your model, not the simple example. Would you mind to check that this value is from the conversion of the example's hello world model? – the busybee Aug 13 '22 at 15:15
  • Oh, and line 28 is explained, it was edited. – the busybee Aug 13 '22 at 15:19
  • Comment 1/3: There seems to be confusion here. Please allow me to try to clear it up. It will take 3 comments as each is allowed ~600 characters. The length of my custom model is not in question. The issues I am asking questions about specifically are: On line 28 which is this text `alignas(8) const unsigned char` as shown in the example converted model. When I attempt to convert a model (whether it's my custom model or the hello_world.tflite example model) the text that would be on line 28 is `unsigned char` (any other text on that line is not in question). How is line 28 edited & explained? – leone.kutch Aug 14 '22 at 22:08
  • Comment 2/3: On line 237, the text specifically is `const int`. When I attempt to convert a model (whether it's my custom model or the hello_world.tflite example model) the text that would be on line 237 is `unsigned int` (any other text on that line is not in question). Why are these two lines different in these specific places? It makes me believe that xxd on Windows is not functioning the same? – leone.kutch Aug 14 '22 at 22:10
  • Comment 3/3: Finally, when I attempt to convert the example model [hello_world.tflite file](https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/micro/examples/hello_world/hello_world.tflite) using the xxd for windows utility, my resulting array doesn't match the example [hello_world.cc](https://github.com/tensorflow/tflite-micro/blob/fdc88436a1928b2314caf5e1445fdf16ccd83d68/tensorflow/lite/micro/examples/hello_world/model.cc) file. I would expect the array values to be identical if the xxd worked. The last question is how to generate the model.h and model.cc files on Windows. – leone.kutch Aug 14 '22 at 22:11