TINY MACHINE LEARNING Lesson 9

TOPICS INDEX

Warnings

With regard to the safety aspects, since the projects are based on a very low voltage power supply supplied by the USB port of the PC or by support batteries or power supplies with a maximum of 9V output, there are no particular risks of an electrical nature. It is however necessary to specify that any short circuits caused during the exercise phase could produce damage to the PC, to the furnishings and in extreme cases even to burns, for this reason every time a circuit is assembled, or changes are made on it, it will be necessary to do so in the absence of power and at the end of the exercise it will be necessary to provide for the disconnection of the circuit by removing both the USB cable connecting to the PC and any batteries from the appropriate compartments or  external power connectors.  In addition, always for safety reasons, it is strongly recommended to carry out projects on insulating and heat-resistant carpets that can be purchased in any electronics store or even on specialized websites.

At the end of the exercises it is advisable to wash your hands, as the electronic components could have processing residues that could cause damage if ingested or if in contact with eyes, mouth, skin, etc. Although the individual projects have been tested and safe, those who decide to follow what is reported in this document, assume full responsibility for what could happen in the execution of the exercises provided for in the same. For younger children and / or the first experiences in the field of Electronics, it is advisable to perform the exercises with the help and in the presence of an adult

Roberto Francavilla

Packing the Hello World Model to make a Sketch

After having created and optimized our model that manages to predict the values of a sin(x) function, we move on to the “packing” phase to realize the sketch.

This, unfortunately, is the longest and most boring phase, also because it would be too complicated to explain it in detail, especially if you do not have a strong foundation of computer science. For this reason I will give you, for this phase, only procedural indications working on our model.

It is obvious that then the packaging procedure is easily generalizable and therefore can be applied to any other machine learning model.

Resizing or Quantizing the Model

Inorder to be able to use the Machine Learning model on microcontrollers, it must be made “tiny” and it is already clear that our model will need transformation. The first is to “resize”, that is, reduce as much as possible the amount of memory committed to running the model.

To do this we return to the file on Google’s Colab that we had previously named “Seno_Function_2.ipynb” and save it with the name “Seno_Function_3.ipynb”.

We go to the last cell and add a code cell and write the following code:

# Convert the model to the TensorFlow Lite format without quantization

converter = tf.lite.TFLiteConverter.from_keras_model(model_2)

tflite_model = converter.convert()

# Save the model to disk

open(“sine_model.tflite”, “wb”).write(tflite_model)

# Convert the model to the TensorFlow Lite format with quantization

converter = tf.lite.TFLiteConverter.from_keras_model(model_2)

# Indicate that we want to perform the default optimizations,

# which includes quantization

converter.optimizations = [tf.lite.Optimize.DEFAULT]

# Define a generator function that provides our test data’s x values

# as a representative dataset, and tell the converter to use it

def representative_dataset_generator():

  for value in x_test:

    # Each scalar value must be inside of a 2D array that is wrapped in a list

    yield [np.array(value, dtype=np.float32, ndmin=2)]

converter.representative_dataset = representative_dataset_generator

# Convert the model

tflite_model = converter.convert()

# Save the model to disk

open(“sine_model_quantized.tflite”, “wb”).write(tflite_model)

Practically with this added code we make two transformations, the first we make the model readable by TensorFlow Lite and save the transformation in a file called “sine_model.tflite”, the second instead performs an optimization of memory used (called “quantization”).

One of the automatic optimizations (by default) that are carried out with the command:

converter.optimizations = [tf.lite.Optimize.DEFAULT]

is to  transform the management ofe numbers, from floating-point numbers (requiring 32 bits of memory) to integers (requiring only 8 bits of memory).  It is obvious that with this transformation you will lose precision, but in any case it is a loss of precision in the expected value considered acceptable (since the model does not give an exact value anyway!).  What has been said applies in general, because bmust be careful with this optimization, in fact when the expected result is a very small number (in our case let’s remember that it goes from 0 to +/- 1) go from a decimal number to integer you could lose significant digits and therefore distort the result, ma of this we also talk about  later.  Now let’s focus on the packing procedure.

At this point we go to Runtime and click on Restart and run everything..

At this point we have two files, the modified model for TFLite and the model, always for TF Lite, but reduced or quantized.

We verify that the two new models retain the forecast characteristics of the starting model. To do this you need to work with TFLite. So let’s add one more code cell on Colab and write the following code:

# Instantiate an interpreter for each model

sine_model = tf.lite.Interpreter(‘sine_model.tflite’)

sine_model_quantized = tf.lite.Interpreter(‘sine_model_quantized.tflite’)

# Allocate memory for each model

sine_model.allocate_tensors()

sine_model_quantized.allocate_tensors()

# Get indexes of the input and output tensors

sine_model_input_index = sine_model.get_input_details()[0][“index”]

sine_model_output_index = sine_model.get_output_details()[0][“index”]

sine_model_quantized_input_index = sine_model_quantized.get_input_details()[0][“index”]

sine_model_quantized_output_index = sine_model_quantized.get_output_details()[0][“index”]

# Create arrays to store the results

sine_model_predictions = []

sine_model_quantized_predictions = []

# Run each model’s interpreter for each value and store the results in arrays

for x_value in x_test:

  # Create a 2D tensor wrapping the current x value

  x_value_tensor = tf.convert_to_tensor([[x_value]], dtype=np.float32)

  # Write the value to the input tensor

  sine_model.set_tensor(sine_model_input_index, x_value_tensor)

  # Run inference

  sine_model.invoke()

  # Read the prediction from the output tensor

  sine_model_predictions.append(

      sine_model.get_tensor(sine_model_output_index)[0])

  # Do the same for the quantized model

  sine_model_quantized.set_tensor(sine_model_quantized_input_index, x_value_tensor)

  sine_model_quantized.invoke()

  sine_model_quantized_predictions.append(

      sine_model_quantized.get_tensor(sine_model_quantized_output_index)[0])

# See how they line up with the data

plt.clf()

plt.title(‘Comparison of various models against actual values’)

plt.plot(x_test, y_test, ‘bo’, label=’Actual’)

plt.plot(x_test, predictions, ‘ro’, label=’Original predictions’)

plt.plot(x_test, sine_model_predictions, ‘bx’, label=’Lite predictions’)

plt.plot(x_test, sine_model_quantized_predictions, ‘gx’, label=’Lite quantized predictions’)

plt.legend()

plt.show()

Click on cell play and this graph will be generated

which compares the reference values of the test (actual), the values predicted by the original model (original prediction), the values predicted by the transformed model for TF Lite (Lite prediction) and the values of the quantized model in TF Lite (Lite quantized prediction).

As you can see the values are all quite aligned with each other without significant deviations.

We can also have information on how much the model has been resized by making a comparison between tf Lite model and quantized TF Lite with the following code to be written in a new cell:

import os

basic_model_size = os.path.getsize(“sine_model.tflite”)

print(“Basic model is %d bytes” % basic_model_size)

quantized_model_size = os.path.getsize(“sine_model_quantized.tflite”)

print(“Quantized model is %d bytes” % quantized_model_size)

difference = basic_model_size – quantized_model_size

print(“Difference is %d bytes” % difference)

We launch once again the code contained in the cell through the play button and you will see that the reduction is about 900 bytes on 4300 bytes, that is, a reduction of more than 20%.

At this point there is still a tasteto be done and it is to convert the model into C language to be used with TensoFlow Lite for microcontrollers.

To do this, once again, we add a cell and write the following code:

# Install xxd if it is not available

!apt-get -qq install xxd

# Save the file as a C source file

!xxd -i sine_model_quantized.tflite > sine_model_quantized.cc

# Print the source file

!cat sine_model_quantized.cc

When the code is executed, a file with the extension “.cc” is generated. You can see all the files generated on Colab by clicking on the “folder” on the left of the Notebook.

And with this, we have finished making our model that has been trained, evaluated and converted into deep learning for TensorFlow Lite that can take a number between 0 and 2 π and re-produce, with a satisfactory approximation, the value of sin (x) .

This realization is only part of the packaging procedure, in fact in order to use the model you need to create a real application.

Realization of the Application (Sketch) that uses the TinyML Model

The objective of this paragraph is to explain in a very general way the procedure for creating an application that allows the following process:

that is, to make a sketch for our Arduino Nano 33 BLE Sense board  that taken a numerical input value, properly formatted, and fed to the model, brings out a forecast output close to the value of sin(x).

To do this we analyze the sample sketch present in the Arduino IDE, launch the application and go to:

File -> Examples -> Arduino_TensorFlowLite -> hello_world

The sketch is loaded:

which consists of 9 tabs:1. hello_world
2. arduino_constants.cpp
3. arduino_main.cpp
4. arduino_output_handler.cpp
5. costants.h
6. main_functions.h
7. model.cpp
8. model.h
9. output_handler.h

The first tab is the main body of the sketch, in fact in this tab all the libraries and functions used by the same sketch are invoked, there is the function “void setup () { …. } ” and the function “void loop () { ….  } ” .  Inthe setup function, all the checks on the TF Lite version and the verification of compatibility with the model are carried out, also the TF Lite function is initialized  and the memory of the tensors is allocated for the inference process.

In the loop function, the main operations are performed by invoking the other functions defined in the other tabs, in particular I want to observe the process mode of inference in the forecast. That is, the use of float variables for the definition of the input and for obtaining the relative output, always of float type and then the quantization phase by transforming from float numbers to integers, after the application of appropriate scales to the input and output data.

This is because, as was written previously, in our case the input is a number ranging from 0 to 6.28… and the output is a number ranging from 0 to +/- 1, so going from a decimal number (floating-point) to integer, you could lose significant digits and thus distort the result.

The second tab “arduino_constants.cpp” defines the number of inferences. It is defined on the basis of experience and the various attempts made, so it is a number that is the result of the results obtained.

In the third “arduino_main.cpp” tab  , all the main functions for the Arduino IDE are invoked.

In the fourth “arduino_outpu_handler.cpp” tab the function is inserted that takes the output value of the inference and then the expected value of sin(x) and is transformed into PWM parameter to establish the relative brightness of the LED on board the Arduino Nano board.

The fifth tab “costants.h” defines the range of values that the input can take. As can be seen, the number “pi” is written in numerical form by truncating its value to the eleventh decimal place. This is done because otherwise we would have had to load another library with additional memory commitment.

The sixth tab “main_functions.h” is the one that is called from the third tab, so nothing relevant to highlight.

In the seventh and eighth tabs, respectively “model.cpp” and “model.h”, there are the files produced by TF Lite seen in the Initial Chapter and represent the model translated into C language.

In the ninth and last tab called “output_handler.h” the TF Lite functions are invoked for the execution of inferences and the predictive definition of the value of sin (x).

Conclusions

What I wanted to describe in this lesson, and which completes the Basic Course on Artificial Intelligence applied to Microcontrollers, is an overview of what needs to be achieved to move from the model built in previous lessons and translated into C language, in a real sketch that can be used by the Arduino Nano 33 BLE board.

I realize that this passage is not as detailed as I would have liked, but if I had done so, I would have filled pages with numerical calculation theories and descriptions of computer processes that would surely have bored the student.

In fact, the purpose of this course is to give the basics of AI, in particular of TinyML, applied to Arduino Nano 33 BLE Sense and generate that healthy curiosity that pushes the student himself  to continue with the insights that he considers useful to do.

In this regard, I suggest two books and the Tenso rFlow website  that I found very useful and that I used to be inspired by what is reported in the pages of the Course.

The links of the two books and the site are:

At the following link the video that shows the operation of the Hello World sketch and also some changes made to make experiments.

If you found the lesson interesting, make a donation you will help me realize many others.