Deploy the PyTorch model to a web browser using ONNX.js

Tram Ho


AI applications are getting closer and closer to users. Since then a lot of demand has arisen for bringing AI models to run in different types of environments such as Edge Device, Web Browser, Mobile App, Arduino … For that reason, the export of AI models to available formats. being able to run on these platforms is a very essential job. In this article we will learn about how to export the model from PyTorch to a hot framework in the AI ​​community, ONNX, and test this model in web browser with ONNX.js. OK is no longer wordy, and we’ll get started right away.

Deploy the model under the client

To unify the general concept for this part, I would like to define the deploy concept on the client, only the types of deploying the AI ​​model directly on edge devices, web browsers, mobile apps … to differentiate the deploying model. backend (or server) image.

Why need to deploy the model under the client

First of all, we need to distinguish the pros and cons of these two types of deploying:

  • Server-side Deploy: The AI ​​models are deployed on a centralized server. Clients communicate with the AI ​​model through APIs. The advantage of this method is that AI models can be centrally handled on the server, less dependent on configuration and deployment environment. Centralized model management is also easier for deploying as well as versioning, mantaining models, and in particular, model privacy, which is ensured when deploying centrally. You won’t have to worry about someone stealing your model or your model research architecture being copied. However, these deployments have a drawback that concentrating all the processing on the server can cause our system to overload as well as the operating costs, scale of AI models on the centralized server. is huge. Another downside to this approach is that data privacy is difficult to guarantee since users will have to send data to the server for processing.
  • Client-side Deploy: This is the second method we usually use. This method will bring AI processing as well as model down the client side. This method has the advantage that the AI-side processes are distributed and do not need a server configured too large to run the model. Applications running AI models can completely run under the client offline without having to access the server. Another advantage of this approach is to keep User Data Privacy . However, this method also has disadvantages which are the difficulty in updating and managing version of the model. Only suitable for small-sized models due to the current computing limitations of client-side hardware. It will be necessary to ensure the model runs on different platforms, deploy environments, and moreover, it is very important that we cannot guarantee Model Privacy when we have brought the entire AI model and processing down. client. This is the same for hackers as giving children to evil . Summing up the pros and cons of these two methods we can see in the following figure

When should deploy the model under the client

So the question is in what case will we use the deploy method under the client. Through my experience when working through AI projects, we can choose to deploy under the client in the following cases:

  • The application requires to run offline: at this point there is no other way than to have to bring the model under the client
  • The model is light enough and still accurate: This is a very important point to avoid affecting the user experience. Because models that are too heavy will result in very long load and inference times on client devices due to hardware limitations. This affects the user experience.
  • The model does not need to be updated much: This depends on each problem, if your problem does not require constant updating of models like Online Learning, the direction of bringing the model down to the client should also be considered.
  • Important application of user data security: If your application does not want to send data to a centralized server for processing, then bringing the AI ​​model down to the client is an advantage.
  • There is a copyright protection solution of the model: When bringing the model to the client, an important point to consider is the copyright protection of the model made with attacks.

Those are the points to consider before deciding to bring the AI ​​model under the client for deployment. If you are ready then we will dive into the techniques and tools to do this. In this article, I use ONNX – Open Neural Network Exchange – one of the famous frameworks to perform model conversion and demo on web application with ONNX.js offline.

What is ONNX

ONNX can be considered as an intermediate framework to represent an AI model that has been trianing from many different frameworks such as PyTorch, Tensorflow, Caffe … With ONNX format we can completely run the mode on different platforms. web, desktop, FPGA, ARM, Mobile …. And for that reason it’s really useful if you want to develop cross-platform AI models.

Basically like that. We just need to understand its meaning. I won’t go too deep into using this framework. You can find out on its own documentation it. We will get into the regular code.

Show me your code

Build model on PyTorch

Very simply, we will build a number recognition application with MNIST. This is a fairly simple article so I would like to not explain too much. You just run by the code

  • The first is to import the library

  • Next is the declaration of necessary hyperparams

  • Next is transform declaration for data and load corresponding data

  • Declare a simple network

  • Declare the loss function and the optimizer

  • Build the test function

  • Training model

Then we conduct training as usual. Sip a cup of coffee and wait for the result …

OK, we will save this model with 97% accuracy for further processing

Export model on ONNX

Exporting the model from PyTorch to ONNX is actually very simple. We just need to execute a few commands as follows

Test run in browser

Import ONNX.js libraries

You create the index.html file in the current directory and paste the code like the test code on Github ONNX.js.

Let’s create a small server to test our results. Enter the terminal in the current directory and type the command

Check the result

After accessing the above address we see a white screen. To check if the model has loaded successfully we turn on the console log

The result shows a red error. We will investigate to see where the error is

Error investigation

The error above indicates that the LogSoftMax function is not currently supported on ONNX.js. In ONNX.js github we find the Openrators Support section. Here clearly states the operators currently supporting. We look to LogSoftmax , currently versions of ONNX.js do not support this operator

So we have to go back to the PyTorch model in the beginning to work on tweaking and find replacement operators.

Adjust the model

In the PyTorch model we see the log_softmax operator used at the end of the forward function as follows:

Change this function to softmax function and for convenience we will declare a new network for easy editing. Now the new network is

Since replacing the softmax function does not affect the weight of the model, we can reload the old state_dict

Then test the performance again on the test set

The accuracy of the model has not changed, so we do not need network finetuning again. Proceed to re-export the model to ONNX.js to replace the old one and reload the web page. Now that we turn on the console log we will see the results of the model as follows:

This proves our model has been loaded successfully in ONNX.js

Build demo interface

This part we do not discuss too much. You copy the code and run it. Add in index.html

And don’t forget the css style anymore. You can export a separate css file for convenience.

The content of the style.css file you refer to in the source code

Running again we get the following interface

Now we will go through the code of the main handlers

Main processing code

The handle manipulation of canvas we will not discuss here. We will focus on the updatePredictions() function. This function will perform data from the canvas every time there is a manipulation of a new line on the canvas. This function takes the image from the current canvas and injects it into the model for prediction. Then update the results to the view. Let’s try to run the model

An error has occurred. Because the canvas input is not compatible with the model we exported. So we will need to recalibrate the model in the PyTorch code again.

Adjust the model again

Notice that the input of the model is an image obtained from the canvas with size 280 280 4 with 4 channels. The input is a channel with size 28 * 28. We will proceed to edit in the forward function of InferenceNet with the following lines

Then we proceed to export again but thanks to changing the input input

Demo run

After having the model, we proceed to refesh the website now no longer see the error. You can test the following demo

Source code

The source code of the article is provided here


Thus, together we have tested converting the PyTorch model to ONNX.js. Hopefully this article will help you better understand how it works, the benefits and disadvantages of the under-client model deploying methods. See you in the next posts.

Share the news now

Source : Viblo