Hands-on Tutorials

The Full (Stack) Story of a Text Classifier

How to build a simple text classifier and deploy it through a web app

Oscar de Felice

Published in

DeepBenBello

9 min readNov 2, 2020

One nice day an idea came to your mind: I can make big money by selling a machine learning model to a company that might be interested in it.

You like the idea, and you decide you want to address a company with a press department and sell a text classifier. You feel excited.

However, at this stage, reality strikes you hard: “I am a data scientist, I can build a great classifier, but when I show my code at managers, they look at it as I would look at hieroglyphs”.

If you get this reaction, feel lucky. Image created by the author on the base from https://facciabuco.com

Well, your worries are over! Go through this article and you will be able to create and publish your website classifying texts.

What are we going to do?

In the end, we will build a website like this, through the following steps

We create a classifier in Python, making use of a Deep Learning model written with Keras API.
Convert Keras model (an .h5 file) to a TensorFlow.js file (.json).
Convert Keras tokeniser to a .json file to be parsed in JavaScript.
Write a JavaScript program to load the model, the tokeniser, and to write a function to perform inference.
Write an HTML code to nicely put all of this in a webpage.

For points one to three, one can make reference to this GitHub repository or to its Colab version.

For the others, here another repository containing the relevant code.

Let’s build the model in Python

As a data scientist, you are not afraid of this (and you can potentially substitute this part with whatever model you like).

Image created by the Author on the base from https://imgflip.com/

For these reasons, I am not going too much in-depth with this, one can refer to other excellent publications to see how to build different models in Keras.

Load data

To write this piece, we used one of the Tensorflow datasets available, in particular the AG News subset. It is a dataset made up of 120 000 news labeled in four (balanced) categories, World, Sports, Business, Sci/Tech news.

We can download the dataset from Kaggle and import train and test sets from csv.

Without performing too much feature engineering (it is not our scope here), let’s simply tokenise Text column and pad sequences to a fixed length.

To conclude our ETL process, we just need to one-hot encode labels.

Build the model

We are ready to build the classifier model, we define a couple of hyper-parameters and we are ready,

Hence, the model. We choose a CNN model, as it guarantees both quite a good performance and a short training time. One can use whatever model available, as one of the Bert gang, XLNET, or some other transformer model.

====================================================================Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 75, 64)            4079296   
_________________________________________________________________
dropout (Dropout)            (None, 75, 64)            0         
_________________________________________________________________
conv1d (Conv1D)              (None, 75, 50)            9650      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 37, 50)            0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 37, 50)            0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 37, 100)           15100     
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 18, 100)           0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 18, 100)           0         
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 18, 200)           60200     
_________________________________________________________________
global_max_pooling1d (Global (None, 200)               0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 200)               0         
_________________________________________________________________
dense (Dense)                (None, 100)               20100     
_________________________________________________________________
activation (Activation)      (None, 100)               0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 100)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 4)                 404       
_________________________________________________________________
activation_1 (Activation)    (None, 4)                 0         
=================================================================
Total params: 4,184,750
Trainable params: 4,184,750
Non-trainable params: 0
_________________________________________________________________

Training

We trained the model above for 7 epochs and a batch size of 64. The results can be displayed easily by making use of classification_report of sklearn.

Hence, the performances are the following,

Performances are not bad, also because classes are balanced. Image by Author

So our model is trained and complete. How to make it running in production? How put it under the hood of a webpage?

Convert the model from Keras to Tensorflow.js

In order to integrate the model in a web script, we need to convert it into a proper form.

First, we need to install TensorFlow JS. This is done by the usual command

pip install tensorflowjs

Once the installation is completed, we can export the Keras model in .h5 format,

and hence convert it in .json with the Tensorflow js command,

tensorflowjs_converter --input_format=keras modelCNN.h5 ./model/

This command will produce a model .json file and some weights .bin files (how many depends on model dimensions), these files are the ones that will be loaded in the JavaScript model.

The Tokeniser issue

To provide predictions, as one can see from the model summary above, we do not feed the model with strings, we need a tokeniser to convert text to vectors*.

*NOTE: There is another way: building the model with an Input layer taking tf.strings as input and operate the conversion with a TextVectorizer layer. This, however, implies estimating a huge number of parameters, making the whole model by far heavier, more difficult to train, and eventually less robust.

There are various ways to perform such a conversion, here we used Keras Tokeniser, which simply maps each word in our corpus to an integer. This means to have a dictionary having as keys word in the vocabulary and as values the integer corresponding to each word.

When we are going to enter the text to be classified in our web page, we need to convert it in a form the model can digest, to do it coherently, we need such a map, so we have to export the tokeniser in a form JavaScript can read.

JSON format comes to the rescue. The command to export the dictionary is the following

We have all the components we need to build our webpage, so let’s abandon Python and move to build a simple webpage containing our model and taking texts as input, and performing inference on those.

TensorFlow JS model

We are now ready to build our webpage, by composing a script in JavaScript to serve an HTML webpage.

Copy and paste: one of the best resources of the FullStack Overflow developer. Image under Creative Commons license from Noun Project.

I am not an expert in web development, so for the HTML part I just took ready stuff on the web and modified it for my purposes, however, for the JavaScript part, TensorFlow JS allows building models, and also train them directly in a browser. A good place to look for more information about is the official site of TensorFlow JS, whose documentation is excellent and a brief course is provided.

The HTML structure

We are going to compose a plain web page, for more fancy appearances, one can write or import a CSS style file to configure the layout of the page.

The following HTML code imports TensorFlow JS and hence defines an input box where the final user can insert the text he wants to classify. Finally, we define a button to give the command predict.

Note that here we focused on the basic stuff, you can beautify to the extent you like by adding options to HTML tags.

Without standing too much at HTML, we need to notice that the code above just builds the layout of our page.

You should get something like this. Image by Author

To define the model for inference, how to use the input text, what to actually do when clicking the “Predict” button, we need to write another piece of code, the JavaScript part (we store this in a file called model.js as indicated in the HTML page). This is the aim of the next section.

Load model in JavaScript

The principle scheme is quite easy and analogous to its Python counterpart.

To predict a user-input text, we need the following

Encode the text to feed the model making use of the tokeniser map. The result of this part is a numerical vector, being the encoding of the wannabe-classified text.
Run model inference on the input text vector.

These two steps are the same we are going to do in JavaScript.

Load Tokeniser

First, we need to load the Tokeniser in JavaScript. Recall we exported the word-to-index map as a JSON file exactly for this reason.

The piece of code importing such a map can be written as follows,

where, vocabPath is the path of the JSON file.

We are not going into details about why we want to use asynchronous functions here, but a partial explanation is that we do not want our webpage to freeze when we wait for JSON loading.

Use Tokeniser

We are ready to define a function to encode input texts,

This does exactly what you would do to preprocess a text in Python. The function takes a string as input and returns its vector representation, based on tokeniser. Note that the tokeniser is correctly the one we trained on our data using Keras.

Can you see the light at the end of the tunnel? Photo by Adrien Olichon on Unsplash

Model inference

What is left? The model!

Until now, we did not make use of TensorFlow JS. We are going to make up for this right now. Converter Keras model can be loaded in JavaScript very easily,

where again, modelPath is the location of the model, note how we used the loadLayersModel function built in TensorFlow JS, this returns a model object which will serve us for inference.

Let’s go back for a second to the HTML, when we defined the button “Predict”, we gave it the instruction onclick=”predict()”, meaning — as one may expect — that on click the function predict has to be executed.

Hence, we only need to write such a function. Let us put it here and then we will explain its relevant details.

First of all, tf.tidy is one of the great features of TensorFlow JS. This is a method that executes the functions given as arguments and after the execution, cleans up all intermediate tensors allocated in the process, except those returned by functions. Such behaviour is really important especially in a web execution when you know nothing (or you do not want to think) about the infrastructures that will host your page.

The first part of the function makes the actual prediction. First, it takes the input text from the webpage input box and stores it in a string variable we called (in an effort of fantasy) text. The text is hence tokenised, and then given to the model making predictions. Of course, we get a vector of probabilities (recall we defined a Neural Network with a softmax activation at the output layer), so we use argmax to get the index of the maximum probability.

At this stage, we just need to print the prediction on screen. The most immediate way to achieve this is to define a variable populated according to prediction values, being the text to be printed. This is what the switch statement precisely does.

The only thing we are left with is to print the predictionText in the HTML element we called “prediction”. That’s all!

Now you have your classifier hosted on a webpage. To publish it, you need a web hosting service, you can pay for a domain or go for a free one, my suggestion is to go for the free GitHub pages if you do not need anything too fancy (and the webpage we just built is not).

That’s All. Tired Right?😴

We are at the end of this long post.

I hope you enjoyed it.

Image on creative commons by giphy.com

I think the integration between various environments is interesting and can give inspiration for nice applications of machine learning web interfaces. Here we just built a predictor, but with TensorFlow JS you can also train a whole model in the browser, make transfer learning and even save and download your model weights. I leave you with another example making use of your webcam to perform hand gesture recognition here. Have fun!

IMPORTANT: I publish this post also to get suggestions, discussion and to be made aware of points of weakness in my coding. Please, signal any mistakes/reduntant code! 😫

🤯 Have you seen what happens if you click the “clap” button multiple times?