← Back to homepage

Creating a Multilabel Neural Network Classifier with Tensorflow 2.0 and Keras

November 16, 2020 by Chris

Neural networks can be used for a variety of purposes. One of them is what we call multilabel classification: creating a classifier where the outcome is not one out of multiple, but some out of multiple labels. An example of multilabel classification in the real world is tagging: for example, attaching multiple categories (or 'tags') to a news article. But many more exist.

There are many ways in which multilabel classifiers can be constructed. In other articles, we have seen how to construct them with Support Vector Machines. But in this article, we're going to use neural networks for that purpose. It is structured as followed. Firstly, we'll take a more detailed look at multilabel classification. What is it? How does it work? We're going to use an assembly line setting to demonstrate it conceptually.

Subsequently, we're going to continue in a more practical way - by introducing how Neural networks can be used for multiclass classification. Using the bias-variance tradeoff, we will look at pros and cons of using them for creating a multilabel classifier. Once this is complete, we do the real work: using a step-by-step example, we're going to build a multilabel classifier ourselves, using TensorFlow and Keras.

Let's get to work! :)

What is multilabel classification?

Suppose that we are observing someone who is working in a factory. It's their task to monitor an assembly line for new objects. Once a new object appears, they must attach a label to the object about its size as well as its shape. Subsequently, the objects must be stored in a bucket - which can then be transported away, or something else.

This is classification, and to be more precise it is an instance of multilabel classification.

In machine learning, multi-label classification and the strongly related problem of multi-output classification are variants of the classification problem where multiple labels may be assigned to each instance.

Wikipedia (2006)

Formally, multi-label classification is the problem of finding a model that maps inputs x to binary vectors y (assigning a value of 0 or 1 for each element (label) in y).

Wikipedia (2006)

Visually, this looks as follows:

Using Neural Networks for Multilabel Classification: the pros and cons

Neural networks are a popular class of Machine Learning algorithms that are widely used today. They are composed of stacks of neurons called layers, and each one has an Input layer (where data is fed into the model) and an Output layer (where a prediction is output). In between, there are (often many) Hidden layers, which are responsible for capturing patterns from the data - providing the predictive capabilities that eventually result in a prediction for some input sample.

Today, in Deep Learning, neural networks have very deep architectures - partially thanks to the advances in compute power and the cloud. Having such deep architectures allows neural networks to learn a lot of patterns as well as abstract and detailed patterns, meaning that since their rise Machine Learning models can be trained and applied in a wide variety of situations.

Among them, multilabel classification.

Nevertheless, if we want to use Neural networks for any classification or regression task - and hence also multilabel classification - we must also take a look at the pros and cons. These can be captured by looking at them in terms of the bias-variance tradeoff.

Funnily, bias and variance are connected in a tradeoff: if your model has high bias, variance is often relatively low due to the rigidity of the function learned. If variance is high, meaning that small changes will significantly change the underlying function learned, then the function cannot be too rigid by consequence, and hence bias is low.

If we want to use Neural Networks for multilabel classification, we must take this into account. Through nonlinear activation functions like ReLU, Neural networks are systems of neurons that can learn any arbitrary function. This means that their bias is low - there is no rigidity when the Neural network is nonlinear. However, this means that it is susceptible to variance related behavior - that small changes in the dataset may trigger significant changes to the underlying patterns. In other words, if you have a small dataset or already think that the structure of your input data is of some function, you might also consider multilabel classifications with other models, such as SVMs. In other cases, Neural networks can definitely be useful.

Now that we know about Neural networks for multilabel classification, let's see if we can create one with TensorFlow and Keras.

Creating a Multilabel Classifier with Tensorflow and Keras

Createing a multilabel classifier with TensorFlow and Keras is easy. In fact, it it not so different from creating a regular classifier - except a few minor details. Let's take a look at the steps required to create the dataset, and the Python code necessary for doing so.

Here is the Python code which is the output of the steps mentioned above:

# Imports
from sklearn.datasets import make_multilabel_classification
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.losses import binary_crossentropy
from tensorflow.keras.optimizers import Adam

# Configuration options
n_samples = 10000
n_features = 6
n_classes = 3
n_labels = 2
n_epochs = 50
random_state = 42
batch_size = 250
verbosity = 1
validation_split = 0.2

# Create dataset
X, y = make_multilabel_classification(n_samples=n_samples, n_features=n_features, n_classes=n_classes, n_labels=n_labels, random_state=random_state)

# Split into training and testing data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=random_state)

# Create the model
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=n_features))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(n_classes, activation='sigmoid'))

# Compile the model
model.compile(loss=binary_crossentropy,
              optimizer=Adam(),
              metrics=['accuracy'])

# Fit data to model
model.fit(X_train, y_train,
          batch_size=batch_size,
          epochs=n_epochs,
          verbose=verbosity,
          validation_split=validation_split)

# Generate generalization metrics
score = model.evaluate(X_test, y_test, verbose=0)
print(f'Test loss: {score[0]} / Test accuracy: {score[1]}')

Running it gives the following performance:

Test loss: 0.30817817240050344 / Test accuracy: 0.8562628030776978

Summary

In this article, we looked at creating a multilabel classifier with TensorFlow and Keras. For doing so, we first looked at what multilabel classification is: assigning multiple classes, or labels, to an input sample. This is clearly different from binary and multiclass classification, to some of which we may already be used.

We also looked at how Neural networks can be used for multilabel classification in general. More specifically, we looked at the bias-variance tradeoff, and provided a few suggestions when to use Neural networks for the task, or when it can be useful to look at other approaches first.

Subsequently, we moved forward and provided a step-by-step example of creating a Neural network for multilabel classification. We used the TensorFlow and Keras libraries for doing so, as well as generating a multilabel dataset using Scikit. We achieved quite nice performance.

I hope that you have learned something from today's article! If you did, please feel free to leave a comment in the comments section below 💬 Please do the same if you have questions or other remarks, or even suggestions for improvement. I'd love to hear from you and will happily adapt my post when necessary. Thank you for reading MachineCurve today and happy engineering! 😎

References

TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.

Wikipedia. (2006, October 16). Multi-label classification. Wikipedia, the free encyclopedia. Retrieved November 16, 2020, from https://en.wikipedia.org/wiki/Multi-label_classification

MachineCurve. (2020, November 2). Machine learning error: Bias, variance and irreducible error with Pythonhttps://www.machinecurve.com/index.php/2020/11/02/machine-learning-error-bias-variance-and-irreducible-error-with-python/

Hi, I'm Chris!

I know a thing or two about AI and machine learning. Welcome to MachineCurve.com, where machine learning is explained in gentle terms.