Tutorial: How to deploy your ConvNet classifier with Keras and FastAPI

Tutorial: How to deploy your ConvNet classifier with Keras and FastAPI

Last Updated on 28 April 2020

Training machine learning models is fun – but what if you found a model that really works? You’d love to deploy it into production, so that others can use it.

In today’s blog post, we’ll show you how to do this for a ConvNet classifier using Keras and FastAPI. It begins with the software dependencies that we need. This is followed by today’s model code, and finally showing you how to run the deployed model.

Are you ready? Let’s go! 🙂



Software dependencies

In order to complete today’s tutorial successfully, and be able to run the model, it’s key that you install these software dependencies:

  • FastAPI
  • Pillow
  • Pydantic
  • TensorFlow 2.0+
  • Numpy

Let’s take a look at the dependencies first.

FastAPI

With FastAPI, we’ll be building the groundwork for the machine learning model deployment.

What it is? Simple:

FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints.

FastAPI. (n.d.). https://fastapi.tiangolo.com/

With the framework, we can build a web service that accepts requests over HTTP, allows us to receive inputs, and subsequently send the machine learning prediction as the response.

Installing goes through pip, with pip install fastapi. What’s more, you’ll also need an ASGI (or Asynchronous Server Gateway Interface) server, such as uvicorn: pip install uvicorn.

Pillow

Then Pillow:

Pillow is the friendly PIL fork by Alex Clark and Contributors. PIL is the Python Imaging Library by Fredrik Lundh and Contributors.

Pillow — Pillow (PIL Fork) 3.1.2 documentation. (n.d.). Pillow — Pillow (PIL Fork) 7.0.0 documentation. https://pillow.readthedocs.io/en/3.1.x/index.html

We can use Pillow to manipulate images – which is what we’ll do, as the inputs for our ConvNet are images. Installation, once again, goes through pip:

pip install Pillow

Pydantic

Now, the fun thing with web APIs is that you can send pretty much anything to them. For example, if you make any call (whether it’s a GET one with parameters or a PUT, POST or DELETE one with a body), you can send any data along with your request.

Let's pause for a second! 👩‍💻

Blogs at MachineCurve teach Machine Learning for Developers. Sign up to MachineCurve's free Machine Learning update today! You will learn new things and better understand concepts you already know.

We send emails at least every Friday. Welcome!

By signing up, you consent that any information you receive can include services and special offers by email.

Now, the bad thing with such possibility is that people may send data that is incomprehensible for the machine learning model. For example, it wouldn’t work if text was sent instead of an image, or if the image was sent in the wrong way.

Pydantic comes to the rescue here:

Data validation and settings management using python type annotations.

Pydantic.https://pydantic-docs.helpmanual.io/

With this library, we can check whether all data is ok 🙂

TensorFlow 2.0+

The need for TensorFlow is obvious – we’re deploying a machine learning model.

What’s more, we need TensorFlow 2.0+ because of its deep integration with modern Keras, as the model that we’ll deploy is a Keras based one.

Fortunately, installing TensorFlow is easy – especially when you’re running it on your CPU. Click here to find out how.

Numpy

Now, last but not least, Numpy. As we all know what it is and what it does, I won’t explain it here 🙂 We’ll use it for data processing.


Today’s code

Next up: the code for today’s machine learning model deployment 🦾 It consists of three main parts:

  • Importing all the necessary libraries.
  • Loading the model and getting the input shape.
  • Building the FastAPI app.

The latter of which is split into three sub stages:

  • Defining the Response.
  • Defining the main route.
  • Defining the /prediction route.

Ready? Let’s go! 🙂 Create a Python file, such as main.py, on your system, and open it in a code editor. Now, we’ll start writing some code 🙂

Just a break: what you’ll have to do before you go further

Not willing to interrupt, but there are two things that you’ll have to do first before you actually build your API:

  • Train a machine learning model with Keras, for example with the MNIST dataset (we assume that your ML model handles the MNIST dataset from now on, but this doesn’t really matter as the API works with all kinds of CNNs).
  • Save the model instance, so that you can load it later. Find out here how.

Model imports

The first thing to do is to state all the model imports:

# Imports from fastapi import FastAPI, File, UploadFile, HTTPException from PIL import Image from pydantic import BaseModel from tensorflow.keras.models import load_model from typing import List import io import numpy as np import sys
Code language: PHP (php)

Obviously, we’ll need parts from FastAPI, PIL (Pillow), pydantic and tensorflow, as well as numpy. But we’ll also need a few other things:

  • For the list data type, we’ll use typing
  • For input/output operations (specifically, byte I/O), we’ll be using io
  • Finally, we’ll need sys – for listening to Exception messages.

Loading the model and getting input shape

Next, we load the model:

Never miss new Machine Learning articles ✅

Blogs at MachineCurve teach Machine Learning for Developers. Sign up to MachineCurve's free Machine Learning update today! You will learn new things and better understand concepts you already know.

We send emails at least every Friday. Welcome!

By signing up, you consent that any information you receive can include services and special offers by email.
# Load the model filepath = './saved_model' model = load_model(filepath, compile = True)
Code language: PHP (php)

This assumes that your model is in the new TensorFlow 2.0 format. If it’s not, click the link above, as we describe there how to save it in the 1.0 format – this is directly applicable here.

Then, we get the input shape as expected by the model:

# Get the input shape for the model layer input_shape = model.layers[0].input_shape
Code language: PHP (php)

That is, we wish to know what the model expects – so that we can transform any inputs into this shape. We do so by studying the input_shape of the first (i = 0) layer of our model.

Building the FastAPI app

Second stage already! Time to build the actual groundwork. First, let’s define the FastAPI app:

# Define the FastAPI app app = FastAPI()
Code language: PHP (php)

Defining the Response

Then, we can define the Response – or the output that we’ll serve if people trigger our web service once it’s live. It looks like this:

# Define the Response class Prediction(BaseModel): filename: str contenttype: str prediction: List[float] = [] likely_class: int

It contains four parts:

Defining the main route

Now, we’ll define the main route – that is, when people navigate to your web API directly, without going to the /prediction route. It’s a very simple piece of code:

# Define the main route @app.get('/') def root_route(): return { 'error': 'Use GET /prediction instead of the root route!' }
Code language: PHP (php)

It simply tells people to use the correct route.

Defining the /prediction route

The /prediction route is a slightly longer one:

# Define the /prediction route @app.post('/prediction/', response_model=Prediction) async def prediction_route(file: UploadFile = File(...)): # Ensure that this is an image if file.content_type.startswith('image/') is False: raise HTTPException(status_code=400, detail=f'File \'{file.filename}\' is not an image.') try: # Read image contents contents = await file.read() pil_image = Image.open(io.BytesIO(contents)) # Resize image to expected input shape pil_image = pil_image.resize((input_shape[1], input_shape[2])) # Convert from RGBA to RGB *to avoid alpha channels* if pil_image.mode == 'RGBA': pil_image = pil_image.convert('RGB') # Convert image into grayscale *if expected* if input_shape[3] and input_shape[3] == 1: pil_image = pil_image.convert('L') # Convert image into numpy format numpy_image = np.array(pil_image).reshape((input_shape[1], input_shape[2], input_shape[3])) # Scale data (depending on your model) numpy_image = numpy_image / 255 # Generate prediction prediction_array = np.array([numpy_image]) predictions = model.predict(prediction_array) prediction = predictions[0] likely_class = np.argmax(prediction) return { 'filename': file.filename, 'contenttype': file.content_type, 'prediction': prediction.tolist(), 'likely_class': likely_class } except: e = sys.exc_info()[1] raise HTTPException(status_code=500, detail=str(e))
Code language: PHP (php)

Let’s break it into pieces:

  • We define the route and the response model, and specify as the parameter that a File can be uploaded into the attribute file.
  • Next, we check the content type of the file – to ensure that it’s an image (all image content types start with image/, like image/png). If it’s not, we throw an error – HTTP 400 Bad Request.
  • Then, we open up a try/catch block, where if anything goes wrong the error will be caught gracefully and nicely sent as a Response (HTTP 500 Internal Server Error).
  • In the try/catch block, we first read the contents of the image – into a Byte I/O structure, which acts as a temporary byte storage. We can feed this to Image from Pillow, allowing us to actually open the image sent over the network, and manipulate it programmatically.
  • Once it’s opened, we resize the image so that it meets the input_shape of our model.
  • Then, we convert the image into RGB if it’s RGBA, to avoid alpha channels (our model hasn’t been trained for this).
  • If required by the ML model, we convert the image into grayscale.
  • Then, we convert it into Numpy format, so that we can manipulate it, and then scale the image (this is dependent on your model! As we scaled it before training, we need to do so here too or we get an error)
  • Finally, we can generate a prediction and return the Response in the format that we specified.

Running the deployed model

That’s it already! Now, open up a terminal, navigate to the folder where your main.py file is stored, and run uvicorn main:app --reload :

INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) INFO: Started reloader process [8960] 2020-03-19 20:40:21.560436: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2020-03-19 20:40:25.858542: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2020-03-19 20:40:26.763790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: GeForce GTX 1050 Ti with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.4175 pciBusID: 0000:01:00.0 2020-03-19 20:40:26.772883: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2020-03-19 20:40:26.780372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2020-03-19 20:40:26.787714: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2020-03-19 20:40:26.797795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: GeForce GTX 1050 Ti with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.4175 pciBusID: 0000:01:00.0 2020-03-19 20:40:26.807064: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2020-03-19 20:40:26.815504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2020-03-19 20:40:29.059590: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-03-19 20:40:29.065990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2020-03-19 20:40:29.071096: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2020-03-19 20:40:29.076811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2998 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1) INFO: Started server process [19516] INFO: Waiting for application startup. INFO: Application startup complete.
Code language: JavaScript (javascript)

Now, your API has started successfully.

Time to send a request. I’ll use Postman for this, which is a HTTP client that is very useful.

I’ll send this MNIST sample, as my model was trained on the MNIST dataset:

Specifying all the details:

Join hundreds of other learners! 😎

Blogs at MachineCurve teach Machine Learning for Developers. Sign up to MachineCurve's free Machine Learning update today! You will learn new things and better understand concepts you already know.

We send emails at least every Friday. Welcome!

By signing up, you consent that any information you receive can include services and special offers by email.

Results in this output:

{ "filename": "mnist_sample.png", "contenttype": "image/png", "prediction": [ 0.0004434768052306026, 0.003073320258408785, 0.008758937008678913, 0.0034302924759685993, 0.0006626666290685534, 0.0021806098520755768, 0.000005191866875975393, 0.9642654657363892, 0.003465399844571948, 0.013714754022657871 ], "likely_class": 7 }
Code language: JSON / JSON with Comments (json)

Oh yeah! 🎉


Summary

In this blog post, we’ve seen how machine learning models can be deployed by means of a web based API. I hope you’ve learnt something today. If you did, please leave a comment in the comments section! 🙂

Sorry for the long delay in blogs again and happy engineering. See you soon! 😎

Full model code

If you wish to obtain the code at once, here you go:

# Imports from fastapi import FastAPI, File, UploadFile, HTTPException from PIL import Image from pydantic import BaseModel from tensorflow.keras.models import load_model from typing import List import io import numpy as np import sys # Load the model filepath = './saved_model' model = load_model(filepath, compile = True) # Get the input shape for the model layer input_shape = model.layers[0].input_shape # Define the FastAPI app app = FastAPI() # Define the Response class Prediction(BaseModel): filename: str contenttype: str prediction: List[float] = [] likely_class: int # Define the main route @app.get('/') def root_route(): return { 'error': 'Use GET /prediction instead of the root route!' } # Define the /prediction route @app.post('/prediction/', response_model=Prediction) async def prediction_route(file: UploadFile = File(...)): # Ensure that this is an image if file.content_type.startswith('image/') is False: raise HTTPException(status_code=400, detail=f'File \'{file.filename}\' is not an image.') try: # Read image contents contents = await file.read() pil_image = Image.open(io.BytesIO(contents)) # Resize image to expected input shape pil_image = pil_image.resize((input_shape[1], input_shape[2])) # Convert from RGBA to RGB *to avoid alpha channels* if pil_image.mode == 'RGBA': pil_image = pil_image.convert('RGB') # Convert image into grayscale *if expected* if input_shape[3] and input_shape[3] == 1: pil_image = pil_image.convert('L') # Convert image into numpy format numpy_image = np.array(pil_image).reshape((input_shape[1], input_shape[2], input_shape[3])) # Scale data (depending on your model) numpy_image = numpy_image / 255 # Generate prediction prediction_array = np.array([numpy_image]) predictions = model.predict(prediction_array) prediction = predictions[0] likely_class = np.argmax(prediction) return { 'filename': file.filename, 'contenttype': file.content_type, 'prediction': prediction.tolist(), 'likely_class': likely_class } except: e = sys.exc_info()[1] raise HTTPException(status_code=500, detail=str(e))
🚀 Boost your ML knowledge with MachineCurve Continue your Keras journey 👩‍💻 Learn about supervised learning with the Keras Deep Learning framework, including tutorials on ConvNets, autoencoders, activation functions, optimizers... and a lot more! Python examples are included. Enjoy our 100+ free Keras tutorials

References

FastAPI. (n.d.). https://fastapi.tiangolo.com/

Pillow — Pillow (PIL Fork) 3.1.2 documentation. (n.d.). Pillow — Pillow (PIL Fork) 7.0.0 documentation. https://pillow.readthedocs.io/en/3.1.x/index.html

Pydantic.https://pydantic-docs.helpmanual.io/

Do you want to start learning ML from a developer perspective? 👩‍💻

Blogs at MachineCurve teach Machine Learning for Developers. Sign up to learn new things and better understand concepts you already know. We send emails every Friday.

By signing up, you consent that any information you receive can include services and special offers by email.

15 thoughts on “Tutorial: How to deploy your ConvNet classifier with Keras and FastAPI

  1. Adesoji Alu

    how do i display picture alongside the predictions and likely class in fastAPI?

  2. Adesoji Alu

    Sir, if someone mistakenly uploads a different image from what you have trained your model with, how can someone write a route in fast API to return INVALID?

    1. Chris

      Hi there,

      To do that, you will need to train another model that is capable of detecting whether the image is from the expected training set. Then you have to apply that model first to every input image; only if it mentions “true”, then you can proceed with the other model. In all other cases, you would have to output ‘INVALID’.

      Best,
      Chris

  3. Adesoji Alu

    please chris how do i do this, i am confused? do i add another class to the training set and specify as invalid or ??

    1. Chris

      Hi Adesoji,

      As I wrote before, you will have to construct a new dataset with images from your target set and “fake” images — then train a binary classifier on detecting whether the image is ‘fake’ or ‘real’. It is a second model, not another class to the training set.

      Best,
      Chris

  4. Adesoji Alu

    ok. Thanks sir. it’s crystal clear

    1. Chris

      Best of luck!

      Best regards,
      Chris

  5. Adesoji Alu

    sir. I am stuck. not been able to achieve my mission.

  6. Adesoji Alu

    creating the route for the first model to detect the trained images or untrained images, then if trained, return invalid, else continue to the classifier model to predict. routing the codes is where i am havin issues

    1. Chris

      If you can share your code through an Ask Questions thread, I can take a look.

  7. Adesoji Alu

    ok sir. using fastapi . Thanks

    # Imports
    from fastapi import FastAPI, File, UploadFile, HTTPException
    from PIL import Image
    from pydantic import BaseModel
    from tensorflow.keras.models import load_model
    from typing import List
    import io
    import numpy as np
    import sys
    import matplotlib.pyplot as plt
    import cv2
    from starlette.responses import StreamingResponse

    # Load the model
    filepath1 = r”C:\Users\Sortol\Desktop\TASK” #(binary classifier model,fake or target, 0 or 1 prediction)
    filepath2 = r”C:\Users\Sortol\Desktop\TASK2″ #(Target classification model for 54 classes)
    model1= load_model(filepath1, compile = False)
    model2 = load_model(filepath2, compile = False)

    import json

    with open(r”C:\Users\Sortol\Desktop\New folder (2)\categories.json”, ‘r’) as f: #(Two labels. Fake and target)
    cat_to_name = json.load(f)
    classes1 = list(cat_to_name.values())

    print (classes1)
    print(len (classes1))

    import json

    with open(r”C:\Users\Sortol\Desktop\New folder (3)\categories.json”, ‘r’) as g: #(54 labels: Apple_healthy leaf, Apple scab, etc)
    cat_to_name = json.load(g)
    classes2 = list(cat_to_name.values())

    # Get the input shape for the model layer
    input_shape1 = model1.layers[0].input_shape
    input_shape2 = model2.layers[0].input_shape

    # Define the FastAPI app
    app = FastAPI()

    # Define the Response
    class Prediction(BaseModel):
    filename: str
    contenttype: str
    prediction: List[float] = []
    likely_class: int
    Accuracy : float
    predicted_class : str
    : bool

    # Define the main route
    @app.get(‘/’)
    def root_route():
    return { ‘error’: ‘Use GET /prediction instead of the root route!’ }

    # Define the /prediction route
    @app.post(‘/prediction/’, response_model=Prediction)
    async def prediction_route(file: UploadFile = File(…)):

    # Ensure that this is an image
    if file.content_type.startswith(‘image/’) is False:
    raise HTTPException(status_code=400, detail=f’File \'{file.filename}\’ is not an image.’)

    try:
    # Read image contents
    contents = await file.read()
    pil_image = Image.open(io.BytesIO(contents))

    # Resize image to expected input shape
    pil_image = pil_image.resize((input_shape1[1], input_shape1[2]))

    # Convert from RGBA to RGB *to avoid alpha channels*
    if pil_image.mode == ‘RGBA’:
    pil_image = pil_image.convert(‘RGB’)

    # Convert image into grayscale *if expected*
    if input_shape1[3] and input_shape1[3] == 1:
    pil_image = pil_image.convert(‘L’)

    # Convert image into numpy format
    numpy_image = np.array(pil_image).reshape((input_shape1[1], input_shape1[2], input_shape1[3]))

    # Scale data (depending on your model)
    numpy_image = numpy_image / 255

    # Generate prediction
    prediction_array = np.array([numpy_image])
    predictions = model1.predict(prediction_array)
    prediction = predictions[0]
    if prediction == 0 or 0.0: #(This checks wheteher the predicted value is either 0 0r 1 for the target vs fake class, if it’ checks and the prediction is 0, the program ends)
    print(‘INVALID_PICTURE:UPLOAD THE ACCEPTED LABEL FOR INFERENCE’)
    break
    elif prediction > 0: #(when the predicted value is greater than o, it will proceed to the classifier to find the exact classification in the 54 classes)
    pass
    prediction_array = np.array([numpy_image])
    predictions = model2.predict(prediction_array)
    prediction = predictions[0]

    likely_class = np.argmax(prediction)
    Accuracy = np.max(prediction)
    predicted_class = classes2[np.argmax(prediction)]

    return {
    ‘filename’: file.filename,
    ‘contenttype’: file.content_type,
    ‘prediction’: prediction.tolist(),
    ‘likely_class’: likely_class,
    ‘predicted_class’: predicted_class,
    ‘Accuracy’ : Accuracy,
    #’im’ : im
    }
    except:
    e = sys.exc_info()[1]
    raise HTTPException(status_code=500, detail=str(e))

Leave a Reply

Your email address will not be published. Required fields are marked *