From Flask API to a robust container "FROM scratch"

viktor

20 janv. 2025 — 10 min read

Ready to be shipped !!!

Summary

Preamble
Requirements
I - WSGI server compliance and HTTPS
-- 1 - Prepare your app to run with gunicorn (headless)
-- 2 - Add HTTPS
II - Compile the python code
III - Go futher
-- 1 - Docker Compose
-- 2 - Developpement and Engineering
IV - What I learn
-- 1 - My mistakes
---- A - Building on the machine
---- B - Using a dedicated compiler container the wrong way
---- C - Mounting Volumes
---- D - File permission pain
-- 2 - Lesson Learned
Last Words

Preamble

If you want to exercice first before updating your app, you can clone my GitHub repository here. You will find a set of instructions to initialize the project in the README.md.

Requirements

-- A Linux/MacOS workstation with a terminal working
-- Git, python and curl install
-- Docker ready and started (daemon running)

I - WSGI server compliance and HTTPS

1 - Prepare your app to run with gunicorn (headless)

Gunicorn can make your flask app WSGI compliant (Web Server Gateway Interface) to be used to serve your app on production with enhanced security.

To do so install the gunicorn package :

python -m pip install gunicorn

In your main file import the BaseApplication form gunicorn.app.base as following :

#!/usr/bin/env python
"""
Filename    : main.py
Porject     : XXXXX
Nickname    : YYYYY
Description : ZZZZZZZ ZZZ ZZZZZZ
"""

from flask import Flask, request, jsonify
from gunicorn.app.base import BaseApplication

...

Again in your main file create a class like this :

# disable pylint warning for abstrac class warning
#pylint: disable=W0223
class GunicornApplication(BaseApplication):
    """Class to run the API using gunicorn WSGI"""
    def __init__(self, application, options=None):
        self.options = options or {}
        self.application = application
        super().__init__()

    def load_config(self):
        config = {key: value for key, value in self.options.items()
                  if key in self.cfg.settings and value is not None}
        for key, value in config.items():
            self.cfg.set(key.lower(), value)

    def load(self):
        return self.application

Now instead of calling app.run() for example like this :

    app.run(host="127.0.0.1", port=5001)

You should create a variable for your gunicorn options and call your newly created class, also modify the listening IP address :

    # app.run(host="127.0.0.1", port=5001)
    options = {
        # change from 127.0.0.1 (localhost) to 0.0.0.0
        # to listen every IP, required for container.
        "bind": "0.0.0.0:5001",
        "workers": 1
    }
    GunicornApplication(application=app, options=options).run()

Try to start your API and re-test it. Every thing should work as expected.

2 - Add HTTPS

Create a certificate on the same folder than your main file. to create a certificate you can use the command below, it will prompt you some question in order to build the certificate, the most important question is Common Name (e.g. server FQDN or YOUR name) []: here you should set the Domain Name for your certificate, you could use demo-api.local as it is safe to use, it shouldn't leave your private network.

openssl req -x509 -newkey rsa:4096 \
-keyout key.pem -out cert.pem \
-days 365 -nodes

With the command above the certificate will be valid for 1 year, after that you should renew it.

Add the certificate in your gunicorn options :

    options = {
        "bind": "0.0.0.0:5001",
        "certfile": "./cert.pem",
        "keyfile": "./key.pem",
        "workers": 1
    }

Start your API.

Test your API now in HTTPS, don't forget to set your Domain Name (replace THE_DOMAIN_YOU_SET) and change the port.

curl --cacert path/to/cert.pem \
'https://THE_DOMAIN_YOU_SET:5001/YOUR_API_PATH' \
--resolve "THE_DOMAIN_YOU_SET:5001:127.0.0.1"

If you use my repository and set your Domain Name as demo-api.local, you can test the 2 API endpoints with the following commands :

curl --cacert cert.pem \
'https://demo-api.local:5001/' \
--resolve "demo-api.local:5001:127.0.0.1"

And :

curl --cacert cert.pem \
'https://demo-api.local:5001/health' \
--resolve "demo-api.local:5001:127.0.0.1"

II - Compile the python code

Create a dockerfile and fill it with the following information :

# Stage 1: Build stage
FROM python:3.12-bookworm as builder

WORKDIR /work

# Create a empty directory for final container /tmp
RUN mkdir /new_empty_dir
# Update the container
RUN apt-get update -y
RUN apt-get upgrade -y
# Installing requirements to compile python
RUN pip install pyinstaller
RUN pip install staticx
RUN apt-get install patchelf -y

# Install the source code
COPY requirements.txt requirements.txt
COPY main.py main.py

# Compile the app
RUN pip install -r requirements.txt
RUN pyinstaller --hidden-import gunicorn.glogging --hidden-import gunicorn.workers.sync -F main.py -n app_unpackaged.elf
RUN staticx --strip dist/app_unpackaged.elf dist/app.elf

# Stage 2: Final container
FROM scratch

USER 65535

WORKDIR /app/

COPY --chown=65535:65535 --from=builder /work/dist/app.elf /app/app.elf
COPY --chown=65535:65535 --from=builder /new_empty_dir /tmp

ENTRYPOINT ["/app/app.elf"]

Before compiling your program make sure you got a requirements.txt file, to create it be sure to be in your venv and execute the following command :

pip freeze > requirements.txt

We create a user with the id 65535 in this dockerfile in order to be able to use this container in rootless mode.

So in order to share certificate to your app you should chown your cert and key file, this command require to be root, you can use it like this :

sudo chown -R 65535 ./*.pem

This command will modify the owner of the file to the user with id 65535 in order to be able to read the file from the user we create for our final container (id 65535). So if your app require a configuration file or another file don't forget to use chown command in order to be able to read your file from the container.

Change the permission of the files to share certificate to your app (Read only permission to created user 65535 and to the associated group) :

sudo chmod 440 ./*.pem

Build your final container :

docker build --tag app .

Run it (giving your certificates files):

docker run \
--mount type=bind,source=$(pwd)/cert.pem,target=/app/cert.pem,readonly \
--mount type=bind,source=$(pwd)/key.pem,target=/app/key.pem,readonly \
--publish 5001:5001 app

Again test your API, but now from a scratch container :

curl --cacert path/to/cert.pem \
'https://THE_DOMAIN_YOU_SET:5001/YOUR_API_PATH' \
--resolve "THE_DOMAIN_YOU_SET:5001:127.0.0.1"

If you use my repository and set your Domain Name as demo-api.local, you can test the 2 API endpoints with the following commands :

curl --cacert cert.pem \
'https://demo-api.local:5001/' \
--resolve "demo-api.local:5001:127.0.0.1"

And :

curl --cacert cert.pem \
'https://demo-api.local:5001/health' \
--resolve "demo-api.local:5001:127.0.0.1"

III - Go futher

1 - Docker Compose

Use a compose.yml file to harden your container deployement.
You could set Ressources quotas (CPU, RAM, Disk) in order to prevent DoS attack. To know what your app require you could use docker stats command
You could set the docker and it's volume to read only if no write are required.
You could use a dedicated network to reduce lateral movement.
You could set a health check to watch that your app is running.
You could reduce CAPABILITIES.
You could use seccomp profile.

Here is an example of a docker-compose.yml with some hardening :

version: '3.9'
services:
    app:
        container_name: gunicorn_API
        build:
            context: .
        logging:
            options:
                max-size: "10m"
                max-file: "3"
        deploy:
            resources:
                limits:
                    cpus: '0.1'
                    memory: 128M
                reservations:
                    cpus: '0.01'
                    memory: 64M
        restart: always
        volumes:
            - ./cert.pem:/app/cert.pem:ro
            - ./key.pem:/app/key.pem:ro
        ports:
            - '5001:5001'
        cap_drop:
            - ALL
        networks:
            - app_network

networks:
    app_network:
        driver: bridge

2 - Developpement and Engineering

Use a venv when developing in python to reduce final packages size and useless modules embedded.

Use liting/quality tools for example pylint (pip install pylint) in order to keep your code enjoyable to read.
Use security tools to be sure your base app is safe (for python you can use bandit pip install bandit).
Write test to simplify future development and be sure that your app run as expected.

Write good documentation (For flask API you could go with a swagger endpoint using flask-restx module) and don't forget the ARchitecture Dossier, with beautiful diagram, to describe how the app will deploy and integrated...

A part of a good documentation when developing an API is to make sure all your routes are tested; you could use Bruno API client for example.

To successfully assign resource quotas on your container, you can run a stress test using locust testing framework.

Integrate your tests in a CI/CD to make sure you didn't miss/forget something when testing on your workstation.

IV - What I learn

1 - My mistakes

A - Building on the machine

When trying to create this container from scratch, I first try to compile the python app on my pc, "It work my machine" but when containerize the app crashes.

B - Using a dedicated compiler container the wrong way

I was using a dedicated container to build to reduce hardware adhesion/grip but my V1 of the compiler container was a failure.

What I did was using the container to compile, passing the current directory as /work of my container.

When building from container pyinstaller was able to compile but staticX wasn't able to aggregate all packages with error staticx: /tmp/staticx-pyi-3ek9mteo/base_library.zip: Invalid ELF image: Magic number does not match.

To fix this error, I share only my source file (main.py) with requirements.txt and a volume at ./dist in order to be able to get the final ELF file.

C - Mounting Volumes

At this point I was thinking that my problems were solved. But when building my final container and running it, the API was starting but when I request the API I get this error :

Traceback (most recent call last):
  File "gunicorn/workers/sync.py", line 131, in handle
  File "gunicorn/sock.py", line 228, in ssl_wrap_socket
  File "gunicorn/sock.py", line 224, in ssl_context
  File "gunicorn/config.py", line 2024, in ssl_context
  File "gunicorn/sock.py", line 218, in default_ssl_context_factory
IsADirectoryError: [Errno 21] Is a directory

Basically Gunicorn was able to read my certificate and key files (certificate.pem and key.pem) when starting but wasn't able to read it when serving the app. So I change my docker run option from -v to --mount from :

docker run -v cert.pem:/app/cert.pem \
-v key.pem:/app/key.pem \
--publish 5001:5001 final-container

To :

docker run \
--mount type=bind,source=$(pwd)/cert.pem,target=/app/cert.pem,readonly \
--mount type=bind,source=$(pwd)/key.pem,target=/app/key.pem,readonly \
--publish 5001:5001 app

D - File permission pain

After all of this my key file wasn't readable I had to chmod it (from 400 to 440).

Traceback (most recent call last):
  File "gunicorn/workers/sync.py", line 133, in handle
  File "gunicorn/http/parser.py", line 41, in __next__
  File "gunicorn/http/message.py", line 259, in __init__
  File "gunicorn/http/message.py", line 60, in __init__
  File "gunicorn/http/message.py", line 274, in parse
  File "gunicorn/http/message.py", line 326, in read_line
  File "gunicorn/http/message.py", line 262, in get_data
  File "gunicorn/http/unreader.py", line 36, in read
  File "gunicorn/http/unreader.py", line 63, in chunk
  File "gunicorn/workers/base.py", line 204, in handle_abort
SystemExit: 1

And my API finally work.

2 - Lesson Learned

When building container directly reduce hardware adhesion/grip by building all your workloads in containers. Reduce to the minimum interaction with your bare-metal machine/OS.

Be aware of permissions and details like --mount instead of -v.

Last Words

It is common for new things to fail during the engineering process, but if you have the time keep pushing, it will pay off.