┌──────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Flask Request Lifecycle — From Socket to Response │ │ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ Client │──▶│ WSGI Server│──▶│ Werkzeug │──▶│ before_ │──▶│ Route │ │ │ │ (browser) │ │ (gunicorn) │ │ Routing │ │ request │ │ Resolver │ │ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ └─────┬──────┘ │ │ │ │ │ ▼ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ Response │ ◀── │ after_ │ ◀── │ teardown_ │ ◀── │ View Func │ ◀── │ URL → View │ │ │ │ (HTTP 200) │ │ request │ │ request │ │ (handler) │ │ matched │ │ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │ │ │ Right-to-left bottom row is the RESPONSE leg. Hooks fire in fixed order. │ │ before_request: auth, request-scoped DB session, request-id setup. │ │ after_request: CORS headers, cache headers, response logging. │ │ teardown_request: ALWAYS runs (even on exceptions) — close DB session here. │ └──────────────────────────────────────────────────────────────────────────────────────────────────┘
Top row: request flowing in. Bottom row (right-to-left): response flowing out, with hooks firing in fixed order.
Flask is a Python micro-framework for building WSGI web applications. It was created by Armin Ronacher in 2010 as an April Fools' joke that turned into one of the most widely deployed Python web frameworks in production. Flask is "micro" in the sense that it ships with a minimal core — request routing, a templating engine, and a development server — and leaves persistence, authentication, migrations, forms, and admin interfaces to external extensions or the application author.
The two load-bearing dependencies are:
Map/Rule), the debugger, and the dev
server. Flask is essentially a thin, opinionated layer over Werkzeug.Flask's design philosophy is explicit over implicit: no ORM is imposed, no project layout is enforced, and there is no built-in admin. This makes it a common default for ML model serving, internal tools, and small-to-medium HTTP APIs where the cost of a full framework is not justified.
Flask is a synchronous WSGI framework. WSGI (Web Server Gateway
Interface) is specified in PEP 3333 and defines the contract between a
Python web application and an HTTP server. The contract is deliberately simple: an
application is any callable that accepts two arguments — environ (a dict of
CGI-style request variables) and start_response (a callable used to emit the
status line and headers) — and returns an iterable of bytes representing the response
body.
def application(environ, start_response):
status = "200 OK"
headers = [("Content-Type", "text/plain; charset=utf-8")]
start_response(status, headers)
return [b"hello from raw WSGI"]
Flask's Flask object is a WSGI application — calling
app(environ, start_response) dispatches through Werkzeug's routing, invokes
the matched view function, converts its return value into a Response, and
serialises the result back through start_response. Any WSGI-compatible
server — gunicorn, uWSGI, waitress,
mod_wsgi — can host a Flask app without modification. In production,
Flask is typically served by gunicorn behind nginx, with
multiple sync workers to amortise the GIL.
When a request reaches the Flask application object, it flows through a well-defined sequence:
app(environ, start_response).
Flask wraps environ in a Werkzeug Request object.current_app, g, request, and session
as context-local proxies.MapAdapter matches the
path + method against the registered url_map, producing an endpoint name
and view arguments.@app.before_request run in registration order. If one returns a non-None
value, the view is skipped and that value becomes the response.Response) is
normalised into a Response object via make_response().@app.after_request
function receives the Response and may mutate or replace it (add headers,
log, etc.). They run even if the view raised, only if the error was handled.Response is called as a WSGI
application itself, invoking start_response and yielding body bytes.from flask import Flask, request, jsonify, g
import time
app = Flask(__name__)
@app.before_request
def start_timer():
g.t0 = time.perf_counter()
@app.after_request
def log_latency(response):
dt_ms = (time.perf_counter() - g.t0) * 1000
app.logger.info("%s %s -> %d in %.1fms",
request.method, request.path, response.status_code, dt_ms)
response.headers["X-Response-Time-ms"] = f"{dt_ms:.1f}"
return response
@app.post("/predict")
def predict():
payload = request.get_json(force=True)
# model.predict(...) would go here
return jsonify(score=0.873, label="positive")
Flask(__name__). Holds the URL
map, view registry, config, extensions, and logger. Can be constructed inside an
application factory (create_app()) to support multiple
configs and testing.@app.route("/users/<int:uid>").
Backed by Werkzeug's routing, supporting converters (int, float,
uuid, path), HTTP method constraints, and URL building via
url_for().(body, status, headers), or
Response. Class-based views (MethodView) are available for
REST-style dispatch.request and session are proxies resolved against the current
request context. Essential for thread-safe access under sync WSGI.with app.app_context(): block).
current_app and g (a per-request scratchpad) live here.render_template() loads from the
templates/ folder, with autoescape on for .html. Supports
inheritance ({% extends %},
{% block %}), macros, and custom filters registered
via @app.template_filter().app.config is a dict populated from objects,
env vars, or files (from_object, from_envvar,
from_pyfile).All three are mature Python web frameworks but target different problems. The table below reflects real production trade-offs, not marketing positioning.
| Dimension | Flask | FastAPI | Django |
|---|---|---|---|
| Paradigm | Sync WSGI (async partial since 2.0) | Async-first ASGI, sync also supported | Sync WSGI + native ASGI since 3.0 |
| Philosophy | Micro; bring-your-own components | Micro; Pydantic + Starlette-based | Batteries-included; ORM, admin, auth, migrations |
| Typing / validation | None built-in (use marshmallow / Flask-Smorest) | Pydantic models native; runtime validation free | Form + serializer frameworks (DRF) add it |
| OpenAPI / docs | Extension (Flask-Smorest, apispec) | Auto-generated from type hints | Via DRF + drf-spectacular |
| Throughput (sync I/O) | Good with gunicorn + many workers | Excellent under async I/O; sync is similar to Flask | Good; overhead from middleware + ORM |
| ORM | None; SQLAlchemy via Flask-SQLAlchemy | None; SQLAlchemy / SQLModel / Tortoise | Django ORM (tightly coupled, opinionated) |
| Templating | Jinja2 | Jinja2 (optional; API-first) | Django Templates (or Jinja2) |
| Best for | ML serving, small APIs, legacy/integration glue | High-concurrency APIs, typed microservices | CMS, CRUD-heavy apps with admin, server-rendered sites |
| Learning curve | Low | Low–moderate (type hints, async) | Moderate–high (framework conventions) |
Honest take: for a greenfield typed JSON API in 2026, FastAPI is the default. Flask remains strong where sync code, simple deployment, and the extension ecosystem matter more than async throughput. Django wins when you need the admin, auth, and ORM on day one.
/predict endpoint. The work is CPU-bound inference, not
async I/O; gunicorn with N sync workers is simpler and often faster than
async. Flask is the de-facto choice for MLflow-style one-shot model servers and for
sidecar prediction APIs.Flask's "micro" core is viable in production only because of a mature extension ecosystem. The canonical set:
flask db migrate, flask db upgrade).@login_required, remember-me cookies. Pluggable user loader.# Typical production install for a Flask API
pip install "flask>=3.0" gunicorn \
flask-sqlalchemy flask-migrate \
flask-jwt-extended flask-cors flask-smorest \
psycopg2-binary
# Run behind gunicorn with 4 sync workers, 2 threads each
gunicorn -w 4 --threads 2 -b 0.0.0.0:8000 "app:create_app()"
async def view support, but the core is still WSGI: every async view is
run inside a per-request event loop by a sync worker. There is no free lunch versus an
ASGI-native framework. High-concurrency async workloads belong on Starlette / FastAPI
/ Quart (the async Flask-API-compatible fork).dict; validation is the developer's problem unless an extension like
Flask-Smorest or pydantic is added. In a typed codebase this feels archaic next to
FastAPI's free Pydantic integration.request, g,
current_app, session are context-locals — convenient, but
they complicate testing and make dependency wiring less explicit than FastAPI's
Depends() system.None of these make Flask wrong; they map the zone where it is and is not the right
tool. For an ML-engineer workflow of "load a model, expose /predict, deploy
with gunicorn behind nginx, done," Flask is still one of the most operationally
predictable choices in the Python ecosystem.