As of October 18, 2022. For V2 version of the site.

1 Intro
- 1.1 Legend
2 1. Codebase
3 2. Dependencies
4 3. Config
5 4. Backing services
6 5. Build, release, run
7 6. Processes
8 7. Port binding
9 8. Concurrency
10 9. Disposability
11 10. Dev/prod parity
12 11. Logs
13 12. Admin processes
14 Additional Cloud Native Factors
- 14.1 13. API-First
- 14.2 14. Telemetry
- 14.3 15. Authentication and authorization

Intro

The 12 Factor App is a set of principles that describes a way of making software that, when followed, enables companies to create code that can be released reliably, scaled quickly, and maintained in a consistent and predictable manner.

Legend

✅ - app conforms to a factor aspect

❌ - app does not conform to a factor aspect

❓ - conformity to this factor aspect is unknown or ambiguous

⚫ - factor aspect is not applicable to the app

1. Codebase

One codebase tracked in revision control; many deploys.

✅ Code is stored in a single repo
⚫ If multiple repos, it is a distributed system with multiple apps, each app conforming to 12-factor
⚫ If multiple repos, use dependency management to include projects within each other
✅ Each app can have multiple deployments (running versions of the app), ex. Production/test/staging. Deploying is done from the same codebase, although they can be of different versions

2. Dependencies

Explicitly declare and isolate dependencies.

✅ Use package management system – declare complete and exact list of dependencies in a dependency declaration manifest
✅ Do not rely on system-wide packages

When running on bare metal, this may not be true. Global packages such as Jest and react-tools would be used from the system-wide scope. If running in docker this is not an issue

✅ Ensure implicit system-wide dependencies do not “leak in” from the surrounding system using isolation
✅ The dependency list is applied uniformly both in production and development
✅ Language runtime and dependency manager as pre-requisites are allowed
✅ Deterministic build command sets up everything required to run the code
✅ App should also not rely on any system tools, e.g., curl. If an app needs to shell out to a system tool, it should be vendored into the app

3. Config

Store config in the environment

✅ Use config to control variables that change between deployments (staging, production, etc.), e.g., database, credentials, per-deploy values, etc.
✅ Config should not be part of the code
✅ Application configuration that does not change between deployments can be stored in code
✅ Instead of scattering config files and risking them getting committed to source control, utilize environment variable files instead

There are two environment files - one for front end and one for backend. May make sense to unify

✅ Environment variables are not grouped together into “environments” for specific deployments, they are granular controls managed independently for each deployment

4. Backing services

Treat backing services as attached resources. A backing service is any service the app consumes over the network as part of its normal operation, e.g., datastores, caching systems, messaging/queuing systems, etc. These might be local or managed by a 3rd party.

✅ App does not make distinctions between local and 3rd party services (swapping out one for another can be done in config without changing code)
✅ Each backing service is an attached resource, which means the coupling between the resource and the deployment is loose
✅ Resources can be attached and detached from deploys at will without code changes (ex. If database is misbehaving, a new database instance can be created from a backup and attached instead of the original database)

5. Build, release, run

Strictly separate build and run stages. A non-development deployment goes through three stages:

✅ Build stage transforms code repo into an executable bundle (build). Using a version of the code at a commit specified by the deployment process, the build stage also fetches dependencies and complies binaries and assets
✅ The release stage combines build with deployments current config. The resulting release contains both the build and the config and is ready for immediate execution in the execution environment

In context of OpenShift release is an image

✅ The run stage runs the app in the execution environment, by launching some of app’s processes against a selected release
✅ The stages are strictly separated, e.g., there is no way to make changes to the code at runtime
❓ Releases should have a unique release identifier (ex. Timestamp or incremental versioning), be append-only (cannot mutate a release)

In context of OpenShift images are labelled with PR

We do use semver in our github releases, but it only applies when merging with master. If we improve of cicd pipeline to deploy on merge to master, we would have correct versioning

❌ It should be possible to rollback to previous release

This have not been exercised, but should be possible in principle

✅ The run stage should be as simple as possible, with the complexity being shifted into the build stage (since the run stage can be triggered automatically if an app crashes and there is no one around to debug any issues, for example)

6. Processes

Execute the app as one or more stateless processes. E.g., on one end it is a script launched by runtime (python my_script.py), on another it is a set of processes.

✅ Processes are stateless and do not share anything. Any persisted data goes through a stateful backing service, like a database
⚫ Do not assume anything cached in memory or on disk will be available on a future request or job
⚫ Any type of “sticky session” mechanic does not use memory caching, instead utilizing datastore
⚫ Asset packagers that use filesystem as a cache for compiled assets should be configured to run during the build stage

7. Port binding

Export services via port binding

✅ The app is self-contained, e.g., it includes a webserver or other software
✅ App does not rely on runtime injection of a webserver or other software to create a web-facing service
✅ App exports HTTP or other capability as a service by binding to a port and listening to requests coming in on that port
✅ Dependency declarations are used to add webserver library or other software within app code
✅ An app can become a backing service for another app by providing the URL as a resource handle for the consuming app

8. Concurrency

Scale out via the process model

⚫ Appropriate process types are used for corresponding tasks (e.g., web process is used to handle HTTP requests, while long-running background tasks are handled by a worker process)
✅ Application can span multiple processes running on multiple physical machines (horizontal scaling)
✅ Processes are not daemonized or write PID files, instead relying on operating system’s process manager (distributed process manager on a cloud platform)
⚫ Locally for development, a tool like Foreman is used to manage multiple processes
✅ Output streams, responses to crashed processes and user-initiated restarts and shutdowns are handled by a process manager as described above

9. Disposability

Maximize robustness with fast startup and graceful shutdown

✅ Processes are disposable, meaning they can be started and stopped at a moment’s notice, allowing fast elastic scaling, rapid deployment of code or config changes, and robustness of production deploys
❌ Startup time is minimized (ideally a few seconds)

Currently the pod startup time is prohibitively high. There should be a ticket to investigate

❓ The process shut down is graceful. For web processes this means ceasing to listen on the service port, allowing any current requests to finish and then exit

It is unknown if requests would be dropped if a pod is shutting down

✅ Http requests are short (no more than a few seconds)
In case of long polling, client should seamlessly attempt to reconnect when the connection is lost
⚫ Worker processes achieve graceful shutdown by returning current job the work queue. All jobs are reentrant through wrapping the results in a transaction or by making the operation idempotent.
❓ Processes should be robust against sudden death (non-graceful termination)

It is unclear how strapi/mongodb handles ungraceful termination during active processing, would cause database corruption?

10. Dev/prod parity

Keep development, staging, and production as similar as possible. Historically there have been substantial gaps between development and production (code making its way from developer’s local deploy to production). This results in 3 gaps, that can be reduced using continuous deployment

✅ Time gap – time it takes for the code developed by a programmer to make its way to production is short (hours/minutes)
✅ Personnel gap – developers write code, ops engineers deploy it. Developers who write code are integrally involved in deploying it and matching its behavior in production.
✅ Tools gap – developers use a different stack than that used in production – keep development and production as similar as possible
✅ Do not use different backing services between development and production (e.g., using SQLite locally and PostgreSQL on production). Doing otherwise creates friction that disincentivizes continuous deployment

11. Logs

Treat logs as event streams

✅ Logs are the stream of aggregated time-ordered events collected from the output streams of all running processes and backing services

Kibana handles this

✅ App should not concern itself with routing or storage of its output stream. Each process writes its events into stdout. Locally this is viewed in the terminal.
✅ In production, each stream is captured by the execution environment, collated together with all other streams, and routed to one or more final destinations for viewing and long-term archival

12. Admin processes

Run admin/management tasks as one-off processes. Developers may wish to run one-off administrative or maintenance tasks, such as database migrations, running a console to run arbitrary code or running one-time scripts committed into the app’s repository

✅ One-off admin processes should be run in an identical environment as the regular long-running process of the app
✅ These processes should run against a release, using the same codebase and config as any process run against that release
✅ Admin code is shipped with application code
✅ Same dependency isolation is used on all process types, e.g., if a python program normally uses virtualenv, bin/python should be used for running both the Tornado webserver and any admin processes
✅ Use REPL shell if available to make run one-off scripts. Locally this is done by starting the shell inside the app’s checkout directory, while in production developers can use SSH or other remote command execution mechanism.

We could start a debug pod and use remote code execution to run arbitrary code

Additional Cloud Native Factors

Additional factors for cloud native apps:

13. API-First

digital.gov.bc.ca is not a type of service that provides an API, since it mostly presents static content. GraphQL can be used to query some of the content from Strapi (not well documented)

API-first was introduced as a factor to place emphasis on the importance of APIs within cloud-native application development.

⚫ APIs are clearly defined and are ready to be integrated with other services
⚫ APIs are consistent and reusable, allowing teams to work against each other’s public contracts without interfering with internal development processes
⚫ APIs can be easily mocked up
⚫ API documentation is well-designed, comprehensive, and easy to follow
⚫ API documentation utilizes API description language for describing the API and focuses on the “what” rather than the “how”
⚫ API source code uses standard model from API specifications, allowing documentation to be generated from the APIs themselves (e.g., OpenAPI v3 specification)

14. Telemetry

Logging is used as a tool during development to diagnose errors and code flows, while telemetry, on the other hand, focuses on the data collection once the app is in production to monitor app’s performance, health, and key metrics in the distributed environment.

❓ Health and system metrics include application start/shutdown, scaling, web request tracing and results of periodic health checks are collected, e.g., Prometheus

Present but under-utilized. Could provide periodic health check reports sent via email.

❌ Domain specific metrics (those needed or required by our specific organization/department/team) are collected

Custom metrics have not been discussed or implemented

15. Authentication and authorization

Because cloud native applications can be transported across data centers, executed within multiple containers, and accessed by many clients, security must be considered strongly

✅ Ensure all security policies are in place
⚫ APIs are secured using OAuth, RBAC, etc.
✅ Web content is exposed externally on HTTPS
⚫ User security is used to maintain audit trails of the events that happened for a user session

Common Components Program

12 Factor App Alignment