Reverse proxy cache (API cache)

Pull request https://github.com/ecamp/ecamp3/pull/3610 introduces a reverse proxy cache in front of the API, in order to accelerate API responses.

[Work in progress]

General concept

What's the purpose

HTTP reverse proxy sits in front of the application and caches HTTP responses. Originally, this was mostly used for static content. However, with a smart invalidation mechanism in the application, HTTP caches can also be used for dynamic data.

Cache tags & surrogate keys

Most HTTP caches implement the invalidation with surrogate keys (specific implementation of cache tags).

This recording (Take your Http caching to the next level with xkey & Fastly) is a bit older (2018) but provides a good and simple overview of how cache tags work. The presentation is based on varnish with xkey (=Surrogate keys) and Symfony FOSHttpCache.

Current implementation

Only API calls in haljson-format are potentially cached. Everything else bypasses the cache (return(pass)).
Currently, only a limited number of endpoints has caching enabled. [To do: include link to code, where the list of enabled endpoints can be seen]
Both JWT cookies are part of the cache key (=hash). This means, every user (more specifically: every login) has its own cache data. Cache is currently not shared between users.
The cache tag logic is different than the one shipped by default from api-platform/core. See last section for a detailed explanation of the cache tag logic.

Test cache during local development

When using http://localhost:3000 during local development, Varnish is bypassed, which means all API responses are coming directly from API platform (uncached).

In order to test caching, use http://localhost:3004. Now, all requests will be routed via Varnish. Requests to frontend, mail, etc. will also be routed via Varnish but will be ignored for caching purposes (=pass).

Even if only using http://localhost:3000, the http-cache container (i.e. varnish) has to be up and running. Otherwise, tag invalidation requests during create/update/delete operations will not be successful and the API will error. If you want to disable all cache functionality in the API, you can set API_CACHE_ENABLED=false in your .env file.

During development, you might need to purge the cache regularly, in order to test new code or when checkout out new branches. In order to purge the cache completely, it is easiest to destroy the http-cache container and restart varnish:

docker compose stop http-cache
docker compose rm http-cache
docker compose up -d

Alternatively, you can open a shell into the running container and then ban all tags:

docker compose exec -ti http-cache /bin/bash
varnishadm 'ban req.url ~ .'

Using the shell into the container, you can also use other varnish commands, such as varnishlog -g raw to output raw log stream or varnishreload to reload the config after changes to the VCL files.

Enable cache on deployment

For deployment, the api cache can be enabled/disabled via value setting apiCache.enabled. The Github workflows as well as the manual helm deployment scripts will look for environment variable API_CACHE_ENABLED in order to populate apiCache.enabled.

Other than the localhost setup, the reverse proxy cache on deployment sits in front of the API only, so other requests (e.g. frontend, etc.) will never be routed via Varnish.

The deployment configuration also includes 2 sidecars for varnish:

prometheus-exporter: useful to collect metrics/statistics about varnish and display in Grafana
varnishncsa: logging of requests (1 line per request) in order to debug problems or to to derive hit/miss/pass statistics for each endpoint

In addition to the sidecars, the deployment configuration includes the following abilities:

Automatic recreation of the Pod for each new commit (rollme). This will purge the cache.
Separate port for tag invalidation (.Values.apiCache.varnishPurgePort). This port is accessible within the Kubernetes deployment only and not accessible from outside.

Caveats

Invalidation of cache tags only works, when data operation (CRUD) is being done via API. For the unlikely case, that data is changed manually directly on the database, affected cache tags need to be purged manually (or ban complete cache alternatively).

Cache Tag Logic

Also see this rejected PR on api-platform/core for a description on why the api-platform default cache tag logic is flawed.

Cache Tag compilation

During serialization, cache tags are compiled via the implemented TagCollector class.

Entity IDs are included in the cache tags for each entity which is fully embedded in the response (=requested item + embedded entities / subentities; or items of a collection in case of a collection request). The response is purged from the cache in case of any change to any of these entities.
For each relation property, exactly 1 cache tag is added in the format "{id}#{property}". Also for xToMany relations, only 1 cache tag is included and not 1 cache tag per referenced entity. The response is purged from cache in case the composition of this relation changes (changing the referenced entity; or adding/deleting entities from a xToMany relation)

Purging

Post/Create

all GetCollection operations, the new item belongs to
all related objects, from the related objects side (=does't purge the related object, if the relation is unidirectional)

Drop/Delete

the ID of the deleted item itself
all GetCollection operations the item belonged to
all related objects, from the related objects side (=does't purge the related object, if the relation is unidirectional)

Update

the ID of the item itself
related objects of changed relations only, from the related objects side (=does't purge the related object, if the relation is unidirectional)

Example

{
  "_links": {
    "self": {
      "href": "/items/1"
    },
    "linkedParent": {
      "href": "/linked_parents/2"
    },
    "linkedChildren": [
      {
        "href": "/linked_children/3"
      }
    ],
    "linkedSubresources": {
      "href": "/items/1/linked_subresources"
    },
    "embeddedParent": {
      "href": "/embedded_parents/4"
    },
    "embeddedChildren": [
      {
        "href": "/embedded_children/5"
      }
    ]
  },
  "_embedded": {
    "embeddedParent": {
      "_links": {
        "self": {
          "href": "/embedded_parents/4"
        }
      },
      "dummyProperty": "",
      "id": 4
    },
    "embeddedChildren": [
      {
        "_links": {
          "self": {
            "href": "/embedded_children/5"
          },
          "embeddedGrandchildren": []
        },
        "_embedded": {
          "embeddedGrandchildren": []
        },
        "dummyProperty": "",
        "id": 5
      }
    ]
  },
  "id": 1
}

A GET request on /items/1 will include the following change tags:

Request	Response cache tags
GET /items/1	1 1#linkedParent 1#linkedChildren 1#embeddedParent 1#embeddedChildren 4 5 5#embeddedGrandchildren

Note, that 1#linkedSubresources is not included. This href is static and will never change, not matter whether subresources are added or removed. This only works with the new uriTemplate feature from api-platform. When using our own link substitution in RelatedCollectionLinkNormalizer, the cache tag for this relation would still be included.

Request	Purged tags (in bold tag which matches the cache tags from above)	Comment
PATCH /linked_parents/2 { stringProperty: "dummy" }	2	Only reponses where /linked_parents/2 is directly embedded are purged. The cached reponse of "GET /items/1" is still valid
PATCH /linked_children/3 { parent: "/items/10" }	3 1#linkedChildren 10#linkedChildren	"GET /items/1" is properly purged
POST /linked_children { parent: "/items/1" }	/linked_children 1#linkedChildren	"GET /items/1" is properly purged
PATCH /embedded_parents/4 { dummyProperty: "dummy" }	4	"GET /items/1" is properly purged
POST /items { embeddedParent: "/embeddedParents/4"}	/items 4#children	"GET /items/1" is still valid, because the children property was not serialized on the embeddedParent entity
POST /embedded_grandchildren { parent: "/embedded_children/5" }	/embedded_grandchildren 5#embeddedGrandchildren	"GET /items/1" is properly purged

Home
- Contribute
  - Schema migrations
Installation
Domain Object Model
API
Design
Localization and translations
Architecture
- Frontend
- Backend
- Cache
Testing guide
- API testing (TBD)
- Frontend testing
- E2E testing (TBD)
Deployment
Debugging
- Locally reproducing a production error
- Importing a database dump locally

Provide feedback

Saved searches

Use saved searches to filter your results more quickly