vfps

A very fast and resource-efficient pseudonym service.

Supports horizontal service replication for highly-available deployments.

Run it

Warning Using the provided docker-compose.yaml is not a production-ready deployment but merely used to get started and testing quickly. It sets very restrictive resource limits uses the default password for an included, unoptimized PostgreSQL deployment.

docker compose -f docker-compose.yaml --profile=test up

Visit http://localhost:8080/ to view the OpenAPI specification of the Vfps API:

You can use the JSON-transcoded REST API described via OpenAPI or interact with the service using gRPC. For example, using grpcurl to create a new namespace:

grpcurl \
  -plaintext \
  -import-path src/Vfps/ \
  -proto src/Vfps/Protos/vfps/api/v1/namespaces.proto \
  -d '{"name": "test", "pseudonymGenerationMethod": "PSEUDONYM_GENERATION_METHOD_SECURE_RANDOM_BASE64URL_ENCODED", "pseudonymLength": 32}' \
  127.0.0.1:8081 \
  vfps.api.v1.NamespaceService/Create

And to create a new pseudonym inside this namespace:

grpcurl \
  -plaintext \
  -import-path src/Vfps/ \
  -proto src/Vfps/Protos/vfps/api/v1/pseudonyms.proto \
  -d '{"namespace": "test", "originalValue": "to be pseudonymized"}' \
  127.0.0.1:8081 \
  vfps.api.v1.PseudonymService/Create

Production-grade deployment

See https://github.com/miracum/charts/tree/master/charts/vfps for a production-grade deployment on Kubernetes via Helm.

Configuration

Available configuration options which can be set as environment variables:

Variable	Type	Default	Description
`ConnectionStrings__PostgreSQL`	`string`	`""`	Connection string to the PostgreSQL database. See https://www.npgsql.org/doc/connection-string-parameters.html for options.
`ForceRunDatabaseMigrations`	`bool`	`false`	Run database migrations as part of the startup. Only recommended when a single replica of the application is used.
`Tracing__IsEnabled`	`bool`	`false`	Enable distributed tracing support.
`Tracing__Exporter`	`string`	`"jaeger"`	The tracing export format. One of `jaeger`, `otlp`.
`Tracing__ServiceName`	`string`	`"vfps"`	Tracing service name.
`Tracing__RootSampler`	`string`	`"AlwaysOnSampler"`	Tracing parent root sampler. One of `AlwaysOnSampler`, `AlwaysOffSampler`, `TraceIdRatioBasedSampler`
`Tracing__SamplingProbability`	`double`	`0.1`	Sampling probability to use if `Tracing__RootSampler` is set to `TraceIdRatioBasedSampler`.
`Tracing__Jaeger`	`object`	`{}`	Jaeger exporter options.
`Tracing__Otlp__Endpoint`	`string`	`""`	The OTLP gRPC Endpoint URL.
`Pseudonymization__Caching__Namespaces__IsEnabled`	`bool`	`false`	Set to `true` to enable namespace caching.
`Pseudonymization__Caching__Pseudonyms__IsEnabled`	`bool`	`false`	Set to `true` to enable pseudonym caching.
`Pseudonymization__Caching__SizeLimit`	`int`	`65534`	Maximum number of entries in the cache. The cache is shared between the pseudonyms and namespaces.
`Pseudonymization__Caching__AbsoluteExpiration`	`D.HH:mm:nn`	`0.01:00:00`	Time after which a cache entry expires.

Observability

The service exports metrics in Prometheus format on :8082/metrics. Health-, readiness-, and liveness-probes are exposed at :8080/healthz, :8080/readyz, and :8080/livez respectively.

FHIR operations

The service also exposes a FHIR operations endpoint. Sending a FHIR Parameters resource to /v1/fhir/$create-pseudonym of the following schema:

{
  "resourceType": "Parameters",
  "parameter": [
    {
      "name": "namespace",
      "valueString": "test"
    },
    {
      "name": "originalValue",
      "valueString": "hello world"
    }
  ]
}

will create a pseudonym in the test namespace. The expected response looks as follows:

{
  "resourceType": "Parameters",
  "parameter": [
    {
      "name": "namespace",
      "valueString": "test"
    },
    {
      "name": "originalValue",
      "valueString": "hello world"
    },
    {
      "name": "pseudonymValue",
      "valueString": "8KWwnm3TXR5R9iUDVVKD-jUezE4DEyeydOeq4v_a_b5ejSLmqOlT8g"
    }
  ]
}

Development

Prerequisites

.NET 7.0: https://dotnet.microsoft.com/en-us/download/dotnet
Docker CLI 20.10.17: https://www.docker.com/
Docker Compose: https://docs.docker.com/compose/install/

Build & run

Start an empty PostgreSQL database for development (optionally add -d to run in the background):

docker compose -f docker-compose.yaml up

To additionally start an instance of Jaeger Tracing, you can specify the jaeger profile:

docker compose -f docker-compose.yaml --profile=jaeger up

Restore dependencies and run in Debug mode:

dotnet restore
dotnet run -c Debug --project=src/Vfps

Open https://localhost:8080/ to see the OpenAPI UI for the JSON-transcoded gRPC services. You can also use grpcurl to interact with the API:

Note In development mode gRPC reflection is enabled and used by grpcurl by default.

grpcurl -plaintext \
    -d '{"name": "test", "pseudonymGenerationMethod": "PSEUDONYM_GENERATION_METHOD_SECURE_RANDOM_BASE64URL_ENCODED", "pseudonymLength": 32}' \
    127.0.0.1:8081 \
    vfps.api.v1.NamespaceService/Create

grpcurl -plaintext \
    -d '{"namespace": "test", "originalValue": "a test value"}' \
    127.0.0.1:8081 \
    vfps.api.v1.PseudonymService/Create

Run unit tests

dotnet test src/Vfps.Tests \
  --configuration=Release \
  --collect:"XPlat Code Coverage" \
  --results-directory=./coverage \
  -l "console;verbosity=detailed" \
  --settings=src/Vfps.Tests/runsettings.xml

Generate Code coverage report

If not installed, install the report generation too:

dotnet tool install -g dotnet-reportgenerator-globaltool

reportgenerator -reports:"./coverage/*/coverage.cobertura.xml" -targetdir:"coveragereport" -reporttypes:Html
# remove the coverage directory so successive runs won't cause issues with their random GUID.
# See <https://github.com/microsoft/vstest/issues/2378>
rm -rf coverage/

Build container image

docker build -t ghcr.io/miracum/vfps:latest .

Run iter8 SLO experiments locally

kind create cluster

export IMAGE_TAG="iter8-test"

docker build -t ghcr.io/miracum/vfps:${IMAGE_TAG} .

kind load docker-image ghcr.io/miracum/vfps:${IMAGE_TAG}

helm upgrade --install \
  --set="image.tag=${IMAGE_TAG}" \
  -f tests/iter8/values.yaml \
  --wait \
  --timeout=15m \
  --version=^1.0.0 \
  vfps oci://ghcr.io/miracum/charts/vfps

kubectl apply -f tests/iter8/experiment.yaml

iter8 k assert -c completed --timeout 15m
iter8 k assert -c nofailure,slos
iter8 k report

Benchmarks

Micro benchmarks

The pseudonym generation methods are continuously benchmarked. Results are viewable at https://miracum.github.io/vfps/dev/bench/.

E2E load testing

Create a pseudonym namespace used for benchmarking:

grpcurl \
  -plaintext \
  -import-path src/Vfps/ \
  -proto src/Vfps/Protos/vfps/api/v1/namespaces.proto \
  -d '{"name": "benchmark", "pseudonymGenerationMethod": "PSEUDONYM_GENERATION_METHOD_SECURE_RANDOM_BASE64URL_ENCODED", "pseudonymLength": 32}' \
  127.0.0.1:8081 \
  vfps.api.v1.NamespaceService/Create

Generate 100.000 pseudonyms in the namespace from random original values:

ghz -n 100000 \
    --insecure \
    --import-paths src/Vfps/ \
    --proto src/Vfps/Protos/vfps/api/v1/pseudonyms.proto \
    --call vfps.api.v1.PseudonymService/Create \
    -d '{"originalValue": "{{randomString 32}}", "namespace": "benchmark"}' \
    127.0.0.1:8081

Sample output running on

OS=Windows 11 (10.0.22000.978/21H2)
12th Gen Intel Core i9-12900K, 1 CPU, 24 logical and 16 physical cores
32GiB of DDR4 4800MHz RAM
Samsung SSD 980 Pro 1TiB
PostgreSQL running in WSL2 VM on the same machine.
.NET SDK=7.0.100-rc.1.22431.12

Summary:
  Count:        100000
  Total:        16.68 s
  Slowest:      187.81 ms
  Fastest:      2.52 ms
  Average:      8.00 ms
  Requests/sec: 5993.51

Response time histogram:
  2.522   [1]     |
  21.051  [99748] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  39.580  [201]   |
  58.109  [0]     |
  76.639  [0]     |
  95.168  [0]     |
  113.697 [0]     |
  132.226 [0]     |
  150.755 [0]     |
  169.285 [0]     |
  187.814 [50]    |

Latency distribution:
  10 % in 6.26 ms
  25 % in 6.91 ms
  50 % in 7.72 ms
  75 % in 8.93 ms
  90 % in 9.57 ms
  95 % in 10.01 ms
  99 % in 11.86 ms

Status code distribution:
  [OK]   100000 responses

Sub-10ms P99-latency

By default, each pseudonym creation requests executes two database queries: one to fetch the namespace configuration and a second one to persist the pseudonym if it doesn't already exist. There is an opt-in way to avoid the first query by caching the namespaces in a non-distributed in-memory cache. It can be enabled and configured using the following environment variables:

Variable	Type	Default	Description
`Pseudonymization__Caching__Namespaces__IsEnabled`	`bool`	`false`	Set to `true` to enable namespace caching.
`Pseudonymization__Caching__SizeLimit`	`int`	`32`	Maximum number of entries in the cache.
`Pseudonymization__Caching__AbsoluteExpiration`	`D.HH:mm:nn`	`0.01:00:00`	Time after which a cache entry expires.

Warning Deleting a namespace does not automatically remove it from the in-memory cache. Pseudonym creation requests against such a stale cached namespace will fail until either the entry expired or the service is restarted.

Using the same setup as above but with namespace caching enabled, we can lower the per-request latencies and increase throughput:

Summary:
  Count:        100000
  Total:        11.70 s
  Slowest:      27.23 ms
  Fastest:      1.55 ms
  Average:      5.47 ms
  Requests/sec: 8549.22

Response time histogram:
  1.546  [1]     |
  4.114  [17418] |∎∎∎∎∎∎∎∎∎∎
  6.682  [72827] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  9.251  [9382]  |∎∎∎∎∎
  11.819 [122]   |
  14.387 [0]     |
  16.956 [0]     |
  19.524 [0]     |
  22.092 [0]     |
  24.661 [49]    |
  27.229 [201]   |

Latency distribution:
  10 % in 4.00 ms
  25 % in 4.34 ms
  50 % in 5.81 ms
  75 % in 6.00 ms
  90 % in 6.66 ms
  95 % in 7.00 ms
  99 % in 8.00 ms

Status code distribution:
  [OK]   100000 responses

Resource efficiency

The sample deployment described in docker-compose.yaml sets strict resource limits for both the CPU (1 CPU) and memory (max 128MiB). Even under these constraints > 1k RPS are possible, although with significantly increased P99 latencies:

Summary:
  Count:        100000
  Total:        73.99 s
  Slowest:      268.06 ms
  Fastest:      5.26 ms
  Average:      36.69 ms
  Requests/sec: 1351.51

Response time histogram:
  5.257   [1]     |
  31.537  [57298] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  57.817  [21327] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  84.097  [17685] |∎∎∎∎∎∎∎∎∎∎∎∎
  110.377 [3395]  |∎∎
  136.656 [243]   |
  162.936 [0]     |
  189.216 [1]     |
  215.496 [0]     |
  241.776 [0]     |
  268.055 [50]    |

Latency distribution:
  10 % in 14.62 ms
  25 % in 18.47 ms
  50 % in 29.46 ms
  75 % in 47.53 ms
  90 % in 71.96 ms
  95 % in 79.95 ms
  99 % in 97.22 ms

Status code distribution:
  [OK]   100000 responses

Image signature and provenance verification

Prerequisites:

All released container images are signed using cosign and SLSA Level 3 provenance is available for verification.

IMAGE=ghcr.io/miracum/vfps:v1.3.5
DIGEST=$(crane digest "${IMAGE}")
IMAGE_DIGEST_PINNED="ghcr.io/miracum/vfps@${DIGEST}"
IMAGE_TAG="${IMAGE#*:}"

cosign verify \
   --certificate-oidc-issuer=https://token.actions.githubusercontent.com \
   --certificate-identity-regexp="https://github.com/miracum/.github/.github/workflows/standard-build.yaml@.*" \
   --certificate-github-workflow-name="ci" \
   --certificate-github-workflow-repository="miracum/vfps" \
   --certificate-github-workflow-trigger="release" \
   --certificate-github-workflow-ref="refs/tags/${IMAGE_TAG}" \
   "${IMAGE_DIGEST_PINNED}"

slsa-verifier verify-image \
    --source-uri github.com/miracum/vfps \
    --source-tag ${IMAGE_TAG} \
    "${IMAGE_DIGEST_PINNED}"

See also https://github.com/slsa-framework/slsa-github-generator/tree/main/internal/builders/container#verification for details on verifying the image integrity using automated policy controllers.

Name		Name	Last commit message	Last commit date
Latest commit History 160 Commits
.config		.config
.github/workflows		.github/workflows
.vscode		.vscode
docs/img		docs/img
src		src
tests		tests
.checkov.yml		.checkov.yml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.kics.yaml		.kics.yaml
.markdown-link-check.json		.markdown-link-check.json
.markdownlintignore		.markdownlintignore
.mega-linter.yml		.mega-linter.yml
.protolintrc.yml		.protolintrc.yml
.renovaterc.json		.renovaterc.json
.trivy.yaml		.trivy.yaml
.trivyignore		.trivyignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Taskfile.yaml		Taskfile.yaml
Vfps.sln		Vfps.sln
docker-compose.yaml		docker-compose.yaml
grpc-utils.Dockerfile		grpc-utils.Dockerfile

License

miracum/vfps

Folders and files

Latest commit

History

Repository files navigation