Skip to content

itaru2622/bluesky-selfhost-env

Repository files navigation

https://github.com/itaru2622/bluesky-selfhost-env

Contents:

this repository aims to get self-hosted bluesky env in easy with:

  • configurable hosting domain: easy to tunable by environment variable (DOMAIN)
  • reproducibility: disclosure all configs and operations, including reverse proxy rules, and patches to sources.
  • simple: all bluesky components runs on one host, by docker-compose.
  • less remapping: simple rules as possible, among FQDN <=> reverse proxy <=> docker-container, for easy understanding and tunning.

at current, my latest release is 2024-05-07 based on codes 2024-05-07 of bluesky-social.

as described below, most features started working on self-hosting environment, but it may not work with full capabilities yet. some of reasons are described in bluesky-social/atproto#2334

test results with 'asof-2024-04-18r1' and later:

  • ok: relaxing restriction on handle length in PDS (applyed bluesky-social/atproto#2392)
  • ok: create account on pds (via social-app, bluesky API).
  • ok: sign-in social-app (with multiple accounts)
  • ok: basic usages on social-app
    • ok: edit profilie (display name)
    • ok: post articles
    • ok: vote 'like' / repost to article
    • ok: follow/un-follow others, via their profile page.
    • ok: notification receiving when others vote 'like' or 'follow'
    • ok: search posts/users/feeds
  • ok: integration with firehose/websocket to pds/bgs(relay) => craw-able.
  • ok: integration with feed-generator (NOTE: feed-generator has some delay, so it may need reload on social-app).
  • ok: integration with ozone/labeler.
    • ok: join ozone(labeler) to self-hosting env.
    • ok: sign-in to ozone-UI then auto-adding record as labeler in DID doc, when the account was created via sign-up on social-app. (<= i.e becoming as labeler in lazy)
    • ok: configure a label on ozone-UI at /configure.
      • ok: 'labels' tab apears in profile page on social-app.
    • ok: 'subscribe to labeler' button apears in profile page on social-app.
    • ok: switch subscribe /unsubscribe labeler' on social-app profile page => it apears/disapears in moderation-list in user's profile pages on social-app.
    • ok: send report from social-app UI to ozone via each post (dotted pulldown menu)
    • ok: receive and view report on ozone-UI (both in /events and /reports)
    • not yet: others.

back to top

below, it assumes self-hosting domain is mysky.local.com (defined in Makefile).
you can change the domain name by environment variable as below:

# 1) set domain name for self-hosting bluesky
export DOMAIN=whatever.yourdomain.com

# 2) set asof date, to distinguish docker images / its sources.
#    2024-04-18(for latest prebuild, in %Y-%m-%d), or latest (following docker image naming manner in lazy).
export asof=2024-05-07

# 3) set email addresses.

# 3-1) EMAIL4CERTS:  to lets encrypt for signing certificates.
export EMAIL4CERTS=your@mail.address
# for self-signed certificates, use below(`internal` is reserved keyword).
# It is recommended to use `internal` for avoid meeting rate limits, until you are sure it ready to self-hosting.
export EMAIL4CERTS=internal

# 3-2) PDS_EMAIL_SMTP_URL: for PDS,  like smtps://youraccount:your-app-password@smtp.gmail.com
export PDS_EMAIL_SMTP_URL=smtps://

# 3-3) FEEDGEN_EMAIL: for feed-generator account in bluesky
export FEEDGEN_EMAIL=feedgen@example.com

# 4) check your configuration, from the point of view of ops.
make echo

# 5) generate secrets for bluesky containers, and check those value:
make genSecrets

## install make command as below, if you don't have yet.
apt install -y make
  1. make DNS A-Records in your self-hosting network.

at least, following two A-Records are required.
refer appendix for sample DNS server(bind9) configuration.

     -    ${DOMAIN}
     -  *.${DOMAIN}
  1. generate and install CA certificate (usecases for private/closed network, and others using self-signed certificates).
    • after generation, copy crt and key as ./certs/root.{crt,key}
    • note: don't forget to install root.crt to your host machine and browser.

the easiest way to get self-signed CA certificates is below.

# get and store self-signed CA certificate into ./certs/root.{crt,key}, by using caddy.
make getCAcert
# install CA cert on host machine.
make installCAcert

# don't forget to install certificate to browser.
# check DNS server responses for your self-host domain
dig  ${DOMAIN}
dig  any.${DOMAIN}

# start containers for test
make    docker-start f=./docker-compose-debug-caddy.yaml services=

# test HTTPS and WSS with your docker environment
curl -L https://test-wss.${DOMAIN}/
open -L https://test-wss.${DOMAIN}/ on browser.
wscat -c https://test-wss.${DOMAIN}/ws with CUI nodejs wscat package

# test reverse proxy mapping if it works as expected for bluesky
#  those should be redirect to PDS
curl -L https://pds.${DOMAIN}/xrpc/any-request | jq
curl -L https://some-hostname.pds.${DOMAIN}/xrpc/any-request | jq

#  those should be redirect to social-app
curl -L https://pds.${DOMAIN}/others | jq
curl -L https://some-hostname.pds.${DOMAIN}/others | jq

# stop test containers, without persisting data
make    docker-stop-with-clean f=./docker-compose-debug-caddy.yaml

=> if testOK then go ahead, otherwise check your environment.

first, describes deploying bluesky with prebuild images.
later describes how to build images from sources by yourself.

# 0) pull prebuild docker images from docker.io, to enforce skip building images.
make docker-pull

# 1) deploy required containers (database, caddy etc).
make docker-start

# wait until log message becomes silent.

# 2) deploy bluesky containers(plc, bgs, appview, pds, ozone, ...)
make docker-start-bsky

# 3) set bgs parameter for perDayLimit via REST API.
make api_setPerDayLimit
# 1) check if social-app is ready to serve.
curl -L https://social-app.${DOMAIN}/

# 2) create account for feed-generator
make api_CreateAccount_feedgen

# 3) start bluesky feed-generator
make docker-start-bsky-feedgen  FEEDGEN_PUBLISHER_DID=did:plc:...

# 4) announce existence of feed ( by scripts/publishFeedGen.ts on feed-generator).
make publishFeed
# 1) create account for ozone service/admin
#  you need to use valid email address since ozone/PDS sends email for confirmation code.
make api_CreateAccount_ozone email=your-valid@email.address.com

# 2) start ozone
# ozone-standalone uses the same DID for  OZONE_SERVER_DID and OZONE_ADMIN_DIDS, at HOSTING.md
make docker-start-bsky-ozone  OZONE_SERVER_DID=did:plc:  OZONE_ADMIN_DIDS=did:plc:

on your browser, access https://social-app.${DOMAIN}/ such as https://social-app.mysky.local.com/

refer screenshots, for UI operations to create/sign-in account on your self-hosting bluesky.

# choice1) shutdown containers but keep data alive.
make docker-stop

# choice2) shutdown containers and clean those data
make docker-stop-with-clean

back to top

export u=foo
make api_CreateAccount handle=${u}.pds.${DOMAIN} password=${u} email=${u}@example.com resp=./data/accounts/${u}.secrets

#then, to make another accounts, just re-assign $u and call the above ops, like below.
export u=bar
!make

export u=baz
!make

after configuring params and optional env, operate as below:

# get sources from all repositories
make    cloneAll

# create work branches and keep staying on them for all repositories (repos/*; optional but recommended for safe.)
make    createWorkBranch

then build docker images as below:

# 0) apply mimimum patch to build images, regardless self-hosting.
#      as described in https://github.com/bluesky-social/atproto/discussions/2026 for feed-generator/Dockerfile etc.
# NOTE: this ops checkout new branch before applying patch, and keep staying new branch
make patch-dockerbuild

# 1) build images with original
make build DOMAIN= f=./docker-compose-builder.yaml

# below ops is now obsoleted and unsupported bacause of fragile(high cost and low return). also below patch has no effect on PDS scaling out(multiple PDS domains).
# ~~ 2) apply optional patch for self-hosting, and re-build image ~~
# ~~  'optional' means, applying this patch is not mandatory to get self-hosting environment. ~~
# ~~ NOTE: this ops checkout new branch before applying patch, and keep staying new branch ~~
#
# ~~ make _patch-selfhost-even-not-mandatory ~~
# ~~ make build services=social-app f=./docker-compose-builder.yaml ~~

back to top

when you set fork_repo_prefix variable before cloneAll, this ops registers your remote fork repository with git remote add fork .... then you have additional easy ops against multiple repositores, as below.

export fork_repo_prefix=git@github.com:YOUR_GITHUB_ACCOUNT/

make cloneAll

# manage(push and pull) branches and tags for all repos by single operation against your remote fork repositories.
make exec under=./repos/* cmd='git push fork branch'
make exec under=./repos/* cmd='git tag -a "asof-XXXX-XX-XX" '
make exec under=./repos/* cmd='git push fork --tags'

# push something on justOneRepo to your fork repository.
make exec under=./repos/justOneRepo cmd='git push fork something'

# refer Makefile for details and samples.

back to top

  1. get all env vars in docker-compose
# names and those values
_yqpath='.services[].environment, .services[].build.args'
_yqpath='.services[].environment'

# lists of var=val
cat ./docker-compose-builder.yaml | yq -y "${_yqpath}" \
  | grep -v '^---' | sed 's/^- //' | sort -u -f

# output in yaml
cat ./docker-compose-builder.yaml | yq -y "${_yqpath}" \
  | grep -v '^---' | sed 's/^- //' | sort -u -f  \
  | awk -F= -v col=":" -v q="'" -v sp="  " -v list="-" '{print   sp list sp q $1 q col sp q $2 q}' \
  | sed '1i defs:' | yq -y


# list of names
cat ./docker-compose-builder.yaml | yq -y "${_yqpath}" \
  | grep -v '^---' | sed 's/^- //' | sort -u -f \
  | awk -F= '{print $1}' | sort -u -f
  1. env vars regarding {URL | DID | DOMAIN} == mapping rules in docker-compose
# get {name=value} of env vars regarding { URL | DID | DOMAIN }
cat ./docker-compose-builder.yaml | yq -y .services[].environment \
 | grep -v '^---' | sed 's/^- //' | sort -u -f \
 | grep -e :// -e did: -e {DOMAIN}

# get names of env vars regarding { URL | DID | DOMAIN }
cat ./docker-compose-builder.yaml | yq -y .services[].environment \
 | grep -v '^---' | sed 's/^- //' | sort -u -f \
 | grep -e :// -e did: -e {DOMAIN} \
 | awk -F= '{print $1}' | sort -u -f \
 | tee /tmp/url-or-did.txt
  1. get mapping rules in reverse proxy (caddy )
# dump rules, no idea to convert into  easy readable format...
cat config/caddy/Caddyfile

back to top

  1. files related env vars in sources
# files named *env*
find repos -type f | grep -v -e /.git/  | grep -i env \
  | grep -v -e .jpg$ -e .ts$  -e .json$ -e .png$ -e .js$

# files containing 'export'
find repos -type f | grep -v /.git/  | xargs grep -l export \
  | grep -v -e .js$ -e .jsx$  -e .ts$ -e .tsx$ -e .go$ -e go.sum$ -e go.mod$ -e .po$ -e .json$ -e .patch$ -e .lock$ -e .snap$
  1. get all env vars from source code
#in easy
_files=repos
#ensure files to search  envs
_files=`find repos -type f | grep -v -e '/.git' -e /__  -e /tests/ -e _test.go -e /interop-test-files  -e /testdata/ -e /testing/ -e /jest/ -e /node_modules/ -e /dist/ | sort -u -f`

# for javascripts families from process.env.ENVNAME
grep -R process.env ${_files} \
  | cut -d : -f 2- | sed 's/.*process\.//' | grep '^env\.' | sed 's/^env\.//' \
  | sed -r 's/(^[A-Za-z_0-9\-]+).*/\1/' | sort -u -f \
  | tee /tmp/vars-js1.txt

# for javascripts families from envXXX('MORE_ENVNAME'), refer atproto/packages/common/src/env.ts for envXXX
grep -R -e envStr -e envInt -e envBool -e envList ${_files} \
  | cut -d : -f 2- \
  | grep -v -e ^import -e ^export -e ^function  \
  | sed "s/\"/'/g" \
  | grep \' | awk -F\' '{print $2}' | sort -u -f \
  | tee /tmp/vars-js2.txt

# for golang  from EnvVar(s): []string{"ENVNAME", "MORE_ENVNAME"}
grep -R EnvVar ${_files} \
  | cut -d : -f 3- | sed -e 's/.*string//' -e 's/[,"{}]//g' \
  | tr ' ' '\n' | grep -v ^$ | sort -u -f \
  | tee /tmp/vars-go.txt

# for docker-compose from services[].environment
echo {$_files} \
  | tr ' ' '\n' | grep -v ^$ | grep -e .yaml$ -e .yml$ | grep compose \
  | xargs yq -y .services[].environment | grep -v ^--- | sed 's/^- //' \
  | sed 's/: /=/' | sed "s/'//g" \
  | sort -u -f \
  | awk -F= '{print $1}' | sort -u -f \
  | tee /tmp/vars-compose.txt


# get unique lists
cat /tmp/vars-js1.txt /tmp/vars-js2.txt /tmp/vars-go.txt /tmp/vars-compose.txt | sort -u -f > /tmp/envs.txt

# pick env vars related to mapping {URL, ENDPOINT, DID, HOST, PORT, ADDRESS}
cat /tmp/envs.txt  | grep -e URL -e ENDPOINT -e DID -e HOST -e PORT -e ADDRESS
  1. find {URL | DID | bsky } near env names in sources
find repos -type f | grep -v -e /.git  -e __ -e .json$ \
  | xargs grep -R -n -A3 -B3 -f /tmp/envs.txt \
  | grep -A2 -B2 -e :// -e did: -e bsky
  1. find bsky.{social,app,network} in sources ( to check hard-coded domain/FQDN )
find repos -type f | grep -v -e /.git -e /tests/ -e /__ -e Makefile -e .yaml$ -e .md$  -e .sh$ -e .json$ -e .txt$ -e _test.go$ \
  | xargs grep -n -e bsky.social -e bsky.app -e bsky.network  -e bsky.dev

back to top

this hask uses the result(/tmp/envs.txt) of the above as input.

# create table showing { env x container => value } with ops-helper script.
cat ./docker-compose-builder.yaml | ./ops-helper/compose2envtable/main.py -l /tmp/envs.txt -o ./docs/env-container-val.xlsx

back to top

this self-hosting env tried to use self-signed certificates as usual trusted certificate by installing certificates into containers. The expected behavior is: by sharing /etc/ssl/certs/ca-certificates.crt amang all containers, containers distinguish those in ca-certificates.crt are trusted.

unfortunately, this approach works just in some containers but not all. It seems depending on distribution(debian/alpine/...) and language(java/nodejs/golang). the rule cannot be found in actual behaviors. then, all of below methods are involved for safe, when it uses self-signed certificates.

  • host deploys /etc/ssl/certs/ca-certificates.crts to containers by volume mount.
  • define env vars for self-signed certificates, such as GOINSECURE, NODE_TLS_REJECT_UNAUTHORIZED for each language.

back to top

create account sign-in
components url (origin)
atproto https://github.com/bluesky-social/atproto.git
indigo https://github.com/bluesky-social/indigo.git
social-app https://github.com/bluesky-social/social-app.git
feed-generator https://github.com/bluesky-social/feed-generator.git
pds https://github.com/bluesky-social/pds.git
ozone https://github.com/bluesky-social/ozone.git
did-method-plc https://github.com/did-method-plc/did-method-plc.git

other dependencies:

components url (origin)
reverse proxy https://github.com/caddyserver/caddy (official docker image of caddy:2)
DNS server bind9 or others, such as https://github.com/itaru2622/docker-bind9.git

back to top

description of test network:

DOMAIN for self-hosting: mysky.local.com

IP:
  - docker host for selfhost: 192.168.1.51
  - DNS server:               192.168.1.27
  - DNS forwarders:           8.8.8.8 (upper level DNS server;dns.google.)

DNS A-Records:
  -   mysky.local.com  : 192.168.1.51
  - *.mysky.local.com  : 192.168.1.51

the above would be described in bind9 configuration file as below:

::::::::::::::
/etc/bind/named.conf
::::::::::::::
include "/etc/bind/rndc.key";
controls {
        inet 127.0.0.1 allow { 127.0.0.1; } keys { "rndc-key"; };
};
options {
        directory         "/etc/bind";
        // UDP 53, from any
        listen-on         { any; };
        // HTTP 80, from any
        listen-on  port 80  tls none http default  { any; };
        listen-on-v6      { none; };
        forwarders        { 8.8.8.8 ; };  # dns.gogle.
        allow-recursion   { any; };
        allow-query       { any; };
        allow-query-cache { any; };
        allow-transfer    { any; };
};
zone "local.com" { type master; file "zone-local.com"; allow-query { 0.0.0.0/0; }; allow-update { 0.0.0.0/0; }; allow-transfer { 0.0.0.0/0; }; };
::::::::::::::
/etc/bind/zone-local.com
::::::::::::::
$ORIGIN .
$TTL 259200	; 3 days
local.com		IN SOA	local.com. root.local.com. (
				2024022809 ; serial
				3600       ; refresh (1 hour)
				900        ; retry (15 minutes)
				86400      ; expire (1 day)
				3600       ; minimum (1 hour)
				)
			NS	local.com.
			A	192.168.1.27
$ORIGIN local.com.
$TTL 3600	; 1 hour
mysky		A	192.168.1.51
$ORIGIN mysky.local.com.
*			A	192.168.1.51

cf. the most simple way to use the above DNS server(192.168.1.27) in temporal,
add it in /etc/resolv.conf as below on all testing machines (docker host, client machines for browser)

nameserver 192.168.1.27

back to top

as described below, most features started working on self-hosting environment, but it may not work with full capabilities yet. some of reasons are described in bluesky-social/atproto#2334

test results with 'asof-2024-04-07r1':

  • ok: relaxing restriction on handle length in PDS (applyed bluesky-social/atproto#2392)
  • ok: create account on pds (via social-app, bluesky API).
  • ok: sign-in social-app (with multiple accounts)
  • ok: basic usages on social-app
    • ok: edit profilie (display name)
    • ok: post articles
    • ok: vote 'like' / repost to article
    • ok: follow/un-follow others, via their profile page.
    • ok: notification receiving when others vote 'like' or 'follow'
    • ok: search posts/users/feeds
  • ok: integration with firehose/websocket to pds/bgs(relay) => craw-able.
  • ok: integration with feed-generator (NOTE: official feed-generator sample has some delay, so it may need reload on social-app).
  • not tested: integration with moderation(ozone).

test results with 'asof-2024-04-07':

  • ok: create user on pds (via bluesky API).
  • ok: create user on pds on social-app
  • ok: sign-in via social-app (with multiple accounts)
  • ok: edit profilie (display name) on social-app
  • ok: post articles on social-app
  • ok: vote 'like' to article on social-app
  • ok: reply to article on social-app
  • ok: start following in others profile page on social-app
  • ok: receive notification in home, when others marks 'like' or 'follow', on social-app.
  • ok: find posts in 'search' on social-app
  • ok: find users in 'search' on social-app
    • ok: find users with 'display-name' after user configures it in his/her profile page.
    • ok: find users with full qualified handle name before display-name configured in his/her profile page.
  • ok: discover feed in '#feeds' on social-app after feed-generator joined and executed feed-generator/scripts/publishFeedGen.ts.
  • ok: pin/unpin feeds to home on social-app after discovering
  • ok: feed-generator subscribes and pushes posts into its feed channel => view them on social-app. (NOTE: it has some delay, so reload on social-app).
  • ok: websocket subscribing to pds/bgs with firehose/websocat.
  • not tested: regarding moderation

test results with 'asof-2024-04-03':

  • ok: create user on pds (via bluesky API).
  • ok: create user on pds on social-app
  • ok: sign-in via social-app (with multiple accounts)
  • ok: edit profilie (display name) on social-app
  • ok: post articles on social-app
  • ok: vote 'like' to article on social-app
  • ok: reply to article on social-app
  • ok: start following in others profile page on social-app
  • ok: receive notification in home, when others marks 'like' or 'follow', on social-app.
  • ok: find posts in 'search' on social-app
  • ok: find users in 'search' on social-app
    • ok: find users with 'display-name' after user configures it in his/her profile page.
    • ok: find users with full qualified handle name before display-name configured in his/her profile page.
  • ok: discover feed in '#feeds' on social-app after feed-generator joined and executed feed-generator/scripts/publishFeedGen.ts.
  • ok: pin/unpin feeds to home after discovering
  • not tested: post an article in feed.
  • not tested: view post in feed (channel) on social-app.
  • not tested: regarding moderation
  • ok: websocket subscribing; tested with firehose/websocat to pds/bgs, and feed-generator

test results with 'asof-2024-03-16' (now archiving status):

  • ok: create user on pds (via bluesky API).
  • NG: create user on pds on social-app (get stuck after submitting 'continue'. <=> in dev-env, we have 'clear' in upper right corner on social-app to get out from stuck.)
  • ok: sign-in via social-app (with multiple accounts)
  • ok: edit profilie (display name) on social-app
  • ok: post articles on social-app
  • ok: vote 'like' to article on social-app
  • ok: reply to article on social-app
  • ok: start following in others profile page on social-app
  • ok: receive notification in home, when others marks 'like' or 'follow', on social-app.
  • ok: find posts in 'search' on social-app
  • ??: find users in 'search' on social-app
    • ok: find users with 'display-name' after user configures it in his/her profile page.
    • none(without any error): find users with full qualified handle name before display-name configured in his/her profile page.
  • ok: discover feed in '#feeds' on social-app after feed-generator joined and executed feed-generator/scripts/publishFeedGen.ts.
  • ok: pin/unpin feeds to home after discovering
  • not tested: post an article in feed.
  • not tested: view post in feed (channel) on social-app.
  • not tested: regarding moderation
  • ok: websocket subscribing; tested with firehose/websocat to pds/bgs, and feed-generator

test results with 'asof-2024-01-06' (now archiving status):

  • ok: create user on pds via social-app, and bluesky API.
  • ok: sign-in via social-app (with multiple accounts)
  • ok: post articles on social-app
  • ok: vote 'like' to article on social-app
  • ok: reply to article on social-app
  • ok: start following in others profile page on social-app
  • ok: receive notification in home, when others marks 'like' or 'follow', on social-app.
  • ok: find posts in 'search' on social-app
  • NG: find users in 'search' on social-app <- reason unknown yet.
  • NG: find feeds in 'search' on social-app <- investigation not started
  • not tested: regarding moderation
  • ok: websocket subscribing; tested with firehose/websocat to pds/bgs, and feed-generator

It seems: indexer and feed-generator are not working by unknown reason even those are staying 'up' status.

back to top

special thanks to prior works on self-hosting.

hacks in bluesky:

back to top