Deployment via gcloud shell
Before following the deployment instructions, you need to create your own resources on the GCP. That includes:
- Topics (Follow the link: Creating a topic )
- Subscriptions for streaming pull and pull (Follow the link: Creating subscriptions)
- Buckets to store messages (Follow the link: Creating storage buckets)
After creating resources open Cloud Shell and clone GitHub repository.
git clone https://github.com/syntio/aquarium-persistor-gcp.git
Next, these services should be enabled:
gcloud services enable cloudbuild.googleapis.com
gcloud services enable cloudfunctions.googleapis.com
The current directory needs to be changed in order to create a vendor
cd aquarium-persistor-gcp/push
go mod vendor
The following command must be executed, otherwise an error occurs "problem reading metadata from context: unable to find metadata" (For more details read https://github.com/googleapis/google-cloud-go/issues/1947)
rm -r vendor/cloud.google.com/go/functions/metadata
It is now possible to execute the following commands to create a function that is set to the Pub/Sub trigger.
export CF_PUSH_NAME=[NAME_OF_FUNCTION] #e.g. cf_demo_per_push
export TOPIC_NAME=[NAME_OF_CREATED_TOPIC] #e.g. msg_demo_push -> note: has to be created first!
export MSG_PREFIX=[PREFIX_OF_FILE_STORED_ON_STORAGE | msg]
export MSG_EXTENSION=[EXTENSION_OF_FILE_STORED_ON_STORAGE | txt]
export BUCKET_ID=[NAME_OF_CREATED_BUCKET] #e.g. syn-aqua-bkt-per-push -> note: has to be created first!
gcloud functions deploy ${CF_PUSH_NAME} \
--runtime go113 \
--entry-point PushHandler \
--trigger-topic ${TOPIC_NAME} \
--memory 256MB \
--timeout 60s \
--region europe-west3 \
--retry \
--set-env-vars=MSG_PREFIX=${MSG_PREFIX},MSG_EXTENSION=${MSG_EXTENSION},BUCKET_ID=${BUCKET_ID}
Definition of variables passed to gcloud command:
-
CF_PUSH_NAME
- name of Cloud Function that stores each message using push mechanism -
TOPIC_NAME
- name of a topic from which the messages will be received -
MSG_PREFIX
- prefix of a file name -
MSG_EXTENSION
- file extension (txt, json, yaml, etc.) -
BUCKET_ID
- ID of a bucket in which messages will be stored
The current directory needs to be changed in order to create a vendor
cd aquarium-persistor-gcp/pull
go mod vendor
By executing the following commands, a Cloud Function that pulls messages from topics has been created.
gcloud services enable cloudscheduler.googleapis.com
gcloud services enable appengine.googleapis.com
export CF_PULL_NAME=[NAME_OF_FUNCTION] #e.g. cf_demo_per_pull
export PROJECT_ID=[PROJECT_ID]
export SUB_ID=[SUB_ID] #e.g. sub-syn-aqua-demo-per-pull -> note: has to be created first!
export REGION=[REGION] #e.g. europe-west3
export BUCKET_ID=[NAME_OF_CREATED_BUCKET] #e.g. syn-aqua-bkt-per-pull -> note: has to be created first!
export MSG_PREFIX=[PREFIX_OF_FILE_STORED_ON_STORAGE | msg]
export MSG_EXTENSION=[EXTENSION_OF_FILE_STORED_ON_STORAGE | txt]
export MAX_OUTSTANDING_MSGS=[MAX_OUTSTANDING_MSGS | 3]
export MAX_OUTSTANDING_BYTES=[MAX_OUTSTANDING_BYTES | 1000000000]
export NUM_OF_GOROUTINS=[MAX_OUTSTANDING_BYTES | 1]
export MAX_EXTENSION=[MAX_EXTENSION | -1]
gcloud functions deploy ${CF_PULL_NAME} \
--runtime go113 \
--entry-point PullHandler \
--trigger-http \
--memory 256MB \
--timeout 60s \
--region ${REGION} \
--set-env-vars=PROJECT_ID=${PROJECT_ID},SUB_ID=${SUB_ID},BUCKET_ID=${BUCKET_ID},MSG_PREFIX=${MSG_PREFIX},MSG_EXTENSION=${MSG_EXTENSION},SYNCHRONOUS=${SYNCHRONOUS},MAX_OUTSTANDING_MSGS=${MAX_OUTSTANDING_MSGS},MAX_OUTSTANDING_BYTES=${MAX_OUTSTANDING_BYTES},NUM_OF_GOROUTINS=${NUM_OF_GOROUTINS},MAX_EXTENSION=${MAX_EXTENSION}
NOTE When creating a function, it is necessary to allow unauthenticated invocations when prompted! Allow unauthenticated invocations of new function [test-deploy-streaming-pull] ? (y/N)? y
Definition of variables passed to gcloud command:
-
CF_PULL_NAME
- name of Cloud Function that pulls and stores messages -
PROJECT_ID
- ID of a project in which the topic is located -
SUB_ID
- ID of a subscription the messages will be pulled from -
BUCKET_ID
- ID of a bucket in which messages will be stored -
MSG_PREFIX
- prefix of a file name -
MSG_EXTENSION
- file extension (txt, json, yaml, etc.)
These parameters are used for ReceiveSettings on pull subscription:
-
SYNCHRONOUS
- bool, determines whether the pull or streaming pull is used -
MAX_OUTSTANDING_MSGS
- maximum number of unprocessed messages (unacknowledged but not yet expired) -
MAX_OUTSTANDING_BYTES
- maximum size of unprocessed messages -
NUM_OF_GOROUTINS
- number of goroutines that each data structure along the Receive path will spawn -
MAX_EXTENSION
- the maximum period for which the Subscription should automatically extend the ack deadline for each message
In order to run the pull function, it is necessary to create an auxiliary function called Invoker
Position yourself in the invoker folder and create a vendor:
cd ../invoker
go mod vendor
Executing the following commands will create an invoker.
NOTE: first you need to create a Pull function to be able to set the URL.
export CF_INVOKER_NAME=[NAME_OF_FUNCTION]
export NUM_OF_INSTANCES=[NUM_OF_INSTANCES | 3]
export FUNC_URL=https://${REGION}-${PROJECT_ID}.cloudfunctions.net/${CF_PULL_NAME}
export NUM_OF_MESSAGES=[NUM_OF_MESSAGES | 100]
export NUM_OF_SECONDS=[NUM_OF_SECONDS | 300]
gcloud functions deploy ${CF_INVOKER_NAME} \
--entry-point InvokerHandler \
--runtime go113 \
--trigger-http \
--memory 256MB \
--timeout 60s \
--region ${REGION} \
--set-env-vars=NUM_OF_INSTANCES=${NUM_OF_INSTANCES},FUNC_URL=${FUNC_URL},NUM_OF_MESSAGES=${NUM_OF_MESSAGES},NUM_OF_SECONDS=${NUM_OF_SECONDS}
Definition of variables passed to gcloud command for deploying Invoker CF:
-
CF_INVOKER_NAME
- name of Cloud Function that invokes instance of pull Cloud Function -
NUM_OF_INSTANCES
- number of pull function instances that will run in parallel -
FUNC_URL
- URL of a Cloud function which will be triggered by invoker (CF_PULL_NAME
) -
NUM_OF_MESSAGES
- the number of messages pull function will persist in one call (this applies only to the synchronous version of pull) -
NUM_OF_SECONDS
- the time duration in which the messages will be received
To fully automate the process, it is still necessary to create a Cloud Scheduler for Invoker function.
#create an app engine if it already doesn't exist
gcloud app create --region=${REGION}
# make sure this variable was declared in the previous step! Uncomment if necessary
# export CF_INVOKER_NAME=[NAME_OF_INVOKER_FUNCTION]
export SCHEDULER_JOB_NAME=[SCHEDULER_JOB_NAME] #e.g. job-syn-aqua-pull_scheduler
export SERVICE_ACCOUNT_EMAIL=[test@project.iam.gserviceaccount.com]
export SCHEDULER_JOB_URI=https://${REGION}-${PROJECT_ID}.cloudfunctions.net/${CF_INVOKER_NAME}
gcloud scheduler jobs create http ${SCHEDULER_JOB_NAME} \
--schedule "* * * * *" \
--uri=${SCHEDULER_JOB_URI} \
--http-method GET \
--oidc-service-account-email=${SERVICE_ACCOUNT_EMAIL}
The deployment process is the same as for pull with the exception of MAX_EXTENSION parameter which is not set.
gcloud services enable cloudscheduler.googleapis.com
gcloud services enable appengine.googleapis.com
export CF_PULL_NAME=[NAME_OF_FUNCTION]
export PROJECT_ID=[PROJECT_ID]
export SUB_ID=[SUB_ID]
export REGION=[REGION] #e.g. europe-west3
export BUCKET_ID=[NAME_OF_CREATED_BUCKET]
export MSG_PREFIX=[PREFIX_OF_FILE_STORED_ON_STORAGE | msg]
export MSG_EXTENSION=[EXTENSION_OF_FILE_STORED_ON_STORAGE | txt]
export MAX_OUTSTANDING_MSGS=[MAX_OUTSTANDING_MSGS | 3]
export MAX_OUTSTANDING_BYTES=[MAX_OUTSTANDING_BYTES | 1000000000]
export NUM_OF_GOROUTINS=[MAX_OUTSTANDING_BYTES | 1]
gcloud functions deploy ${CF_PULL_NAME} \
--runtime go113 \
--entry-point StreamingPullHandler \
--trigger-http \
--memory 256MB \
--timeout 60s \
--region ${REGION} \
--set-env-vars=PROJECT_ID=${PROJECT_ID},SUB_ID=${SUB_ID},BUCKET_ID=${BUCKET_ID},MSG_PREFIX=${MSG_PREFIX},MSG_EXTENSION=${MSG_EXTENSION},MAX_OUTSTANDING_MSGS=${MAX_OUTSTANDING_MSGS},MAX_OUTSTANDING_BYTES=${MAX_OUTSTANDING_BYTES},NUM_OF_GOROUTINS=${NUM_OF_GOROUTINS}
As mentioned earlier in the pull description, it is necessary to create an Invoker. The only difference is that NUM_OF_MESSAGE parameter isn't used in the commands.
Finally, a cloud Scheduler has to be created.