MongoDB Batch to BigQuery not running on the specified machine type #1489

jonatansthlmstratlab · 2024-04-30T09:04:55Z

Related Template(s)

MongoDB_to_BigQuery

Template Version

latest

What happened?

Hello,

I've been trying to run a batch dataflow job for MongoDB to BigQuery through both the console and Cloud Shell. I have been getting this error message in the dataflow logs:

textPayload: "Failed to start the VM, launcher-XXX, used for launching because of status code: UNAVAILABLE, reason: One or more operations had an error: 'operation-XXX': [UNAVAILABLE] 'HTTP_503'.."

And this error in the VM-Instance logs:

message: "A n1-standard-1 VM instance is currently unavailable in the europe-north1-b zone. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation."

I've been trying to run the job in different regions and different zones without success.

I did a test today with the word-count template which got the same error message when running on the n1-standard-1 machines but when I changed to n2-standard-2 it worked.

However, when I try to run the MongoDB to BigQuery template and specifying the n2-standard-2 machine type (both through the console and the Cloud Shell) instead of the default machine I am still getting the error message:

message: "A n1-standard-1 VM instance is currently unavailable in the europe-north1-b zone. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation."

This is my code for running it in Cloud Shell:
gcloud dataflow flex-template run mongodbtobigquery --project=XXX --region=europe-north1 --template-file-gcs-location=gs://dataflow-templates-europe-north1/latest/flex/MongoDB_to_BigQuery --parameters \ "outputTableSpec=XXX,\ mongoDbUri=XXX,\ database=XXX,\ collection=XXX,\ userOption=NONE,\ workerMachineType=n2-standard-2"

See below for full logs of the Error in VM-Instance.

Relevant log output

{
insertId: "XXX"
labels: {
compute.googleapis.com/root_trigger_id: "XXX"
}
logName: "projects/XXX/logs/cloudaudit.googleapis.com%2Factivity"
operation: {
id: "operation-XXX"
last: true
producer: "compute.googleapis.com"
}
protoPayload: {
@type: "type.googleapis.com/google.cloud.audit.AuditLog"
authenticationInfo: {
principalEmail: "service-XXX@dataflow-service-producer-prod.iam.gserviceaccount.com"
}
methodName: "beta.compute.instances.insert"
request: {
@type: "type.googleapis.com/compute.instances.insert"
}
requestMetadata: {
callerIp: "private"
callerSuppliedUserAgent: "cloud_workflow_service"
destinationAttributes: {
}
requestAttributes: {
}
}
resourceName: "projects/XXX/zones/europe-north1-b/instances/launcher-XXX"
serviceName: "compute.googleapis.com"
status: {
code: 8
details: [
0: {
@type: "type.googleapis.com/google.protobuf.Struct"
value: {
errorInfo: [
0: {
domain: "compute.googleapis.com"
metadata: {
attachment: ""
vmType: "n1-standard-1"
zone: "europe-north1-b"
zonesAvailable: ""
}
reason: "resource_availability"
}
]
help: [
0: {
links: [
0: {
description: "Troubleshooting documentation"
url: "https://cloud.google.com/compute/docs/resource-error"
}
]
}
]
localizedMessage: [
0: {
locale: "en-US"
message: "A n1-standard-1 VM instance is currently unavailable in the europe-north1-b zone. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation."
}
]
zoneResourcePoolExhausted: {
resource: {
project: {
canonicalProjectId: "XXX"
}
resourceName: "europe-north1-b"
resourceType: "ZONE"
scope: {
scopeName: "global"
scopeType: "GLOBAL"
}
}
}
}
}
]
message: "ZONE_RESOURCE_POOL_EXHAUSTED"
}
}
receiveTimestamp: "2024-04-30T08:26:19.959180125Z"
resource: {
labels: {
instance_id: "XXX"
project_id: "XXX"
zone: "europe-north1-b"
}
type: "gce_instance"
}
severity: "ERROR"
timestamp: "2024-04-30T08:26:19.100256Z"
}

The text was updated successfully, but these errors were encountered:

jonatansthlmstratlab · 2024-05-03T08:32:34Z

So after some digging around I found this in the dataflow_job.tf

variable "launcher_machine_type" {
  type        = string
  description = "The machine type to use for launching the job. The default is n1-standard-1."
  default     = null
}

variable "machine_type" {
  type        = string
  description = "The machine type to use for the job."
  default     = null
}

The launcher_machine_type is a parameter I can't find a way of defining either in the Cloud Shell or in the console, and I would assume is the reason for the job to always default back to the n1-standard-1 machine type instead of the machine I'm specifying in the workerMachineType-parameter.

jonatansthlmstratlab added bug Something isn't working needs triage p2 labels Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MongoDB Batch to BigQuery not running on the specified machine type #1489

MongoDB Batch to BigQuery not running on the specified machine type #1489

jonatansthlmstratlab commented Apr 30, 2024

jonatansthlmstratlab commented May 3, 2024

MongoDB Batch to BigQuery not running on the specified machine type #1489

MongoDB Batch to BigQuery not running on the specified machine type #1489

Comments

jonatansthlmstratlab commented Apr 30, 2024

Related Template(s)

Template Version

What happened?

Relevant log output

jonatansthlmstratlab commented May 3, 2024