Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Unable to export logs to Datadog #1210

Open
huib-coalesce opened this issue Dec 8, 2023 · 4 comments
Open

[Bug]: Unable to export logs to Datadog #1210

huib-coalesce opened this issue Dec 8, 2023 · 4 comments
Labels
bug Something isn't working needs triage p2

Comments

@huib-coalesce
Copy link

Related Template(s)

Pub/Sub to Datadog template

What happened?

Exported log messages don't appear in Datadog.

Steps to reproduce

I've used the following Terraform definition to set the infrastructure up.

main.tf

locals {
  datadog_iam_roles = [
    "roles/monitoring.viewer",
    "roles/compute.viewer",
    "roles/cloudasset.viewer",
    "roles/browser"
  ]
  dataflow_iam_roles = [
    "roles/dataflow.admin",
    "roles/dataflow.worker",
    "roles/pubsub.viewer",
    "roles/pubsub.subscriber",
    "roles/pubsub.publisher",
    "roles/storage.objectAdmin"
  ]
}

# Service Account for Datadog integration with Google Cloud Platform
resource "google_service_account" "datadog_service_account" {
  project      = var.project_id
  account_id   = "${var.short_name}-datadog-viewer"
  display_name = "${var.short_name}-datadog-viewer"
}

resource "google_project_iam_member" "datadog_iam" {
  for_each = toset(local.datadog_iam_roles)

  project = var.project_id
  role    = each.value
  member  = "serviceAccount:${google_service_account.datadog_service_account.email}"
}

# Generate Datadog service account from https://app.datadoghq.com/integrations/google-cloud-platform
resource "google_service_account_iam_member" "token-creator-iam" {
  service_account_id = google_service_account.datadog_service_account.id
  role               = "roles/iam.serviceAccountTokenCreator"
  member             = "serviceAccount:dd-abc@xyz.iam.gserviceaccount.com"
}

# Log Router Sink
module "log_export" {
  source                 = "terraform-google-modules/log-export/google"
  version                = "~> 7.0"
  destination_uri        = module.destination.destination_uri
  filter                 = "severity >= INFO"
  log_sink_name          = "${var.short_name}-dd-log-sink"
  parent_resource_id     = var.project_id
  parent_resource_type   = "project"
  unique_writer_identity = true
}

# Pub/Sub Topic and Pull Subscription
module "destination" {
  source                   = "terraform-google-modules/log-export/google//modules/pubsub"
  version                  = "~> 7.0"
  project_id               = var.project_id
  topic_name               = "${var.short_name}-dd-topic"
  log_sink_writer_identity = module.log_export.writer_identity
  create_subscriber        = true
  create_push_subscriber   = false
}

# Enable the Dataflow API
resource "google_project_service" "dataflow_job_service" {
  project = var.project_id
  service = "dataflow.googleapis.com"
}

# Topic for undeliverable messages
resource "google_pubsub_topic" "dead_letter_topic" {
  name                       = "${var.short_name}-dd-dead-letter"
  project                    = var.project_id
  message_retention_duration = "86400s"
}

# Cache bucket
resource "google_storage_bucket" "dataflow_tmp_bucket" {
  name     = "${var.project_id}-dataflow-cache"
  location = var.region
  project  = var.project_id
}

# Service Account for exporting to Datadog
resource "google_service_account" "dataflow_service_account" {
  project      = var.project_id
  account_id   = "${var.short_name}-datadog-dataflow"
  display_name = "${var.short_name}-datadog-dataflow"
}

resource "google_project_iam_member" "dataflow_iam" {
  for_each = toset(local.dataflow_iam_roles)

  project = var.project_id
  role    = each.value
  member  = "serviceAccount:${google_service_account.dataflow_service_account.email}"
}

# echo -n "my-datadog-api-key" | gcloud secrets create datadog-api-key --data-file=- --project my-google-project
data "google_secret_manager_secret_version" "datadog-api-key" {
  secret  = "datadog-api-key"
  project = "my-google-project"
}

resource "google_project_iam_member" "dataflow_secret_iam" {
  project = "my-google-project"
  role    = "roles/secretmanager.secretAccessor"
  member  = "serviceAccount:${google_service_account.dataflow_service_account.email}"
}

# Dataflow Job using the PubSub to Datadog template
resource "google_dataflow_job" "datadog_dataflow" {
  name                  = "${var.short_name}-dd-dataflow"
  project               = var.project_id
  region                = var.region
  template_gcs_path     = "gs://dataflow-templates/latest/Cloud_PubSub_to_Datadog"
  temp_gcs_location     = google_storage_bucket.dataflow_tmp_bucket.url
  service_account_email = google_service_account.dataflow_service_account.email
  parameters            = {
    inputSubscription     = module.destination.pubsub_subscription
    outputDeadletterTopic = google_pubsub_topic.dead_letter_topic.id
    url                   = "https://http-intake.logs.datadoghq.com"
    includePubsubMessage  = true
    apiKeySource          = "SECRET_MANAGER"
    apiKeySecretId        = data.google_secret_manager_secret_version.datadog-api-key.name
  }
}

variables.tf

variable "project_id" {
  description = "Google Project id"
}

variable "short_name" {
  type        = string
  description = "Short descriptor name of the Google Project"
  validation {
    condition     = length(var.short_name) < 8
    error_message = "Please use a descriptor shorter than 8 chars"
  }
}

variable "region" {
  type        = string
  description = "The Google Project default region"
}

terraform.tfvars

project_id = "my-project-id"
short_name = "dev"
region     = "us-central1"

Result

I have a Log Router Sink.

That's connected to a Pub/Sub Topic:
image

It has a Topic Subscription with a Delivery type of Pull:
image

And finally, there's the Dataflow Job
image

However, it appears to me that the messages are never sent to Datadog.

But there are no errors in the logs:
image

And there's nothing in Datadog:
image

Beam Version

Newer than 2.46.0

Relevant log output

No response

@huib-coalesce huib-coalesce added bug Something isn't working needs triage p2 labels Dec 8, 2023
@bvolpato
Copy link
Contributor

bvolpato commented Jan 7, 2024

@huib-coalesce Sorry, missed this before.

Aren't the messages being sent to the error output? Filing a Google Cloud case with job IDs might be useful so the team can look further.

@huib-coalesce
Copy link
Author

No they're not as far as I'm aware.

ConvertToDataDogEvent (left) shows data going in and out (on the right):
image

Create KV pairs shows data going in, but not out:
image

Write Datadog events shows no incoming data:
image

WrapDatadogWriteErrors shows no incoming/outgoing data:
image

FlattenErrors shows no incoming/outgoing data:
image

Same for WriteFailedRecords:
image

And the dev-dd-dead-letter topic doesn't show anything either:
image

@GurayCetin
Copy link

Could you please check that Log Router Sinks is getting logs properly?
I guess you are missing "roles/pubsub.publisher" role for the sink.

In my case, i have created the sink my own not with module but maybe it can refer you how to add it.

resource "google_logging_project_sink" "datadog_sink" {
  name                   = "kubernetes_container_error_logs"
  destination            = "pubsub.googleapis.com/${google_pubsub_topic.export_logs_to_datadog.id}"
  filter                 = "resource.type=k8s_container AND log_id(stderr) AND severity>=ERROR"
  unique_writer_identity = true
}

resource "google_project_iam_member" "pubsub_publisher_permisson_sink" {
  project = var.project_id
  role    = "roles/pubsub.publisher"
  member  = google_logging_project_sink.datadog_sink.writer_identity
}

@huib-coalesce
Copy link
Author

Could you please check that Log Router Sinks is getting logs properly?

The Log Router Sink is receiving content:
image

I guess you are missing "roles/pubsub.publisher" role for the sink

To try your suggestion, I've added:

resource "google_project_iam_member" "pubsub_publisher_permisson_sink" {
  project = var.project_id
  role    = "roles/pubsub.publisher"
  member  = module.log_export.writer_identity
}

And changed the filter from
(severity >= DEBUG AND resource.type=\"cloud_function\") OR (severity >= WARNING AND resource.type=\"cloud_run_revision\")
to
severity>=ERROR

which ensures there are regular messages to pass through:
image

The Log Router Sink:

image

The service account:

image

The Topic Metrics shows content arriving:

image

Same for the Topic Subscription:

image

The point where messages are come in, but are not going out:

image

I however did notice:

image

But I don't know the significance of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage p2
Projects
None yet
Development

No branches or pull requests

3 participants