Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerts & Reports - CSV as Message Content resulting HTTP 500 #28595

Open
3 tasks done
nivdann opened this issue May 19, 2024 · 3 comments
Open
3 tasks done

Alerts & Reports - CSV as Message Content resulting HTTP 500 #28595

nivdann opened this issue May 19, 2024 · 3 comments
Assignees
Labels
alert-reports Namespace | Anything related to the Alert & Reports feature

Comments

@nivdann
Copy link

nivdann commented May 19, 2024

Bug description

I managed to enable my Alerts & Reports as mentioned in documentation, BUT when I try to use the chart report as CSV to send to email, it fails with HTTP 500 as shown in the worker logs.

Alerts and Snapshots work and are sent to my email as expected.
The only thing that doesn't work and fails is CSV charts.

How to reproduce the bug

  1. Go to Alerts & Reports
  2. Add new alert (i added a "select 1;" query with trigger condition != 0 to have the trigger, and also choose a example chart with CSV option)
  3. Wait for trigger

Screenshots/recordings

image
image

Superset version

3.1.3

Python version

3.10

Node version

I don't know

Browser

Not applicable

Additional context

I using helm to deploy to my EKS running AWS.
Using Firefox and geckodriver (defaults).

Superset Config

 ROW_LIMIT = 5000
 ENABLE_PROXY_FIX = True
 ALERT_REPORTS_NOTIFICATION_DRY_RUN = False
FEATURE_FLAGS = {
     "DYNAMIC_PLUGINS": True,
     "ALLOW_ADHOC_SUBQUERY": True,
     "ENABLE_TEMPLATE_PROCESSING": True,
     "DASHBOARD_VIRTUALIZATION": True,
     "DRILL_BY": True,
     "GLOBAL_ASYNC_QUERIES": False,
     "ALERT_REPORTS": True
}
SMTP_HOST = os.getenv("SMTP_HOST")
SMTP_STARTTLS = ast.literal_eval(os.getenv("SMTP_STARTTLS"))
SMTP_SSL = ast.literal_eval(os.getenv("SMTP_SSL"))
SMTP_SSL_SERVER_AUTH = ast.literal_eval(os.getenv("SMTP_SSL_SERVER_AUTH"))
SMTP_PORT = os.getenv("SMTP_PORT")
SMTP_MAIL_FROM = os.getenv("SMTP_MAIL_FROM")
SMTP_USER = ""
SMTP_PASSWORD = ""


from celery.schedules import crontab

class CeleryConfig:
    broker_url = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
    imports = (
        "superset.sql_lab",
        "superset.tasks.cache",
        "superset.tasks.scheduler",
    )
    result_backend = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
    worker_prefetch_multiplier = 10
    task_acks_late = True
    task_annotations = {
        "sql_lab.get_sql_results": {
            "rate_limit": "100/s",
        },
    }
    beat_schedule = {
        "reports.scheduler": {
            "task": "reports.scheduler",
            "schedule": crontab(minute="*", hour="*"),
        },
        "reports.prune_log": {
            "task": "reports.prune_log",
            "schedule": crontab(minute=0, hour=0),
        },
    }
CELERY_CONFIG = CeleryConfig


LOG_LEVEL = 'DEBUG'
SCREENSHOT_LOCATE_WAIT = 100
SCREENSHOT_LOAD_WAIT = 600
EMAIL_PAGE_RENDER_WAIT = 60
WEBDRIVER_BASEURL = "http://{{ template "superset.fullname" . }}:{{ .Values.service.port }}"
WEBDRIVER_BASEURL_USER_FRIENDLY = "http://superset:8088"


SUPERSET_WEBSERVER_TIMEOUT = 300


from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

capabilities = DesiredCapabilities.FIREFOX.copy()
capabilities['proxy'] = {
    "proxyType": "DIRECT" # Direct connection, no proxy
}
WEBDRIVER_CONFIGURATION = {"capabilities": capabilities}
WEBDRIVER_OPTION_ARGS = ["--headless"]

SECRET_KEY = os.getenv("SUPERSET_SECRET_KEY")

Worker logs

[2024-05-19 10:58:00,074: INFO/ForkPoolWorker-1] Scheduling alert test eta: 2024-05-19 10:58:00
[2024-05-19 10:58:00,081: DEBUG/ForkPoolWorker-1] [stats_logger] (incr) reports.execute
[2024-05-19 10:58:00,081: INFO/ForkPoolWorker-1] Executing alert/report, task id: 39937d56-4280-4e16-bde8-011841a62d90, scheduled_dttm: 2024-05-19T10:58:00
[2024-05-19 10:58:00,082: INFO/ForkPoolWorker-1] session is validated: id 15, executionid: 39937d56-4280-4e16-bde8-011841a62d90
[2024-05-19 10:58:00,149: INFO/ForkPoolWorker-1] Running report schedule 39937d56-4280-4e16-bde8-011841a62d90 as user admin
[2024-05-19 10:58:00,385: DEBUG/ForkPoolWorker-1] Parsing with sqlparse statement: select 1;
[2024-05-19 10:58:00,394: DEBUG/ForkPoolWorker-1] Database._get_sqla_engine(). Masked URL: awsathena+rest://athena.us-west-2.amazonaws.com/superset_db?s3_staging_dir=s3%3A%2F%2Fdata%2Fquery_results%2F&work_group=primary
[2024-05-19 10:58:00,397: DEBUG/ForkPoolWorker-1] Database._get_sqla_engine(). Masked URL: awsathena+rest://athena.us-west-2.amazonaws.com/superset_db?s3_staging_dir=s3%3A%2F%2Fdata%2Fquery_results%2F&work_group=primary
[2024-05-19 10:58:01,943: INFO/ForkPoolWorker-1] Query for test took 1552.98 ms
[2024-05-19 10:58:01,963: INFO/ForkPoolWorker-1] Getting chart from http://superset:8088/api/v1/chart/2/data/?format=csv&type=post_processed&force=true as user admin
[2024-05-19 10:58:02,074: INFO/ForkPoolWorker-1] header_data in notifications for alerts and reports {'notification_type': 'Alert', 'notification_source': <ReportSourceFormat.CHART: 'chart'>, 'notification_format': 'CSV', 'chart_id': 2, 'dashboard_id': None, 'owners': [Superset Admin]}, taskid, 39937d56-4280-4e16-bde8-011841a62d90
[2024-05-19 10:58:02,164: DEBUG/ForkPoolWorker-1] Sent an email to ['admin@superset.com']
[2024-05-19 10:58:02,352: INFO/ForkPoolWorker-1] Report sent to email, notification content is None
[2024-05-19 10:58:02,352: DEBUG/ForkPoolWorker-1] [stats_logger] (gauge) reports.email.send.ok1
[2024-05-19 10:58:02,376: ERROR/ForkPoolWorker-1] A downstream exception occurred while generating a report: 39937d56-4280-4e16-bde8-011841a62d90. Failed generating csv HTTP Error 500: Internal Server Error
Traceback (most recent call last):
File "/app/superset/commands/report/execute.py", line 258, in _get_csv_data
csv_data = get_chart_csv_data(chart_url=url, auth_cookies=auth_cookies)
File "/app/superset/utils/csv.py", line 92, in get_chart_csv_data
response = opener.open(chart_url)
File "/usr/local/lib/python3.10/urllib/request.py", line 525, in open
response = meth(req, response)
File "/usr/local/lib/python3.10/urllib/request.py", line 634, in http_response
response = self.parent.error(
File "/usr/local/lib/python3.10/urllib/request.py", line 563, in error
return self._call_chain(*args)
File "/usr/local/lib/python3.10/urllib/request.py", line 496, in _call_chain
result = func(*args)
File "/usr/local/lib/python3.10/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/app/superset/tasks/scheduler.py", line 98, in execute
).run()
File "/app/superset/commands/report/execute.py", line 729, in run
raise ex
File "/app/superset/commands/report/execute.py", line 727, in run
).run()
File "/app/superset/commands/report/execute.py", line 689, in run
).next()
File "/app/superset/commands/report/execute.py", line 586, in next
raise first_ex
File "/app/superset/commands/report/execute.py", line 555, in next
self.send()
File "/app/superset/commands/report/execute.py", line 452, in send
notification_content = self._get_notification_content()
File "/app/superset/commands/report/execute.py", line 364, in _get_notification_content
csv_data = self._get_csv_data()
File "/app/superset/commands/report/execute.py", line 262, in _get_csv_data
raise ReportScheduleCsvFailedError(
superset.commands.report.exceptions.ReportScheduleCsvFailedError: Failed generating csv HTTP Error 500: Internal Server Error

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Copy link

dosubot bot commented May 19, 2024

The HTTP 500 error you're encountering when trying to send chart data as CSV via email in Apache Superset seems to be related to the process of fetching and generating the CSV data from the chart URL. Here are some targeted steps to troubleshoot and potentially resolve this issue:

  1. Verify Chart URL Accessibility: Ensure the chart URL is accessible and that any required authentication is correctly handled. The error could stem from issues in fetching data due to authentication or accessibility problems.

  2. Data Handling in CSV Generation: Check the process of converting chart data to a pandas DataFrame and the subsequent CSV conversion. Ensure that this process does not introduce errors, especially with complex data types or large datasets.

  3. CSV Content and Email Attachment: Confirm that the generated CSV content is correctly formatted and encoded for email attachment. Malformed or empty CSV data could lead to failures in the email sending process.

  4. Logging and Error Handling: Enhance logging around the CSV generation and email sending process to capture detailed information about the failure. This can provide insights into what part of the process is causing the HTTP 500 error.

  5. Configuration and Permissions: Review Superset and server configurations related to CSV export and email sending. Ensure there are no misconfigurations or permission issues that could be causing the problem.

  6. Superset Updates: Make sure your Superset installation is up to date. If this issue is known, it might have been addressed in a more recent version of Superset.

  7. Integration Tests for Insights: The tests/integration_tests/reports/commands_tests.py file in the Superset codebase contains tests related to report scheduling and sending, including CSV content. Reviewing these tests might offer clues to resolving your issue.

For further assistance, you might want to follow the discussion Alerts & Reports - CSV as Message Content resulting HTTP 500 for any updates or solutions from the community.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@dosubot dosubot bot added the alert-reports Namespace | Anything related to the Alert & Reports feature label May 19, 2024
@gaful
Copy link

gaful commented Jun 6, 2024

Hello, did you find a solution ? we are facing the same issue.

many thanks
G

@gaful
Copy link

gaful commented Jun 6, 2024

Hello to add a little more details :

We also faced some issue like that

  1. with large csv files (>10M)
  2. with charts containing null value -> I don't know why the charts export works fine, but the reports doesn't work when charts contains null value.

For our case :

  • Reports were working on 3.1 version
  • We are facing the issue since 4.0.1 migration
  • The issue is only with csv file, while the same charts works perfectly fine in txt , it is not a charts issue size.

cdlt
G

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alert-reports Namespace | Anything related to the Alert & Reports feature
Projects
None yet
Development

No branches or pull requests

3 participants