Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add direct syslog logging per workflow/task #340

Open
jbbarth opened this issue May 14, 2018 · 1 comment
Open

Add direct syslog logging per workflow/task #340

jbbarth opened this issue May 14, 2018 · 1 comment

Comments

@jbbarth
Copy link
Collaborator

jbbarth commented May 14, 2018

Today simpleflow only logs to stdout and with little contextual information on the performed actions. It falls flat when trying to identify logs from a specific workflow or activity at a given point in time, given that machines often perform different tasks simultaneously.

I'd like to add a second handler, optional, to a remote syslog target (think rsyslog remote server, or a hosted service such as logmatic or papertrail).

It would log things in the following format (please criticize):

# for tasks performed "inside" a workflow
<workflow_id>: <activity|decision>#<started_event_id> <log level> pid=<pid> <message>

# for tasks performed "outside" a workflow (workers initialization, etc.)
<message>
# or (tell me?)
simpleflow: <message>

The result would look like this in papertrail or your favorite syslog server:

May 14 10:22:19 127.0.0.1 logger: INFO pid=3817 using task list basic-bcb8a2b0092072ef-ca7d812541f544fe964ded26ae5cc8db 
May 14 10:22:19 127.0.0.1 logger: INFO pid=3823 starting <bound method ActivityPoller.start of ActivityPoller(domain=TestDomain, task_list=basic-bcb8a2b0092072ef-ca7d812541f544fe964ded26ae5cc8db)> 
May 14 10:22:19 127.0.0.1 logger: INFO pid=3822 starting <bound method DeciderPoller.start of DeciderPoller(TestDomain, basic-bcb8a2b0092072ef-ca7d812541f544fe964ded26ae5cc8db, basic)> 
May 14 10:22:19 127.0.0.1 logger: INFO pid=3824 starting ActivityPoller(task_list=basic-bcb8a2b0092072ef-ca7d812541f544fe964ded26ae5cc8db) on domain TestDomain 
May 14 10:22:19 127.0.0.1 logger: INFO pid=3825 starting DeciderPoller(workflow=basic) on domain TestDomain 
May 14 10:22:19 127.0.0.1 basic-bcb8a2b0092072ef: decision#3 INFO pid=3826 taking decision for workflow basic-bcb8a2b0092072ef (basic) 
May 14 10:22:19 127.0.0.1 basic-bcb8a2b0092072ef: decision#3 INFO pid=3826 execution context from decider: {'workflow_id': u'basic-bcb8a2b0092072ef', 'version': u'example', 'name': u'basic', 'run_id': u'22ukk/DSnnnnnnnnnS5OR7mOxdUg1Y=', 'parent_workflow_id': None, 'tag_list': [u'a=1', u'b=foo'], 'parent_run_id': None, 'domain_name': u'TestDomain', 'continued_execution_run_id': None} 
May 14 10:22:19 127.0.0.1 basic-bcb8a2b0092072ef: decision#3 INFO pid=3826 1 open activities (1 decisions) 
May 14 10:22:19 127.0.0.1 basic-bcb8a2b0092072ef: decision#3 INFO pid=3826 completing decision for workflow basic-bcb8a2b0092072ef (basic) 
May 14 10:22:19 127.0.0.1 basic-bcb8a2b0092072ef: activity#6 INFO pid=3827 activity context: {'activity_id': u'activity-examples.basic.increment-1', 'name': u'examples.basic.increment', 'run_id': u'22ukk/DSnnnnnnnnnS5OR7mOxdUg1Y=', 'domain_name': u'TestDomain', 'workflow_id': u'basic-bcb8a2b0092072ef', 'version': u'example', 'input': u'{"args":[1],"kwargs":{}}'} 
May 14 10:22:19 127.0.0.1 basic-bcb8a2b0092072ef: decision#9 INFO pid=3828 taking decision for workflow basic-bcb8a2b0092072ef (basic) 
May 14 10:22:19 127.0.0.1 basic-bcb8a2b0092072ef: decision#9 INFO pid=3828 execution context from decider: {'workflow_id': u'basic-bcb8a2b0092072ef', 'version': u'example', 'name': u'basic', 'run_id': u'22ukk/DSnnnnnnnnnS5OR7mOxdUg1Y=', 'parent_workflow_id': None, 'tag_list': [u'a=1', u'b=foo'], 'parent_run_id': None, 'domain_name': u'TestDomain', 'continued_execution_run_id': None} 
May 14 10:22:20 127.0.0.1 basic-bcb8a2b0092072ef: decision#9 INFO pid=3828 2 open activities (2 decisions) 
May 14 10:22:20 127.0.0.1 basic-bcb8a2b0092072ef: decision#9 INFO pid=3828 completing decision for workflow basic-bcb8a2b0092072ef (basic) 
May 14 10:22:20 127.0.0.1 basic-bcb8a2b0092072ef: decision#16 INFO pid=3831 taking decision for workflow basic-bcb8a2b0092072ef (basic) 
May 14 10:22:20 127.0.0.1 basic-bcb8a2b0092072ef: decision#16 INFO pid=3831 execution context from decider: {'workflow_id': u'basic-bcb8a2b0092072ef', 'version': u'example', 'name': u'basic', 'run_id': u'22ukk/DSnnnnnnnnnS5OR7mOxdUg1Y=', 'parent_workflow_id': None, 'tag_list': [u'a=1', u'b=foo'], 'parent_run_id': None, 'domain_name': u'TestDomain', 'continued_execution_run_id': None} 
May 14 10:22:20 127.0.0.1 basic-bcb8a2b0092072ef: decision#16 INFO pid=3831 result of (1 + 1) * 2 = 4 
May 14 10:22:20 127.0.0.1 basic-bcb8a2b0092072ef: decision#16 INFO pid=3831 1 open activities (0 decisions) 
May 14 10:22:20 127.0.0.1 basic-bcb8a2b0092072ef: decision#16 INFO pid=3831 completing decision for workflow basic-bcb8a2b0092072ef (basic) 
May 14 10:22:50 127.0.0.1 basic-bcb8a2b0092072ef: decision#21 INFO pid=3843 taking decision for workflow basic-bcb8a2b0092072ef (basic) 
May 14 10:22:50 127.0.0.1 basic-bcb8a2b0092072ef: decision#21 INFO pid=3843 execution context from decider: {'workflow_id': u'basic-bcb8a2b0092072ef', 'version': u'example', 'name': u'basic', 'run_id': u'22u4Mb2YpWB4lVc4ZabGNkk/DSPnRMocS5OR7mOxdUg1Y=', 'parent_workflow_id': None, 'tag_list': [u'a=1', u'b=foo'], 'parent_run_id': None, 'domain_name': u'TestDomain', 'continued_execution_run_id': None} 
May 14 10:22:50 127.0.0.1 basic-bcb8a2b0092072ef: decision#21 INFO pid=3843 result of (1 + 1) * 2 = 4 
May 14 10:22:50 127.0.0.1 basic-bcb8a2b0092072ef: decision#21 INFO pid=3843 completing decision for workflow basic-bcb8a2b0092072ef (basic) 
May 14 10:22:51 127.0.0.1 logger: INFO pid=3823 process: caught signal signal=SIGTERM pid=3823 
May 14 10:22:51 127.0.0.1 logger: INFO pid=3823 process: will stop workers, this might take up several minutes. Please, be patient. 
May 14 10:22:51 127.0.0.1 logger: INFO pid=3823 process: sending SIGTERM to pid=3824 
May 14 10:22:51 127.0.0.1 logger: INFO pid=3823 process: waiting for proces=psutil.Process(pid=3824, name='Python') to finish. 
May 14 10:22:51 127.0.0.1 logger: INFO pid=3824 process: caught signal signal=SIGTERM pid=3824 
May 14 10:22:51 127.0.0.1 logger: INFO pid=3824 stopping ActivityPoller(task_list=basic-bcb8a2b0092072ef-ca7d812541f544fe964ded26ae5cc8db)

Remarks:

  • here logger as the default program name, and some logs for a basic-bcb8a2b0092072ef workflow
  • domain is supposed to be obvious ; in Botify case domain matches the "environment" and given a specific machine, we know the SWF domain it uses directly ; YMMV
  • we could log a handful of other things but logs become very noisy and hard to read by a human (task list, activity id, activity name, etc.) => better have it in a tool above that, and this can always be changed later by making the logging pattern more flexible
  • I didn't test how bare client logging.getLogger().info("foo") behave, especially in pypy forks when using simpleflow.execute
  • I tested it with logging.handlers.SysLogHandler which has the following limitations: silent errors when logging with TCP (or maybe papertrail's fault?), explicit errors on UDP when above the max packet size (~9k bytes on my machine, but depends on MTU and remote syslog server I guess) ; no TCP+TLS support
  • logging with syslog introduce another limitation, that "program name" is the "workflow id" in our case ; in syslog program name is limited to 64 chars, but workflow id in SWF is limited to 256 ; in practice we don't have much workflows over 80 chars at Botify and they're all unique when truncated to 52 chars, but YMMV
  • the implementation would rely on environment variables which are 1/ non-intrusive and 2/ easy to retrieve in subprocesses, optionally not from simpleflow: any "poll_for_(activity|decision)_task" call would set a few environment variables, that the logger or any other program could use

Thoughts?

@ybastide
Copy link
Contributor

👍; needs simpleflow.execute testing 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants