Skip to content
Philip (flip) Kromer edited this page May 21, 2012 · 2 revisions

Later

The concrete swineherd runnables each have eponymous stage names.

  • CpRunner

  • ScpRunner

  • PigRunner

  • HadoopRunner

  • HadoopStreamingRunner

  • WukongRubyRunner

  • FlumeOgConfigRunner

Rake compatibility

Any simple Rake task should work as a swineherd flow

  • task
  • desc
  • namespace
  • ... (flesh out)

Rule-based multiplexing

Can dispatch to another stage dynamically -- for instance, the FilenameRule stage handles the familiar "build any .o file from its eponymous .c file" rule.

(?mystery: how to handle resolution order? Part of the unsolved specificity question-?)

FilenameRule

Asynchronous Execution

Scheduling a stage for a later or recurring time

Background - Independent fire & forget Invocation

Parallel - invoke tasks to run in parallel and rejoin

Does not guarantee parallel execution -- just allows it

Dependency Reasoning

Triggering by 'outdated' not just 'missing'

  • many have an updated_at date, or nil if it doesn't exist.

mystery: what about things without a meaningful date?

Specificity


http://www.workflowpatterns.com/documentation/documents/phd_bartek.pdf

  • Activity represents an atomic piece of work

    • trigger conditions: any, all, custom
  • Work Node -- once executed, not again until reset

  • Merge Node

  • Split Node

  • Route Node

  • Abort Node

  • Reset Arc

  • Parallel Split

  • Synhronization

  • XOR

  • Simple Merge


  • resource: an entity that is capable of doing work
    • durable or consumable in nature
    • resource may have a schedule and history associated with them

http://www.workflowpatterns.com/patterns/

In process-aware information systems various perspectives can be distinguished.

  • The control-flow perspective captures aspects related to control-flow dependencies between various tasks (e.g. parallelism, choice, synchronization etc). ...
  • The data perspective deals with the passing of information , scoping of variables, etc
  • The resource perspective deals with resource to task allocation, delegation, etc.
  • Finally, the patterns for the exception handling perspective deal with the various causes of exceptions and the various actions that need to be taken as a result of exceptions occurring.
Clone this wiki locally