NATS-SPARK-CONNECTOR

The following are all the current flavors of the nats-spark-connector, each flavor has its own directory structure.

At this point, the partitioned connector is not being actively developed anymore. It is now considered legacy and included only for educational purposes only. It will be removed entirely in the future.

balanced: In this scenario, NATS utilizes a single JetStream partition, using a durable and queued configuration to load balance messages across Spark threads, each thread contributing to single streaming micro-batch Dataframe at each Spark "pull". Current message offset is kept in the NATS queue for the purpose of fault tolerance (FT). Spark simply acknowledges each message during a micro-batch 'commit', and resends a message if an ack is not received within a pre-set timeframe.

This flavor is contained in the subdirectory 'balanced', which has its own README.md containing further information.

LEGACY: USE BALANCED INSTEAD

partitioned: In this scenario, NATS utilizes a number of JetStream partitions named <partition_prefix>-0, <partition_prefix>-1, ..., <partition_prefix>-N, each partition containing pre-configured associated subjects. Each partition sends messages to its own Spark affinity thread, and all partition threads contribute to a single streaming micro-batch Dataframe at each Spark "pull". Current offset for each partition is kept in Spark for the purpose of fault tolerance (FT). There also is an option to always start from each partition's first offset during a Spark restart, instead of the FT configuration.

This flavor is contained in the subdirectory 'partitioned', which has its own README.md containing further information.

filtered: This is a future scenario TBD.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NATS-SPARK-CONNECTOR

Files

README.md

Latest commit

History

README.md

File metadata and controls

NATS-SPARK-CONNECTOR