Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The embulk process sometime throw java.lang.OutOfMemoryError: Metaspace When Plugin throw exception #1567

Open
legiangthanh opened this issue Mar 17, 2023 · 7 comments

Comments

@legiangthanh
Copy link

legiangthanh commented Mar 17, 2023

Issue Type: Bug Report

  • Write the following environmental information.
    • OS version: Ubuntu 22.04
    • Java version: 1.8
    • Embulk version: 0.10.41

The embulk process sometimes throws java.lang.OutOfMemoryError: Metaspace When Plugin throws an exception.

Based on my investigation this OOM issue happened because Metaspace is used up (in our case we configure 128MB for metaspace) when the failure job is cleaned up.

When the failure job is cleaned up, Classes will be reloaded by another class loader because Embulk creates a new ExecSessionInternal (embulk/EmbulkEmbed.java at master · embulk/embulk ). If the number of class in plugins are large, OOM will be more happening.

Can we reuse ExecSessionInternal instead of creating a new one?

@dmikurube
Copy link
Member

Thanks for investigating and reporting here! I'll have a deeper look. I intend to fix it in v0.10.44, or next v0.10.45.

@dmikurube dmikurube added this to the v0.10.44 milestone Mar 17, 2023
@dmikurube
Copy link
Member

Looked into the history. I see that they used different ExecSession instances for run and cleanup, expected to be intentionally. There seem to be some background reasons, such as remaining resources of failed threads, ThreadLocalStorage, and/or else.

Then, unfortunately, I don't think it's an easy and quick choice to reuse ExecSessionInternal here because we need a deeper investigation for the background reasons before making the change.

Instead, let me share some options :

  • I have a plan for a total rennovate in the class loading and caching architecture in the future. One option is to wait for that although it may be after v0.11.0.
  • Anyway at a glance, it seems a little bit strange that the metaspace is used up by only the "double" use by run and cleanup. That sounds too much even before doubled. I'd suggest to check the number of classes loaded by your plugin. It may be just too many, or Embulk may have another issue to load classes duplicated. If the latter is the case, we need another fix in the Embulk core.

@dmikurube dmikurube modified the milestones: v0.10.44, v0.11 Mar 20, 2023
@dmikurube
Copy link
Member

@legiangthanh btw, when it happened, did you see log messages like Loaded plugin embulk-***-*** (***) many times for the same plugin in one execution?

If you saw it many times, it means that plugin classes were loaded duplicated, and that's the main problem to be addressed, not just run v.s. cleanup.

@legiangthanh
Copy link
Author

@dmikurube, I can see this log message show when cleanup if the plugin throws an exception, here is an example

thanh.le@TLE-0932X6 embulk % java -XX:MaxMetaspaceSize=80m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/thanh.le/Work/private/embulk -jar build/libs/embulk-0.10.44.jar run config.yaml
2023-04-05 16:20:47.842 +0700 [INFO] (main): embulk_home is set by the location of embulk.properties found in: /Users/thanh.le/.embulk
2023-04-05 16:20:47.848 +0700 [INFO] (main): m2_repo is set as a sub directory of embulk_home: /Users/thanh.le/.embulk/lib/m2/repository
2023-04-05 16:20:47.849 +0700 [INFO] (main): gem_home is set as a sub directory of embulk_home: /Users/thanh.le/.embulk/lib/gems
2023-04-05 16:20:47.849 +0700 [INFO] (main): gem_path is set empty.
2023-04-05 16:20:47.849 +0700 [DEBUG] (main): Embulk system property "default_guess_plugin" is set to: "gzip,bzip2,json,csv"
2023-04-05 16:20:48.131 +0700 [INFO] (main): Started Embulk v0.10.44
2023-04-05 16:20:51.446 +0700 [INFO] (0001:transaction): Gem's home and path are set by system configs "gem_home": "/Users/thanh.le/.embulk/lib/gems", "gem_path": ""
2023-04-05 16:20:53.242 +0700 [INFO] (0001:transaction): Loaded JRuby runtime 9.1.15.0
2023-04-05 16:20:53.428 +0700 [INFO] (0001:transaction): Loaded plugin embulk-input-s3 (0.6.0)
2023-04-05 16:20:53.714 +0700 [INFO] (0001:transaction): Loaded plugin embulk-output-s3 (1.7.1)
2023-04-05 16:20:53.861 +0700 [INFO] (0001:transaction): Loaded plugin embulk-parser-csv
2023-04-05 16:20:54.481 +0700 [INFO] (0001:transaction): Start listing file with prefix [sample_]
2023-04-05 16:20:56.059 +0700 [INFO] (0001:transaction): Found total [2] files
2023-04-05 16:20:56.225 +0700 [INFO] (0001:transaction): Using local thread executor with max_threads=20 / output tasks 10 = input tasks 2 * 5
2023-04-05 16:20:56.237 +0700 [INFO] (0001:transaction): Loaded plugin embulk-formatter-csv
2023-04-05 16:20:56.414 +0700 [INFO] (0001:transaction): {done:  0 / 2, running: 0}
2023-04-05 16:20:56.792 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.005.00.csv'
2023-04-05 16:20:56.792 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.000.00.csv'
2023-04-05 16:20:56.839 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.006.00.csv'
2023-04-05 16:20:56.839 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.001.00.csv'
2023-04-05 16:20:56.882 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.007.00.csv'
2023-04-05 16:20:56.882 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.002.00.csv'
2023-04-05 16:20:56.925 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.008.00.csv'
2023-04-05 16:20:56.925 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.003.00.csv'
2023-04-05 16:20:56.950 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.004.00.csv'
2023-04-05 16:20:56.950 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.009.00.csv'
2023-04-05 16:20:58.025 +0700 [INFO] (0016:task-0001): Open S3Object with bucket [testcrash], key [sample_03.csv], with size [183]
2023-04-05 16:20:58.025 +0700 [INFO] (0015:task-0000): Open S3Object with bucket [testcrash], key [sample_02.csv], with size [179]
2023-04-05 16:21:03.757 +0700 [INFO] (0001:transaction): {done:  1 / 2, running: 1}
2023-04-05 16:21:04.694 +0700 [INFO] (0001:transaction): {done:  2 / 2, running: 0}
2023-04-05 16:21:04.751 +0700 [INFO] (0001:cleanup):Loaded plugin embulk-input-s3 (0.6.0)
java.lang.OutOfMemoryError: Metaspace
Dumping heap to /Users/thanh.le/Work/private/embulk/java_pid53576.hprof ...
Heap dump file created [143937058 bytes in 0.331 secs]
org.embulk.exec.PartialExecutionException: org.embulk.spi.DataException: Invalid record at s3://testcrash/sample_03.csv:5: 4,32864,TTTT20150127,embulk,{"k":true}
	at org.embulk.exec.BulkLoader$LoaderState.buildPartialExecuteException(BulkLoader.java:342)

If the plugin can finish normally, OOM is not happen


thanh.le@TLE-0932X6 embulk % java -XX:MaxMetaspaceSize=80m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/thanh.le/Work/private/embulk -jar build/libs/embulk-0.10.44.jar run config.yaml
2023-04-05 16:19:59.599 +0700 [INFO] (main): embulk_home is set by the location of embulk.properties found in: /Users/thanh.le/.embulk
2023-04-05 16:19:59.607 +0700 [INFO] (main): m2_repo is set as a sub directory of embulk_home: /Users/thanh.le/.embulk/lib/m2/repository
2023-04-05 16:19:59.607 +0700 [INFO] (main): gem_home is set as a sub directory of embulk_home: /Users/thanh.le/.embulk/lib/gems
2023-04-05 16:19:59.607 +0700 [INFO] (main): gem_path is set empty.
2023-04-05 16:19:59.607 +0700 [DEBUG] (main): Embulk system property "default_guess_plugin" is set to: "gzip,bzip2,json,csv"
2023-04-05 16:19:59.901 +0700 [INFO] (main): Started Embulk v0.10.44
2023-04-05 16:20:03.093 +0700 [INFO] (0001:transaction): Gem's home and path are set by system configs "gem_home": "/Users/thanh.le/.embulk/lib/gems", "gem_path": ""
2023-04-05 16:20:04.915 +0700 [INFO] (0001:transaction): Loaded JRuby runtime 9.1.15.0
2023-04-05 16:20:05.114 +0700 [INFO] (0001:transaction): Loaded plugin embulk-input-s3 (0.6.0)
2023-04-05 16:20:05.255 +0700 [INFO] (0001:transaction): Embulk system property "plugins.output.s3" is not set.
2023-04-05 16:20:05.255 +0700 [INFO] (0001:transaction): Embulk system property "plugins.default.output.s3" is not set.
2023-04-05 16:20:05.355 +0700 [INFO] (0001:transaction): Loaded plugin embulk-output-s3 (1.7.1)
2023-04-05 16:20:05.488 +0700 [INFO] (0001:transaction): Embulk system property "plugins.parser.csv" is not set.
2023-04-05 16:20:05.488 +0700 [INFO] (0001:transaction): Embulk system property "plugins.default.parser.csv" is not set.
2023-04-05 16:20:05.491 +0700 [INFO] (0001:transaction): Loaded plugin embulk-parser-csv
2023-04-05 16:20:06.058 +0700 [INFO] (0001:transaction): Start listing file with prefix [sample_]
2023-04-05 16:20:07.847 +0700 [INFO] (0001:transaction): Found total [2] files
2023-04-05 16:20:08.030 +0700 [INFO] (0001:transaction): Using local thread executor with max_threads=20 / output tasks 10 = input tasks 2 * 5
2023-04-05 16:20:08.042 +0700 [INFO] (0001:transaction): Embulk system property "plugins.formatter.csv" is not set.
2023-04-05 16:20:08.042 +0700 [INFO] (0001:transaction): Embulk system property "plugins.default.formatter.csv" is not set.
2023-04-05 16:20:08.044 +0700 [INFO] (0001:transaction): Loaded plugin embulk-formatter-csv
2023-04-05 16:20:08.211 +0700 [INFO] (0001:transaction): {done:  0 / 2, running: 0}
2023-04-05 16:20:08.807 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.005.00.csv'
2023-04-05 16:20:08.807 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.000.00.csv'
2023-04-05 16:20:09.028 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.006.00.csv'
2023-04-05 16:20:09.028 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.001.00.csv'
2023-04-05 16:20:09.082 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.007.00.csv'
2023-04-05 16:20:09.082 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.002.00.csv'
2023-04-05 16:20:09.141 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.008.00.csv'
2023-04-05 16:20:09.141 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.003.00.csv'
2023-04-05 16:20:09.172 +0700 [INFO] (0016:task-0001): Writing S3 file 'logs/out.009.00.csv'
2023-04-05 16:20:09.177 +0700 [INFO] (0015:task-0000): Writing S3 file 'logs/out.004.00.csv'
2023-04-05 16:20:10.208 +0700 [INFO] (0015:task-0000): Open S3Object with bucket [testcrash], key [sample_02.csv], with size [179]
2023-04-05 16:20:11.387 +0700 [INFO] (0016:task-0001): Open S3Object with bucket [testcrash], key [sample_03.csv], with size [183]
2023-04-05 16:20:11.399 +0700 [WARN] (0016:task-0001): Skipped line s3://testcrash/sample_03.csv:5 (org.embulk.util.rubytime.RubyDateTimeParseException: Text 'TTTT20150127' could not be parsed at index 0: no digits): 4,32864,TTTT20150127,embulk,{"k":true}
2023-04-05 16:20:15.991 +0700 [INFO] (0001:transaction): {done:  1 / 2, running: 1}
2023-04-05 16:20:17.349 +0700 [INFO] (0001:transaction): {done:  2 / 2, running: 0}
2023-04-05 16:20:17.354 +0700 [INFO] (0001:transaction): Incremental job, setting last_path to [sample_03.csv]
2023-04-05 16:20:17.418 +0700 [INFO] (main): Committed.
2023-04-05 16:20:17.419 +0700 [INFO] (main): Next config diff: {"in":{"last_path":"sample_03.csv"},"out":{}}

@legiangthanh
Copy link
Author

Base on my test, this issue more likely happens when using a ruby plugin, when I switch to the maven plugin, although the class is reloaded, OOM don't happen

thanh.le@TLE-0932X6 embulk % java -XX:MaxMetaspaceSize=80m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/thanh.le/Work/private/embulk -jar build/libs/embulk-0.10.44.jar run config.yaml
2023-04-05 17:00:41.455 +0700 [INFO] (main): embulk_home is set by the location of embulk.properties found in: /Users/thanh.le/.embulk
2023-04-05 17:00:41.462 +0700 [INFO] (main): m2_repo is set as a sub directory of embulk_home: /Users/thanh.le/.embulk/lib/m2/repository
2023-04-05 17:00:41.462 +0700 [INFO] (main): gem_home is set as a sub directory of embulk_home: /Users/thanh.le/.embulk/lib/gems
2023-04-05 17:00:41.462 +0700 [INFO] (main): gem_path is set empty.
2023-04-05 17:00:41.462 +0700 [DEBUG] (main): Embulk system property "default_guess_plugin" is set to: "gzip,bzip2,json,csv"
2023-04-05 17:00:41.746 +0700 [INFO] (main): Started Embulk v0.10.44
2023-04-05 17:00:41.896 +0700 [INFO] (0001:transaction): Embulk system property "plugins.input.s3" is not set.
2023-04-05 17:00:42.008 +0700 [INFO] (0001:transaction): Loaded plugin embulk-input-s3 (maven:org.embulk:s3:0.6.0)
2023-04-05 17:00:42.208 +0700 [INFO] (0001:transaction): Loaded plugin embulk-output-s3 (maven:org.embulk:s3:1.7.0)
2023-04-05 17:00:42.316 +0700 [INFO] (0001:transaction): Loaded plugin embulk-parser-csv
2023-04-05 17:00:42.949 +0700 [INFO] (0001:transaction): Start listing file with prefix [sample_]
2023-04-05 17:00:44.656 +0700 [INFO] (0001:transaction): Found total [2] files
2023-04-05 17:00:44.841 +0700 [INFO] (0001:transaction): Using local thread executor with max_threads=20 / output tasks 10 = input tasks 2 * 5
2023-04-05 17:00:44.857 +0700 [INFO] (0001:transaction): Loaded plugin embulk-formatter-csv
2023-04-05 17:00:44.999 +0700 [INFO] (0001:transaction): {done:  0 / 2, running: 0}
2023-04-05 17:00:45.398 +0700 [INFO] (0013:task-0001): Writing S3 file 'logs/out.005.00.csv'
2023-04-05 17:00:45.398 +0700 [INFO] (0012:task-0000): Writing S3 file 'logs/out.000.00.csv'
2023-04-05 17:00:45.440 +0700 [INFO] (0013:task-0001): Writing S3 file 'logs/out.006.00.csv'
2023-04-05 17:00:45.445 +0700 [INFO] (0012:task-0000): Writing S3 file 'logs/out.001.00.csv'
2023-04-05 17:00:45.460 +0700 [INFO] (0012:task-0000): Writing S3 file 'logs/out.002.00.csv'
2023-04-05 17:00:45.460 +0700 [INFO] (0013:task-0001): Writing S3 file 'logs/out.007.00.csv'
2023-04-05 17:00:45.487 +0700 [INFO] (0012:task-0000): Writing S3 file 'logs/out.003.00.csv'
2023-04-05 17:00:45.492 +0700 [INFO] (0013:task-0001): Writing S3 file 'logs/out.008.00.csv'
2023-04-05 17:00:45.508 +0700 [INFO] (0012:task-0000): Writing S3 file 'logs/out.004.00.csv'
2023-04-05 17:00:45.527 +0700 [INFO] (0013:task-0001): Writing S3 file 'logs/out.009.00.csv'
2023-04-05 17:00:46.523 +0700 [INFO] (0012:task-0000): Open S3Object with bucket [testcrash], key [sample_02.csv], with size [179]
2023-04-05 17:00:46.523 +0700 [INFO] (0013:task-0001): Open S3Object with bucket [testcrash], key [sample_03.csv], with size [183]
2023-04-05 17:00:52.812 +0700 [INFO] (0001:transaction): {done:  2 / 2, running: 0}
2023-04-05 17:00:52.814 +0700 [INFO] (0001:transaction): {done:  2 / 2, running: 0}
2023-04-05 17:00:52.871 +0700 [INFO] (0001:cleanup): Loaded plugin embulk-input-s3 (maven:org.embulk:s3:0.6.0)
2023-04-05 17:00:52.957 +0700 [INFO] (0001:cleanup): Loaded plugin embulk-output-s3 (maven:org.embulk:s3:1.7.0)
org.embulk.exec.PartialExecutionException: org.embulk.spi.DataException: Invalid record at s3://testcrash/sample_03.csv:5: 4,32864,TTTT20150127,embulk,{"k":true}
	at org.embulk.exec.BulkLoader$LoaderState.buildPartialExecuteException(BulkLoader.java:342)
	at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:582)
	at org.embulk.exec.BulkLoader.access$000(BulkLoader.java:36)
	at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:355)
	at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:352)
	at org.embulk.spi.ExecInternal.doWith(ExecInternal.java:26)
	at org.embulk.exec.BulkLoader.run(BulkLoader.java:352)
	at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:278)
	at org.embulk.EmbulkRunner.runInternal(EmbulkRunner.java:288)
	at org.embulk.EmbulkRunner.run(EmbulkRunner.java:153)
	at org.embulk.cli.EmbulkRun.runInternal(EmbulkRun.java:107)
	at org.embulk.cli.EmbulkRun.run(EmbulkRun.java:24)
	at org.embulk.cli.Main.main(Main.java:55)
Caused by: org.embulk.spi.DataException: Invalid record at s3://testcrash/sample_03.csv:5: 4,32864,TTTT20150127,embulk,{"k":true}
	at org.embulk.parser.csv.CsvParserPlugin.run(CsvParserPlugin.java:448)
	at org.embulk.spi.ParserPlugin.runThenReturnTaskReport(ParserPlugin.java:87)
	at org.embulk.spi.FileInputRunner.run(FileInputRunner.java:145)
	at org.embulk.exec.LocalExecutorPlugin$ScatterExecutor.runInputTask(LocalExecutorPlugin.java:285)
	at org.embulk.exec.LocalExecutorPlugin$ScatterExecutor.access$200(LocalExecutorPlugin.java:218)
	at org.embulk.exec.LocalExecutorPlugin$ScatterExecutor$1.call(LocalExecutorPlugin.java:249)
	at org.embulk.exec.LocalExecutorPlugin$ScatterExecutor$1.call(LocalExecutorPlugin.java:246)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.embulk.parser.csv.CsvParserPlugin$CsvRecordValidateException: org.embulk.util.rubytime.RubyDateTimeParseException: Text 'TTTT20150127' could not be parsed at index 0: no digits
	at org.embulk.parser.csv.CsvParserPlugin$1.timestampColumn(CsvParserPlugin.java:401)
	at org.embulk.spi.Column.visit(Column.java:77)
	at org.embulk.spi.Schema.visitColumns(Schema.java:124)
	at org.embulk.parser.csv.CsvParserPlugin.run(CsvParserPlugin.java:344)
	... 10 more
Caused by: org.embulk.util.rubytime.RubyDateTimeParseException: Text 'TTTT20150127' could not be parsed at index 0: no digits
	at org.embulk.util.rubytime.ParserWithContext.consumeDigitsInternal(ParserWithContext.java:606)
	at org.embulk.util.rubytime.ParserWithContext.consumeDigitsInInt(ParserWithContext.java:631)
	at org.embulk.util.rubytime.ParserWithContext.consumeYearWithCentury(ParserWithContext.java:532)
	at org.embulk.util.rubytime.ParserWithContext.parse(ParserWithContext.java:180)
	at org.embulk.util.rubytime.RubyDateTimeFormatter.parseUnresolved(RubyDateTimeFormatter.java:105)
	at org.embulk.util.rubytime.RubyDateTimeFormatter.parse(RubyDateTimeFormatter.java:119)
	at org.embulk.util.timestamp.LegacyTimestampFormatter.parse(LegacyTimestampFormatter.java:59)
	at org.embulk.parser.csv.CsvParserPlugin$1.timestampColumn(CsvParserPlugin.java:398)
	... 13 more

Error: org.embulk.spi.DataException: Invalid record at s3://testcrash/sample_03.csv:5: 4,32864,TTTT20150127,embulk,{"k":true}

@hiroyuki-sato
Copy link
Contributor

Hello, @legiangthanh

Thank you for reporting this issue.
Could you more narrow down the problem?

  • What JRuby are you using?
  • Have you tried changing to another JRuby version, like 9.4.2.0 (latest)?
  • Have you ever tried local files instead of S3?
  • Have you ever tried null output instead of S3?

If you provide sample data for this issue, I'll try to reproduce it in my environment.

@legiangthanh
Copy link
Author

legiangthanh commented May 4, 2023

Hi @hiroyuki-sato,

What JRuby are you using?

I am using Jruby 9.1.15.0

Have you tried changing to another JRuby version, like 9.4.2.0 (latest)?

Not yet, let me try it.

Have you ever tried local files instead of S3?
Have you ever tried null output instead of S3?

Not yet. The reason I use the S3 input and output plugin because base on my investigation When the plugin throws an exception, the cleaned-up code will reload classes because Embulk creates a new ExecSessionInternal (embulk/EmbulkEmbed.java at master · embulk/embulk ). If these plugins contain multiple classes, OOM metaspace will more likely happen.

If you provide sample data for this issue, I'll try to reproduce it in my environment.

id,account,purchase,comment,json_column
1,32864,20150127,embulk,{"k":true}
2,32864,20150127,embulk,{"k":true}
3,32864,20150127,embulk,{"k":true}
4,32864,TTTT20150127,embulk,{"k":true}

Configuration file

in:
  type: s3
  bucket: testcrash
  path_prefix: sample_
  auth_method: basic
  endpoint: s3.ap-southeast-1.amazonaws.com
  access_key_id:
  secret_access_key: 
  parser:
    type: csv
    charset: UTF-8
    newline: LF
    delimiter: ','
    quote: null
    stop_on_invalid_record: true
    trim_if_not_quoted: false
    skip_header_lines: 1
    allow_extra_columns: false
    allow_optional_columns: false
    columns:
    - {name: id, type: long}
    - {name: account, type: long}
    - {name: purchase, type: timestamp, format: '%Y%m%d'}
    - {name: comment, type: string}
    - {name: json_column, type: json}
out:
  type: s3
  path_prefix: logs/out
  file_ext: .csv
  bucket: testcrash
  endpoint: s3.ap-southeast-1.amazonaws.com
  auth_method: basic
  access_key_id:
  secret_access_key: 
  formatter:
    type: csv

The purchase field in the last row contains invalid data. S3 input plugin will throw an exception when processing this row.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants