Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data must be less than or equal to 1MB in size #39

Open
jamesamurr-bind opened this issue Jan 3, 2020 · 1 comment
Open

Data must be less than or equal to 1MB in size #39

jamesamurr-bind opened this issue Jan 3, 2020 · 1 comment

Comments

@jamesamurr-bind
Copy link

jamesamurr-bind commented Jan 3, 2020

This is the sameor similar to the closed issue #14 #14. I can't reopen it myself.

In my use case I have a transaction that is larger than 1MB and this is causing pg2k4j to get stuck reading that same WAL message over and over. This causes my disk to fill up (because WAL is never successfully read) and the database crashes.

The recommendation in that issue was to set wal_writer_flush_after lower than 1mb and to set wal_writer_delay to something short.

I have set wal_writer_flush_after to 20 and set wal_writer_delay to 200. This does not resolve the problem. Do you have other suggestions?

A 1MB transaction is large, but is not an unreasonable use case. For example if I add a column to an existing table with a million records and then need to backfill data in that column in a transaction it would fail. Or if I had to bulk insert thousands of new members for my site, this would cause the same error. Because of this issue I can't use this software.

Here is my error:

2020-01-03T21:57:45.924263600Z [main] ERROR com.disneystreaming.pg2k4j.SlotReaderKinesisWriter - Received exception of type class java.lang.IllegalArgumentException
2020-01-03T21:57:45.924932900Z java.lang.IllegalArgumentException: Data must be less than or equal to 1MB in size, got 2346047 bytes
2020-01-03T21:57:45.924944900Z  at com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:517)
2020-01-03T21:57:45.924949500Z  at com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:406)
2020-01-03T21:57:45.924952800Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.lambda$processByteBuffer$0(SlotReaderKinesisWriter.java:242)
2020-01-03T21:57:45.924956100Z  at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
2020-01-03T21:57:45.924959100Z  at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
2020-01-03T21:57:45.924970900Z  at java.base/java.util.stream.Streams$StreamBuilderImpl.forEachRemaining(Streams.java:411)
2020-01-03T21:57:45.924974400Z  at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
2020-01-03T21:57:45.924977500Z  at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
2020-01-03T21:57:45.924980500Z  at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
2020-01-03T21:57:45.924983600Z  at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
2020-01-03T21:57:45.924986600Z  at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2020-01-03T21:57:45.924989600Z  at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
2020-01-03T21:57:45.924992700Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.processByteBuffer(SlotReaderKinesisWriter.java:234)
2020-01-03T21:57:45.924995900Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.readSlotWriteToKinesisHelper(SlotReaderKinesisWriter.java:195)
2020-01-03T21:57:45.924999700Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.readSlotWriteToKinesis(SlotReaderKinesisWriter.java:131)
2020-01-03T21:57:45.925003500Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.runLoop(SlotReaderKinesisWriter.java:86)
2020-01-03T21:57:45.925007300Z  at com.disneystreaming.pg2k4j.CommandLineRunner.run(CommandLineRunner.java:30)
2020-01-03T21:57:45.925011800Z  at java.base/java.util.Optional.ifPresent(Optional.java:183)
2020-01-03T21:57:45.925041500Z  at com.disneystreaming.pg2k4j.CommandLineRunner.main(CommandLineRunner.java:45)
2020-01-03T21:57:47.223098700Z [main] INFO com.disneystreaming.pg2k4j.PostgresConnector - Attempting to create replication slot pg2k4j
2020-01-03T21:57:47.271197900Z [main] INFO com.disneystreaming.pg2k4j.PostgresConnector - Slot pg2k4j already exists
2020-01-03T21:57:47.326846300Z [main] INFO com.amazonaws.services.kinesis.producer.KinesisProducer - Extracting binaries to /tmp/amazon-kinesis-producer-native-binaries
2020-01-03T21:57:47.819365800Z [main] INFO com.amazonaws.services.kinesis.producer.HashedFileCopier - '/tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_489FA9AC71B1CD61A4002E9F16A279556D581D9D' already exists, and matches.  Not overwriting.
2020-01-03T21:57:47.829184100Z [main] INFO com.disneystreaming.pg2k4j.SlotReaderKinesisWriter - Consuming from slot pg2k4j

To reproduce:

  1. Run pg2k4j against a database
  2. Create a table with the following DDL
CREATE TABLE public.my_test_table (
	id serial,
	"name" varchar(100) NOT NULL,
	property_1 varchar(200) not null,
	property_2 varchar(200) not null,
	property_3 varchar(200) not null,
	property_4 varchar(200) not null,
	property_5 varchar(200) not null,
	property_6 varchar(200) not null,
	property_7 varchar(200) not null,
	property_8 varchar(200) not null,
	property_9 varchar(200) not null,
	CONSTRAINT my_test_table_pkey PRIMARY KEY (id)
);
  1. Create an insert script similar to the following with 3500 inserts.
BEGIN;
INSERT INTO opportunity.public.james_test_table
("name", property_1, property_2, property_3, property_4, property_5, property_6, property_7, property_8, property_9)
VALUES(concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)));
...
< add 3500 more of the previous insert statement here>
...
COMMIT;
  1. Run the insert script against the postgres database that is running pg2k4j
psql --host=my.ip.address --port=5447 --username=test_user --dbname=test_database --file "C:\dev\sql_scripts\my_temp_table.sql"
  1. Watch the logs on your pg2k4j container and wait for the transaction to finish. Once it is finished you will see the error.
docker logs reverent_pike --since 10m -t --follow
@dcupif
Copy link

dcupif commented Jan 18, 2022

AWS Kinesis max record size is 1MB (https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants