Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature add docs on oldkeys #188

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mijoharas
Copy link
Contributor

I've written some docs. I've actually only just looked into the TOASTed column stuff. I also noticed a missing parameter, which I added.

@@ -105,6 +105,7 @@ Parameters
* `include-type-oids`: add type oids. Default is _false_.
* `include-domain-data-type`: replace domain name with the underlying data type. Default is _false_.
* `include-column-positions`: add column position (_pg_attribute.attnum_). Default is _false_.
* `include-origin`: add origin of a piece of data. Default is _false_.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this was missing.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops. Good catch!

| change.oldkeys.keytypeoids | array containing the oids of the replica identity columns | optional (absent `include-type-oids` only present for update/delete) |
| change.oldkeys.keyvalues | array containing the values of the replica identity columns | required (only present for update/delete) |

Note: Unchanged TOAST datum columns are not output for any of the rows.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, just dug into this one here. Is there a way to see if an unchanged TOAST column has been omitted? how are users supposed to deal with TOASTed column without the deprecated include-unchanged-toast option? it seems that without ddl changes in the logical replication log (which makes sense), you can't tell the difference between a column that's been removed, and an unchanged TOAST column that's been omitted for that reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, and I guess my suggestion would be a new field change.omittedunchangedtoastcolumns that is an array of the names of unchanged toast columns (i.e. if the row had a datum in column foo that was unchanged toast, the change would have omittedunchangedtoastcolumns: ["foo"] in it).

Let me know if you'd like that to be included, I can probably do a follow up PR since it seems reasonably straightforward to implement.

Also let me know if there's a different way we're supposed to be knowing whether a toasted column has been omitted.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mijoharas you table looks good. I have a few comments/suggestions.

  • section should be added before section Examples
  • title: 'JSON description' or 'JSON in details'
  • you are describing the format 1. However, we should describe all format versions. I suggest two tables (one for each format). Mixing fields could possibly cause some confusion.
  • table headers: use bold. The third column could be described as requirement and you should use mandatory and optional. If it is an optional use (enabled by include-foo).
  • acronyms in uppercase letters such as LSN and JSON.
  • origin value: the replication origin. You can link replication origin to https://www.postgresql.org/docs/current/replication-origins.html
  • change.kind value: separate each kind with comma such as insert, update, delete, and truncate.
  • change.schema value: table schema.
  • change.table value: table name.
  • change.pk value: s/primary keys/primary key/.
  • change.columnnames value: array containing all table column names.
  • change.columnntypes value: array containing all table column type names.
  • change.columntypeoids value: array containing all table column type oids.
  • change.columnpositions value: array containing column position(pg_attribute.attnum) for each column.
  • change.oldkeys value: use /current/ instead of /10/. Details about REPLICA IDENTITY is at https://www.postgresql.org/docs/current/sql-altertable.html
  • change.oldkeys.keynames value: array containing table replica identity column names.
  • change.oldkeys.keytypes value: array containing table replica identity column type names.
  • change.oldkeys.keytypeoids value: array containing table replica identity column type oids.

Regarding unchanged TOAST columns, you could say:

UPDATE will not provide columns for new tuple whose values have not been modified and its value is stored in a TOAST table. If your table has REPLICA IDENTITY FULL, the unchanged value will be provided in the oldkeys (format 1) or identity (format 2).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like change.omittedunchangedtoastcolumns. It fits the case: Postgres should provide support in order to use it. While writing the previous reply, I realized that we might copy unchanged TOAST columns from old tuple to new tuple iif REPLICA IDENTITY FULL.

Copy link
Contributor Author

@mijoharas mijoharas Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I agree with all the suggestions and will apply. I hadn't dug into format 2 much, hence the omission but I'll dive in and add that as an extra table as suggested.

question on the TOAST stuff:

Postgres should provide support in order to use it.

is this referring to postgres supporting plugins in including unchanged toast columns? or is this referring to providing information on DDL changes for logical replication plugins? if so, I agree with you I guess. But given that it currently doesn't what can a user do to understand whether a TOASTed column was omitted? I agree putting in the extra field is a bit of a hack, but it's the only way for a user of the plugin to understand if the column changes are due to ddl remove column or unchanged TOAST.

I could very much be misunderstanding what you're meaning, so do let me know if that's the case.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgres omits unchanged TOAST columns for performance reasons because it was initially designed with replication as the main use case. However, it is not so good for CDC solutions. When I said Postgres should provide support, I'm saying that Postgres should provide a mechanism that says doesn't omit unchanged TOAST columns.

@mijoharas
Copy link
Contributor Author

sorry, almost forgot to mention, addresses #187

@eulerto eulerto self-requested a review April 26, 2021 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants