Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when inserting duped values into primary column with many values #11924

Closed
NickCrews opened this issue May 4, 2024 · 1 comment · Fixed by #12084
Closed

error when inserting duped values into primary column with many values #11924

NickCrews opened this issue May 4, 2024 · 1 comment · Fixed by #12084

Comments

@NickCrews
Copy link

NickCrews commented May 4, 2024

Possibly related to #3789

I'm getting

InternalException: INTERNAL Error: Could not find node in column segment tree!
Attempting to find row number "36028797019024731" in 2 nodes
Node 0: Start 36028797018960000, Count 0Node 1: Start 36028797019082880, Count 0

repro, using addresses.parquet.zip

import duckdb

sql = """
CREATE TABLE addresses(
    address__id UUID PRIMARY KEY NOT NULL,
    person__id UUID NOT NULL,
    street1 VARCHAR,
    street2 VARCHAR,
    city VARCHAR,
    state VARCHAR,
    zipcode VARCHAR,
    country VARCHAR,
    mailing_status VARCHAR,
    is_mailing BOOLEAN,
    is_voting BOOLEAN,
    latitude DOUBLE,
    longitude DOUBLE,
    last_updated DATE,
    source VARCHAR,
);
INSERT OR REPLACE INTO addresses
FROM
    read_parquet('addresses.parquet')
-- WHERE address__id <> '8753b9f7-46fc-4fd5-b318-de20305d5462'
;
"""
ddb = duckdb.connect(":memory:")
ddb.sql(sql)

In the attached parquet there is one duplicate uuid, "8753b9f7-46fc-4fd5-b318-de20305d5462". If you uncomment the one line in the SQL, then no error.

Things I've experimented with

  • reduce the number of rows in the parquet file: I get a normal ConstraintException
  • reorder the rows so the dupe rows are at the beginning of the file: I get a normal ConstraintException
  • drop some of the other columns from the DDL and the parquet: I get a normal ConstraintException

I am on a nightly build duckdb-0.10.3.dev601, running on a mac M1.

@Mytherin
Copy link
Collaborator

Mytherin commented May 4, 2024

Thanks for the report! I can reproduce the issue - we'll take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants