Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I attempt to insert new data into a table indexed with pg_search, I encounter failure. #1047

Open
SilenceLeo opened this issue Apr 10, 2024 · 9 comments
Labels
bug Something isn't working priority-2-medium Medium priority issue user-request This issue was directly requested by a user

Comments

@SilenceLeo
Copy link

SilenceLeo commented Apr 10, 2024

Bug Description

When I attempt to insert new data into a table indexed with pg_bm25, I encounter failure.

ERROR:  failed to create index reader while retrieving index
ERROR:  error inserting document during insert callback: WriterClientError(IOError(Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }))

How To Reproduce

x86_64  2C8G
pg_bm25-v0.5.11-pg16-amd64-ubuntu2204.deb

# psql --version
psql (PostgreSQL) 16.2 (Ubuntu 16.2-1.pgdg22.04+1)
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pg_bm25";

CREATE TABLE "book_local_embedding" (
  "id" serial8,
  "text" text NOT NULL DEFAULT '',
  "metadata" jsonb,
  "embedding" vector(1024),
  PRIMARY KEY ("id")
);

CREATE INDEX "index_book_local_embedding_cosine" ON "book_local_embedding" USING hnsw ("embedding" vector_cosine_ops);

CALL paradedb.create_bm25(
    index_name => 'index_book_local_embedding_text',
    schema_name => 'public',
    table_name => 'book_local_embedding',
    key_field => 'id',
    text_fields => '{text: {tokenizer: {type: "chinese_compatible"}}}',
    json_fields => '{"metadata": {}}'
);

-- INSERT

INSERT INTO book_local_embedding(
    "text", "embedding", "metadata"
)
VALUES (..., ..., ...), (..., ..., ...);;

logs

2024-04-10 11:40:16.033 CST [1024906] LOG:  starting PostgreSQL 16.2 (Ubuntu 16.2-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2024-04-10 11:40:16.033 CST [1024906] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2024-04-10 11:40:16.035 CST [1024906] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-04-10 11:40:16.043 CST [1024910] LOG:  database system was shut down at 2024-04-10 11:40:15 CST
2024-04-10 11:40:16.050 CST [1024906] LOG:  database system is ready to accept connections
2024-04-10 11:40:16.051 CST [1024913] LOG:  starting pg_bm25 shutdown worker at PID 1024913
2024-04-10 11:40:16.051 CST [1024914] LOG:  starting pg_bm25 insert worker at PID 1024914
2024-04-10 11:45:16.139 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 11:45:16.152 CST [1024908] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.002 s, total=0.013 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB; lsn=3/203E5A40, redo lsn=3/203E5A08
2024-04-10 11:50:16.251 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 11:52:44.767 CST [1024908] LOG:  checkpoint complete: wrote 1484 buffers (9.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=148.504 s, sync=0.004 s, total=148.517 s; sync files=12, longest=0.003 s, average=0.001 s; distance=19843 kB, estimate=19843 kB; lsn=3/21746790, redo lsn=3/21746758
2024-04-10 12:10:17.148 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 12:10:17.862 CST [1024908] LOG:  checkpoint complete: wrote 8 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.703 s, sync=0.004 s, total=0.715 s; sync files=7, longest=0.003 s, average=0.001 s; distance=43 kB, estimate=17863 kB; lsn=3/217513E8, redo lsn=3/217513B0
2024-04-10 12:15:17.960 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 12:19:34.971 CST [1024908] LOG:  checkpoint complete: wrote 2573 buffers (15.7%); 0 WAL file(s) added, 0 removed, 1 recycled; write=256.876 s, sync=0.004 s, total=257.012 s; sync files=12, longest=0.003 s, average=0.001 s; distance=20341 kB, estimate=20341 kB; lsn=3/22B2E920, redo lsn=3/22B2E8E8
2024-04-10 12:20:18.014 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 12:20:49.409 CST [1024908] LOG:  checkpoint complete: wrote 52 buffers (0.3%); 0 WAL file(s) added, 0 removed, 1 recycled; write=31.383 s, sync=0.003 s, total=31.395 s; sync files=11, longest=0.002 s, average=0.001 s; distance=6376 kB, estimate=18944 kB; lsn=3/2347B910, redo lsn=3/23168B88
2024-04-10 12:30:18.597 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 12:32:49.565 CST [1024908] LOG:  checkpoint complete: wrote 1510 buffers (9.2%); 0 WAL file(s) added, 0 removed, 1 recycled; write=150.954 s, sync=0.003 s, total=150.968 s; sync files=17, longest=0.002 s, average=0.001 s; distance=25337 kB, estimate=25337 kB; lsn=3/24A27248, redo lsn=3/24A27210
2024-04-10 12:35:18.664 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 12:35:38.318 CST [1024908] LOG:  checkpoint complete: wrote 197 buffers (1.2%); 0 WAL file(s) added, 0 removed, 1 recycled; write=19.641 s, sync=0.004 s, total=19.654 s; sync files=16, longest=0.002 s, average=0.001 s; distance=9745 kB, estimate=23778 kB; lsn=3/253AB990, redo lsn=3/253AB958
2024-04-10 12:45:18.508 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 12:48:04.861 CST [1024908] LOG:  checkpoint complete: wrote 1664 buffers (10.2%); 0 WAL file(s) added, 0 removed, 1 recycled; write=166.336 s, sync=0.005 s, total=166.353 s; sync files=13, longest=0.004 s, average=0.001 s; distance=24210 kB, estimate=24210 kB; lsn=3/26B501F0, redo lsn=3/26B501B8
2024-04-10 12:55:19.060 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 12:56:51.909 CST [1024908] LOG:  checkpoint complete: wrote 269 buffers (1.6%); 0 WAL file(s) added, 0 removed, 2 recycled; write=92.836 s, sync=0.004 s, total=92.850 s; sync files=12, longest=0.003 s, average=0.001 s; distance=22078 kB, estimate=23996 kB; lsn=3/290AD6C8, redo lsn=3/280DFCF0
2024-04-10 13:05:21.080 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 13:08:42.763 CST [1024908] LOG:  checkpoint complete: wrote 1384 buffers (8.4%); 0 WAL file(s) added, 0 removed, 3 recycled; write=201.664 s, sync=0.007 s, total=201.683 s; sync files=12, longest=0.004 s, average=0.001 s; distance=54598 kB, estimate=54598 kB; lsn=3/2C9071F0, redo lsn=3/2B631580
2024-04-10 13:15:21.956 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 13:16:44.736 CST [1024908] LOG:  checkpoint complete: wrote 829 buffers (5.1%); 0 WAL file(s) added, 0 removed, 3 recycled; write=82.763 s, sync=0.004 s, total=82.780 s; sync files=12, longest=0.003 s, average=0.001 s; distance=48567 kB, estimate=53995 kB; lsn=3/2E59F470, redo lsn=3/2E59F438
2024-04-10 14:25:23.072 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 14:25:26.597 CST [1024908] LOG:  checkpoint complete: wrote 36 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=3.510 s, sync=0.006 s, total=3.525 s; sync files=26, longest=0.004 s, average=0.001 s; distance=154 kB, estimate=48611 kB; lsn=3/2E5C5F98, redo lsn=3/2E5C5F60
2024-04-10 14:45:23.979 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 14:46:04.354 CST [1024908] LOG:  checkpoint complete: wrote 405 buffers (2.5%); 0 WAL file(s) added, 0 removed, 0 recycled; write=40.364 s, sync=0.002 s, total=40.376 s; sync files=3, longest=0.002 s, average=0.001 s; distance=2718 kB, estimate=44021 kB; lsn=3/2E86D958, redo lsn=3/2E86D920
2024-04-10 14:48:22.866 CST [1024914] ERROR:  failed to create index reader while retrieving index
2024-04-10 14:48:22.872 CST [1093943] book@book ERROR:  error inserting document during insert callback: WriterClientError(IOError(Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }))
2024-04-10 14:48:22.872 CST [1093943] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 14:48:23.036 CST [1024906] LOG:  background worker "pg_bm25_insert_worker" (PID 1024914) exited with exit code 1
2024-04-10 14:50:23.448 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 14:50:37.593 CST [1094822] book@book ERROR:  error inserting document during insert callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(42603), path: "/", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }))
2024-04-10 14:50:37.593 CST [1094822] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 14:50:37.593 CST [1094822] book@book ERROR:  error sending abort request in callback: error sending request for url (http://0.0.0.0:42603/): error trying to connect: tcp connect error: Connection refused (os error 111)
2024-04-10 14:50:37.593 CST [1094822] book@book WARNING:  AbortTransaction while in ABORT state
2024-04-10 14:50:44.202 CST [1024908] LOG:  checkpoint complete: wrote 208 buffers (1.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=20.738 s, sync=0.006 s, total=20.754 s; sync files=16, longest=0.003 s, average=0.001 s; distance=3024 kB, estimate=39922 kB; lsn=3/2EB72418, redo lsn=3/2EB61C00
2024-04-10 14:52:41.405 CST [1095526] book@book ERROR:  error inserting document during insert callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(42603), path: "/", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }))
2024-04-10 14:52:41.405 CST [1095526] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 14:52:41.408 CST [1095526] book@book ERROR:  error sending abort request in callback: error sending request for url (http://0.0.0.0:42603/): error trying to connect: tcp connect error: Connection refused (os error 111)
2024-04-10 14:52:41.408 CST [1095526] book@book WARNING:  AbortTransaction while in ABORT state
2024-04-10 14:54:45.270 CST [1096246] book@book ERROR:  error inserting document during insert callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(42603), path: "/", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }))
2024-04-10 14:54:45.270 CST [1096246] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 14:54:45.270 CST [1096246] book@book ERROR:  error sending abort request in callback: error sending request for url (http://0.0.0.0:42603/): error trying to connect: tcp connect error: Connection refused (os error 111)
2024-04-10 14:54:45.270 CST [1096246] book@book WARNING:  AbortTransaction while in ABORT state
2024-04-10 14:55:23.297 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 14:55:26.620 CST [1024908] LOG:  checkpoint complete: wrote 34 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=3.309 s, sync=0.003 s, total=3.323 s; sync files=10, longest=0.002 s, average=0.001 s; distance=140 kB, estimate=35943 kB; lsn=3/2EB84CD8, redo lsn=3/2EB84CA0
2024-04-10 14:56:48.807 CST [1096993] book@book ERROR:  error inserting document during insert callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(42603), path: "/", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }))
2024-04-10 14:56:48.807 CST [1096993] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 14:56:48.808 CST [1096993] book@book ERROR:  error sending abort request in callback: error sending request for url (http://0.0.0.0:42603/): error trying to connect: tcp connect error: Connection refused (os error 111)
2024-04-10 14:56:48.808 CST [1096993] book@book WARNING:  AbortTransaction while in ABORT state
2024-04-10 14:58:52.344 CST [1097710] book@book ERROR:  error inserting document during insert callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(42603), path: "/", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }))
2024-04-10 14:58:52.344 CST [1097710] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 14:58:52.344 CST [1097710] book@book ERROR:  error sending abort request in callback: error sending request for url (http://0.0.0.0:42603/): error trying to connect: tcp connect error: Connection refused (os error 111)
2024-04-10 14:58:52.344 CST [1097710] book@book WARNING:  AbortTransaction while in ABORT state
2024-04-10 14:59:56.193 CST [1098119] book@book ERROR:  error inserting document during insert callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(42603), path: "/", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }))
2024-04-10 14:59:56.193 CST [1098119] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 14:59:56.194 CST [1098119] book@book ERROR:  error sending abort request in callback: error sending request for url (http://0.0.0.0:42603/): error trying to connect: tcp connect error: Connection refused (os error 111)
2024-04-10 14:59:56.195 CST [1098119] book@book WARNING:  AbortTransaction while in ABORT state
2024-04-10 15:00:23.714 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 15:00:26.635 CST [1024908] LOG:  checkpoint complete: wrote 30 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.909 s, sync=0.005 s, total=2.921 s; sync files=11, longest=0.003 s, average=0.001 s; distance=113 kB, estimate=32360 kB; lsn=3/2EBA14A0, redo lsn=3/2EBA1468
2024-04-10 15:02:00.453 CST [1098853] book@book ERROR:  error inserting document during insert callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(42603), path: "/", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }))
2024-04-10 15:02:00.453 CST [1098853] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 15:02:00.453 CST [1098853] book@book ERROR:  error sending abort request in callback: error sending request for url (http://0.0.0.0:42603/): error trying to connect: tcp connect error: Connection refused (os error 111)
2024-04-10 15:02:00.453 CST [1098853] book@book WARNING:  AbortTransaction while in ABORT state
2024-04-10 15:04:04.297 CST [1099595] book@book ERROR:  error inserting document during insert callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(42603), path: "/", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }))
2024-04-10 15:04:04.297 CST [1099595] book@book STATEMENT:  
	      INSERT INTO book_local_embedding(
	        "text", "embedding", "metadata"
	      )
	      VALUES ($1, $2, $3), ($4, $5, $6), ($7, $8, $9), ($10, $11, $12), ($13, $14, $15), ($16, $17, $18), ($19, $20, $21), ($22, $23, $24), ($25, $26, $27), ($28, $29, $30), ($31, $32, $33), ($34, $35, $36), ($37, $38, $39), ($40, $41, $42), ($43, $44, $45), ($46, $47, $48), ($49, $50, $51), ($52, $53, $54), ($55, $56, $57), ($58, $59, $60), ($61, $62, $63), ($64, $65, $66), ($67, $68, $69), ($70, $71, $72), ($73, $74, $75), ($76, $77, $78), ($79, $80, $81), ($82, $83, $84), ($85, $86, $87), ($88, $89, $90), ($91, $92, $93), ($94, $95, $96), ($97, $98, $99), ($100, $101, $102), ($103, $104, $105), ($106, $107, $108), ($109, $110, $111), ($112, $113, $114), ($115, $116, $117), ($118, $119, $120), ($121, $122, $123), ($124, $125, $126), ($127, $128, $129), ($130, $131, $132), ($133, $134, $135), ($136, $137, $138), ($139, $140, $141), ($142, $143, $144), ($145, $146, $147), ($148, $149, $150)
	    
2024-04-10 15:04:04.299 CST [1099595] book@book ERROR:  error sending abort request in callback: error sending request for url (http://0.0.0.0:42603/): error trying to connect: tcp connect error: Connection refused (os error 111)
2024-04-10 15:04:04.299 CST [1099595] book@book WARNING:  AbortTransaction while in ABORT state
2024-04-10 15:05:23.731 CST [1024908] LOG:  checkpoint starting: time
2024-04-10 15:05:25.751 CST [1024908] LOG:  checkpoint complete: wrote 21 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.007 s, sync=0.004 s, total=2.020 s; sync files=10, longest=0.003 s, average=0.001 s; distance=80 kB, estimate=29132 kB; lsn=3/2EBB5710, redo lsn=3/2EBB56D8
@rebasedming rebasedming added bug Something isn't working priority-2-medium Medium priority issue user-request This issue was directly requested by a user labels Apr 10, 2024
@neilyio
Copy link
Contributor

neilyio commented Apr 11, 2024

There are two separate errors here that suggest that something is going wrong on the file system.

"failed to create index reader while retrieving index" suggests that Tantivy cannot read the index directory in <PG_DATA_DIR>/paradedb/pg_search.

"Broken pipe" is also a file system issue. We use a unix named pipe to pass inserted data to a background writer process, and it's possible that something is wrong with the pipe file description.

I haven't seen this before. Could you try again on our latest release? (0.6). If you're still able to reproduce, it would be helpful if you could share what kind of data you're trying to insert so I can attempt to reproduce myself.

@SilenceLeo
Copy link
Author

Hi @neilyio ,

I'll update the version and give it a try. I'll let you know if there's any progress.

Thank you.

@philippemnoel
Copy link
Collaborator

Going to close this in the meantime. Please feel free to re-open the issue if you still face problems on v0.6.0+

@SilenceLeo
Copy link
Author

SilenceLeo commented Apr 18, 2024

Hi @neilyio @philippemnoel ,

I've upgraded to version 0.6, but the issue still persists.

I've identified the root cause of the problem and managed to bypass it using a simple workaround.

I have two local databases named book and book-test, each with identical bm25 index created.

First, create index in book-test.

book-test=> CALL paradedb.create_bm25(
    index_name => 'index_book_local_embedding_text',
    schema_name => 'public',
    table_name => 'book_local_embedding',
    ...
);

Then, create index in book.

book=> CALL paradedb.create_bm25(
    index_name => 'index_book_local_embedding_text',
    schema_name => 'public',
    table_name => 'book_local_embedding',
    ...
);

Insert into the database.

# Success
book-test=> INSERT INTO "book_local_embedding" (...) VALUES ...;

# Error: failed to create index reader while retrieving index.
# The error log is in the subject.
book=>  INSERT INTO "book_local_embedding" (...) VALUES ...;

I renamed one of the index names and successfully inserted the data.

@philippemnoel philippemnoel reopened this Apr 18, 2024
@SilenceLeo
Copy link
Author

SilenceLeo commented Apr 18, 2024

After observing for a while, I've noticed that the error persists. The successful insertion after renaming was just a coincidence. The issue is indeed related to the inserted data.

This is the insert SQL.
postgresql-2024-04-18_172045.log

@philippemnoel
Copy link
Collaborator

philippemnoel commented Apr 25, 2024

After observing for a while, I've noticed that the error persists. The successful insertion after renaming was just a coincidence. The issue is indeed related to the inserted data.

This is the insert SQL. postgresql-2024-04-18_172045.log

Thank you for reporting, sorry we missed this. We'll take a look and get back to you

@hamedkhaledi
Copy link

I'm experiencing the identical problem even after updating to version v0.6.1.

@philippemnoel
Copy link
Collaborator

I'm experiencing the identical problem even after updating to version v0.6.1.

Can you provide a reproduction? Are you able to share the data you're inserting?

@philippemnoel philippemnoel changed the title When I attempt to insert new data into a table indexed with pg_bm25, I encounter failure. When I attempt to insert new data into a table indexed with pg_search, I encounter failure. May 20, 2024
@hamedkhaledi
Copy link

Thanks for response.

Description:

I'm encountering an error while performing bulk insertions into a ParadeDB table with HNSW and BM25 indexes.The error
message is:

[215] ERROR:  XX000: failed to create index reader while retrieving index

Notably, the error consistently occurs after approximately 5-6 insertions.

Steps to Reproduce:

  • Create Table and Indexes:
CREATE TABLE IF NOT EXISTS chunks
(
    id         BIGSERIAL PRIMARY KEY,
    content    TEXT,
    is_deleted BOOLEAN,
    index_bert Vector(768),
    index_text TEXT
);

CREATE INDEX IF NOT EXISTS semantic_index_bert ON chunks USING hnsw (index_bert vector_cosine_ops);

CALL paradedb.create_bm25(
        index_name => 'literal_index_text',
        schema_name => 'public',
        table_name => 'chunks',
        key_field => 'id',
        text_fields => '{index_text: {tokenizer: {type: "default"}, fast: true}}'
     );
  • Python Code for Bulk Insertion:
import random
import string

import psycopg
from tqdm import tqdm


def generate_random_word(length=5):
    return ''.join(random.choice(string.ascii_lowercase) for _ in range(length))


if __name__ == '__main__':
    connection_string = "postgresql://test:test@localhost:5432/test_db"
    sql_contexts = []
    for i in tqdm(range(100)):
        sql_context = "insert into chunks (content, is_deleted) values"
        for j in range(1000):
            sql_context += f"""('{generate_random_word(100)}', False)"""
            if j != 999:
                sql_context += ",\n"
        sql_context += ";"
        sql_contexts.append(sql_context)

    with psycopg.connect(connection_string, autocommit=True) as conn:
        for sql_context in tqdm(sql_contexts):
            cursor = conn.cursor()
            with conn.transaction():
                cursor.execute(sql_context)

Expected Behavior:

The Python code should successfully insert 100,000 rows (100 insertions with 1000 rows each) into the chunks table
without encountering the error.

Paradedb version: v0.7.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority-2-medium Medium priority issue user-request This issue was directly requested by a user
Projects
None yet
Development

No branches or pull requests

5 participants