Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when creating an index on MacOS #1107

Closed
boris-glumpler opened this issue May 4, 2024 · 26 comments
Closed

Error when creating an index on MacOS #1107

boris-glumpler opened this issue May 4, 2024 · 26 comments
Labels
bug Something isn't working pg_search Issue related to `pg_search/` priority-2-medium Medium priority issue

Comments

@boris-glumpler
Copy link

Hello,

I've compiled and installed the pg_search extension on my Mac and added it to a database. But when I try to create an index I keep getting this error:

error dropping index teams_idx_bm25_index: error sending request for url (http://0.0.0.0:49157/): connection closed before message completed

And when I check http://0.0.0.0:49157/, I'm getting the following message:

unexpected error: io error: failed to fill whole buffer

This is the index, I'm trying to create:

CALL paradedb.create_bm25(
    index_name => 'teams_idx',
    schema_name => 'public',
    table_name => 'teams',
    key_field => 'id',
    text_fields => '{"name":{},"description":{"tokenizer":{"type":"default"}}}',
    numeric_fields => '{}',
    boolean_fields => '{}',
    json_fields => '{}'
);

I've reinstalled pg_search a couple times now, but the issue persists.

I'd appreciate some pointers on how I can fix this. Cheers!

@philippemnoel
Copy link
Collaborator

Hey @boris-glumpler! Could you provide more details on your setup? macOS version, arch, Postgres version you're installing in, etc.? Anything to help us reproduce will help fix it faster for you.

@boris-glumpler
Copy link
Author

Yup, sorry, here it is:

  • Postgres 16.2 from Homebrew
  • MacOS 14.4.1 on an M3 Pro

@philippemnoel
Copy link
Collaborator

Yup, sorry, here it is:

  • Postgres 16.2 from Homebrew
  • MacOS 14.4.1 on an M3 Pro

Could you please provide full reproduction steps? From compilation to the query that fails?

@boris-glumpler
Copy link
Author

boris-glumpler commented May 5, 2024

Ok, so I just started again with a fresh install of Postgres and ran through these commands:

brew install postgresql@16
git clone https://github.com/paradedb/paradedb.git
cd paradedb/pg_search
git checkout v0.6.1
cargo install --locked cargo-pgrx --version 0.12.0-alpha.1
cargo pgrx init --pg16=/opt/homebrew/opt/postgresql@16/bin/pg_config
cargo pgrx install --release

That all ended with a success message:

Finished installing pg_search

I then added pg_search to shared_preload_libraries and ran brew services restart postgresql@16.

After that I just created the extension create extension pg_search;, ran my migrations and tried to create an index:

 CALL paradedb.create_bm25(
    index_name => 'users_idx',
    schema_name => 'public',
    table_name => 'users',
    key_field => 'id',
    text_fields => '{"name":{},"bio":{"tokenizer":{"type":"default"}}}'
);

This then resulted in this error:

Query 1 ERROR at Line 1: : ERROR:  error dropping index users_idx_bm25_index: error sending request for url (http://0.0.0.0:57937/): connection closed before message completed
CONTEXT:  SQL statement "SELECT paradedb.drop_bm25_internal(format('%s_bm25_index', index_name))"
PL/pgSQL function paradedb.drop_bm25(text,text) line 10 at PERFORM
SQL statement "CALL paradedb.drop_bm25(index_name, schema_name => schema_name)"
PL/pgSQL function paradedb.create_bm25(text,text,text,text,text,text,text,text) line 33 at CALL

@boris-glumpler
Copy link
Author

I just ran lsof -i:57937 and the process with the corresponding PID for that port is pg_search_insert_worker.

@philippemnoel
Copy link
Collaborator

I just ran lsof -i:57937 and the process with the corresponding PID for that port is pg_search_insert_worker.

Are you able to share part of the data you are inserting?

@boris-glumpler
Copy link
Author

I haven't inserted anything yet. To test it out I just migrated the users table and its still empty.

@neilyio
Copy link
Contributor

neilyio commented May 5, 2024

@boris-glumpler Can you try to create an index with the mock data that we supply?

CALL paradedb.create_bm25_test_table();

CALL paradedb.create_bm25(
  schema_name => 'paradedb',
  index_name => 'bm25_test_table',
  table_name => 'bm25_test_table',
  key_field => 'id',
  text_fields => '{description: {}}'
);

SELECT * FROM bm25_test_table.search('description:shoes');

@boris-glumpler
Copy link
Author

boris-glumpler commented May 5, 2024

The test table was generated just fine, but trying to create an index caused the same error

Query 1 ERROR at Line 1: : ERROR:  error dropping index bm25_test_table_bm25_index: error sending request for url (http://0.0.0.0:49155/): connection closed before message completed
CONTEXT:  SQL statement "SELECT paradedb.drop_bm25_internal(format('%s_bm25_index', index_name))"
PL/pgSQL function paradedb.drop_bm25(text,text) line 10 at PERFORM
SQL statement "CALL paradedb.drop_bm25(index_name, schema_name => schema_name)"
PL/pgSQL function paradedb.create_bm25(text,text,text,text,text,text,text,text) line 33 at CALL

@philippemnoel philippemnoel added bug Something isn't working priority-2-medium Medium priority issue pg_search Issue related to `pg_search/` labels May 5, 2024
@neilyio
Copy link
Contributor

neilyio commented May 5, 2024

Thanks. Can you confirm that process 49155 has the same message unexpected error: io error: failed to fill whole buffer?

@neilyio
Copy link
Contributor

neilyio commented May 5, 2024

Please also enable postgres logging (logging_collector = on) in your postgresql.conf, and let me know if you see anything in the log file.

@boris-glumpler
Copy link
Author

Yup, it's the same message.

Here's the log output:

2024-05-06 00:39:50.267 IST [5540] LOG:  starting PostgreSQL 16.2 (Homebrew) on aarch64-apple-darwin23.2.0, compiled by Apple clang version 15.0.0 (clang-1500.1.0.2.5), 64-bit
2024-05-06 00:39:50.269 IST [5540] LOG:  listening on IPv6 address "::1", port 5432
2024-05-06 00:39:50.269 IST [5540] LOG:  listening on IPv6 address "fe80::1%lo0", port 5432
2024-05-06 00:39:50.269 IST [5540] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2024-05-06 00:39:50.272 IST [5540] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2024-05-06 00:39:50.317 IST [5544] LOG:  database system was interrupted; last known up at 2024-05-05 22:53:13 IST
2024-05-06 00:39:51.022 IST [5544] LOG:  database system was not properly shut down; automatic recovery in progress
2024-05-06 00:39:51.026 IST [5544] LOG:  redo starts at 0/67978E8
2024-05-06 00:39:51.026 IST [5544] LOG:  invalid record length at 0/67979D0: expected at least 24, got 0
2024-05-06 00:39:51.026 IST [5544] LOG:  redo done at 0/6797998 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2024-05-06 00:39:51.031 IST [5542] LOG:  checkpoint starting: end-of-recovery immediate wait
2024-05-06 00:39:51.033 IST [5542] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.003 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB; lsn=0/67979D0, redo lsn=0/67979D0
2024-05-06 00:39:51.034 IST [5556] LOG:  starting pg_search telemetry worker at PID 5556
2024-05-06 00:39:51.034 IST [5557] LOG:  starting pg_search shutdown worker at PID 5557
2024-05-06 00:39:51.034 IST [5540] LOG:  database system is ready to accept connections
2024-05-06 00:39:51.034 IST [5558] LOG:  starting pg_search insert worker at PID 5558
2024-05-06 00:39:51.035 IST [5556] ERROR:  could not initialize telemetry config: PosthogApiKey
2024-05-06 00:39:51.036 IST [5540] LOG:  background worker "pg_search_telemetry_worker" (PID 5556) exited with exit code 1
2024-05-06 00:40:06.812 IST [5756] ERROR:  error dropping index bm25_test_table_bm25_index: error sending request for url (http://0.0.0.0:49550/): connection closed before message completed
2024-05-06 00:40:06.812 IST [5756] CONTEXT:  SQL statement "SELECT paradedb.drop_bm25_internal(format('%s_bm25_index', index_name))"
	PL/pgSQL function paradedb.drop_bm25(text,text) line 10 at PERFORM
	SQL statement "CALL paradedb.drop_bm25(index_name, schema_name => schema_name)"
	PL/pgSQL function paradedb.create_bm25(text,text,text,text,text,text,text,text) line 33 at CALL
2024-05-06 00:40:06.812 IST [5756] STATEMENT:  CALL paradedb.create_bm25(
	  schema_name => 'paradedb',
	  index_name => 'bm25_test_table',
	  table_name => 'bm25_test_table',
	  key_field => 'id',
	  text_fields => '{description: {}}'
	);

@neilyio
Copy link
Contributor

neilyio commented May 5, 2024

Since you're installing from source, we have an opportunity to poke around the source code too. I'd do this myself if I could reproduce, but I've never seen your error before.

In pg_search/src/writer/index.rs, you could try adding an early return to line 125:

124            WriterRequest::DropIndex { directory } => {
125                return Ok(()); // <-- add this line
126                self.drop_index(directory).map_err(ServerError::from)
127            }

That would help isolate whether something is going wrong with Tantivy, or whether something is wrong with the communication with the insert worker.

@neilyio
Copy link
Contributor

neilyio commented May 5, 2024

When you're reloading the extension, make sure to DROP EXTENSION and CREATE extension again after you run cargo pgrx install.

@boris-glumpler
Copy link
Author

I'm getting the same error message after recompiling the extension.

@neilyio
Copy link
Contributor

neilyio commented May 5, 2024

Thanks. I meant to suggest before... could you add a logging statement with pgrx::log! along with the early return just as a sanity check that the extension reloaded properly? Sometimes that can be a little confusing, and before we make a conclusion, I want to make sure we're seeing the new version of the code.

124            WriterRequest::DropIndex { directory } => {
125                pgrx::log!("LOGGING STATEMENT FROM DROP INDEX");
126                return Ok(()); // <-- add this line
127                self.drop_index(directory).map_err(ServerError::from)
129            }

Then in your postgres logs, you can make sure the logging statement shows up. Sorry for the oversight.

@boris-glumpler
Copy link
Author

The error is still the same. The log message does not show up in the logs. I have been getting compiler warnings about unreachable code, tho, and did drop/create the extension.

@boris-glumpler
Copy link
Author

I just upgraded to v0.7.0 and creating an index is working now.

@boris-glumpler
Copy link
Author

I still get an error, tho, when I want to drop an index.

@boris-glumpler
Copy link
Author

And trying to create another index now gives me the following error:

Query 1 ERROR at Line 1: : ERROR:  error inserting document during build callback: WriterClientError(ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(0.0.0.0)), port: Some(49154), path: "/", query: None, fragment: None }, source: hyper::Error(IncompleteMessage) }))
CONTEXT:  SQL statement "CREATE INDEX users_idx_bm25_index ON public.users USING bm25 ((users.*)) WITH (key_field='id', text_fields='{"name":{},"bio":{}}', numeric_fields='{}', boolean_fields='{}', json_fields='{}');"
PL/pgSQL function paradedb.create_bm25(text,text,text,text,text,text,text,text) line 42 at EXECUTE

@neilyio
Copy link
Contributor

neilyio commented May 17, 2024

This is an odd error. Can you confirm if you're able to create an index and write data to it? Can you try creating the index, inserting some rows, and then deleting some rows?

@boris-glumpler
Copy link
Author

Can't create another index now, nope. I keep getting the above error inserting document during build callback... error. Also, the one index I was able to create right after installing v0.7.0 didn't seem to install properly. The schema was created, but there is no search function. The only functions I have in there are schema and rank_hybrid.

@philippemnoel
Copy link
Collaborator

@boris-glumpler Could you DROP EXTENSION and start from scratch? I also noticed you compiled from scratch using 0.12.0-alpha.1 for the pgrx version. We recommend using 0.11.3, which is the stable version we are on.

@boris-glumpler
Copy link
Author

I used 0.12.0-alpha.1 before, cause that's what the readme said. And then I changed to 0.11.3 when you changed the readme ;) Anyways, I just compiled 0.7.1, dropped the extension and then tried to create it again, but got this error:

ERROR:  could not find function "range_timestamp_wrapper" in file "/opt/homebrew/lib/postgresql@16/pg_search.dylib"

@philippemnoel
Copy link
Collaborator

@boris-glumpler We've fixed this, apologies. We were missing some SQL commands in our upgrade scripts. I would recommend you delete all mentions of pg_search on your system, and try again from the version in dev or main, which should be the equivalent of v0.7.3. Please let us know how it goes.

@boris-glumpler
Copy link
Author

Great, thanks! I'll be traveling the next couple days and will give it a go then and report back!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pg_search Issue related to `pg_search/` priority-2-medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants