Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Useful code for checking progress during indexing #8

Open
Chloe-Meinshausen opened this issue Feb 3, 2023 · 1 comment
Open

Useful code for checking progress during indexing #8

Chloe-Meinshausen opened this issue Feb 3, 2023 · 1 comment

Comments

@Chloe-Meinshausen
Copy link
Contributor

Hi all,
just wanted to share this very helpful query that will check the status on the create index process. -I have one thats been running for almost 24 hours and seeing the progress creep up has been a stress relief.

This is the results you get if you run it in pgadmin.

image

      SELECT
        now()::TIME(0),
        a.query,
        p.phase,
        round(p.blocks_done / p.blocks_total::numeric * 100, 2) AS "% done",
        p.blocks_total,
        p.blocks_done,
        p.tuples_total,
        p.tuples_done,
        ai.schemaname,
        ai.relname,
        ai.indexrelname
      FROM pg_stat_progress_create_index p
      JOIN pg_stat_activity a ON p.pid = a.pid
      LEFT JOIN pg_stat_all_indexes ai on ai.relid = p.relid AND ai.indexrelid = p.index_relid;

source:
https://dba.stackexchange.com/questions/11329/monitoring-progress-of-index-construction-in-postgresql/249784#249784?newreg=7a3d794dcb154b6782ca390ead375050

@DaveBathnes
Copy link
Member

Thank you! That's really useful. Those indexes on the json column I know are very slow to build. And probably not very efficient!

The whole thing took about 2 days for me last time I ran it. The indexes were a long time, and then the final full vacuum command, as that rebuilds the data. That also may be unnecessary. Because there's a lot of processing of the data, I thought it would be a useful way of reducing the final database size. I'll need to double check again how effective it is though.

One next stage to the project I will start having a look at is more tailored indexing. Some of the data would be useful in a full-text search, and I think that would mean pulling out the relevant data fields (e.g. book description, title) into a tsvector column. And then indexing only specific fields in the json.

Also, thanks for the PR! Apologies for the delay, it's been a busy week so I've been a bit delayed getting a chance to look at it. But I'm looking forward to seeing the tests run!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants