Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library Browser showing OPL problem multiple times #2387

Open
Alex-Jordan opened this issue Apr 1, 2024 · 3 comments
Open

Library Browser showing OPL problem multiple times #2387

Alex-Jordan opened this issue Apr 1, 2024 · 3 comments

Comments

@Alex-Jordan
Copy link
Contributor

I see the following in both production and my develop server.

  1. Go to Library Browser.
  2. Click Advanced Search for the OPL
  3. For "Textbook", select "Calculus: Concepts and Contexts by James Stewart (edition 5)"
  4. For "Text chapter" choose "12. Vector geometry"
  5. For "Text section" choose "1. Coordinate systems"

At this point there should be 17 matching problems. Click "View Problems". I see the first problem, Library/UMN/calculusStewartET/s_12_1_10.pg, three times. I also see Library/UMN/calculusStewartET/s_12_1_9.pg three times. And there are more examples. So far, I only see this with UMN problems.

@drgrice1
Copy link
Sponsor Member

drgrice1 commented Apr 1, 2024

It seems that WeBWorK 2.18 also does this. So this is not a regression with develop. Of course it is a bug nonetheless.

@Alex-Jordan
Copy link
Contributor Author

I see it also in a different search with Library/Rochester/setVectors1space3D/UR_VC_1_2.pg, so it is not just UMN problems.

@drgrice1
Copy link
Sponsor Member

drgrice1 commented Apr 2, 2024

So this is probably technically an OPL bug, and not a webwork2 bug. The issue is a structural flaw in the design of the OPL database that goes back to 2014 when cross listing of problems in different subject areas was added to the Taxonomy2. That implementation was not well thought out. It requires that at least the DBsubect be selected in order to uniquely identify the cross listings of a single problem.

So what is happening for the particular example in your first comment in this issue is that the Library/UMN/calculusStewartET problems for the textbook "Calculus: Concepts and Contexts by James Stewart (edition 5)" and subject "12. Vector geometry" and section "1. Coordinate systems" are all cross listed for "Calculus - multivariable", "Linear Algebra", and "Geometry", and so are each stored in the OPL_pgfile table three times (one for each cross listed subject). Since none of the top three selects in the advanced tab which correspond to DBsubject, DBchapter, or DBsection are selected and we are filtering only by textbook, textchapter, and textsection, that does not uniquely identify the cross listing, and so you get all three of each problem.

To see this, select the text book, chapter, and section you mentioned. At this point it shows 24 matching WeBWorK problems. Now select "Calculus - Multivariable" for the database subject (the first select on the advanced tab). Then it drops to 8 matching WeBWorK problems. If you select "Linear Algebra" it also matches 8, and the same for "Geometry". Initially each problem is listed 3 times because it is cross listed in those 3 subjects. Selecting one of the subjects eliminates the cross listings.

To fix this the OPL database will either need to be redesigned to properly handle cross listings, or the database queries in the getDBListings will need to be revised. To obtain the most efficient result, the database needs to be redesigned, but revising the database queries in getDBListings works with some loss of efficiency. I am working on finding queries that work relatively well in all cases, but am not quite there yet.

Note that we have been lying about how many problems are in the OPL and Contrib since this already affects the counts when no database subject is selected. On the develop branch with the v2023-04-30 OPL release it shows 37,515 problems in the OPL and 11,087 problems in Contrib. I used a revised database query to find the correct numbers, and there are actually only 29,763 problems in the OPL, and 7,916 in Contrib.

drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 2, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 3, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 4, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
drgrice1 added a commit to drgrice1/webwork2 that referenced this issue Apr 10, 2024
The database query used to count and list OPL and Contrib problems was
not good enough to narrow down to a single cross listed problem listing
in the OPL database. As a result a single problem would be listed
multiple times if a datbaase subject is not selected.  For example, if a
textbook, text chapter, and text section are selected but not a database
chapter.  This fixes issue openwebwork#2387.  See that issue for details. Note that
this could be improved upon by changing the OPL database.  The structure
of the OPL database is not good. There is absolutely no reason that the
libraryroot, pg file path, and pg file basename should be stored in
separate columns, much less in separate tables. Just gotta say, what
were you thinking whoever designed it that way?

To make that work well the values of the select options needed to be
switched from being the human readable texts that are shown to being the
database ids of the things the select represent.  This also fixes
another issue that I have known about for a while now.  If you are on
the "Basic Search" page (the initial page) of the library browser,
select the subject "Calculus - single variable", chapter "Limits and
continuity", and section "Motivational applications (estimation)", and
then click on the "Advanced Search" button, this results in a heavy
spike in CPU usage and a long delay before the page actually loads.
That happens with other selections as well, but how bad it is varies
with what is selected.

Also remove the plurals from the select names that don't make sense
since they represenet a singular quantity.

This also removes support for OPLv1 (as the TODO in SetMaker.pm
suggests). There is no reason to support that anymore. Not all of the
scripts have the version check removed yet though.

Generally clean up the ListingDB.pm file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants