Library Browser showing OPL problem multiple times #2387

Alex-Jordan · 2024-04-01T03:04:46Z

I see the following in both production and my develop server.

Go to Library Browser.
Click Advanced Search for the OPL
For "Textbook", select "Calculus: Concepts and Contexts by James Stewart (edition 5)"
For "Text chapter" choose "12. Vector geometry"
For "Text section" choose "1. Coordinate systems"

At this point there should be 17 matching problems. Click "View Problems". I see the first problem, Library/UMN/calculusStewartET/s_12_1_10.pg, three times. I also see Library/UMN/calculusStewartET/s_12_1_9.pg three times. And there are more examples. So far, I only see this with UMN problems.

The text was updated successfully, but these errors were encountered:

drgrice1 · 2024-04-01T03:11:16Z

It seems that WeBWorK 2.18 also does this. So this is not a regression with develop. Of course it is a bug nonetheless.

Alex-Jordan · 2024-04-01T03:11:56Z

I see it also in a different search with Library/Rochester/setVectors1space3D/UR_VC_1_2.pg, so it is not just UMN problems.

drgrice1 · 2024-04-02T12:52:27Z

So this is probably technically an OPL bug, and not a webwork2 bug. The issue is a structural flaw in the design of the OPL database that goes back to 2014 when cross listing of problems in different subject areas was added to the Taxonomy2. That implementation was not well thought out. It requires that at least the DBsubect be selected in order to uniquely identify the cross listings of a single problem.

So what is happening for the particular example in your first comment in this issue is that the Library/UMN/calculusStewartET problems for the textbook "Calculus: Concepts and Contexts by James Stewart (edition 5)" and subject "12. Vector geometry" and section "1. Coordinate systems" are all cross listed for "Calculus - multivariable", "Linear Algebra", and "Geometry", and so are each stored in the OPL_pgfile table three times (one for each cross listed subject). Since none of the top three selects in the advanced tab which correspond to DBsubject, DBchapter, or DBsection are selected and we are filtering only by textbook, textchapter, and textsection, that does not uniquely identify the cross listing, and so you get all three of each problem.

To see this, select the text book, chapter, and section you mentioned. At this point it shows 24 matching WeBWorK problems. Now select "Calculus - Multivariable" for the database subject (the first select on the advanced tab). Then it drops to 8 matching WeBWorK problems. If you select "Linear Algebra" it also matches 8, and the same for "Geometry". Initially each problem is listed 3 times because it is cross listed in those 3 subjects. Selecting one of the subjects eliminates the cross listings.

To fix this the OPL database will either need to be redesigned to properly handle cross listings, or the database queries in the getDBListings will need to be revised. To obtain the most efficient result, the database needs to be redesigned, but revising the database queries in getDBListings works with some loss of efficiency. I am working on finding queries that work relatively well in all cases, but am not quite there yet.

Note that we have been lying about how many problems are in the OPL and Contrib since this already affects the counts when no database subject is selected. On the develop branch with the v2023-04-30 OPL release it shows 37,515 problems in the OPL and 11,087 problems in Contrib. I used a revised database query to find the correct numbers, and there are actually only 29,763 problems in the OPL, and 7,916 in Contrib.

The database query used to count and list OPL and Contrib problems was not good enough to narrow down to a single cross listed problem listing in the OPL database. As a result a single problem would be listed multiple times if a datbaase subject is not selected. For example, if a textbook, text chapter, and text section are selected but not a database chapter. This fixes issue openwebwork#2387. See that issue for details. Note that this could be improved upon by changing the OPL database. The structure of the OPL database is not good. There is absolutely no reason that the libraryroot, pg file path, and pg file basename should be stored in separate columns, much less in separate tables. Just gotta say, what were you thinking whoever designed it that way? To make that work well the values of the select options needed to be switched from being the human readable texts that are shown to being the database ids of the things the select represent. This also fixes another issue that I have known about for a while now. If you are on the "Basic Search" page (the initial page) of the library browser, select the subject "Calculus - single variable", chapter "Limits and continuity", and section "Motivational applications (estimation)", and then click on the "Advanced Search" button, this results in a heavy spike in CPU usage and a long delay before the page actually loads. That happens with other selections as well, but how bad it is varies with what is selected. Also remove the plurals from the select names that don't make sense since they represenet a singular quantity. This also removes support for OPLv1 (as the TODO in SetMaker.pm suggests). There is no reason to support that anymore. Not all of the scripts have the version check removed yet though. Generally clean up the ListingDB.pm file.

drgrice1 mentioned this issue Apr 2, 2024

Fix cross listed OPL problems in the library browser. #2388

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Library Browser showing OPL problem multiple times #2387

Library Browser showing OPL problem multiple times #2387

Alex-Jordan commented Apr 1, 2024

drgrice1 commented Apr 1, 2024

Alex-Jordan commented Apr 1, 2024

drgrice1 commented Apr 2, 2024

Library Browser showing OPL problem multiple times #2387

Library Browser showing OPL problem multiple times #2387

Comments

Alex-Jordan commented Apr 1, 2024

drgrice1 commented Apr 1, 2024

Alex-Jordan commented Apr 1, 2024

drgrice1 commented Apr 2, 2024