Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethinking Requirement Computation and UX #522

Open
3 of 7 tasks
SamChou19815 opened this issue Sep 17, 2021 · 3 comments
Open
3 of 7 tasks

Rethinking Requirement Computation and UX #522

SamChou19815 opened this issue Sep 17, 2021 · 3 comments

Comments

@SamChou19815
Copy link
Contributor

SamChou19815 commented Sep 17, 2021

Paradigm Shift: From pre-computation invariant enforcement to post-computation violation fixing.

Rationale

In a recent team meeting, we discussed how requirement computation's complexity has exploded. We have several pressing flaws in our current system:

  1. lack of opt-in/out-out mechanism;
  2. broken double-counting mechanism for minors.

Given our current design/UX philosophy, we need to force user to make some choices so that these problems are impossible to occur after a course is added. This can be confusing for the user, since they might just want to mass add a lot of courses to see what's going on. This is also difficult for us, since we have to design modals for all these choices in all possible flows.

This design document will present an alternative way to solve the problem: we shift the burden of producing a valid requirement graph from the pre-computation stage to the post-computation stage. Intuitively, it can be described by the following graph:

intuition

I believe this new general model of thinking can help both users and developer understand what's going on:

  • For developers, instead of predicting what problems can be caused by adding a class before adding it, we can directly compute the list of problems on the computed requirement graph.
  • For users, instead of making choices they have no idea why they have to be made, they can see the problem and have the problem in mind to make their choices.

Finally, it can also unblock another planned feature: mass adding courses, since users don't have to make choices at adding time any more!

Design/UX Implications

There should be an additional screen that can be entered from a prominent button. This screen should list all the actionable problems and provide inline-fixes.

Here is a design sketch on what information and buttons need to be presented:

sketch

Example UX:

  1. A user adds MATH 4710. It can be used to fulfill "external spec", "tech elective", and "engineering distribution". None of those courses allow double counting, but we don't force user to choose the requirement to fulfill yet.
  2. The course is now added. A warning is now generated. Users can see the number of warnings on the navigation sidebar.
  3. The user might choose to ignore the warning for now and continue adding courses. (We might want to perform some user research on this. There are a variety of options we can take here. e.g. we bug the user to resolve some of the warnings after adding a few courses. We can also try to make the warning more prominent by adding the following text to the requirement sidebar: "the requirement fulfillment progress looks a little off? Click on the warning icon to see the issues we discovered.")
  4. Eventually, the user decides to click on the warning icon and enters the warning page.
  5. The user clicks "do not count for external spec" button for MATH 4710. The warning disappears and the progress bar for requirement might drop. (Note for devs: the button's text ignores the complexity of the underlying actions. e.g. clicking the button might implies choosing one course as default and undo an opt-in of another requirement. However, we handle this on the dev side, and leave the user a simpler message.)

Engineering

Correctness Theorem

The correctness of the new flow rests upon a single theorem:

For a graph G = (V, E) produced from user input with possible constraint violations, there exists a graph G_correct = (V, E') such that E' is a subset of E, and G_correct does not have any constraint violations.

In plain English, it states that starting from any bad graph, we can remove edges until it becomes good. (In the current system, being good means no illegal double-counting.)

To prove the theorem, we need to first define constraint:

A constraint is a tuple (c, R) where c is a course and R is a set of requirement. The constraint mandates that there can be at most one edge (r, c) in the graph such that r in R.

With this definition of constraint, the theorem can be easily proven: for every violation of constraint, we kept only one edge. So the resulting graph will be valid, has the same set of originasl vertices but a subset of original edges.

The proof of the theorem also tells us how to guide user to have a valid requirement graph, which I will talk about more in the design/UX section.

Double Counting as Constraint Violations

With constraints as a more primitive concept than built-in double counting concept, we now have the ability to model the complicated intra-minor double counting.

The fundemental change is that instead of directly computing the set of courses that are illegally double counted, we first compute constraints and then compute which constraint is violated.

The previous double counting detection can be modeled as a "global constraint". For every course, there is a constraint (c, R) where R is a set of connected requirement that does not allow double counting. Note that the behavior that all minors allow double counting are unchanged in this stage. Now we add a new set of constraints to handle potential intra-minor double counting. For every course that is connected to at least one minor requirement, there is a constraint (c, R) where R is a set of connected minor requirements.

With this new algorithm, you can convince yourself that it can detect the following constraint violation (two different ones colored differently):

violations

Helping Users to Fix the Graph

The requirement graph algorithm, in additional to the graph, also outputs a set of violated constraints. These violated constraints guide the user to make a 1-N choice in each violated one, or use the opt-out mechanism.

The requirement graph algorithm is now amended to be

  1. Building a rough graph by naively connecting requirements and courses based on requirement JSON;
  2. Respect user's choices on fulfillment strategies;
  3. Respect user's opt-in choices;
  4. Respect user's double counting resolution choices;
  5. Compute constraint violations.

Representing User Choices

User choices for double-counting related issues are now refactored into two categories:

  1. Opt-in: which can only be applied to requirement-course edges that cannot be naturally formed.
  2. Opt-out: which can only be applied to requirement-course edges that are naturally formed.

, where naturally formed means that the requirement data states that the course can be used to
satisfy that requirement.

The data format might look like:

{
  // course's uniqueId
  "12345": {
    "optIn": { "REQ3": ["slotA", "slotB"] },
    "optOut": ["REQ1", "REQ2"]
  }
}

The existing user-selectable-requirement-choices collection data, which is a simple course unique id -> requirement map, can be migrated to optOut by taking the set complement of R \ r, where R is a set of all requirements that does not allow double-counting and can be fulfilled by the course, and r is the current requirement id in the existing data.

Concrete Tasks

  • Setup requirement-graph-aware data migration infrastructure
  • Migrate existing data to use the new format, first in a separate collection
  • Migrate the frontend to start using the new format
  • Implement UI that allow users to review the issues and make choices
  • Remove current choice UI that make user choose during add time
  • Launch
  • Cleaned up unused old data

Risks

This is a complete rethink of a major UX flow. A lot of ongoing modals under development might be abandoned, and interfaces need to be redesigned. However, the format of user choices largely remain the same, since we are only shifting the time of user choices.

Unsolved Problems

It's important to note that this is not a solution to every problem we currently know. Notably, the following two problems cannot be solved directly by this proposal.

AP/IB Override

This proposal does not consider AP/IB courses at all. I think the most principled solution is just to treat them exactly as a normal courses, but happened to be put into a special semester called "Pre-Cornell".

Complicated Sub-requirements

The most complicated sub-requirement we see so far seems to be the applied math minor. It's completely not representable in the current system. However, it's the problem of sub-requirements, while this design document concerns top-level requirement only.

Cross-listed courses

Right now we treat cross-listed courses as exactly the same, but some requirements can only be fulfilled by taking one variant of cross-listed class. It is not the concern of this design proposal to fix it.

@benjamin-shen
Copy link
Collaborator

💯 I have some comments and questions about this proposal. I am overall in support for all the reasons you've mentioned.

In plain English, it states that starting from any bad graph, we can remove edges until it becomes good. (In the current system, being good means no illegal double-counting.)

I believe this can also be extended to more complex constraints beyond double counting! For example, forfeiting AP credit (which was previously impossible).

The previous double counting detection can be modeled as a "global constraint". For every course, there is a constraint (c, R) where R is a set of connected requirement that does not allow double counting.

Do you propose we store (c1, R), (c2, R), ...(cn, R) as separate constraints? I feel like that would lead to high space complexity (with redundant data R). Maybe we can use a single constraint (c_all, R) where c_all has a matching function that returns true on all courses.

With this new algorithm, you can convince yourself that it can detect the following constraint violation (two different ones colored differently)

Just to double check: this means there are two constraints associated with one course (c, {R1, R2}) and (c, {R4, R5})?

The requirement graph algorithm is now amended to be

  1. Building a rough graph by naively connecting requirements and courses based on requirement JSON;
  2. Respect user's choices on fulfillment strategies;
  3. Respect user's opt-in choices;
  4. Respect user's double counting resolution choices;
  5. Compute constraints;
  6. Find constraint violations.

I noticed that constraints were resolved last. Are there scenarios where OTHER user choices would supersede constraint violations?

Something like
3. Respect user's constraint-friendly opt-in choices
...
7. Respect user's constraint choices;
8. Respect user's constraint-conflicting opt-in choices

(That's a lot of kinds of user choices... lol)

There should probably be an additional screen that can be entered from a prominent button. This screen should list all the actionable problems and provide inline-fixes.

Design question (not necessarily for Sam): What happens if there are constraint violations that are not resolved? Would an arbitrary requirement be fulfilled (or no requirement)?

There should also be a similar screen that let user review all the choices they made and allow them to change.

👍 This is SUPER exciting. In some situations, would it be possible for us to recommend a user choice to maximize edges/flow? I also wonder if there is potential for machine learning or predictive modeling here (based on other users' choices).

This proposal does not consider AP/IB courses at all. I think the most principled solution is just to treat them exactly as a normal courses, but happened to be put into a special semester called "Pre-Cornell".

That makes sense to me. Maybe we could bring back the "equivalent courses" feature (or a new feature for CASE exams) and put it in the special semester too.

@SamChou19815
Copy link
Contributor Author

@benjamin-shen

Do you propose we store (c1, R), (c2, R), ...(cn, R) as separate constraints? I feel like that would lead to high space complexity (with redundant data R). Maybe we can use a single constraint (c_all, R) where c_all has a matching function that returns true on all courses.

In fact, I'm proposing we store (c, r1), (c, r2), .... However, it's important to note that we only compute the constraints that are violated, so the actual size of constraints should be pretty small.

Just to double check: this means there are two constraints associated with one course (c, {R1, R2}) and (c, {R4, R5})?

Yes, that's right.

I noticed that constraints were resolved last. Are there scenarios where OTHER user choices would supersede constraint violations?

Can't think of any at the moment. I think it's important to put all constraint checking to the end, otherwise it would defeat the core idea that we check constraint violation after graph construction.

@SamChou19815
Copy link
Contributor Author

Background

There are concerns over the current too liberal approach of computing requirement as proposed in the original plan above. In the previous proposal, we keep all edges by default even if there is double-counting and let user opt-out of those edges later.

As a response to that, we have explored ideas that the edges are not added by default. Instead, the user has to explicitly do opt-in for courses that might introduce double counting. This proposal indeed ensures that we are conservative about progress computation, but it also makes certain things hard to explain to users:

  • why user has to make choices
  • how to find potential constraint violation because an edge is added
  • how to inform user about making choices without exposing all the graph internals

Amendament Solution

It turns out that we can try to find the best of both worlds by computing two graphs. The first graph is the graph we will compute according to the original proposal above. It will keep all edges, and user choices to remove double counting will be represented as opt-out. (Phase 3 computation). Then based on the graph, we remove all the edges that are associated with constraint violation, which results in a graph that only contain safe edges (Phase 4 computation).

Both graphs will be useful. Phase 3 graph helps us to compute constraint violations that can be used as source of suggestions and warnings for users, and phase 4 graph provides a source for progress computation that is conservative and safe.

Example

Consider the following example before we account for user choices. By default, all edges are kept, even if there are constraint violations (i.e. double counting).

Screen Shot 2021-11-02 at 21 47 33

Then we start to take account of user choices. In this case, the user opts-out the edge CS5414 --- MEng_CS_15_Credits. Now we get a graph that on the left of the following figure.

Screen Shot 2021-11-02 at 21 47 42

This graph still has constraint violations (edges marked in red), so we remove all the edges that are associated with constraint violations and obtain the graph on the right. The graph on the right will then by the source of truth for computing fulfillment progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants