Skip to content

Latest commit

 

History

History
359 lines (274 loc) · 15.1 KB

0003-data-strategy.md

File metadata and controls

359 lines (274 loc) · 15.1 KB

Data Strategy

Date: 2024-01-31

Status: accepted

Context

In order to implement "Single Fetch" in Remix (Issue, RFC), we need to expose some level of control over the internal data fetching behaviors of the @remix-run/router. This way, while React Router will run loaders in parallel by default, Remix can opt-into making a single fetch call to the server for all loaders.

Decisions

dataStrategy

To achieve the above, we propose to add an optional dataStrategy config which can be passed in by the application. The idea is that dataStrategy will accept an array of matches to load and will return a parallel array of results for those matches.

function dataStrategy(arg: DataStrategyFunctionArgs): DataResult[];

interface DataStrategyFunctionArgs<Context = any>
  extends DataFunctionArgs<Context> {
  matches: AgnosticDataStrategyMatch[];
}

interface DataFunctionArgs<Context> {
  request: Request;
  params: Params;
  context?: Context;
}

There's a comment here from Jacob which does a good job of outlining the current responsibilities, but basically React Router in it's current state handles 4 aspects when it comes to executing loaders for a given URL - dataStrategy is largely intended to handle step 3:

  1. Match routes for URL
  2. Determine what routes to load (via shouldRevalidate)
  3. Call loader functions in parallel
  4. Decode Responses

Inputs

The primary input is matches, since the user needs to know what routes match and eed to have loaders executed. We also wanted to provide a way for the user to call the "default" internal behavior so they could easily change from parallel to sequential without having to re-invent the wheel and manually call loaders, decode responses, etc. The first idea for this API was to pass a defaultStrategy(match) parameter so they could call that per-match:

function dataStrategy({ matches }) {
  // Call in parallel
  return Promise.all(matches.map(m => defaultStrategy((m))));

  // Call sequentially
  let results = []
  for (let match of matches) {
    results.push(await defaultStrategy(match))
  }
  return results;
}

⚠️ defaultStrategy was eliminated in favor of match.handler.

We also originally intended to expose a type: 'loader' | 'action' field as a way to presumably let them call match.route.loader/match.route.action directly - but we have since decided against that with the match.handler API.

⚠️ type was eliminated in favor of match.handler.

dataStrategy is control when handlers are called, not how. RR is in charge of calling them with the right parameters.

Outputs

Originally, we planned on making the DataResult API public, which is a union of the different types of results (SuccessResult, ErrorResult, RedirectResult, DeferResult). However, as we kept evolving and did some internal refactoring to separate calling loaders from decoding results - we realized that all we really need is a simpler HandlerResult:

interface HandlerResult {
  type: ResultType.success | ResultType.error;
  result: any;
}

If the user returns us one of those per-match, we can internally convert it to a DataResult.

  • If result is a Response then we can handle unwrapping the data and processing any redirects (may produce a SuccessResult, ErrorResult, or RedirectResult)
  • If result is a DeferredData instance, convert to DeferResult
  • If result is anything else we don't touch the data, it's either a SuccessResult or ErrorResult based on type
    • This is important because it's lets the end user opt into a different decoding strategy of their choice. If they return us a Response, we decode it. If not, we don't touch it.

Decoding Responses

Initially, we intended for dataStrategy to handle (3), and considered an optional decodeResponse API for (4) - but we decided that the decoding of responses was a small enough undertaking using standard Fetch APIs (i.e., res.json) that it didn't warrant a custom property - and they could just call those APIs directly. The defaultStrategy parameter would handle performing 3 the normal way that RR would.

⚠️ decodeResponse is made obsolete by HandlerResult

Handling route.lazy

There's a nuanced step we missed in our sequential steps above. If a route was using route.lazy, we may need to load the rout before we can execute the loader. There's two options here:

  1. We pre-execute all route.lazy methods before calling dataStrategy
  2. We let dataStrategy execute them accordingly

(1) has a pretty glaring perf issue in that it blocks any loaders from running until all route.lazy's have resolved. So if route A is super small but has a slow loader, and route B is large but has a fast loader:

|-- route a lazy  -->                      |-- route a loader --------------->|
|-- route b lazy  ------------------------>|-- route b loader -->             |

This is no bueno. Instead, we want option (2) where the users can run these sequentially per-route - and "loading the route" is just part of the "loading the data" step

|-- route a lazy  -->|-- route a loader --------------->         |
|-- route b lazy  ------------------------>|-- route b loader -->|

Therefore, we're introducing the concept of a DataStrategyMatch which is just like a RouteMatch but the match.route field is a Promise<Route>. We'll kick off the executions of route.lazy and then you can wait for them to complete prior to calling the loader:

function dataStrategy({ matches, defaultStrategy }) {
  return Promise.all(
    matches.map((m) => match.route.then((route) => route.loader(/* ... */)))
  );
}

There are also statically defined properties that live outside of lazy, so those are extended right onto match.route. This allows you to define loaders statically and run them in parallel with route.lazy:

function dataStrategy({ matches, defaultStrategy }) {
  // matches[0].route => Promise
  // matches[0].route.id => string
  // matches[0].route.index => boolean
  // matches[0].route.path => string
}

⚠️ We are actively seeing if we can eliminate this via match.handler

Handling shouldRevalidate behavior

We considered how to handle shouldRevalidate behavior. There's sort of 2 basic approaches:

  1. We pre-filter and only hand the user matchesToLoad
  2. We hand the user all matches and let them filter
    • This would probably also require a new defaultShouldRevalidate(match) => boolean parameter passed to dataStrategy

I think (1) is preferred to keep the API at a minimum and avoid leaking into other ways to opt-out of revalidation. We already have an API for that so let's lean into it.

Additionally, another big con of (2) is that if we want to let them make revalidation decisions inside dataStrategy - we need to expose all of the informaiton required for that (currentUrl, currentParams, nextUrl, nextParams, submission info, actionResult, etc.) - the API becomes a mess.

Therefore we are aiming to stick with one and let shouldRevalidate be the only way to opt-out of revalidation.

Handling actions and fetchers

Thus far, we've been mostly concerned with how to handle navigational loaders where they are multiple matched routes and loaders to run. But what about actions and fetchers where we only run a handler for a single leaf match? The quick answer to this is to just send a single-length array with the match in question:

// loaders
let matchesToLoad = getMatchesToLoad(request, matches);
let results = await dataStrategy({
  request,
  params,
  matches: matchesToLoad,
  type: "loader",
  defaultStrategy,
});

// action
let actionMatch = getTargetMatch(request, matches);
let actionResults = await dataStrategy({
  request,
  params,
  matches: [actionMatch],
  type: "action",
  defaultStrategy,
});
let actionResult = actionResults[0];

// fetcher loader/action
let fetcherMatch = getTargetMatch(request, matches);
let fetcherResults = await dataStrategy({
  request,
  params,
  matches: [fetcherMatch],
  type: "loader", // or "action"
  defaultStrategy,
});
let fetcherResult = fetcherResults[0];

This way, the user's implementation can just always operate on the matches array and it'll work for all use cases.

// Sample strategy to run sequentially
function dataStrategy({ request, params, matches, type }) {
  let results = [];
  for (let match of matches) {
    let result = await match.route[type]({ request, params });
    result.push(result);
  }
  return results;
}

What about middlewares?

As we thought more and more about this API, it became clear that the concept of "process data for a route" (step 3 above) was not necessarily limited to the loader/action and that there are data-related APIs on the horizon such as middleware and context that would also fall under the dataStrategy umbrella! In fact, a well-implemented dataStrategy could alleviate the need for first-class APIs - even if only initially. Early adopters could use dataStrategy to implement their own middlewares and we could see which patterns rise to the top and adopt them as first class route.middleware or whatever.

So how would middleware work? The general idea is that middleware runs sequentially top-down prior to the loaders running. And if you bring context into the equation - they also run top down and middlewares/loaders/actions receive the context from their level and above in the tree - but they do not "see" any context from below them in the tree.

A user-land implementation turns out not to be too bad assuming routes define middleware/context on handle:

// Assume routes look like this:
let route = {
  id: "parent",
  path: "/parent",
  loader: () => {},
  handle: {
    // context can provide multiple keyed contexts
    context: {
      parent: () => ({ id: "parent" }),
    },
    // middleware receives context as an argument
    middleware(context) {
      context.parent.whatever = "PARENT MIDDLEWARE";
    },
  },
};

async function dataStrategy({ request, params, matches, type }) {
  // Run context/middleware sequentially
  let contexts = {};
  for (let match of matches) {
    if (m.route.handle?.context) {
      for (let [id, ctx] of Object.entries(m.route.handle.context)) {
        contexts[key] = ctx();
      }
    }
    if (m.route.handle?.middleware) {
      m.route.handle.middleware(context);
    }
  }

  // Run loaders in parallel (or run the solo action)
  return Promise.all(
    matches.map(async (m, i) => {
      // Only expose contexts from this level and above
      let context = matches.slice(0, i + 1).reduce((acc, m) => {
        Object.keys(m.route.handle?.context).forEach((k) => {
          acc[k] = contexts[k];
        });
        return acc;
      }, {});
      try {
        return {
          type: ResultType.data,
          data: await m.route[type]?.({ request, params, context });,
        };
      } catch (error) {
        return {
          type: ResultType.error,
          error,
        };
      }
    })
  );
}

❌ Nope - this doesn't actually work!

Remember above where we decided to pre-filter the matches based on shouldRevalidate? That breaks any concept of middleware since even if we don't intend to load a route, we need to run middleware on all parents before the loader. So we must expose at least the matches at or above that level in the tree - and more likely all matches to dataStrategy if it's to be able to implement middleware.

And then, once we expose multiple matches - we need to tell the user if they're supposed to actually run the handlers on those matches or only on a leaf/target match.

I think there's a few options here:

Option 1 - routeMatches and handlerMatches

We could add a second array of the "full" set of matches for the route and then middleware would operate on that set, and handlers would operate on the filtered set (renamed to handlerMatches) here. This still preserves the pre-filtering and keeps shouldRevalidate logic out of dataStrategy.

async function dataStrategy({ request, params, routeMatches, handlerMatches, type }) {
  // Run context/middleware sequentially
  let contexts = {};
  for (let match of routeMatches) { ... }

  // Run loaders in parallel
  return Promise.all(
    handlerMatches.map(async (m, i) => { ... })
  );
}

Option 2 - new field on DataStrategyMatch

Since we're already introducing a concept of a DataStrategyMatch to handle route.lazy, we could lean into that and expose something on those matches that indicate if they need to have their handler run or not?

// Inside React Router, assume navigate from /a/ -> /b and we don't need to
// re-run the root loader
let dataStrategyMatches = [{
  route: { id: 'root', loader() {}, ... }
  runHandler: false // determined via shouldRevalidate
}, {
  route: { id: 'b', loader() {}, ... }
  runHandler: true // determined via shouldRevalidate
}]

Then, the user could use this to differentiate between middlewares and handlers:

async function dataStrategy({ request, params, matches, type }) {
  // Run context/middleware sequentially
  let contexts = {};
  for (let match of matches) { ... }

  // Run loaders in parallel
  let matchesToLoad = matches.filter(m => m.runHandler);
  return Promise.all(
    matchesToLoad.map(async (m, i) => { ... })
  );
}

Option 3 - new function on DataStrategyMatch

Extending on the idea above - it all started to feel super leaky and full of implementation-details. Why are users manually filtering? Or manually passing parameters to loaders/actions? Using a type field to know which to call? Waiting on a match.route Promise before calling the loader?

That's wayyyy to many rough edges for us to document and users to get wrong (rightfully so!).

Why can't we just do it all? LE'ts wrap all of that up into a single match.handler() function that:

  • Waits for route.lazy to resolve (if needed)
  • No-ops if the route isn't supposed to revalidate
    • Open question here if we return the current data from these no-ops or return undefined?
  • Knows whether to call the loader or the action
  • Allows users to pass additional params to loaders/actions for middleware/context use cases.
function dataStrategy({ matches }) {
  // No more type, defaultStrategy, or match.route promise APIs!

  // Can implement middleware as above since you now get all matches
  let context = runMiddlewares(matches);

  // Call all loaders in parallel (no params to pass) but you _can_ pass you
  // own argument to `handler` and it will come in as `loader({ request }, handlerArg)`
  // So you can send middleware context through to loaders/actions
  return Promise.all(matches.map(m => m.handler(context));

  // Note we don't do any filtering above - if a match doesn't need to load,
  // `match.handler` is no-op.  Just like `serverLoader` is a no-op in `clientLoader`
  // when it doesn't need to run
}

Consequences