Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT]: Add initial types and unit tests for coercion of tricky inputs using typebox #772

Draft
wants to merge 21 commits into
base: devel
Choose a base branch
from

Conversation

eddie-atkinson
Copy link

@eddie-atkinson eddie-atkinson commented May 4, 2024

Just a WIP PR to show where I'm at with exploring the use of typebox as a replacement for ajv.

So far I'm reasonably happy with where this work is at. Typebox seems like it can fit the needs of the project and is well documented and supported.

So far what I've done is:

  • Create reusable typebox types for YahooFinanceNumber, YahooFinanceDate, DateInMs, and TwoNumberRange
  • Add unit tests for these types that copy the existing test cases and augment them in some cases

Outstanding TODOs:

  • Test the performance of a few interfaces vs the existing ajv validation
  • Write a validateAndParse replacement using the Typebox APIs. This should be pretty straightforward, only potential wrinkle is whether error formatting is sufficiently readable to not require any significant manipulation. I think this will be OK as Typebox appears to just pipe through JSON Schema validation errors which are pretty readable IMO

Copy link

codecov bot commented May 4, 2024

Codecov Report

Attention: Patch coverage is 96.57534% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 93.84%. Comparing base (5a6a6c8) to head (7ce1443).
Report is 1 commits behind head on devel.

Current head 7ce1443 differs from pull request most recent head f0359fc

Please upload reports for the commit f0359fc to get more accurate results.

Files Patch % Lines
src/lib/datetime.ts 95.23% 2 Missing ⚠️
src/lib/yahooFinanceTypes.ts 94.44% 2 Missing ⚠️
src/lib/moduleExecTypebox.ts 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel     #772      +/-   ##
==========================================
+ Coverage   93.30%   93.84%   +0.54%     
==========================================
  Files          27       31       +4     
  Lines         732      878     +146     
  Branches      247      286      +39     
==========================================
+ Hits          683      824     +141     
- Misses         49       54       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gadicc
Copy link
Owner

gadicc commented May 5, 2024

This all looks great, @eddie-atkinson!

General comments:

  • Glad you started first with the coercion to make sure it will work from the get-go.
  • Typebox looks great... love the composition and readability.
  • Style is all great, love everything... thanks for the unit tests early on (not a given for everyone), edge cases, error handling and reporting. Great job 🙏

Minor nitpick:

  • For the copied code in date/time, could you add the commit SHA so we can more easily track any further work upstream. Either just in the comment or as the link (i.e. instead of master in the URL). I also wonder if there's any better way to create a union with the existing upstream functions, rather than copying and editing. Did you look at that at all? Anyway, super minor, I wouldn't waste more time on this now - let's get everything done first and revisit later.

  • Re nullable naming, will response inline above.

Thanks again, this is all great. Looking forward to following along and very excited to write new code using this in the future... the current system works well but can be quite tedious constantly rebuilding the schema to test things or forgetting to compile/commit on changes `:) So yeah, I'm excited, and big thanks again for all your efforts here 🙏

@eddie-atkinson
Copy link
Author

Hi @gadicc

For the copied code in date/time, could you add the commit SHA so we can more easily track any further work upstream

Done.

I also wonder if there's any better way to create a union with the existing upstream functions, rather than copying and editing. Did you look at that at all?

So my understanding is that the example directory this code sits in does not get bundled with typebox itself and is instead "for reference purposes only" with the intent that you copy the code you need. That being said, I share your concern about having it just sitting there, I'd prefer to defer to standard library functionality if we could. I can look into it as a follow up.

I've added a couple more commits which add a validateAndCoerceTypebox function. It is pretty simplistic, it just attempts to decode a value for a given schema, catches the error if there is one, logs it if desired and then re-throws it. The format of the error can be seen in the snapshot of the test in the PR. I'm not sure how keen you are to keep the error format consistent with what was there previously, is that part of the library's interface? If not, I feel that the error message as produced by typebox is quite readable. It includes the schema error, the value and the path. However, I am happy to spend some time making it look more like the old one if desired.

There were also a couple of test cases I dropped from the old validateAndCoerceTypes function, notably the it('fails on invalid options usage') test. To my mind invalid options use is protected against by the Typescript compiler yelling if you pass an invalid option. However, I'm keen to understand if that's a test case you think is still needed; it would be simple to add if desired.

Looking forward to hearing more feedback, and I'll start hacking on the perf tests in the meantime

@gadicc
Copy link
Owner

gadicc commented May 7, 2024

All looks great as usual, @eddie-atkinson; thanks again!

date/time types from examples

Ah yes, right you are... didn't even notice that 😅 If it's not included in the npm package there's really not much we can do, but as you say, I guess this was their intention anyway. Definitely fine for now, and if we notice any bugs, we now have the commit hash (thanks) to report upstream or rebase upstream fixes (by hand).

validateAndCoerceTypebox & error format

Awesome, thanks! Absolutely fine with the new error reporting. All we did before was use ajv's format... no API promise here, just something the end user can understand - within reason. My only note is that - as you'll see in the existing validate() function - some common errors also logged a huge amount of unnecessary data - and we'd do our best to only log the relevant parts. This is something we'll have to address eventually, as it greatly affects usability / DX of the library... but I think once we have everything else in place, it shouldn't be too hard to create similar functionality as have real data to test against. Pity I didn't create tests for the existing functionality - my bad 😅 (Or maybe typebox logs better errors from the get-go, not sure).

invalid options use is protected against by the Typescript compiler yelling if you pass an invalid option

The non-obvious reason for this (which I should have mentioned in a comment) is for library usage by non-TS users. That's the only reason why we still need to validate at runtime. And I guess that - unfortunately - means we'll need to use typebox for each module's options too. Sorry 🙈

But step by step... the most important thing is to get the foundations up - which you've done an excellent job of. Both systems can exist side by side for a while and we can convert everything gradually.

So, thanks again. I continue to follow with interest and appreciation :)

await expect(rwo({ invalid: true })).rejects.toThrow(InvalidOptionsError);
});

it("accepts empty queryOptions", async () => {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is slightly weird because no other module other than search seems to support this. But I'm not really sure why

handleResultError(e, options);
break;
}
// TODO: The existing implementation of 'validate' assumes that this parameter may not be provided
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another weird thing I found. The implementation in validate uses a bare else to cover the condition that type !== 'result' instead of checking for type === 'options'. I'm guessing this is a catch all case for when someone ignores the TS and passes through completely garbage options.

I have two thoughts on this:

  1. I feel like this case shouldn't really exist as this function should only be called internally within the library where we can assume TS.
  2. If we do want to be defensive around this I think it would make more sense to hoist the handling of this out of the validateAndCoerce function. Ideally we should be able to do that in the modules themselves and then we can have a clean interface for internal consumption, and catch weird errors where they originate rather than deep in the library.

Just my $0.02. I concede there is probably something going on here that I don't quite grok.

@eddie-atkinson
Copy link
Author

Hi @gadicc,

Me again.

A few updates on where I'm at.

I've added some more functionality for using typebox to validate options and responses, notably moduleExecTypebox.ts and search.tb.ts. The current approach I have taken is to build the modules that use typebox in parallel. I'm not sure if you have thoughts on how we could roll this change out but my current plan was to build the typebox functionality in parallel and then we could replace modules one by one with the Typebox implementations as we gain more confidence that it worked. This does have the annoying side effect of increasing the module's bundle size as we will include both typebox and ajv in the final output for a while, so happy to discuss other options. The current implementation allows users to opt into the new functionality by calling yf.typebox.<moduleName>().

For the naming of the typebox files I have adopted .tb.ts as a file extension to distinguish them from the main code path. Ideally I'd have liked to segregate them into a typebox directory, but ts-json-schema-generator and typescript were interacting badly and throwing all sorts of weird errors during schema generation (can't wait to take a flamethrower to that codegen step 🫠 ).

In terms of the mechanical process of converting typescript interfaces to Typebox, that part is pretty easy. You can basically copy the entire file wholesale into the tool that the maintainer of Typebox created and copy out the result with two important caveats:

  1. The output will convert number and date types to Type.Number() and Type.Date() respectively. Running the unit tests I found the JSON outputs in the code base actually encode these values in a variety of ways (those captured in yahooFinanceTypes.ts). So for these values I switched the generated types to YahooNumber and YahooFinanceDate.
  2. The nice comments with examples you wrote get stripped so they have to be copied back

I had a bit of a play around with a performance test without getting very far, generally it seems like the example files in the repo aren't large enough for a meaningful performance difference to manifest.

Keen to discuss what the steps are from here, I'm not sure whether we want to merge this PR or split it up a bit to make reviewing more tractable

@gadicc
Copy link
Owner

gadicc commented May 15, 2024

Amazing work as usual @eddie-atkinson - big thanks! 🙏

Not worried at all about including both ajv and typebox for now... the small difference in bundle size won't make a big difference outside of the browser. In any event, it won't be long lived.

So yeah, current parallel staging looks great. I have some thoughts on best next steps, but I'm just in the middle of some travel, so will need to get back. The code I've reviewed so far all looks great but I'm a few commits behind. Need to get back to you on the code comment above too (you're right, in theory, if it's only used internally, no need to validate at runtime - but I'll double check).

Yeah, all the tschema scripts etc, although they're all stable and work great (if you don't change anything), as you saw, it can create become a hassle, so yeah, another reason why I was excited form your proposal and to move to something both clearer and more maintainable :) Thanks also for appeasing the coverage gods! No easy feat :D

In short, thanks again, and will be in touch soon!

@gadicc
Copy link
Owner

gadicc commented May 26, 2024

@eddie-atkinson, thanks again for your patience. I'm still abroad and have had a lot less time online than anticipated. To that end, I'm very happy where this is all going, and in light of my limited time now, happy for you to take the lead on how best to implement this.

My original thoughts were to rebase and squash the commits down into 1) all preliminary support (packages, coersion, etc), 2) commit(s) for module(s). For the latter, I had originally thought it would be nice to deal with a single file, so the diff would show the exact changes to switch from the old way to the new way. But that's not mutually exclusive to the parallel track we've taken here... we can have both the original and .tb.ts files co-exist as you've described which takes off the pressure somewhat, and later down the line when we want the use the new versions only, we can just rename the files over the old versions and the diffs will still look great. Hope that makes sense! Those were my thoughts but as I said since I have a lot less time online than expected now, and you have all this stuff fresh in your mind, happy to go another route on this. To save some time and energy I also don't mind to squash everything as is into a single commit.

So, just let me know how you feel about everything and if there's still more immediate work you'd like to do or if things are ready to merge now and to continue on from there. And most importantly, thanks again for all your exceptional work on this and patience too! 🙏

@eddie-atkinson
Copy link
Author

Hi @gadicc,

I'm still abroad and have had a lot less time online than anticipated.

Enjoy it! There's always more time to sit in front of a computer. In this profession time away from the computer is an essential thing to keeping the fire burning over the long-term.

My original thoughts were to rebase and squash the commits down into 1) all preliminary support (packages, coersion, etc), 2) commit(s) for module(s).

I'm happy to work it this way. My plan for this PR was to try out a few modules and see how it all hangs together. After having done a few of the more hairy modules I'm confident we can handle all of them.

The rough plan I was thinking of was to:

  1. Make a PR for all the setup work (coercion, validateAndParse etc)
  2. Make separate PRs for each module

I was keen to make a separate PR for each module to make reviews more tractable as some of the modules are incredibly large. For the smaller ones I'm also happy to combine a few together, I just know search for example will be a bit of a hefty review.

I had originally thought it would be nice to deal with a single file, so the diff would show the exact changes to switch from the old way to the new way. But that's not mutually exclusive to the parallel track we've taken here... we can have both the original and .tb.ts

We might be able to have our cake and eat it here. Just move files like so search.ts -> search.ajv.ts and add a new search.ts file.

Anyways, happy to work the PR flow however it's going to make it easiest to review. I am conscious that this shouldn't result in breaking changes, but it's hard to be certain the TS interface won't change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants