Move ReferencedClassFinder to base #1987

johnynek · 2022-04-11T02:55:39Z

Ideally anything that does not increase the dependencies should be moved to base. By removing a function it looks like no one was using (and is just a trivial redirection to Config) we can do that here (although the tests do reference things in code).

cc @navinvishy @daniel-sudz I think we could use this code in the beam and spark backends in each stage of writes to try to create tokens for all the classes we can see. In kryo, this is registering classes so they have names. Note, I would sort the classes first so we assign the classes in a reproducible way that never depends on any iteration order of a hash set.

codecov-commenter · 2022-04-11T03:54:30Z

Codecov Report

Merging #1987 (fe647f8) into develop (6434348) will decrease coverage by 0.04%.
The diff coverage is n/a.

@@              Coverage Diff              @@
##             develop    #1987      +/-   ##
=============================================
- Coverage      37.82%   37.78%   -0.05%     
- Complexity      1116     1335     +219     
=============================================
  Files            316      315       -1     
  Lines          21117    21061      -56     
  Branches        2924     2874      -50     
=============================================
- Hits            7988     7957      -31     
+ Misses         12141    12123      -18     
+ Partials         988      981       -7

Impacted Files	Coverage Δ
...a/com/twitter/scalding/ReferencedClassFinder.scala

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6434348...fe647f8. Read the comment docs.

navinvishy · 2022-04-11T16:59:18Z

In the beam backend, I see that we're getting a kryo instantiator here. We're using KryoHadoop by default which gets Cascading serialization tokens and registers those classes with Kryo. Are you suggesting removing the dependency on KryoHadoop?

daniel-sudz

this change seems fine to me

I don't quite understand this part though:

I think we could use this code in the beam and spark backends in each stage of writes to try to create tokens for all the classes we can see. In kryo, this is registering classes so they have names. Note, I would sort the classes first so we assign the classes in a reproducible way that never depends on any iteration order of a hash set.

what is the use case for this? Having writers be aware of all exiting classes?

Move ReferencedClassFinder to base

fe647f8

daniel-sudz reviewed Apr 11, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move ReferencedClassFinder to base #1987

Move ReferencedClassFinder to base #1987

johnynek commented Apr 11, 2022

codecov-commenter commented Apr 11, 2022 •

edited

navinvishy commented Apr 11, 2022

daniel-sudz left a comment •

edited

Move ReferencedClassFinder to base #1987

Are you sure you want to change the base?

Move ReferencedClassFinder to base #1987

Conversation

johnynek commented Apr 11, 2022

codecov-commenter commented Apr 11, 2022 • edited

Codecov Report

navinvishy commented Apr 11, 2022

daniel-sudz left a comment • edited

Choose a reason for hiding this comment

codecov-commenter commented Apr 11, 2022 •

edited

daniel-sudz left a comment •

edited