Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored client library code for Data Types for simplicity and code reuse #1

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gunjan-juyal
Copy link
Owner

@gunjan-juyal gunjan-juyal commented Sep 25, 2023

Refactored client library code for Data Types for simplicity and code reuse

This refactoring is attempting the following benefits:

  1. Code reuse for common value encoding conversions
  2. Reducing the number of places when adding a new data-type, especially the type-related conversion logic.
  3. Refactor the existing primitive types to use the new helper functions to simplify existing code and tests.

Reference - code change required for adding a new type:

Fixes - (Bug to be created) ☕️

[Note: This PR is raised on a fork of the java-spanner client repo. It is currently a prototype, and once things look promising a separate PR will be raised for merging this to the official repo.]

…-reuse.

This refactoring is attempting the following benefits:
1. Code reuse for common value encoding conversions
2. Reducing the number of places when adding a new data-type, especially the type-related conversion logic.
3. Refactor the existing primitive types to use the new helper functions to simplify existing code.
@gunjan-juyal gunjan-juyal self-assigned this Sep 26, 2023
@gunjan-juyal gunjan-juyal marked this pull request as draft September 26, 2023 05:27
@charvisingla
Copy link

These changes are great for a first pass.
I think the Type specific logic is still essentially scattered all over the place for non-primitive or special handling. A developer would still need to look at all the areas to figure out whether they need to add a special case etc. Can we make generic interfaces to be overriden for new types to limit all set of specialization logic for a new type in 1-2 files? Say in either Type and Value classes? or a new generic interface that covers all methods required for type specific logics with a default implementation for each?

… set, setting query parameters and mutations
Copy link

@thiagotnunes thiagotnunes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the ideas was to centralize Type specific logic such that when we add a new type we would only need to touch one file, maybe two files. I see that if I add a new type I would still need to change: potentially Type.java, TypeHelper.java, Value.java and potentially ChecksumResultSet.java.

Could we see if we can centralize this further?

Perhaps all of the following could live in a TypeMapper of sorts:

  1. Code to primitive type
  2. Code to array type
  3. Supported primitive type classes
  4. Extracting value using code
  5. Extracting array using code

return date();
Code typeCode = Code.fromProto(proto.getCode(), proto.getTypeAnnotation());
if (isPrimitiveTypeCodeSupported(typeCode)) {
System.out.format("Type from map: %s\n", CODE_TO_PRIMITIVE_TYPE.get(typeCode));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left over?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for catching. Will clean up

import com.google.cloud.spanner.Type.Code;
import java.util.List;

public final class TypeHelper {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: *Helper is too generic of a name, perhaps PrimitiveTypeMapper or something alike?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Naming is hard! Will think of an alternative, otherwise Mapper sounds good.
@thiagotnunes I have named the types supported by these generic methods/interfaces as "Primitive" types, since I do not intend to handle complex types such as Struct or Protos. I am not sure if this is indicative though. Do comment if this sounds alright, or if you can think of alternatives

@gunjan-juyal
Copy link
Owner Author

These changes are great for a first pass. I think the Type specific logic is still essentially scattered all over the place for non-primitive or special handling. A developer would still need to look at all the areas to figure out whether they need to add a special case etc. Can we make generic interfaces to be overriden for new types to limit all set of specialization logic for a new type in 1-2 files? Say in either Type and Value classes? or a new generic interface that covers all methods required for type specific logics with a default implementation for each?

@charvisingla Yes, in the first parse I tried to just identify and reduce the duplication wherever I found common code fragments. Special cases are still left untouched.

In a second parse I am now looking at these two as you suggested:

  1. Limiting all type-specific transcoding, casting etc to 1 or 2 places
  2. Introduce a default implementation that is sufficient for future simple types (e.g. Number-compatible types such as INT32 and String-compatible types such as JSON/JSONB in the past), and only custom logic needs to override this.

I had faced the following challenges in phase-1 while exploring ways to add new generic interfaces. Do share if you have any specific ideas or suggestions:

  1. Public interfaces: We should not break or deprecate any existing interfaces, and prefer to support existing conventions for future types - e.g. if we have a ValueBinder.to(boolean) for Boolean type then we should later add a ValueBinder.to(int) for INT32.
  2. Private or internal interfaces: The volume of existing work is huge, and much of this is tightly coupled to specific types. E.g. separate method signatures for each data type.

@gunjan-juyal
Copy link
Owner Author

One of the ideas was to centralize Type specific logic such that when we add a new type we would only need to touch one file, maybe two files. I see that if I add a new type I would still need to change: potentially Type.java, TypeHelper.java, Value.java and potentially ChecksumResultSet.java.

Could we see if we can centralize this further?

Perhaps all of the following could live in a TypeMapper of sorts:

  1. Code to primitive type
  2. Code to array type
  3. Supported primitive type classes
  4. Extracting value using code
  5. Extracting array using code

Yes, and also (6) Checksum calculation for each type. Attempting this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants