Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HollowProducer.Blob.write(HollowBlobWriter writer) should be public #524

Open
yonatang opened this issue Mar 19, 2021 · 0 comments
Open

Comments

@yonatang
Copy link
Contributor

A Relevant Use Case

There are use-cases where adding objects to the objectMapper is a better choice than using runCycle. One example can be having a file of users' details that contains both sellers and buyers, and you want to generate two separate and independent blobs - one for sellers and one for buyers - while reading the list of users only once. It is not possible with Producer.runCycle, but pretty easy with two HollowWriteStateEngine's and two HollowObjectMapper's:

  val sellerObjectMapper=/*...*/; val buyerObjectMapper=/*...*/
  users.forEach { user-> when(user.type) { 
     "seller"->sellerObjectMapper.add(user);
     "buyer"->buyerObjectMapper.add(user);
  }

The Issue

It becomes complex if you want to use a publisher to publish these states. In order to publish these states using a publisher, you need to create HollowProducer.Blob, because the interface HollowPublisher accepts only Blob type. Hollow provides two Blob types - in-memory and filesystem, that can be creating using the factories in Hollow(Filesystem|InMemory)BlobStager. But in both case, it is impossible to use HollowBlobWriter to populate them, because the Blob.write(HollowBlobWriter writer) is protected. That essentially tightly couples the publisher to the producer - the publisher cannot be re-used anywhere else apart from within a HollowProducer class.

The Solution

If someone would like to re-use their custom publisher (i.e. a publisher that uploads the blob to their internal corporate object storage, while handling auth etc) without using a producer, it would be difficult - the possible options are:

  • A hack should be used (reflection, writing to the FilesystemBlob file directly, etc);
  • The publisher would have to be extended to bypass the Blob abstraction and accept InputStream + metadata; or
  • The user would have to extend the BlobStager and its Blob, just to expose the write method.

Either way it is a no-win situation.

I believe that the right thing to do is to make the Blob.write() public. I don't see any harm to publicly allow writing to blobs (increasing visibility is a non-breaking change). And if you allow to publicly create blobs outside of a producer, it only make sense to allow to allow to use (read: write into) them. In any case, I think the alternatives, for the use-case I've described, are by far worse.

Off course, if I'm mis-using Hollow and there is a better way to achieve my goals for that use-case without modifying Hollow, I'd be delighted to learn it. Notice that a splitter or a filter are not viable solutions, because the the majority of the data is well nested in the user's record (for the sake of the example), and both would not remove orphan nested types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant