Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust serialization #380

Open
acpeakhour opened this issue Mar 29, 2023 · 2 comments
Open

Rust serialization #380

acpeakhour opened this issue Mar 29, 2023 · 2 comments

Comments

@acpeakhour
Copy link

Hi,

In a local branch I've added serialization using serde. One issue I came across is in nodestore.rs.

In the VectorNodeStore struct there is:

project_to_tree: fn(Vec<f32>) -> Vec<f32>,

which can't be serialized directly and I had to resort to skipping that field and using a default initialiser.

It would be great to have the serde annotations directly in the lib, and I am wondering as to your thoughts on how to address this issue.

@sudiptoguha
Copy link
Contributor

Re "It would be great to have the serde annotations directly in the lib, and I am wondering as to your thoughts on how to address this issue." -- it's easy :) This is an Apache 2.0 project and contributions are welcome. But if for some reason that is not desirable, then that is ok too - we do plan to get to it in some time.

One of the lessons of RCF journey was the notion of serialization (or how a model is consumed) impacts all notions of algorithmic complexity. If the model is deserialized & serialized on every input, then that defines a workload different from sporadic ser-de. Ser-de is necessary for consumption. Now there can be two aspects (i) performance and (ii) interoperability. Interoperability can be language aware or language agnostic. I think protobuf is an example of the first (and my knowledge in this regard is limited) and text/JSON is language agnostic. Having a few representative serializations is sufficient; in the Java version we ended up just trying ProtoStuff and Json/Jackson. The remainder of the effort can go to enabling features like project_to_tree (which is idempotent/trivial at the moment) :) One nice potential thing about protobuf is that we could have the same models being passed around between a Java and a Rust environment. I have myself used both of those environments simultaneously to debug.

Caveats: As newer usages happen - it is possible that the basic RCF needs upgrade/re-orientation (for example, as soon. as project to tree becomes a nontrivial projection). But if the ser-de object has a version string then all of these are solvable. Testing serialization has also been an unclear area.

@acpeakhour
Copy link
Author

Our use case is to persist the model between restarts of the application, serde did work for this case - however as you noted it isn't exactly fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants