Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation #626

Open
JohannesLichtenberger opened this issue Jun 19, 2023 · 23 comments
Open

Documentation #626

JohannesLichtenberger opened this issue Jun 19, 2023 · 23 comments

Comments

@JohannesLichtenberger
Copy link
Member

We'd have to write documentation about the overall architecture, the secondary indexes, and the path summary...

@JohannesLichtenberger JohannesLichtenberger self-assigned this Jun 19, 2023
@Aminmalek
Copy link
Contributor

Aminmalek commented Jun 21, 2023

I'm interested in doing this task.

@JohannesLichtenberger
Copy link
Member Author

I think we should probably use this: https://sirix-docs.readthedocs.io/en/latest/

Currently, some documentation is linked here: https://sirix.io/documentation.html

However, I'm creating new diagrams in the sirix/images folder (using Excalidraw)

@Aminmalek
Copy link
Contributor

I think we should probably use this: https://sirix-docs.readthedocs.io/en/latest/

Currently, some documentation is linked here: https://sirix.io/documentation.html

However, I'm creating new diagrams in the sirix/images folder (using Excalidraw)

how can I help for new docs?

@JohannesLichtenberger
Copy link
Member Author

You could, for instance, check if you can set up a SirixDB server and check if the documentation is correct.

Other than that maybe you can add the XQuery/JSONiq functions to the new docs...

BTW: What's your opinion on using readthedocs.io?

@Aminmalek
Copy link
Contributor

Aminmalek commented Jun 21, 2023

You could, for instance, check if you can set up a SirixDB server and check if the documentation is correct.

Other than that maybe you can add the XQuery/JSONiq functions to the new docs...

BTW: What's your opinion on using readthedocs.io?

Sure, I'll be happy to help you with that please add e to this task! Regarding the 'readdocs' issue, I have experience working with small libraries in Python/Django, such as the example you provided (exmplale). In my opinion, sometimes these kinds of documents can be a bit tedious and difficult to read for beginners, but they are very useful and easy for fast generating the docs. Personally, I prefer documentation formats like those used in Spring Project and other similar frameworks.

@JohannesLichtenberger
Copy link
Member Author

We could of course also stick to the sirix.io markdown files for instance. Maybe also a complete redesign of the website would be amazing, but yeah...

@JohannesLichtenberger
Copy link
Member Author

We also need Tutorials, HowTo Guides... https://youtu.be/t4vKPhjcMZg

@Aminmalek
Copy link
Contributor

Yes I feel lack of tutorials and good documentation too :) I will help you do this I need some time to read the code and understand it can you please help me in this? How to start and which parts to read.

@JohannesLichtenberger
Copy link
Member Author

You can, for instance, check the usage of JsonDocumentCreator and debug a little bit.

I think a top-down approach might be best.

In general there's a Database instance which encapsulates Resources, the equivalent to tables in a relational database system. These resources are either JSON or XML based (we store a binary encoding of a tree, think of it as a persistent DOM -- firstChild/lastChild/parent/leftSibling/rightSibling encoding).

Then from the database instance you can create a new resource or open a resource session to start N read-only trxs or a single read-write trx. Each JsonNodeTrx or JsonNodeReadOnlyTrx has a page reading trx dependency, which is essentially the storage engine (I think we could also rename the classes at some point ;-)). The page reading trx has a reader/writer dependency, which basically writes the pages to the storage device (currently to files via "normal" FileChannel based I/O or the use of memory mapped files or io_uring, but the latter somehow currently is slower than "normal" I/O, maybe due to the event loop used in the library we use, but I'm not sure... we currently also work on a file based + async mechanism to store the pages also in S3 buckets for instance.

The architecture is a huge tree of tries basically and new revisions are always appended. The data / key/value pages of the tries store the actual nodes (of the JSON or XML trees) or they store secondary indexes...

@Aminmalek
Copy link
Contributor

You can, for instance, check the usage of JsonDocumentCreator and debug a little bit.

I think a top-down approach might be best.

In general there's a Database instance which encapsulates Resources, the equivalent to tables in a relational database system. These resources are either JSON or XML based (we store a binary encoding of a tree, think of it as a persistent DOM -- firstChild/lastChild/parent/leftSibling/rightSibling encoding).

Then from the database instance you can create a new resource or open a resource session to start N read-only trxs or a single read-write trx. Each JsonNodeTrx or JsonNodeReadOnlyTrx has a page reading trx dependency, which is essentially the storage engine (I think we could also rename the classes at some point ;-)). The page reading trx has a reader/writer dependency, which basically writes the pages to the storage device (currently to files via "normal" FileChannel based I/O or the use of memory mapped files or io_uring, but the latter somehow currently is slower than "normal" I/O, maybe due to the event loop used in the library we use, but I'm not sure... we currently also work on a file based + async mechanism to store the pages also in S3 buckets for instance.

The architecture is a huge tree of tries basically and new revisions are always appended. The data / key/value pages of the tries store the actual nodes (of the JSON or XML trees) or they store secondary indexes...

Thank you for your very helpful comment I will start reading right know :)

@Aminmalek
Copy link
Contributor

Can you please assign this task to me?

@JohannesLichtenberger
Copy link
Member Author

Will do, once I'm back home. Can not find the button using my phone :-D BTW: you can also check the existing documentation and I hope that even the excalidraw images might provide a bit of an architecture overview (for instance how a JSON document is mapped to the tree structure), despite that I want to work on a new technical document about the concepts and architecture using the new illustrations/images...

@Aminmalek
Copy link
Contributor

Will do, once I'm back home. Can not find the button using my phone :-D BTW: you can also check the existing documentation and I hope that even the excalidraw images might provide a bit of an architecture overview (for instance how a JSON document is mapped to the tree structure), despite that I want to work on a new technical document about the concepts and architecture using the new illustrations/images...

Sure I will, can we keep in touch via email?I will have questions about the code and architecter :))

@JohannesLichtenberger
Copy link
Member Author

We have a discord channel, you can join. The link is in the README.

@Aminmalek
Copy link
Contributor

First things first, the issue with the documentation is that it's hard to find specific information. For example, it's difficult to locate instructions for starting, running, and using the database.I think we need to make it well organized and better routing for user .

@JohannesLichtenberger
Copy link
Member Author

Sounds great. We have to have more examples in the sirix-examples bundle, we also need tutorials and HowTos. But yes, getting sirix up as a dependency as an embedded DBS would probably be a first step. Also the examples bundle is nowhere mentioned IIRC...

@Aminmalek
Copy link
Contributor

Aminmalek commented Jul 4, 2023

here is hard to find the cert file for user
>>> client = httpx.Client(base_url="https://localhost:9443", verify=<path/to/cert.pem/in/resources/folder>)
I found it by searching in codebase in /opt/

@Aminmalek
Copy link
Contributor

here is hard to find the cert file for user >>> client = httpx.Client(base_url="https://localhost:9443", verify=<path/to/cert.pem/in/resources/folder>) I found it by searching in codebase in /opt/

but they are empty files

@JohannesLichtenberger
Copy link
Member Author

A bunch of new architecture diagrams and a reworked text is here: https://sirix.io/docs/concepts.html. However, I'll keep on adding more stuff...

@JohannesLichtenberger
Copy link
Member Author

We'd need tutorials and how to's :-)

@henrycotton1
Copy link

Hey Im happy to work on some tutorials and how to's aswell as further documentation. Let me know if I can start working on this.

@Aminmalek
Copy link
Contributor

Hey Im happy to work on some tutorials and how to's aswell as further documentation. Let me know if I can start working on this.

we can work on this together

@Aminmalek
Copy link
Contributor

@henrycotton1 I've replied your email weeks ago,lets plan about this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

3 participants