Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliot + Jupyter #457

Open
timothyb0912 opened this issue Dec 6, 2020 · 3 comments
Open

Eliot + Jupyter #457

timothyb0912 opened this issue Dec 6, 2020 · 3 comments

Comments

@timothyb0912
Copy link

timothyb0912 commented Dec 6, 2020

Hi @itamarst, thanks for this great package!

I've recently gotten interested in using Eliot, especially in using it programmatically and through Jupyter notebooks.

I saw here in the "The future: Eliot + Jupyer" section, that you were interested in others exposing Eliot logs through Jupyter. Are you still interested in this?

I have a proof of concept in this gist.
It uses treelib to draw tree-diagrams of the logs, based on the eliottree package for inspiration.

Is this the general direction that you'd like to see Eliot go?
I'm happy to add tests, open a PR to add this functionality, and work with you to get it done.
Just let me know, and thanks again.

@itamarst
Copy link
Owner

itamarst commented Dec 7, 2020

Hi,

Glad you're interested, and finding it useful.

You cou can by the way use eliot.parse for parsing; it also supports missing messages, and out-of-order messages, which can become useful when doing threaded or multi-process tracing. And I wonder if eliottree would work as a library; not sure author considered it as use-case, but you can ask.

I'd love to hear about your use case in more detail:

  • My thought during that talk was that Eliot can complement Jupyter: Jupyter is great at exposing internal implementation details so long as code is in a Jupyter cell. As soon as you hit a library (or you're not using Jupyter), you no longer have that visibility. So my vague notion was "dump to disk, then explore in Jupyter".
  • What you're doing is probably going beyond what I was doing, in that you're using it in Jupyter code where you could in theory just be doing display(). And... now that I see it being done, that does seem like it might be useful, e.g. once you get complex recursion or repetition, or when you want to delay reporting information.

So, yes, this does seem interesting. However, the idioms and requirements are not clear to me yet.

As a starting point, I suggest you just go and do it for whatever you real-world goals are (and I'd love to hear them), and when you hit roadblocks ask here, and we can figure it out. And then the output might be some features, or bugfixes, or documentation, or all of them.

Does that make sense?

@timothyb0912
Copy link
Author

timothyb0912 commented Dec 9, 2020

Yeah, definitely. I'll make use of Eliot for my current projects, and when I'm done, I'll have a more detailed set of feature requests, usage notes, and possibly bugfixes to share. These learnings can be used to figure out how best to expose Eliot's logs in Jupyter notebooks.

Here's my use case. I work as a data scientist, and at the completion of any proof of concept, I often have a collection of untested Jupyter notebooks. Alternatively, I inherit the legacy code of others, in a notebook or outside of it. At either point, I'll begin using test driven-development to re-write the code in a well-designed manner or simply to learn how it works for myself. However, I'll want to be able to ensure that my new code preserves the functionality of my old code.

One way to do this is to execute many examples with both the original and new code to ensure equality. If differences are discovered it indicates missing specifications / tests for my new code or bugs for the old code. Assuming the original code was correct, we'll want to add tests that guide us to remove the differences in output between the old and new code. Ideally, the test to be added would revolve around the comparison of the first internally computed value (as logged by Eliot) that exists but differs in the two code bases, given identical inputs.

Does this make sense? I want to use legacy code + examples + Eliot as a test oracle for the creation of test-driven-code that replaces the legacy code.

To start off, how would I useeliot.parse to parse the messages? I tried that originally, but I ran into errors when parsing, and I never got it to work.

Given the list of logged messages from the destination added to eliot.add_destination, how would I parse them using eliot.parse? Can you show an example? It seemed to fail because I was using a list and not a stream from a file object as in the docs here.

@itamarst
Copy link
Owner

itamarst commented Dec 9, 2020

Ooh, neat. For one of my consulting clients we were talking about a similar problem, seeing if new version of e.g. Pandas broke code, and I did a little prototyping (and very briefly considered Eliot) but we didn't end up going down that route. Interested to hear how it works for you.

So yeah, makes total sense.

I believe you should be able to do:

from eliot.parse import Parser

tasks = list(Parser.parse_stream(your_list_of_messages))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants