Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python interface #48

Closed
ejgallego opened this issue Mar 23, 2018 · 19 comments · May be fixed by #240
Closed

Python interface #48

ejgallego opened this issue Mar 23, 2018 · 19 comments · May be fixed by #240

Comments

@ejgallego
Copy link
Owner

We should provide a Python binding for SerAPI as this is what many people actually use.

In principle, we should have a doc object, with add and cancel methods, as well as a coqObject one with the proper subclasses.

I am not sure what is the best way to proceed here, I guess some help from Python experts would be welcome.

@ejgallego
Copy link
Owner Author

cc #17

@ejgallego ejgallego added this to the 0.7.0 milestone Dec 7, 2018
@brando90
Copy link

brando90 commented Feb 2, 2019

I am interested in a Python way to use SerAPI. At what stage is this?

I have 4 questions/ideas:

    1. Is there python interface version I can try?
    1. Does one for a specific use case already exists that I may start trying? i.e. someone has already integrated Serapi with python?
    1. I could be the first one to do it? Can I help build it or something?
    1. Or is the language agnostic way the only/best way to do it now?

@ejgallego
Copy link
Owner Author

I am interested in a Python way to use SerAPI. At what stage is this?

I have thought a lot about ways to do it well, however not having a Python expert close meant that I moved it down on my priority list. But I am personally very interested in making it happen so I'd be glad to help.

First the questions:

    1. Is there python interface version I can try?
    1. Does one for a specific use case already exists that I may start trying? i.e. someone has already integrated Serapi with python?

Unfortunately not.

    1. I could be the first one to do it? Can I help build it or something?

Sure, I'd be able to help a lot too!

    1. Or is the language agnostic way the only/best way to do it now?

I see a huge value in having a small Python library that exposes the methods of the protocol; the thing is, what'd be the best way to do it.

Interfacing Python with SerAPI directly could prove costly. One key goal of SerAPI is to have easy maintenance and some parts of the serialization are heavy. Despite that, I think that a good solution is doable, the two main challenges are:

  • Async: interaction with Coq is by nature async, what is the best Python idiom to handle async programming? Once we figure this out, it should be a matter of hours to have something working.
  • Data representation: presenting Coq objects as python objects may be pretty heavy. But we could go two routes: have the Python API present objects as SEXPS, or indeed write a serializer from SEXP to objects. This can be done at different levels, for example in https://github.com/ejgallego/ocaml-ts there is a shim that connects OCaml's AST to a printer [for example a SEXP to Python object transformer]

@brando90
Copy link

brando90 commented Feb 5, 2019

Sounds good! I'm happy to get together to help and chat about the interface/API/design etc. (if thats helpful/useful).

As a selfish comment, I am interesting to keeping in mind some of the functionality of TCoq in mind (#109), specially exposing compund tactics and their intermediate proof states. Just leaving this comment here so I don't forget (could be low priority etc).

Yes, I do think having Python objects will be useful for the useful.

@ejgallego
Copy link
Owner Author

Sounds good! I'm happy to get together to help and chat about the interface/API/design etc. (if thats helpful/useful).

That's great!

As a selfish comment, I am interesting to keeping in mind some of the functionality of TCoq in mind (#109), specially exposing compund tactics and their intermediate proof states. Just leaving this comment here so I don't forget (could be low priority etc).

I'd suggest we start with an example then; please write a small proof script and what you would expect from it. Then, we can first see if the generic protocol can handle it; and if yes, we could see what the most convenient Python interface would be.

@brando90
Copy link

brando90 commented Feb 5, 2019

this theorem might be more advance than my knowledge of coq but I had in mind something like this (example from http://geocoq.github.io/GeoCoq/html/GeoCoq.Highschool.midpoint_thales.html):

Lemma midpoint_thales_reci :
  ∀ a b c o: Tpoint,
   Per a c b →
   Midpoint o a b →
   Cong o a o b ∧ Cong o b o c.
Proof.
intros.

induction (Col_dec a b c).

induction (l8_9 a c b H);
treat_equalities;assert_congs_perm;try split;finish.
assert_diffs.
assert_congs_perm.
split.
Cong.
assert(Hmid := midpoint_existence a c).
destruct Hmid.
assert(Hpar : Par c b x o).
apply (triangle_mid_par c b a o x);finish.
assert(Hper : Perp c b c a)
 by (apply perp_left_comm;apply per_perp;Perp).
assert(HH := par_perp_perp c b x o c a Hpar Hper).
assert(Hper2 : Perp c x o x).
  apply (perp_col c a o x x).
  assert_diffs.
  finish.
  Perp.
  assert_cols;Col.
assert_diffs.
assert (Per o x c)
 by (apply perp_per_2;Perp).

unfold Per in H8.
destruct H8.
spliter.
apply l7_2 in H8.
assert(HmidU := l7_9 a x0 x c H2 H8).
subst.
unfold Midpoint in H2.
spliter.
eCong.
Qed.

steps like: treat_equalities;assert_congs_perm;try split;finish. are treated like a single compound tactic that affects multiple proof states (context+goal) in the proof tree. Ideally I'd like to control the granularity of this (getting hidden proof states at any granularity if I want to). So be able to break it down into its atomic (or sub compunded) tactics and perhaps more importantly, their proof states (and proof tree).

@brando90
Copy link

brando90 commented Feb 5, 2019

This is one of the difference with TCoq (that I am aware of). TCoq records the intermediate proof states encountered as opposed to being an API for machine-machine or human-machine interaction (as far as I know).

@ejgallego
Copy link
Owner Author

steps like: treat_equalities; assert_congs_perm; try split; finish. are treated like a single compound tactic that affects multiple proof states (context+goal) in the proof tree. Ideally I'd like to control the granularity of this (getting hidden proof states at any granularity if I want to). So be able to break it down into its atomic (or sub compunded) tactics and perhaps more importantly, their proof states (and tree).

Here we'd need a more concrete specification of what you mean by intermediate states. I see that TCoq instruments LTAC in a particular way, however it doesn't instrument proofview so the kind of traces you get would seem to be a bit arbitrary. [I didn't look at tcoq in depth tho] Note that the notion of state is a bit trickier than expected due to the "multi-goal" model of the 8.5+ proof engine.

I suggest we start with a very simple example with tactics a;b and we see what you would like to get.

As for now, SerAPI does allow to get the state before and after a;b execution. We could get the state just after a [just by running a speculatively], but if you instrument you may be many checkpoints depending on what the definition of a is.

@palmskog
Copy link
Collaborator

palmskog commented Feb 6, 2019

@brando90 I would suggest starting really minimalistically, e.g., a proof with 1-2 lines of proof scripts from here or here.

It may also help if you understand how multi-goal tactic systems work, one good source is a short note by Arnaud Spiwack.

@brando90
Copy link

brando90 commented Feb 7, 2019

One idea I had in mind was to try to reimplement something similar to TCoq but with a better-maintained library like SerAPI. Though I think you guys are right. I should think clearly what I want first and then we can discuss.

Yes, I will check something simple first and make sure I understand multi-goal tactic systems by reading what you posted. Thanks!

Hope to chat soon in the WG. :)

@ejgallego ejgallego modified the milestones: 0.7.0, 0.7.1 Oct 25, 2019
@ejgallego ejgallego modified the milestones: 0.7.1, 0.11.0 Jan 24, 2020
@ejgallego ejgallego modified the milestones: 0.11.0, 0.11.1 May 13, 2020
@ejgallego ejgallego modified the milestones: 0.11.1, 0.12.0 May 27, 2020
@ejgallego ejgallego modified the milestones: 0.12.0, 0.12.2 Aug 27, 2020
@ejgallego ejgallego modified the milestones: 0.12.2, 0.13.1 Mar 12, 2021
ejgallego added a commit that referenced this issue Sep 22, 2021
This PR adds a python interface to the SerAPI protocol using
`pythonlib`, which allows to call OCaml functions from Python.

Current setup is based on
https://github.com/jonathan-laurent/pyml_example , thanks a lot!

But a lot more work is needed in order for this to be useful and to
adhere to Python conventions.

Requires pinning `ppx_python` to commit
janestreet/ppx_python@86db99c

Closes: #48
ejgallego added a commit that referenced this issue Sep 23, 2021
This PR adds a python interface to the SerAPI protocol using
`pythonlib`, which allows to call OCaml functions from Python.

Current setup is based on
https://github.com/jonathan-laurent/pyml_example , thanks a lot!

But a lot more work is needed in order for this to be useful and to
adhere to Python conventions.

Requires pinning `ppx_python` to commit
janestreet/ppx_python@86db99c

Closes: #48
@ejgallego
Copy link
Owner Author

This took some time, but it is currently "solved" by https://github.com/ejgallego/pycoq

However, due to toolchain issues, python support has not yet landed in the main branches, so we will keep this issue open until indeed python support is part of an official release.

ejgallego added a commit that referenced this issue Sep 23, 2021
We add Python serialization for the complete protocol, modulo the
existing issues the current setup seems to work well!

Main hiccup was the lack of variants support
janestreet/ppx_python#4 , but fortunately it
was easy to work around.

Thanks to the `ppx_python` team for their quick resolution of issue
janestreet/ppx_python#1 which was essential
to get this commit working.

This PR just takes care of the serialization, main Python support is
done now in https://github.com/ejgallego/pycoq

Thus, this PR closes #48
@brando90
Copy link

@ejgallego how is that different from this version of pycoq? https://github.com/IBM/pycoq

@ejgallego
Copy link
Owner Author

Hey @brando90 ; oh, I totally missed that project when doing some search ; maybe I saw it back in March when I started PyCoq' :S would be helpful if people would add their tools to https://github.com/ejgallego/coq-serapi#clients-and-users so we keep track of them and can coordinate.

From a technical point of view, the projects are very different, IBM's PyCoq relies on serapi to access Coq, and this involves RPC and is not very flexible.

Our own PyCoq does instead create a Python .so object with the full Coq implementation, and doesn't rely on RPC, but links to Coq directly. This has several advantages, in particular, it should be possible to make the API much more idiomatic [as outlined in the issues] and avoiding RPC may matter in a lot of scenarios where for example performance is important. Maintenance should be also easier, tho that's still to see as the PyML stack is a heavy one too.

@brando90
Copy link

Thanks for the reply Emilio! A few questions I am curious about (+some comments):

Hey @brando90 ; oh, I totally missed that project when doing some search ; maybe I saw it back in March when I started PyCoq' :S would be helpful if people would add their tools to https://github.com/ejgallego/coq-serapi#clients-and-users so we keep track of them and can coordinate.

Agreed! Perhaps a better advertisement in the readme would be useful?

From a technical point of view, the projects are very different, IBM's PyCoq relies on serapi to access Coq, and this involves RPC and is not very flexible.

I am curious, why are RPCs not flexible? In what ways are they not flexible? Can't they do everything the server (serapi) does? So if it's not flexible would it be because Coq or serapi aren't flexible? Curious to understand (and hopefully these convs help us build a better tool for all of us)

Our own PyCoq does instead create a Python .so object with the full Coq implementation, and doesn't rely on RPC, but links to Coq directly. This has several advantages, in particular, it should be possible to make the API much more idiomatic [as outlined in the issues]

What do you mean by idiomatic here? Like that it looks like "the idioms of python" - pythonic?

and avoiding RPC may matter in a lot of scenarios where for example performance is important.

This makes sense. I guess the main reason is likely because we are skipping serapi alltogether and talking to coq directly? But perhaps some benchmarking would be nice and perhaps with things like asynchronous code (e.g. python 's asyncio's async and await might remove that issue in python).

Maintenance should be also easier, tho that's still to see as the PyML stack is a heavy one too.

Thanks in advance! Hope to try them soon and hopefully provide good feedback.

@ejgallego
Copy link
Owner Author

I am curious, why are RPCs not flexible?

You are limited by the RPC interface, in this sense you have several limits: for example, depend on the chosen serialization format; usually, you must serialize full calls to in the RPC, which is very costly, or implement complex call back mechanism, which bring their own gargabe-collection issues. Also, you depend on the public RPC interface, so if the API doesn't suit you, bad luck.

Agreed! Perhaps a better advertisement in the readme would be useful?

Maybe! If you have any suggestion please submit a PR!

What do you mean by idiomatic here? Like that it looks like "the idioms of python" - pythonic?

See for example issue ejgallego/pycoq#4

Indeed, it would be nice if we could generate a better object interface than the current, low level one.

This makes sense. I guess the main reason is likely because we are skipping serapi alltogether and talking to coq directly?

We are replacing printing and parsing for direct object serialization, you can bench, but it is for sure faster.

Looking forward to feedback!

@ejgallego ejgallego modified the milestones: 0.13.1, 0.13.2 Sep 24, 2021
@brando90
Copy link

I am curious, why are RPCs not flexible?

You are limited by the RPC interface, in this sense you have several limits: for example, depend on the chosen serialization format; usually, you must serialize full calls to in the RPC, which is very costly, or implement complex call back mechanism, which bring their own gargabe-collection issues. Also, you depend on the public RPC interface, so if the API doesn't suit you, bad luck.

Agreed! Perhaps a better advertisement in the readme would be useful?

Maybe! If you have any suggestion please submit a PR!

What do you mean by idiomatic here? Like that it looks like "the idioms of python" - pythonic?

See for example issue ejgallego/pycoq#4

Indeed, it would be nice if we could generate a better object interface than the current, low level one.

This makes sense. I guess the main reason is likely because we are skipping serapi alltogether and talking to coq directly?

We are replacing printing and parsing for direct object serialization, you can bench, but it is for sure faster.

I am curious, why is it faster? I guess I am still struggling to form a model of how this is different from RPCs. Both have to eventually or somehow parse Coq - no?

Looking forward to feedback!

:)

@ejgallego
Copy link
Owner Author

I am curious, why is it faster? I guess I am still struggling to form a model of how this is different from RPCs. Both have to eventually or somehow parse Coq - no?

We don't parse the protocol when using the non-RPC model, we build a Python object in direct style, using https://github.com/thierry-martinez/pyml , so that's faster and way more flexible in terms of sharing and delaying serialization.

Some people is really sending huge objects, and they reported parsing / printing as an slowdown.

@brando90
Copy link

as a side comment, my understanding is that ibm pycoq uses serapi as a backend but it's possible to change ithe backend to this pycoq too, so perhaps they aren't competing tools but rather complementary.

The name clashing is rather unfortunate though.

@ejgallego ejgallego modified the milestones: 0.13.2, 0.16.0 Jun 15, 2022
@ejgallego ejgallego removed this from the 0.16.0 milestone Sep 8, 2022
@ejgallego
Copy link
Owner Author

SerAPI is now replaced by coq-lsp and petanque, there are Python clients for both, which will work much better than the prototypes here:

@ejgallego ejgallego closed this as not planned Won't fix, can't repro, duplicate, stale Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants