Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC Add a 'Cython Best Practices, Conventions and Knowledge' section #25608

Merged
merged 17 commits into from
Mar 9, 2023

Conversation

jjerphan
Copy link
Member

@jjerphan jjerphan commented Feb 14, 2023

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This adds a section to scikit-learn documentation regarding some Cython know-how of the team.

Any other comments?

Feel free to suggest changes, adaptations or additions.

Co-authored-by: Meekail Zain <Micky774@users.noreply.github.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Vincent M <maladiere.vincent@yahoo.fr>
@lorentzenchr
Copy link
Member

@jakirkham might be interested.

@ArturoAmorQ
Copy link
Member

A first comment before a proper review: I think it would be nice to point towards this new section from the reading the existing code base and the performance tips for the cython developer sections. It could be a simple "for more information, see section ...".

@jjerphan
Copy link
Member Author

@ArturoAmorQ: feel free to open a pull-request on my branch if you think this can be better worded and if it is easier for you. 👍

@jjerphan
Copy link
Member Author

jjerphan commented Feb 19, 2023

Relevant notes from @da-woods people might be interested reading: https://cython-guidelines.readthedocs.io/en/latest/

doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
ArturoAmorQ pushed a commit to ArturoAmorQ/scikit-learn that referenced this pull request Feb 20, 2023
@ArturoAmorQ
Copy link
Member

@ArturoAmorQ: feel free to open a pull-request on my branch if you think this can be better worded and if it is easier for you. +1

You can cherrypick my commit here: ArturoAmorQ/scikit-learn@32648c5 (#2)

@jjerphan jjerphan marked this pull request as ready for review February 28, 2023 08:10
@jjerphan jjerphan added Waiting for Reviewer Waiting for Second Reviewer First reviewer is done, need a second one! labels Feb 28, 2023
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's something I would have liked to read when I joined the project, thanks for working on this. Here are some comments.

In addition in the performance page, there's this old TODO
TODO: html report, type declarations, bound checks, division by zero checks, memory alignment, direct blas calls...
I think this PR is a good opportunity to finally do it :)
In particular, I think it's worth adding a bullet point about direct calls to blas: when it's desirable, must be through the cython_blas extension... The rest is kind of already done I think

doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/performance.rst Outdated Show resolved Hide resolved
jjerphan and others added 3 commits March 3, 2023 15:21
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
@jjerphan
Copy link
Member Author

jjerphan commented Mar 4, 2023

@da-woods: how interested are you to merge such notes (with yours) (in Cython's documentation)?

Copy link
Contributor

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I much enjoy and appreciate the snippets of code I can just copy and paste.

doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
@da-woods
Copy link

da-woods commented Mar 5, 2023

@da-woods: how interested are you to merge such notes (with yours) (in Cython's documentation)?

I think we are interesting in merging more of this sort of this into the Cython documentation. It probably won't be all my notes - I know there's a few of my opinions (e.g. my abiding dislike of cpdef functions) that Stefan doesn't necessarily agree with so I'd rather keep those somewhere where it's "just my opinion".

I still think duplication doesn't hurt so there's no reason not have this section in PandasScikit-learn and then see what we can pick across.

@jjerphan
Copy link
Member Author

jjerphan commented Mar 6, 2023

I tend to agree with all of your notes.

While it is useful to have Cython's developers and users (e.g. you and me here) share their experience, tips and conventions about the language, I think everyone would benefit from having them unified at one canonical place (ideally Cython's docs).

Do you think it possible to identify and extract non-opiniated, factual elements from both yours and those notes and integrates them in Cython's docs ?

Also, were you referring to scikit-learn instead of pandas in your previous comments ?

@da-woods
Copy link

da-woods commented Mar 6, 2023

Do you think it possible to identify and extract non-opiniated, factual elements from both yours and those notes and integrates them in Cython's docs ?

I'll try to make a start on that in the next few weeks (probably bit-by-bit)

Also, were you referring to scikit-learn instead of pandas in your previous comments ?

Oops yes sorry! I got temporarily dislocated!

jjerphan and others added 2 commits March 6, 2023 11:49
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Co-authored-by: Adam Li <adam2392@gmail.com>
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass, mostly regarding the rendering this time.

doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
doc/developers/cython.rst Outdated Show resolved Hide resolved
jjerphan and others added 2 commits March 7, 2023 09:55
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
@jjerphan jjerphan removed Waiting for Reviewer Waiting for Second Reviewer First reviewer is done, need a second one! labels Mar 8, 2023
@betatim
Copy link
Member

betatim commented Mar 8, 2023

Added the "no changelog" label, as I think this wouldn't get added.

I like it and find it useful as is. As such, should we try and merge this as soon as this round of comments is addressed? It seems like work on this PR could go on for a long time because there is so much to know. However the longer it is unmerged, the longer people won't profit from what is already here. Can always do a second, third, fourth PR to keep working on this. WDYT?

@jjerphan
Copy link
Member Author

jjerphan commented Mar 8, 2023

Yes, I agree. I think Jérémie is busy with the 1.2.2 release, but other maintainers can give their +1.

I am waiting for this PR to get 2 approvals with potential proposals for subsequent PRs improving the section.

@jeremiedbb
Copy link
Member

Can always do a second, third, fourth PR to keep working on this

I'm also in favor of merging a good enough version and improve in subsequent PRs

jjerphan and others added 2 commits March 8, 2023 10:01
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
@jjerphan
Copy link
Member Author

jjerphan commented Mar 9, 2023

@jeremiedbb: I think I've treated all of your comments.

@jeremiedbb
Copy link
Member

Reading the discussions in #25780, I think it will be worth adding a bullet point about including cnp.import_array(), but it can be done in another PR

@lorentzenchr
Copy link
Member

lorentzenchr commented Mar 9, 2023

I like it and find it useful as is. As such, should we try and merge this as soon as this round of comments is addressed? It seems like work on this PR could go on for a long time because there is so much to know. However the longer it is unmerged, the longer people won't profit from what is already here. Can always do a second, third, fourth PR to keep working on this. WDYT?

I agree 💯 It is better to have 4 smaller PRs and to iterate faster.

Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let‘s iterate.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. There's just this rendering issue that bothers me :/

doc/developers/cython.rst Outdated Show resolved Hide resolved
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last nitpick. The extra indentation made the first sentence bold, but all other bullet points don't so let's remove it for consistency

doc/developers/cython.rst Outdated Show resolved Hide resolved
@jeremiedbb
Copy link
Member

Thanks

@jjerphan jjerphan deleted the doc/cython-section branch March 9, 2023 13:52
@jjerphan
Copy link
Member Author

jjerphan commented Mar 9, 2023

Thanks everyone! I hope this will help people getting started using Cython for scikit-learn.


* Always prefer memoryviews instead over ``cnp.ndarray`` when possible: memoryviews are lightweight.

* Avoid memoryview slicing: memoryview slicing might be costly or misleading in some cases and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I have an PR that should dramatically improve this. It almost certainly won't (and shouldn't!) make it into Cython 3, but hopefully won't be long after that. Essentially, a lot of the reference counting turns out to be pretty pointless since you're usually incrementing and decrementing exactly the same object so it can be skipped completely.

So hopefully this advice can change sometime in the next year. But not yet

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for mentioning this, @da-woods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants