New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cycle 4 Funding: Performance optimisation and C code review #390
Conversation
So happy to this request! Just want to confirm, this is to support you as an independent contractor, yes? |
Thank you! Yes, this will be for me as an independent contractor (based in Australia, if that matters) |
Two somewhat random notes:
More generally, I like that you are thinking of actually improving the code itself, rather than papering over slow stuff with caches or interpolation. The trickiest part may indeed be to find out what are the real bottlenecks. |
I am not surprised at all. Beating numpy requires either extra domain-specific implementation or (as you said) really deep knowledge in C. And as someone with said deep knowledge, I do not recommend going down that route unless absolutely essential. In fact, I can see that a useful project might be to evaluate whether existing C code for specific functions is actually faster than equivalent python/numpy calls and replacing C code wherever possible.
That's neat! Might have to figure out a way to add that into Corrfunc :)
Indeed. Putting bandaids to fix a semi-working code does not really work long-term, and I prefer to solve it in a (potentially) harder but more sustainable way. And I totally agree about the bottlenecks - finding out which functions are the real bottlenecks might be tricky - that's where I hope that the voice of community experience will come in handy, at least for a starting group of potential targets. |
While working on multi-threaded race condition in wcslib, I noticed several potential improvements to the
These changes above will improve code readability and make the C code more compliant with conventions, but the current C code in Additional context for the assignment + status checking There are a lot of potentially confusing statements where an assignment is done and status checked in the same line throughout the wcslib C codebase under cextern. These statements would be better off being split into two and replacing if ((wcs->types = calloc(naxis, sizeof(int))) == 0x0) {
//ideally should be broken into two lines as
wcs->types = calloc(naxis, sizeof(int));
if (wcs->types == NULL) { However, making such changes within astropy might cause the cextern package to diverge from upstream and potentially cause maintenance issues in the future. If such changes might be useful for the project, and there are funds for tackling them into |
Please react to this comment to vote on this proposal (👍, 👎, or no reaction for +0) |
Would be great to see improvements here, but I think we'd definitely not want to get into using a forked version of WCSLIB, so one should try to push any such enhancements upstream – since that is more or less single-handedly maintained by Mark Calabretta, this will probably not follow the standard repository procedure, but I think he is entirely open to useful reports and patches. Probably @mcara can comment more on how this might work out. |
Definitely worth a try but I think it would be much more effective if a long-time |
From the sidelines, Calabretta seems very responsive and receptive to patches if you can prove to him they are valid patches. |
Yeah that's my concern too. I would prefer changes to wcslib to be incorporated upstream so that it is less of a maintenance burden. Even the second/future part that I have suggested for modernising the code would also be better implemented upstream. Given the downvote on the idea about the changes from @mcara, that probably is a sign that the (second/future) changes should either not be done or at least be done upstream. |
Given the email exchanges so far, Mark does seem fairly responsive and reasonably open to discussion. Let's see what he thinks about my proposed |
Mark's style in supporting WCSLIB is to do all code modifications himself, which is different than the astropy approach. Nevertheless, he has done a great service supporting this library, and such suggestions with regard to personal style may be offensive to him. Personally, I don't see most of these warranting forking the code or engaging in an effort to persuade him to do things differently. One risk is that he offloads all support to the critics. Is that desired? As has been mentioned Mark has generally fixed things that he has been shown to be bugs or errors. |
Forking and maintaining is definitely not desired, especially since (as you noted) the issues do not fundamentally impact correctness. Plus, the goal is to reduce the long-term effort, and not add to it :D |
I'm writing on behalf of Astropy's Finance Committee. We are pleased to be able to let you know that, following consultation with the community, we are able to approve this request. We can currently fund the one year of work in the amount of $10,200 (US). Future funding beyond this amount will be contingent upon the availability of funds. (We will be using the full budgets of all of the approved requests to craft future grant and funding proposals.) We assume you will be using this full year 1 budget; if that is not the case, please contact us immediately. Ana Gabela and I will be your contacts on the Finance Committee to facilitate this award. Please get in touch with us if you have any questions or concerns. Please do not reach out to NumFOCUS directly. In addition, new to this funding cycle, is the assignment of a COTR (Contracting Officer's Technical Representative) to each funded project. This concept is borrowed from government funding agencies, although it is to be stressed that Astropy's goal is to make the COTR role as low-overhead as possible. The COTR’s primary responsibility is to make sure the work is happening at the expected pace and, if necessary, to be a liaison between the funded project and the Finance Committee or CoCo. The COTR for your project will be assigned shortly and we’ll also be sending out more details about how we see this working. The next steps are:
We expect the work to be carried out between 2024-05-01 and 2025-04-30 (the “period of performance”). If you come to realize that the work will take more time than originally planned, you can apply for an extension following the procedure described at https://github.com/astropy/astropy-project/blob/main/finance/process/request-extension.md. Congratulations — we are really looking forward to seeing you put these funds to good use! |
The file contains all the relevant info :)