New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove author attribution in file headers #377
base: master
Are you sure you want to change the base?
Conversation
I am also happy to remove the "begin date" in the headers which doesn't serve much use. The earliest commit imported into GitHub is from 2008 and some of the headers do go back further to 2005. But it is mostly a point of historical curiosity than useful information in the source file. The git log provides much richer information here as well. |
These were added because someone who was using STP was very adamant about them. I don't remember exactly who it was, but it seemed really important for them. Ah, I found it, here: For Debian apparently. Debian takes copyright super-seriously so it makes sense. Please check that before changing authors. Looks pretty serious. |
Eh ok. I have never heard of another project that requires a list of authors in every header, ever. I am pretty sure 90% of the projects in Debian do not do this. LLVM does not, for example. I would be fine with a general "(c) STP contributors, see AUTHORS file" if we really need it. I can understand the issue described in #199 where the A copy of the license or a reference to the license file is common. This way contributors understand their modifications are licensed under the project license. Some projects require signing a Contributor License Agreement before a pull request will be accepted, which basically just makes me confirm I understand that the project uses an open source license. There are GitHub bots for that, if we want. |
Sure, what you said makes 100÷ sense. Just wanted to point out that those
authors were there for a particular reason. Let's merge this PR but make
sure we don't upset Debian and other projects that care about about
authorship in the meanwhile :)
…On Sun, Aug 2, 2020, 21:58 Ryan Govostes ***@***.***> wrote:
Eh ok. I have never heard of another project that requires a list of
authors in every header, ever. I am pretty sure 90% of the projects in
Debian do not do this. LLVM does not, for example. I would be fine with a
general "(c) STP contributors, see AUTHORS file" if we really need it.
I can understand the issue described in #199
<#199> where the AUTHORS file / LICENSE
file did not match up with some headers we imported from other projects, as
we've recently discussed. I also can understand that that could spook
Debian, if it seems like the project is claiming credit for other peoples'
work.
A copy of the license or a reference to the license file is common. This
way contributors understand their modifications are licensed under the
project license. Some projects require signing a Contributor License
Agreement before a pull request will be accepted, which basically just
makes me confirm I understand that the project uses an open source license.
There are GitHub bots for that, if we want.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#377 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKF4OLX2TVCIBUVRRGFNH3R6XANZANCNFSM4PSWLSCQ>
.
|
So Michael Tautschnig is a prominent Debian contributor https://qa.debian.org/developer.php?login=mt@debian.org and packages CBMC for Debian -- however, even he has commits against CBMC that do not adhere to that rule: https://github.com/diffblue/cbmc/blob/develop/src/util/allocate_objects.cpp (file chosen totally at random). |
@hellovijay Understood. This pull request won't go forward if you are not satisfied. Note that you are listed as the first author in the AUTHORS file and you should and will remain there. The reasons for suggesting the removal are highlighted in my original post, such as having poor consistency throughout the project. A changelog / author block at the top of source files is uncommon nowadays with version control that tracks this information much better. The issue came to mind as @andrewvaughanj is contributing to many files and the author list in each file will grow longer. Also my name is spelled wrong in almost every file :) Perhaps this would be better:
|
There's certainly nothing wrong with this, but note that this information is already in the git history and surfaced in the GitHub website (see the avatars across the top on this page for example). Appending a name to the list though doesn't convey how much that person contributed. For example, I wouldn't want to put my name on a file that you wrote the vast majority of simply because I fixed a small bug. |
Sorry to triple-post, but in the example file I linked to in the previous message, note there are 3 people credited for the file (including me, who did not contribute to that file), while there are 5 other contributors who are not credited. This is why it is a bit of a mess. |
Okay.
* How about the person who originates a file gets to put their name at the
top. The term AUTHORS is replaced by "FIRST AUTHORS".
* Please do feel free to correct the spelling of your name where it
appears.
* With regard to copyright, I am comfortable with the following:
(c) 2005-2015 Vijay Ganesh, David L. Dill
(c) 2015-2020 Vijay Ganesh, David L. Dill, Mate Soos
(c) 2020-present STP contributors (see AUTHORS file)
Let me know if the above approach works for you.
…On Sun, Aug 2, 2020 at 6:35 PM Ryan Govostes ***@***.***> wrote:
Sorry to triple-post, but in the example file
<https://github.com/stp/stp/blob/master/lib/CMakeLists.txt> I linked to
in the previous message, note there are 3 people credited for the file
(including me, who did not contribute to that file), while there are 5
other contributors who are not credited. This is why it is a bit of a mess.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#377 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABBVO6FG6FOLGJPBXK252ITR6XS4PANCNFSM4PSWLSCQ>
.
|
On the overall copyrightI have a few contributions here and there to STP, and nowhere near the number of that of Trevor/Dan/Mate, however I feel that this:
Is a bit disingenuous to the other major contributors to STP during that five year window (especially people like Trevor and Dan). I've omitted Ryan here, because if I'm not careful, we'll end-up with an ever growing list, and I'm bound to forget someone ... so I'll start as I mean to go! Sorry, Ryan 😬 Anyway, in 2015, and on #199 you wrote:
If we're agreeable, could we just retroactively assign the copyright to the maintainers starting in 2015 (which you and David would be one of)? That way, no-one's nose gets out of shape. We can just ensure that the current AUTHORS file lists everyone who has contributed at all, but make it clear that between 2005 and 2015, the copyright was the sole ownership of Vijay Ganesh and David L. Dill. Was there actually an official assignment of copyright of STP to Mate during that window? If we're retroactively assigning copyright to Mate, I would vote for retroactively assigning copyright to the "STP maintainers" (whoever they may be). On 'Author' vs. 'Original Author'I actually think the idea of "original author" is worse -- what happens if someone rewrites 99% of the file (but, say, leaves the header block + doesn't actually change the API); does that really mean that the original author applies here, and that the new contributor feels they can't add their name because they are not the "original author"? When to update the copyright blockWe also need to juggle Ryan's point about when you add copyright to the top of a file (I know that Mate and I had this with CMS, and Mate rightly removed me when I naively added myself thinking I was "doing the right thing"). Even if someone makes a 10 line change in a 1000 line file, surely they do actually retain "artistic copyright" over those 10 lines? The question we need to decide if that is enough to put the name at the top of the file. Personal viewMy personal view is that we should strip all of these copyrights (mainly because they're wrong in a lot of places, which is arguably even worse), ensure we maintain an up-to-date AUTHORS file and then require DCO + squashed and signed-off commits (the same approach as Boolector and CVC4) for contributions going forwards. This means that it is clear that anyone contributing to STP has assigned their copyright over to the STP project and that they have the authority to do so. In case it helps, here is CVC4's guide to contributing: https://github.com/CVC4/CVC4/blob/master/CONTRIBUTING.md |
After sleeping on this, I would be fine with closing the pull request to avoid bruising anyone's feelings for attribution being removed, though I also recognize this shortchanges contributors who were not credited this way. I was raising a minor but thorny point that has the potential to grow into an outsized distraction. I agree with @andrewvaughanj that the three-line copyright isn't ideal. I don't want to start making judgments about the relative value of contributions. Also any sort of "retroactive" modification of copyright strikes me as ripe for creating conflict (contributors already expressed their feelings about this before), while simultaneously probably not being valid (can we query STP?). The exact dates don't really matter in my opinion, but the ones that make the most sense to me are either (a) when Vijay stopped working substantially on the project, and/or (b) when the project became open source. My view is that "first author" in a file header is also not ideal, for a number of philosophical and practical reasons. A large file can represent small contributions from many people. Files get rewritten, split up, and combined. Ponder the Ship of Theseus. We have git history, merged pull requests, etc. that are more accurate and don't require human judgment.
|
Hi Andrew,
On the overall copyright
I agree with you.
I didn't mean to diminish the contributions of others and I clearly didn't
contribute any code in the time period from 2015-2020. That must have been
an oversight on my part in my previous email. Sorry about that. Feel free
to change it the following:
(c) 2015-2020 Trevor Hansen, Mate Soos, Ryan, Andrew,...
Further, I agree with the rest of your points in your email under the
heading "On the Overall Copyright".
I am fine with the CVC4 approach to copyright you mentioned.
On 'Author' vs. 'Original Author'
I don't think any of the proposed solutions are ideal. I really don't know
how to resolve this one.
Cheers,
Vijay Ganesh.
On Mon, Aug 3, 2020 at 6:28 AM Andrew V. Jones <notifications@github.com>
wrote:
… On the overall copyright
I have a few contributions here and there to STP, and nowhere near the
number of that of Trevor/Dan/Mate, however I feel that this:
(c) 2015-2020 Vijay Ganesh, David L. Dill, Mate Soos
Is a bit disingenuous to the other major contributors to STP during that
five year window (especially people like Trevor and Dan). I've omitted Ryan
here, because if I'm not careful, we'll end-up with an ever growing list,
and I'm bound to forget someone ... so I'll start as I mean to go! Sorry,
Ryan 😬
Anyway, in 2015, and on #199 <#199> you
wrote:
I am quite happy to share the copyright. STP wouldn't be where it is
without
the contributions of Trevor, Mate, Dew, Ryan, Dill, Khoo, and others.
If I were to highlight the most important contributors, I would say they
would be myself, Trevor Hansen, Mate Soos, and David L. Dill.
If we're agreeable, could we just retroactively assign the copyright to
the maintainers starting in 2015 (which you and David would be one of)?
That way, no-one's nose gets out of shape.
We can just ensure that the current AUTHORS file lists everyone who has
contributed *at all*, but make it clear that between 2005 and 2015, the
copyright was the sole ownership of Vijay Ganesh and David L. Dill.
Was there actually an official assignment of copyright of STP to Mate
during that window? If we're retroactively assigning copyright to Mate, I
would vote for retroactively assigning copyright to the "STP maintainers"
(whoever they may be).
On 'Author' vs. 'Original Author'
I actually think the idea of "original author" is *worse* -- what happens
if someone rewrites 99% of the file (but, say, leaves the header block +
doesn't actually change the API); does that really mean that the original
author applies here, and that the new contributor feels they can't add
their name because they are not the "original author"?
When to update the copyright block
We also need to juggle Ryan's point about when you add copyright to the
top of a file (I know that Mate and I had this with CMS, and Mate rightly
removed me when I naively added myself thinking I was "doing the right
thing").
Even if someone makes a 10 line change in a 1000 line file, surely they do
actually retain "artistic copyright" over those 10 lines? The question we
need to decide if that is enough to put the name at the top of the file.
Personal view
My personal view is that we should strip all of these copyrights (mainly
because they're wrong in a lot of places, which is arguably *even worse*),
ensure we maintain an up-to-date AUTHORS file and then require DCO
<https://github.com/apps/dco> + squashed and signed-off commits (the same
approach as Boolector and CVC4) for contributions going forwards.
This means that it is clear that anyone contributing to STP has assigned
their copyright over to the STP project *and* that they have the
authority to do so.
In case it helps, here is CVC4's guide to contributing:
https://github.com/CVC4/CVC4/blob/master/CONTRIBUTING.md
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#377 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABBVO6GJG62PPYUR3ODJWK3R62GNVANCNFSM4PSWLSCQ>
.
|
I feel like we should be able to agree on some of these changes. As a start, can anyone see issues if we:
|
AuthorsI (genuinely) think the best way to resolve this issue is to drop the
I'm actually fine to have a more sweeping change that just does 1. in every file (irrespective of its genus, apart from the If we do this, then we just need to make sure that Ryan's name is spelled correctly in the After doing this everyone appears in the Begin dateRemove it ✅ Copyright agreementI'm happy to take this as a separate action not as part of this PR -- I will set up DCO + write a contributors guide (and see if we can have a GitHub Action that confirms commits are signed-off). |
I like Andrew's idea.
…-Vijay.
On Tue, Aug 4, 2020 at 4:49 AM Andrew V. Jones ***@***.***> wrote:
Authors
I (genuinely) think the best way to resolve this issue is to drop the
authors bit from the top of each file and do this:
1.
If the file existed *before* 2015, then we add (c) 2005-2015 Vijay
Ganesh, David L. Dill
2.
If the file has been changed *after* 2015, the new add (c)
2015-present STP contributors (see AUTHORS file)
I'm actually fine to have a more sweeping change that just does 1. in
*every* file (irrespective of its genus, apart from the ext files) and 2.
as above.
If we do this, then we just need to make sure that Ryan's name is spelled
correctly *in the AUTHORS file*.
After doing this everyone appears in the git log and in the AUTHORS file,
and we have no issues going forward.
Begin date
Remove it ✅
Copyright agreement
I'm happy to take this as a separate action not as part of this PR -- I
will set up DCO + write a contributors guide (and see if we can have a
GitHub Action that confirms commits are signed-off).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#377 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABBVO6FNJMH2G2ZJRZYJT23R67DTBANCNFSM4PSWLSCQ>
.
|
I don't think it's reasonable to transfer copyright over from other contributors to Vijay and David without the other contributor's permission. In the past some contributors have insisted on maintaining their copyright. Transferring copyright to David L. Dill seems a bit weird to me, too. Was his last contribution really in 2015? To sort this out, I suspect we're going to need something more nuanced. |
Okay, maybe it should just be all files that existed before 2015 get Vijay+David, and any file that was changed by anyone else (ever!) gets STP contributors (and the year it was first changed by someone who wasn't Vijay+David)? |
I agree with Trevor here. For example, my first commit is from 2009 and so is Trevor's. Here is some food for thought:
Note that number of commits is not equivalent to the amount of work. In particular, often the starting of a project is already a substantial amount of work without any commits. But it does show that a lot of people contributed to the project. Between project inception to 2014 (inclusive) we have:
Sorry for going for the data-driven approach :) Again, there is lots of work that does not show in the logs. Another example is the work that goes into responding to customer queries and aligning within the team. None of these are trivial work and they don't (or barely) show up in the the above data. Just my 2 cents :) |
Perhaps we are overthinking this. We can have overlapping copyright assignments, see for instance OpenCV: https://github.com/opencv/opencv/blob/master/LICENSE
|
Hi All,
I am positive you will come up with a fair way to resolve this issue. I
like the idea that Ryan sent in his last email, with overlapping copyrights.
Thanks for your wonderful work!
Cheers,
Vijay Ganesh.
…On Wed, Aug 5, 2020 at 1:10 PM Ryan Govostes ***@***.***> wrote:
Perhaps we are overthinking this. We can have overlapping copyright
assignments, see for instance OpenCV:
https://github.com/opencv/opencv/blob/master/LICENSE
Copyright (C) 2000-2020, Intel Corporation, all rights reserved.
Copyright (C) 2009-2011, Willow Garage Inc., all rights reserved.
Copyright (C) 2009-2016, NVIDIA Corporation, all rights reserved.
Copyright (C) 2010-2013, Advanced Micro Devices, Inc., all rights reserved.
Copyright (C) 2015-2016, OpenCV Foundation, all rights reserved.
Copyright (C) 2015-2016, Itseez Inc., all rights reserved.
Copyright (C) 2019-2020, Xperience AI, all rights reserved.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#377 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABBVO6CRJ26AUAYIB6E6GJDR7GHBPANCNFSM4PSWLSCQ>
.
|
I've been trying to make a more polished attempt to process the commit history; I wrote a script here: https://github.com/andrewvaughanj/parse_stp_history/blob/master/parse_stp_history.py This script walks all commits from the start of the repo and tracks adds/deletes/renames. Importantly, the files that are part of the initial commit are marked as owned by Vija and David, and not by Michael Katelman. Here are the current statistics:
You can find the complete output here: https://github.com/andrewvaughanj/parse_stp_history/blob/master/stp_history.txt My suggestion here is:
Does this seem reasonable to everyone? |
This change removes author attribution from the tops of files. The intention is not to deny anyone recognition for the work, but I will let this sit for a little bit to see if anyone feels differently.
The
AUTHORS
file remains and contributors are invited to add themselves to it. (It does need an update.)The git log already details exactly who contributed what to which file.
We no longer have to keep the file headers up-to-date, or feel compelled to add attribution every time a change is made.
The headers are not particularly accurate. Many were bulk-added to include me (or my alter ego, Ryan Gvostes), even though I'm sure I did not modify most of those files; other contributors whose names deserve to be up there are not.
Thoughts?