New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renaming Bus Factor #632
Comments
In my experience, academia (at least the part that's aware of open source software) uses bus factor. In academia, most people who don't know it, after having it explained, generally either get it and laugh, or still don't understand it (and don't really understand open source either). I know I've also heard at least one other term that was more pleasant and still worked, but I can't quite remember it for the minute. (I've also heard truck factor, but I guess that's not really much of an improvement.) |
If we're to rename the metric, now is the time before we standardize
anything with a standards body.
Bus Factor and Lottery Factor both describe an external event that would
impact directly a project member and thus the project. There are a host of
other events with the same potential impact: becoming a parent, getting
laid off, moving to a new city, having to take care of family members,
meeting a significant other, ... - the metric could be named after any of
these events and still always require explanation.
Maybe we can choose a name that is directly descriptive of the problem or
threat: concentration of knowledge, distribution of effort, ...
|
When I talk about this issue, I generally frame it as a discussion of "Contributor Sustainability", but it's probably only one of a number of things that impact contributor sustainability. I still think the metric should be named something that's already established within our community and the literature, which might make Pony Factor a better choice. I like Lottery Factor, but it's definitely not as well known. |
There are two things I think don't work well about "Bus Factor"
I find skeptical looks and confusion when I'm explaining that a "bus factor" of a project is 3, and that's a bad thing. I feel that if we had some language that better indicated what the bus factor is, intuitively, that would be more persuasive and useful. I know "bus" is what we're moving away from, but... When I think of projects making progress, I usually think of forms of transportation. Trains, planes, boats, cars, etc. They move many people around and need critical pieces to keep them moving. We could borrow some of these ideas, and swap "factor" for "count" like:
I'm also happy to turn away from "bus" like things, maybe options from nature without conflicting with git?
That's if we're willing to get creative though. I think if we plan on changing the name to something, Bus Factor is certainly the most well known -- so we should update it to something more intuitive and descriptive. Out of the options I gave, I like Host, Pilot, Captain, and Monarch. Excited to have a discussion about it. The confusing wordplay also exists for "elephant factor" -- but that should probably be a separate discussion :) |
I would vote against Pony Factor, as it's based on an in-joke that is just confusing to people who aren't in the group |
@justaugustus I'm curious if the Inclusive Naming Initiative have had any discussions about this or related terminology? |
I usually use lottery as well, pony doesn't make a lot of sense to me. On count vs factor, what about something like frequent contributors count, which parallels with 'inactive contributors' and 'new contributors'. Bus factor assumes something about the impact of these specific people leaving which might or might not be true depending on additional context. Just calling it a count of people who undertake a certain level of activity leaves it more neutral and more clearly as just part of the fuller picture. |
I usually use lottery factor as well |
I like to use names/terms that are easy to read and do not have implied meaning or metaphor. For something like this in other kinds of projects or organizations, it is sometimes called 'key person risk' (or key people/member risk). I'll toss 'key maintainer risk' in here for consideration, since when I read things like 'lottery factor' or 'pony factor' I have to go look up what that means in this context (and it may be even harder for those who don't have English as a first language); 'key maintainer risk' seems closer to describing exactly what is being measured. |
I would love to not use "bus factor" and usually use "lottery factor" instead. And I always explain in a few words what that means, as I'd explain the naming for any of our metrics. In my opinion, nothing is intuitive to everyone. Even something like "event location inclusivity" requires a few words of explanation by what we mean by that.
What about "Key Maintainer Count" or "Core Maintainer Count"? |
@ElizabethN I would be concerned with using the term "maintainer" as for many projects that has a very specific meaning. Maybe "Key Contributor Count" |
KCC has a nice ring to it, and it's definitely more clear than an analogy. |
Oooh, I like Key Contributor Count. |
For some reason, I thought we had already addressed this one. Thanks for bringing this up @geekygirldawn. The name is definitely problematic and agree pony factor isn't good option either. We could use them as key words though link them to the new name. I like Key Contributor Count... Or Key Contributor Risk. |
...or Core Contributor Risk. We have defined Occasional Contributors ( which was previously problematic as "Drive-by Contributors"). However, we have not defined key or core contributors. Academic literature usually uses core but key may be more descriptive of contribution importance. |
I like Risk better than Count, as it has the same sense (of urgency/danger) that Bus Factor has. It also feels less like something people would try to game |
At risk of sounding like a typical tech exec.... Wouldn't "gaming" this metric be a good thing? More people contributing to oss at a level to constitute bus factor seems likea. good thing... I'll put in that I think "risk" being a number runs the same risk (ha) as using a word like "factor". Without explanation, "my key contributor risk is 3" is a nonsensical phrase. |
I like Key Contributor Count.
My concern with "key" is that it adds a value judgement. Also following
Elizabeth's comment: not everyone needs to have a key to the project to be
included.
Returning to the definition of the metric, we're counting the smallest
number of contributors that made 50% of all contributions during the
analysis time window.
How about: Majority Contributor Count
"Majority" because people know that concept from voting and other contexts.
We could also emphasized that a larger count is good and go with something
like:
- Majority Contributor Spread
- Majority Contributor Concentration
|
In truth, without explanation, any name we choose is likely to be nonsensical. Some are more descriptive than others though. The metric should describe what we are trying to measure - which is the risk associated with key contributors abandoning a project (i think). I wouldn't get hung up on a number. |
@klumb I get ya. The measurement is absolutely indicating the risk. I would agree more with "risk" if the metric itself wasn't a count/number. It's descriptive of what we're measuring to name the measurement. If we're renaming it anyways, why be vague? We could keep metaphorical names like "pilot risk" etc. but that hardly solves the #2 problem I mentioned above, where I regularly have to explain what the metric actually is for people to buy into why it's useful. Just my experience, though. I like @GeorgLink 's observation. Majority Contributor Count is ultra-succinct and descriptive. I didn't even think about how "key" could mean like "having a key". Majority is much more specific. |
Also, I think value judgement is going to necessary for this metric. What the value is the question? Is it related to 'ownership/authorship of a percentage of the codebase? |
If it is about percentage of codebase, rather than contributor, maybe we need it to be about contribution authorship. For example, Majority Contribution Authorship, Majority Contribution Spread, or Majority Contribution Maintainership? Contribution Maintenance Risk? Majority Contribution Count? Just throwing some more out there. ;) |
I really like majority, I think it removes the value judgement of 'key'. This count in the chaoss metric reads to me as just a naive count of how many contributions people make as a percent of the total number of contributions, it says nothing about the value of those contributions in terms of code quantity or quality, which I think argues for keeping the metric as more of a single neutral data point. The metric says it wants to answer "how many contributors can we lose before a project stalls?" but that seems packed with assumptions to me. |
Sorry, but I have no idea what majority means in this context. And given that not all open source contributions are captured in a repository, how would it be measured? |
It totally is! That's part of the magic IMO 😄 There's some stake-in-grounding happening here. What kind of assumptions need to get made to actually measure something, ya kno?
While it's perfectly possible this isn't a perfectly accurate representation, I believe that the metric and it's implementations are usually disjointed. I believe most of the time, contributions are "counted" here as "commits" or "pull request open/close" or "issue open/close". That's just a function of using the GH API / history to make measurements... More tools could definitely get built to measure more though 😃
"Majority" here meaning who is making the majority of contributors. Majority Contributor Count = count of contributors who make the majority of contributions in the project for some time window. |
I like what the bus factor means, in that - a project is one disaster away from the project being completely abandoned or maintained. Some maintainers might keep maintaining if they win the lottery :) I think the seriousness should be retained because that seriousness is what gets people (leaders, people with influence) to act (majority contributor count IMHO, not so much) _, but agree bus factor is morbid. Propose then something more like 'disaster factor' because it has meaning immediately. |
Adoption may be better for Disaster Factor because it is close enough to the previous problematic name. It also signals the risk part. I think that could work. |
Disaster Factor = The risk associated with a count of contributors, who authored a majority of contributions in the project for some time window, abandoning a project. It is probably a good idea to review the description and objective of this metric as well. |
This is such an interesting discussion! At the risk of making this more complicated, is there a rubric that is used to come up with this number, so the name of the metric might be less of a concern (it's explanation would be the rubric)? I like the words 'adoption', 'risk', 'sustainability', and 'resilience' because they are less problematic (for me) than 'bus' or 'disaster' - |
We could use the GitHub poll capability in the Discussions area to create a poll from the names that have been suggested here and solicit votes from the community. Let me know if you'd like me to put that into a Discussion thread (I'm still relatively new to this community - sorry if that's been rehashed or if there is a different norm for this sort of thing!) |
Hello there! A bit late here, but even this might be an unfortunate name, this seems to be used in many places, even in Wikipedia. According to Wikipedia, it seems there's an existing and older concept coming from the insurance world called Key Person Risk. This is indeed a term similar to other proposals here mentioned. Even in Google Scholar this seems to be a term used once and again. My proposal would be to either keep using this term, or if this is updated, we should identify this with this more usual way. As an example, we could say: Key Person Risk (aka bus factor). Some thoughts on this :). |
Crediting some recent work of @JustinGOSSES I'll also nominate 'Nebraska Factor' , which also shows that with one false move ... It also depends how granular we believe this metric to be - for me sustainability and resilience are more like metric-models, which would contain the bus factor among others, not itself the measure. |
I like some of some the other names mentioned as well, but I do tend to agree with Daniel here. My favorite name is a variation on that - Key Contributor Risk. |
We could do a ranking poll in our community for the names mentioned in this discussion and then discuss the top three in the next metrics meeting. |
@emmairwin that's a great point! But this is perhaps too US centric...we can always talk about the Antartica factor XD. Anyway, my understanding (without having the proper definitions in front of me) is that the Nebraska factor focuses more on a tiny but important piece in the whole SBoM ecosystem where there's a risk of not being aware of the use of such pieces of code and its risks, while the bus factor is specific of one project. Perhaps we could say the Nebraska factor is a meta-bus factor. |
Sure. I refer to the xkcd comic
Naming is hard :)
…On Fri., Mar. 29, 2024, 1:37 p.m. Daniel Izquierdo Cortazar, < ***@***.***> wrote:
@emmairwin <https://github.com/emmairwin> that's a great point! But this
is perhaps too US centric...we can always talk about the Antartica factor
XD.
Anyway, my understanding (without having the proper definitions in front
of me) is that the Nebraska factor focuses more on a tiny but important
piece in the whole SBoM ecosystem where there's a risk of not being aware
of the use of such pieces of code and its risks, while the bus factor is
specific of one project. Perhaps we could say the Nebraska factor is a
meta-bus factor.
—
Reply to this email directly, view it on GitHub
<#632 (comment)>
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAOZST7VH4ZN6KPNIHG2ALY2XGJDBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLLDTOVRGUZLDORPXI6LQMWWES43TOVSUG33NNVSW45FGORXXA2LDOOJIFJDUPFYGLKTSMVYG643JORXXE6NFOZQWY5LFVEYTANJWHE2TOMBZQKSHI6LQMWSWS43TOVS2K5TBNR2WLKRSGIYTEOJVHA3TGM5HORZGSZ3HMVZKMY3SMVQXIZI>
.
You are receiving this email because you were mentioned.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>
.
|
Ha! A joke on the joke ;). Let me rephrase myself, I've seen this used in SBoM contexts :). And yep xkcd is great! |
I'll just second the previous comments that the "key person risk" phrasing seem an immediately understandable replacement for bus factor, at least to me. I would also suggest it might be helpful when renaming to also call out that there's some common variations on this idea that are related but different metrics. Elephant factor is one I've heard confused before. There's also the DDS (Distributed Development Score). It is most similar to BUS factor in that it is trying to measure something that correlates to the same negative event that people want to avoid but is calculated slightly differently. The negative event people are trying to predict a chance of is development suddenly stopping. Mechanisms for that stopping vary a bit. Methods of calculating metrics that approximate mechanisms that lead to the negative event vary even more. I will also note that in practice, I've sometimes also looked for # of maintainers (or approvers of PRs) in past year. Any bus-factor-ish metric used by itself will say there's no risk of abandonment of a project that's only had PRs approved by a single person in the past 3 years if 80% of the commits were developed equally by 10 people in 2016 who haven't touched the project in 5 years. In the other direction, a project developed 70% by one person in 2016 will have a risky key person factor, but if 20% of the other commits were done & approved equally by a group of 10 over the past 2 years the actual chance of sudden abandonment is probably very low. |
I have also been using the Lottery Factor, but I have to say I really like "Core Contributor Factor". The main idea to communicate is that a project with a single point of failure is risky to adopt and so if there are already several projects that depend on said project, the single point of failure must be addressed. After brainstorming with our AI friends I would like to propose SPOC Single or SPOC-R Single Used in a sentenceUnfortunately, this project has a high SPOC risk. |
+1 for Lottery Factor. |
In an attempt to summarize this very long discussion, here are the options that seem to be getting the most traction:
If this seems reasonable, I would like to create a poll to gauge reactions with the caveat that the CHAOSS project operates by consensus, so this poll is limited to understanding what people think, and the one with the most votes will not necessarily be the "winner". Notes:
|
I think this "vote" seems like a good idea. Just as a point on one of the options, I think it will be confusing when the single point of critical failure metric is not 1... "Points of Critical Failure (PSOCF)" and "Critical Failure Points (CFPS)" are alternatives for this idea. |
OK, we now have a poll based on the above options. Please vote here: #634 |
Hey folks, sorry I am late to the party. Just wanted to throw in one more term that I have heard but didn't see mentioned here - succession plan, or succession strategy. I offer only for consideration, not because I think the poll needs to be changed - I like "lottery factor". |
@Jefro - this is an excellent point. I don't see it as part of the metric itself, but it's an important part of what you should be doing as an outcome or action as a result of what you learn from measuring this. I've just added a paragraph about succession planning to the a Practitioner Guide for Contributor Sustainability that I've been working on. It's a first draft that's ready for feedback, so if you have time, feel free to have a look and leave comments / suggestions :) |
I do not like the term Lottery Factory. It reminds me strongly of problem gambling, which can be diagnosed as a pathological addiction disorder, and which is much more prevalent in poorer communities. I think any gains we have by getting rid of the term bus factor are lost when moving to Lottery factor. Disaster Factor also ignores the the fact that a contributor may leave a project for good reasons, personally for themselves and for project as a whole (which may have reached a point in its lifecycle where sunsetting is best for everyone). SPOC makes me think of Star Trek, which is fine, but it'll be confusing when talking to others. 🖖 |
I was a big fan of Lottery Factor, but with the comment from @RichardLitt, now I'm not so sure. Another option that @decause-gov proposed on the poll itself is "Nebraska Factor", which I think is something else to consider:
The benefit of Nebraska Factor is that it doesn't imply a reason (disaster, bus, lottery), because as mentioned above, a person can leave for positive reasons, too. |
I'll vote against Nebraska Factor, as in inside joke that's likely off-putting to those not in on it. It also doesn't have any inherent meaning, so you have to know what it is to know what it is, the name doesn't help you understand it (unless you already know the cartoon) |
Agree. Nebraska factor seems off-putting. When I used lottery factor it was largely in response to the negativity of bus factor, not because I particularly liked it. I ended up voting for the |
How about: Critical contributor index? |
Keep in mind, if we use key/core/critical contributor, we will also need to address this in our ongoing discussions about the boundaries between - core/regular contributors, occasional contributors, conversion rate, and 2nd contributors. What is a key/core/critical contributor? I think Bus Factor addresses the risk associated with 'people' leaving but I don't believe it defines who they are or why they are important. |
I'm thinking maybe we keep it simple. We don't need to make the judgement about whether someone is "core", "key", or "critical" in this metric, so maybe just "Contributor Risk"? |
I like "The Lottery Bus Factory". 🚌 💰 The Lottery Factor works as well. |
Hey All, just wanted to pop in here to let you all know that I came across an 'industry standard' (in quotes because I'm not sure it actually is) here in Australia wherein Dr. Jennifer Beckett has a measure for the bus factor but she referred to it more informally as the 'moses effect.' She "firmly believes" (this is in quotes because I'm pretty sure it was a joke) that Bus factor is a MUCH better term, but we are probably going to be using bus factor as a primary measurement for internet toxicity because it specifically acts to measure the Bridging Capital of individual influence from inside of a community, to the outside public. More importantly it connects the three social capitals (bonding, linking and bridging) together in one singular user journey so we can easily understand the risk of that person propping everything up. Reversing the metric also allows us to gauge the level of reputation and briding capital that someone will have coming INTO the community (reputation being a social currency metric, and riding capital being a social capital metric). If you'd like I can have a longer conversation with her about this and it may give us a new insight into the way that Bus Factor impacts the socio-cultural stability of online communities. Worth a look I think! |
In a seperate comment on this, I also wonder if we should be basing the name of this off of what is actually happening when you graph the bus factor on a network diagram - not just on contributors to an opensource project, but actually place it within the context of social capital and currency theories IN GENERAL. In reality the Bus factor occurs when a 'node' (member in a community) has garnered a lot of linking capital (they are connected to a lot of people) and Bonding capital (they have close ties and have grown in reputation so their voice is recognized), shows a potential threat or likelihood of leaving a community--and their linked members are only connected to them in the project, so those nodes are at risk of disappearing from the community network diagram. In other words, that individual has garnered the linking capital and bonding capital to prop up a community BUT if they were to leave the community would be at risk. Within the context of an opensource project that is the likelihood that their leaving would put the project at risk but even for an entertainment community this can be measured in the amount of engagements that a member of import has caused, in comparison to the amount that surrounding nodes have caused. We also see this example in real life with malls (🤮) American malls were architected such that there was always a food court in the middle for people to connect with as a 3rd space. Then at either end of a mall (usually a line in 2 or 3 directions) there would be an 'anchor store' that people would go to for low-dollar, but high-value items such as grocery, or mig-market stores. The smaller novelty stores and specialty services such as asian-import stores, mini-gold outlets, video game labs, and whatnot relied on the big-box anchors to force people to go between the communal space, and the larger store. Larger stores relied on the people being there for those specific interests to keep them in the mall for long periods of time. What caused malls to die, and also what causes bigger contributors who commit frequently to leave, is usually that the perceived reward, becomes too much for the work that they have to commit. (I've talked about the burden of contribution in CHAOSS meetings before). So there could be something involved in the generalized issues for bus factor that we could use to rename it. This might be 'lossed link likelihood' or 'at-risk supporter' or something along those lines? |
In the metrics meeting, we discussed renaming this to "Contributor Risk" to keep it simple and descriptive, similar to our other metrics names. |
This might be the least worst option 😄 |
Another idea: kujenga factor. This is the Swahili word meaning "to build", and it is the basis for the popular game Jenga™. As you may not have played it, this game involves taking wooden blocks out of a tower, until the tower eventually falls down, at which point the person who removed the last block loses. https://en.wikipedia.org/wiki/Jenga I believe that Jenga is trademarked. Kujenga, however, isn't. It's relatively easy to explain. Also, it looks like: |
Are we thinking about projects or people? |
Or ecosystems. |
I know that renaming what is probably our most widely used metric is going to be painful, but I think it's time to rename Bus Factor to something else.
The number of people I've had express pretty severe dislike of the name Bus Factor is quite high, and I often try to avoid calling it Bus Factor.
I often call it "Lottery Factor" because it's easy to understand. How likely is your project to survive if someone suddenly one the lottery, retired on a beach, and never looked at your project again.
Pony factor is more widely used, because it's been adopted by the Apache Software Foundation, but I find that it's harder for people to understand outside of the ASF. There isn't an easy narrative around it like what I have above for Lottery Factor.
I'd be really curious about the opinions from folks involved in inclusive naming initiatives and whether they've seen a commonly suggested substitute for Bus Factor.
I'm also curious about what the academic folks have seen. Is there a particular term that is more widely used in Academia / research?
cc-ing a few folks that I think would be interested in this discussion: @GeorgLink @germonprez @sgoggins @ElizabethN @klumb @dicortazar
I welcome any Chaotics to jump in with opinions.
The text was updated successfully, but these errors were encountered: