Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major/Minor/Patch should not be an int #26

Open
concatime opened this issue Mar 30, 2020 · 7 comments
Open

Major/Minor/Patch should not be an int #26

concatime opened this issue Mar 30, 2020 · 7 comments

Comments

@concatime
Copy link

concatime commented Mar 30, 2020

Hi.

Major, minor and patch are “non-negative integers”. So no need to use a signed integer.
Also, int in C is at least 16-bits, so the maximum version that we can safely contain is 32'767. This approach falls short on projects using dates as version (e.g. https://github.com/janmojzis/tinyssh/releases). In fact, 20190101 does not fit in a signed int as major version.
One solution would be to use directly uint64 like here: https://docs.rs/crate/semver/0.9.0/source/src/version.rs. But guessing that this project may also be used in embedded devices. 64-bit is a little too much.
Patch should be at least 16-bits (semver/semver#516 (comment)).
How about uint32 for major, uint16 for minor and patch? This way, we can form a 64-bit long version number by concatenating uint32, uint16 and uint16. This also simplify your semver_numeric function[1].

[1]

#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

uint64_t semver_numeric(void) {
	const uint32_t major = 20190101UL;
	const uint16_t minor = 42U, patch = 256U;
	return (uint64_t)major << 32U | (uint32_t)minor << 16U | patch;
}

int main(void) {
	printf("%" PRIu64 "\n", semver_numeric());
	return EXIT_SUCCESS;
}

Cheers,
Issam.

@h2non
Copy link
Owner

h2non commented Mar 31, 2020

I fundamentally agree. Happy to merge a PR improving things!

@jwdonahue
Copy link

Why don't you just used packed decimal digits? The SemVer spec says the whole thing is actually a string of ASCII characters, not binary scalar types.

@jwdonahue
Copy link

jwdonahue commented May 12, 2020

For a seriously memory constrained device, I would not use SemVer directly, I would use a simple build number. I would then maintain a web page that matched build numbers to SemVer version strings. For most modern embedded systems however, a 64..256 character string would not put any strain on resources.

I have always maintained that version strings don't belong in your code. If you create libraries for small embedded systems, you should not store any version data in your code at all. Version strings are rarely needed by the device run-time and just wind up wasting space. It's up to the integrator of those libraries and hardware components to maintain a BOM anyway, so why waste the space? Plug-in cards for PC's and the like, really don't need anything more than model and build number, which can then be used to lookup more human targeted data.


By the time you write code to unpack your structure in a SemVer string for display, you've burned more bytes than the string would have taken up anyway.

@ojousima
Copy link
Contributor

As a quick comment, a lot of embedded devices use version numbers as a part of their firmware update checks. For example "Is the software radio compatible with this application, is this application newer than on currently installed. If yes to both, update." It's a lot easier to do those checks on integers than on strings.

@jwdonahue
Copy link

A lot easier? I think you're ignoring some details, but even if that is so, just embed a build number.

I've worked on several dozen embedded products over the years. Some that have half a dozen or more microprocessors scattered across different boards. Never encountered a system where the version strings for those components was more than 3-4 bytes long (often just a hash of the image), and our automation wrote that into ROM/EPROM or Flash. So we generally got by with a short string of characters or a single unsigned integer. SemVer wasn't even on our radar, because it would have been useless and wasteful.

We never embedded code in a device that would decide whether to take an update. Many of the devices I developed had RF modems on them and were used in the construction/maintenance of public infrastructure, aircraft, nuclear reactors, etc., so it would have been foolish to allow such a device to contain any such exploitable logic. Instead, updates were encoded specifically for the version of the image they were to update (in other words, there could be only one valid update image) and the device had to be switched manually into a special mode that ran on a very tiny boot loader, that would often have to pull additional bootstrap code onboard to perform the upgrade. We rarely used the RF link for such updates. It usually required a "special cable" (ie; expensive, proprietary JTAG or serial port cable).

But, had I been working on say, an internet connected RF router, I'd have been even more security conscious and devised a scheme that did not browse a list of options to automatically download. I'd arrange for a secure channel to a feed, that matched the one and only official update image for my current device state. So while there would be checks to run on the device side, they would not involve simple comparisons of a few integers or ASCII characters, they would involve decrypting the new image, and checking that a record contained therein matched the hash code encrypted in my EPROM/Flash.

All of that being said, most devices have embedded binary version data in them. They also often contain code that converts strings to that data format or converts the format to a string. By the time you factor in the extra code for such conversions, say for display purposes or serializing to/from a communications channel, you might as well just store the string. I have found that there are very few cases where performance or storage requirements are a limiting factor when it comes to processing version data. It's rarely done in any critical path because it's just not needed.

On the few products where a boot-time check of the attached boards had to be performed before any of them could destroy something or incur a liability, we just used an 8, 16 or 32 bit number (usually a hash of hashes) that identified the set of component versions that had been tested together for assembly into that particular device model number. Sometimes we simply designed the boards with an 8 or 16 bit mask, physically wired to a special address. If anything on the "bus" didn't have the correct pattern set when the board was flow-soldered, the bus controller would shutdown the product and light an LED.

So, my point is, that in the places where SemVer is actually useful, there really aren't any memory or performance issues to be concerned about. I know of CI/D build systems that scale across tens of thousands of machines, that just use version strings everywhere. I've made the mistake of parsing those strings into binary fields, running comparisons, etc, and it's usually been a big waste of time. It adds complexity where none is needed and it unnecessarily limits the range of versions that can be tracked.

SemVer uses ASCII strings. The numeric fields in the version triple are rarely more than 4 bytes long. Converting ASCII decimal digits to binary numbers is wasteful in most cases, when you could simply load two, four or eight of them, into a register and compare them directly. There's a reason semver doesn't allow for leading zeros in numeric fields! That way, they can be sorted by ASCII code.

@ojousima
Copy link
Contributor

Let us save the trouble of arguing about the best practices. I simply raised a point that a lot of embedded devices do use integers as a part of their version number checks as a counterexample to your feedback. One such example is MCU Boot version check. Security of updates is a separate matter and usually handled via public key cryptography.

@jwdonahue
Copy link

It's a given, but that does not mean that this particular code base should do so. Why use a format that places arbitrary limits on the values of version numbers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants