Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thoughts on moving to prime-time #13

Open
zboldyga opened this issue Jan 23, 2020 · 26 comments
Open

thoughts on moving to prime-time #13

zboldyga opened this issue Jan 23, 2020 · 26 comments

Comments

@zboldyga
Copy link

zboldyga commented Jan 23, 2020

@KillingSpark hey there! Saw your post on Reddit.

I'm looking for an open source, systems-type side project to work on this year. I was considering a port of PM2 (Node.js) to rust, for several reasons: pm2 has a good UX and a large user base, but javascript is slow, gc w/ memory leak risk, and insecure.

But ultimately PM2 just leverages whatever daemon manager is available on the OS. And while rewriting that layer with speed and memory safety has benefits, I wonder whether it makes sense to cut out the middle layer and make a user friendly e.g. systemd-type service manager

That's when I encountered your project. I think the premise is great! But I'll be honest, my experience with a lot of the underlying principles of a modern service manager is limited. I am reading a lot of opinions that systemd has far too many features and too much code, and as I dug deeper, a lot of the features do seem irrelevant to a basic service manager. But I think I'd need to learn a lot more to fully assess an appropriate design.

That said, it would be great to have your advice on a few points:

  1. Given your current progress, are you thinking that you want to expand this to a battle-hardened production system for service management? e.g. something that works on modern hardwares supported by rustc, and some dominant linux distributions? Or are you still viewing it as a toy project? (I don't mean that in an offensive way)

  2. Based on what you learned, what % complete do you think this is in comparison with a bare-bones service manager that could drop-in for basic usage of systemd?

  3. Why no cgroups? Admittedly, I don't fully understand the implications of this choice, nor the alternatives. But you seemed to get a good amount of backlash about that feature from the community.

  4. Any other major design decisions or forks-in-the-road you faced, or see ahead?

@KillingSpark
Copy link
Owner

KillingSpark commented Jan 24, 2020

Great to see that the reddit posts reach people that are interested in rustysd :)
To answer your questions:

  1. I would be glad to donate this project to anyone who has these intentions. I myself don't think that I have the necessary experience/connections/time/energy to maintain a project on that scale. For now it is a toy project. (This might change if I ever feel like these counter-arguments don't apply anymore).

  2. Rustysd should be like 80%-ish there. The biggest thing missing to be a drop-in is probably the templating of units (unit files with an @ in their name). Otherwise there are some minor things like getting all the directories systemd uses etc. I tried to boot a debian once but rustysd paniced somewhere and debugging that is rather hard so I stopped to look into that as there were more interesting things to work on. Other than that it's mostly code quality improvements so these panics don't occur but are handled more gracefully.
    If your plan was to just run services and not replace your init systemd rustysd is 95%-ish there I guess? There are probably some bugs lingering, also I am not sure if all possible deadlocks have been eliminated. Before I add anymore features I should really write more unit-tests and a test-scenario with more than two services.

  3. It didn't seem really necessary at first since it's not really required to get a 'working' prototype. I have since then added cgroup support for the linux target but rustysd still works on other platforms. It just doesn't try to move processes into cgroups on these. Since all I am using cgroups for is tracking processes the usage is fairly simple (have a look at the platform/cgroups module if you are interested)

  4. I guess designing the UI for the systemctl equivalent rsdctl would be pretty important. Otherwise I am thinking about offering an alternative service-definition file format that more explicitly ties services and sockets together, which should be designed carefully.

@zboldyga
Copy link
Author

Very interesting, thanks for the detailed breakdown!

I will review the code in more detail and also do a deeper dive into the inner workings of similar tools. My intuition is that the hop from it's current state to a hardened production tool for the masses is probably quite large, but it's great to hear that you think it's as far along as 80% for a basic service management use case.

Since the competition is pretty tough, I would think the tool would need to be very well designed in order to compel developers to switch from their current service manager. Maybe as a next step I can understand the code and draw up a design document to illustrate the major design decisions behind the tool. Maybe we can discuss the major decisions to see what options there are, and the pros and cons of your current approaches... If we align on enough things, I can fork and start writing unit tests to tidy things up!

@cdbattags
Copy link

I'd love to potentially use this in tandem with https://github.com/linuxkit/linuxkit.

Wouldn't be too hard because most usages so far in Justin's examples are using sysctl/getty and we'd probably be able to swap out almost 1 for 1.

IMO this would be the perfect greenfield exercise for using something like rustysd.

Thoughts, @justincormack?

@KillingSpark
Copy link
Owner

@zboldyga

I will review the code in more detail and also do a deeper dive into the inner workings of similar tools. My intuition is that the hop from it's current state to a hardened production tool for the masses is probably quite large, but it's great to hear that you think it's as far along as 80% for a basic service management use case.

I should have said that those percentages meant: "feature wise 80% ready", in my experience debugging and cleaning up can take a considerable amount of time. And I agree, for big amounts of people to use this there is probably a lot to be done.

Since the competition is pretty tough, I would think the tool would need to be very well designed in order to compel developers to switch from their current service manager. Maybe as a next step I can understand the code and draw up a design document to illustrate the major design decisions behind the tool. Maybe we can discuss the major decisions to see what options there are, and the pros and cons of your current approaches... If we align on enough things, I can fork and start writing unit tests to tidy things up!

I would be glad to discuss any design changes/improvements! If you have questions about the current design don't hesitate to contact me. I have no concrete list of design decisions I want to change myself but there is one I know I don't like: There are some hacky things, around socket activation where units can 'ignore' activation until a later point. This might be one of the places where a redesign is much needed.

@KillingSpark
Copy link
Owner

@cdbattags I would be interested in how this experiment plays out if you go ahead with it!

@zboldyga
Copy link
Author

@KillingSpark Sweet, I'll get to work soon :)

@KillingSpark
Copy link
Owner

@zboldyga Hey just pinging if this is still active :)

FYI I started documenting inner mechanisms in /doc/ PidHandling.md. If you take notes that could be useful to others trying to understand rustysd's inner workings I would love to include those in similar documents.

@justincormack
Copy link

@cdbattags really interested in experimentation around this in LinuxKit. I would love to remove all C code from the base system... lots of pieces to work on to get closer to that.

@cdbattags
Copy link

cdbattags commented Feb 12, 2020

@justincormack I believe @pwFoo has made the most progress so far. See #15.

@pwFoo do you mind open sourcing your Dockerfile for https://hub.docker.com/r/pwfoo/rustysd?

The one that goes alone with:

kernel:
  image: linuxkit/kernel:4.19.99
  cmdline: "console=tty0 console=ttyS0 console=ttyAMA0"
init:
  - pwfoo/rustysd:latest

@zboldyga
Copy link
Author

@KillingSpark Yup!

The biggest design decision I'm seeing, possibly the only thing that would really be a fork-in-the-road, is the user interface for the tool.

Lots of people have flocked to PM2 because it's simple, it provides a bundle of the functionalities that general application developers need, and it has cross-platform support. Some specific features that are perks from a usability standpoint:

  • All services are loaded from a processes.json file in userland (can be loaded in this way - but it's the recommended way). It's common to include these in code repositories, which is nice from an application-development perspective.

  • Logs are managed intuitively - stored to a hierarchy of directories, easy to enforce size/time limits/rotation, and the CLI API allows viewing logs.

  • Basic 'top'-esque stats for the processes, accessible via the CLI API.

  • Ability to set startup rules, hot-reload rules, watching, etc.

An advanced feature: deals with clustering / forking.

It's a one-stop shop, and it's easy to use.

What I don't want to do is build another tool that application developers feel like they can't use directly. I'm not sure how much value this would really provide.

I would wager that users of PM2 who switch to a rust-based tool are going to get immediate, tangible benefits: faster log access, faster/better real-time monitoring. There's also the benefit that it's a lot less likely to have bugs in the future than untyped javascript.

How do you feel about mimicking the pm2 user interface (some of the CLI, a config-file for loading)? There is probably room for improvement as well - but aiming for similarity means existing users can switch easily.

@zboldyga
Copy link
Author

@justincormack It's also great to see that you have a need and a possible greenfield exercise. Would a PM2-type interface be acceptable for your use case?

@KillingSpark
Copy link
Owner

@zboldyga I am not opposed to having a pm2 style interface. The inner model of rustysd should allow for having both a systemctl and a pm2 interface side by side.

I am not sure how easy it will be to add different service description possibilities but I dont see any technical reasons not to do so.

There is the problem that this could result in a lot of complexity which isn't what i was aiming for but arguably it is not in the core parts and should be able to be hidden behind feature flags sooo...

@pwFoo
Copy link

pwFoo commented Feb 12, 2020

@justincormack @cdbattags
I replaced all the linuxkit images with custom ones (rustysd, crun, rngd, mdevd, ...). Use older alpine vanilla kernel because of missing nf_conntrack with kernel >= 4.19 and moved to cgroupv2 (just for testing...)

At the moment it boots to a shell in qemu and a dell notebook. After crun fix is merged I try to move from simple shell init script and /bin/sh to rustysd controlled crun containers.

Quick and dirty repo called "DenglerOS"
https://github.com/pwFoo/DenglerOS

Goal is a minimal container based linux with GUI (fluxbox, chromium, ...) running in docker / podman / crun containers. Similar to RancherOS, but focused for desktop use (but just a fun project at the moment!!!).

Rustysd / init part:
https://github.com/pwFoo/DenglerOS/tree/master/packages/init/rustysd

@zboldyga
Copy link
Author

@KillingSpark I didn't consider that option, but that sounds like a great idea (two interfaces).

I'm also in agreement about managing the complexity / aiming for simplicity. That's precisely where systemd went wrong: trying to do too many features. So perhaps we start simple, then see what users think, and add features if needed.

I'm willing to set aside 10 hours between now and March 20th, e.g. 1 hour around lunch time, twice per week. I think that will be enough to write some test cases, raise questions around certain internal designs, provide documentation, possibly some small code adjustments. I'll start on Friday, will probably fork then and send you some questions.

@KillingSpark
Copy link
Owner

@zboldyga That's cool! No pressure though, I am glad for whatever you can offer!

@KillingSpark
Copy link
Owner

@zboldyga before you start designing a pm2 style interface to rustysd, you might want to make a concrete comparison of what pm2 offers over what a systemctl style interface would offer. I'd imagine many commands have equivalents and duplicating code (and add more chances to introduce bugs) just to satisfy multiple control interfaces should be avoided where possible.

I might be totally off here I have no experience with pm2, so take with a grain of salt.

@cdbattags
Copy link

When the pm2 "feature" was suggested I thought of it as an abstraction layer that'd more or less reuse what we have so far. I think we can get away with not adding much more for it.

@zboldyga
Copy link
Author

right, I'm all for adding as little code as possible. Just want to remove the need for wrapper code written in a higher-level language.

I won't add any of those bits yet. For now just going to write unit tests and ensure the foundation is solid.

@zboldyga
Copy link
Author

@KillingSpark I misread one of your earlier comments, about having a systemctl and pm2 interface side by side. I thought you were saying a systemd and pm2 interface side by side.

I get having a very low-level interface. Advanced users will always want granular functionality.

For a quick, user-friendly interface, systemctl and pm2 are mostly the same. I think PM2 is a bit more usable for those higher up the software stack. systemctl manages systemd 'units' which are quite abstract, whereas in this case we are only managing services. So a tool that only manages services can provide commands that are closer to what application developers will expect out-of-box. e.g. pm2 monitoring is very quick and intuitive, whereas systemctl probably won't give you what you want without some tinkering.

I also think the application-level process files is an important part of PM2 that makes usability simple. e.g. loading that whole file into the service manager with one command. I'm not sure if systemctl supports that. But it would be great to load a whole configuration from a simple config file.

@zboldyga
Copy link
Author

If you want to look at what PM2 is doing, the code is quite short and is mostly just wrapper if statements: https://github.com/Unitech/pm2/tree/master/lib/API . But in some cases there is extra logic that is quite handy for user experience.

@KillingSpark
Copy link
Owner

Ohh! I thought pm2 was actually a service manager itself, I should have looked that up before writing anything, sorry about that. Yeah having pm2 support would probably be handy to people :)

@zboldyga
Copy link
Author

Sweet. Yeah and I'm totally cool with deviating from PM2's design choices if something is quirky or convoluted. We could e.g. end up with a blend of systemctl and pm2. I imagine PM2 is less well thought-out than it's lower-level counterparts.

@KillingSpark
Copy link
Owner

Additionally we could support all systemctl commands pm2 uses and have that tool just work in the rustysd ecosystem. I will look into how feasible this is. But having a sane user-friendly interface baked into rustysd/rsdctl would be pretty cool in any case!

@zboldyga
Copy link
Author

@KillingSpark Also just a lingering thought - I'm wondering if the package name is appropriate for real-world usage. On one hand, using rust in the name is a common trend at the moment and a quick way to get the support of the rust community. But some downsides:

  • Most users won't know rust, and likely won't even know the benefits of rust over e.g. C/C++. I would think some people will shy away from the tool if it appears to revolve around a language they don't use.

  • If the tool is a service manager, then I would imagine naming it with something revolving around processes, services, daemons, etc. would be straightforward.

  • The C/C++ community is giving a lot of pushback around the idea of writing things in Rust just because it's memory-safe. Memory-safe != bug-free code, and tools like systemd have a long history and are hardened. I tend to agree, somewhat.

But of course, associating with Rust makes the tool look shiny and new, which could be important for getting traction. Maybe we can think about this and revisit when we're farther along?

Opening up some code now!

@KillingSpark
Copy link
Owner

I guess that's true. But I am not that creative when it comes to naming of projects. My counter argument would be: If someone bases their opinion about a project on it's name that's their problem.

I get that the C/C++ community sees a lot of the work done in the rust community as reinventing the wheel, which it sometimes is, but rustysd could bring real benefits to the table. I think the readme states that clearly enough so anyone with at least a bit of interest should be able to see past the somewhat uninspired name.

If you have any cool names that convince me to rename, I am open to suggestions but I don't think changing the name just to drop the 'rust' part is necessary

@KillingSpark
Copy link
Owner

@zboldyga If you are interested: I started to compile a document containing ideas on how rustysd should be redesigned, to ensure that there is less technical debt before working on additional features :)

See the tracking issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants