Cor: a hobbyist x86_64 kernel

Cor explores how one could build a bare-metal kernel in 2015. It uses the Rust programming language to achieve memory safety.

We find that the complexity of modern CPU architectures doesn't necessarily mean that you can't build nice things yourself.

Non-goals

Speed (though Rust is pretty fast out of the box)
Security (you wouldn't attach this to your network at work)
Production readiness

Synopsis

If you're not running Linux, it's easiest to grab Vagrant and run vagrant up.
Install dependencies: build-essential, qemu, xxd, Ruby, Go, Rust
$ make
$ bin/run to start the system, you'll be connected to the serial console of the machine
$ bin/debug to debug the kernel, runs qemu drops you into a gdb
$ cucumber to run integration/blackbox tests
$ bin/debug_stage1 to debug the bootloader (also see how bin/debug skips it)

Roadmap

In the OSDev ontology, we're likely building a "Nick Stacky" system.

more safety things:

model I/O ports as slices of a 0-dimensional data type
InterruptLocked (see rust forums)

More unicorns:

Page-table-based IPC ("send" a page to another process, zero copy yadda yadda)
SMP support
Smarter kalloc (something like Linux' slab allocator?)
Smarter userspace malloc that allocates contiguous sections for a single task
FS: Journalling
Test on real hardware
Thread-local storage setup for Rustland:

jonasschneider: but figure out some place to put your F-segment, and stick the address in the GDT jonasschneider: (and do an LGDT and all) jonasschneider: and mov 0, %fs:0x70 if you're lazy jonasschneider: or figure out some better place to put your stack

TODOs:

compile with -O
- correctly declare inline ASM memory barriers/volatility
Fix relative addressing in boot.s
Redzone thing

Memory map

TODO: Make this entire map part of the linker script Physical memory map at the time of stage2 startup:

0x01000-0x05FFF: page tables courtesy of boot.s
0x6FFFF: 0x37 (magic)
0x70000-0x7FFFF: stage2's Stack (TODO: guard page)
(I just realized that 0xa0000 - 0xfffff is still free, fuu base16)
After that there are likely some memory holes

Additional virtual mapped memory:

0x0000008000000000-0x0000008000200000: identity map of lower physical memory starting at 0 (this is where we keep & run the stage2 kernel)

Additional physical memory used by stage2:

0x06000-0x06FFF: stage2's IDT (TODO: replace with kalloc)
0x81000 System timer jiffies counter (please don't ask)

All other memory is allocated in mm.c by kalloc, which uses the BIOS memory map provided by boot.s to place things into higher memory (usually, phys >= 0x100000.)

Caveats

As this is an academic project, I'll try to document things I stumbled over.

%ax is the same register as %ah and %al. That means: don't try to write something into %ah, then zero out %ax and expect your value in %ah to still be present.
gdb doesn't handle QEMU architecture switches well. This can bite you when trying to debug the bootloader. I'm not yet sure what exactly breaks, but I've seen different failure modes when switching the CPU into 64-bit mode:
1. gdb 7.6.2 on OS X (Homebrew) crashing after the switch, complaining about g packets. This seems to be a rather known problem. The linked thread also supplies a patch. Applying that leads us to symptom #2, which is:
2. patched gdb 7.6.2 on OS X (Homebrew) not crashing after the switch, but still displaying the 32-bit registers, but with wrong values. This is apparently also known, but is an issue with the gdb remote debugging protocol. (The QEMU monitor still displays the correct register values.)
After debugging these, I realized that somehow Homebrew or OS X libs might be the culprit. And it turns out that under Linux (tested under Ubuntu and Arch), attaching to QEMU's gdbserver port after the switch to 64-bit mode works, but crashes when switching while attached. On the other hand, on OS X, the g packet crash happens even when attaching gdb after the switch to 64-bit mode.

I'm not yet sure how to finally solve this. So far, the workaround seems to be to (a) run gdb under Linux, and (b) restart it when switching architectures. Meh.
On Yosemite (not sure if relevant), gdb's readline occasionally doesn't play nice with iTerm2. That means gdb will hang if it asks you a yes/no question, it won't respond to hitting the enter key after typing your answer. This happens both on a Homebrew-installed gdb, and over an SSH connection to an Ubuntu VM (via Vagrant). Terminal.app doesn't have this problem.
QEMU does have some limited tracing support built-in. Running it with something like -d int,pcall,cpu_reset,ioport,unimp,guest_errors will spew various potentially helpful info to stderr. However, debugging generic errors like a General Protection fault still proves nontrivial. Using Homebrew's interactive_shell command in the qemu formula, qemu was patched to include some printf statements in the interrupt-handler code. This affects do_interrupt64 (see target-i386/seg_helper.c in the qemu tree), for an example see this gist
info mem in the qemu console will display the virtual memory map.
Memory below 0x10000 cannot, in fact, belong to any segment, since segment 0 is the null segment. This, for some cases, means you can't have things in this low memory. An example seems to be the stack segment register when returning from an interrupt routine.
Should maybe file a bug against QEMU because it doesn't check CS/SS register contents right when/somewhere shortly after entering protected mode, if you forget that it'll bite you later.
The red zone thingie? (When interrupted in ring0)
Relocation truncation https://www.technovelty.org/c/relocation-truncated-to-fit-wtf.html

To investigate: ELF sizes --

SECTIONS
{
  . = 0x10000;
  .text : { *(.text) }
  . = 0x8000000;
  .data : { *(.data) }
  .bss : { *(.bss) }
}

is tiny, while swapping the addresses gives a huge one

Design goal should probably "as little resident/permanent state in C-land as possible", given entropy and all that
Continuity: The user space perspective is "do a syscall, then later return from the syscall", while the kernel has a completely different view.
Context-switching idea: make the kernel-level scheduling, yielding, parking etc. independent of the trampoline/userspace/syscall/interrupt logic. It looks like they are orthogonal problems, at least when approached naively. For maximum performance, it's probably faster to mix everything.

The Story

I don't have a history with writing anything low-level. I usually write Ruby or other dynamic languages with GC, and never really cared about what actually went down inside the computer. UNIX syscalls were my primitive instructions. gdb always scared me with its pointers, and how it could crash my entire process so easily. Finding .s files in a project repo was always a good sign for me to avoid touching it with a 10ft pole.

Takeaway: For high-level developers, the scare factor of low-level assembly programming might be so high because it's combined with the great complexity of a modern OS. If you take one of the factors away, you're back in a fairly comfortable zone; usually, you take away the low-level factor and deal with the complexity. It turns out that taking away the complexity works just as well. (Difficulty = Complexity x Scope)

Bibliography

http://wiki.osdev.org/Memory_Map_(x86)
Intel manual (TODO)
AMD manual (TODO)
http://idak.gop.edu.tr/esmeray/UnderStandingKernel.pdf

Lessons learned

Ownership is a powerful concept of resource management. Case study: CPU I/O ports.
1. If you are able to access a port, nobody else can (=unique owner)
2. You can temporarily give somebody else access, but during that time, you don't have access yourself (=borrowing)
3. You can give away access to a part of a port (=slice splitting)
Linux ops structs do dynamic dispatch much like vtables
Rust is great at moving on the ladder of abstraction. (Generics/Traits vs inline asm)

Name		Name	Last commit message	Last commit date
Latest commit History 277 Commits
arch/x86-multiboot		arch/x86-multiboot
bin		bin
features		features
include		include
scripts		scripts
src		src
test		test
userspace		userspace
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Makefile		Makefile
Makefile.conf		Makefile.conf
README.md		README.md
Vagrantfile		Vagrantfile
out		out

jonasschneider/cor

Folders and files

Latest commit

History

Repository files navigation

Cor: a hobbyist x86_64 kernel

Non-goals

Synopsis

Roadmap

Memory map

Caveats

The Story

Bibliography

Lessons learned

About

Resources

Stars

Watchers

Forks

Languages