Skip to content

proposal: Go 2: type annotation ~ for stack allocated variables #37346

@JOT85

Description

@JOT85

Introduction

Within a typical Go program, a lot of heap allocations are made - which comes with significant overhead. One reason for this is interfaces. If an interface takes a reference to something - whether that be an explicit pointer, a slice, or something else - the compiler must assume that it escapes on to the heap.

How much overhead?

Firstly, the benefit of significantly reduced heap allocation is clear when you consider the implications to garbage collection time. If much less is on the heap, less time will be spent performing garbage collection.

To see how much of an overhead heap allocation actually has, I made this quick program:

package main

import (
	"fmt"
	"io"
	"time"
)

//Stealing a trick from the reflect package.
var dummy struct {
	b bool
	x []byte
}

//escape forces p to the heap, without doing much work.
func escapes(p []byte) {
	if dummy.b {
		dummy.x = p
	}
}

//ExampleReader copies as much of Data that it can to the dst on a call to Read.
type ExampleReader struct {
	Data []byte
}

//Read simply copies r.Data to dst
//Don't inline to prevent that from effecting the benchmarks.
//go:noinline
func (r *ExampleReader) Read(dst []byte) (int, error) {
	return copy(dst, r.Data), nil
}

//These functions all do the same: call the read method and check that the last byte is an 'o'
//readFrom directly uses the Read method.
func readFrom(r *ExampleReader) {
	buf := make([]byte, 4096)
	n, _ := r.Read(buf)
	b := buf[n-1]
	if b != 'o' {
		panic("o!")
	}
}

//readFromEscapey is the same as readFrom, except that the buffer escapes so must be heap allocated.
func readFromEscapey(r *ExampleReader) {
	buf := make([]byte, 4096)
	//force buf to be allocated on the heap
	escapes(buf)
	n, _ := r.Read(buf)
	b := buf[n-1]
	if b != 'o' {
		panic("o!")
	}
}

//readFromInterface is the same as readFrom, except that it uses the Read method from an interface, causing buf to escape.
func readFromInterface(r io.Reader) {
	buf := make([]byte, 4096)
	n, _ := r.Read(buf)
	b := buf[n-1]
	if b != 'o' {
		panic("o!")
	}
}

//Benchmark methods call the corresponding function n times.
func BenchmarkReadFrom(r *ExampleReader, n int) {
	for i := 0; i < n; i++ {
		readFrom(r)
	}
}
func BenchmarkReadFromEscapey(r *ExampleReader, n int) {
	for i := 0; i < n; i++ {
		readFromEscapey(r)
	}
}
func BenchmarkReadFromInterface(r io.Reader, n int) {
	for i := 0; i < n; i++ {
		readFromInterface(r)
	}
}

func main() {
	r := ExampleReader{
		Data: []byte("Hello"),
	}

	//Time each of the functions.
	const n = 10000000
	start := time.Now()
	BenchmarkReadFromInterface(&r, n)
	fmt.Println(time.Since(start) / n)
	start = time.Now()
	BenchmarkReadFromEscapey(&r, n)
	fmt.Println(time.Since(start) / n)
	start = time.Now()
	BenchmarkReadFrom(&r, n)
	fmt.Println(time.Since(start) / n)
}

Then the output of gc flag -m:

$ go build -gcflags '-m' test.go
# command-line-arguments
./test.go:16:6: can inline escapes
./test.go:49:9: inlining call to escapes
./test.go:93:13: inlining call to fmt.Println
./test.go:96:13: inlining call to fmt.Println
./test.go:99:13: inlining call to fmt.Println
./test.go:16:14: leaking param: p
./test.go:30:7: (*ExampleReader).Read r does not escape
./test.go:30:30: (*ExampleReader).Read dst does not escape
./test.go:36:15: readFrom r does not escape
./test.go:37:13: readFrom make([]byte, 4096) does not escape
./test.go:46:22: readFromEscapey r does not escape
./test.go:47:13: make([]byte, 4096) escapes to heap
./test.go:58:24: leaking param: r
./test.go:59:13: make([]byte, 4096) escapes to heap
./test.go:68:24: BenchmarkReadFrom r does not escape
./test.go:73:31: BenchmarkReadFromEscapey r does not escape
./test.go:78:33: leaking param: r
./test.go:85:2: moved to heap: r
./test.go:86:15: ([]byte)("Hello") escapes to heap
./test.go:92:29: &r escapes to heap
./test.go:93:32: time.Since(start) / n escapes to heap
./test.go:93:13: main []interface {} literal does not escape
./test.go:93:13: io.Writer(os.Stdout) escapes to heap
./test.go:96:32: time.Since(start) / n escapes to heap
./test.go:96:13: main []interface {} literal does not escape
./test.go:96:13: io.Writer(os.Stdout) escapes to heap
./test.go:99:32: time.Since(start) / n escapes to heap
./test.go:99:13: main []interface {} literal does not escape
./test.go:99:13: io.Writer(os.Stdout) escapes to heap
<autogenerated>:1: (*File).close .this does not escape

The important note here is that the slice in readFrom does not escape, so is stack allocated, and the slice does escape in both readFromEscapey and readFromInterface.

The result of running these benchmarks is:

1.073µs // readFromInterface
1.078µs // readFromEscapey
116ns   // readFrom

As you can see, the interface performs almost identically to the direct call when the slice escapes, so (I think) all the overhead is coming from the heap allocation, not the interface. (Which is amazing!)

Relative, the time difference here is quite large, >9x, however if the work being done was greater, or involved IO, this could end up being quite a small difference. But, there's a lot of interface use without IO where the difference could be significant.

The other, probably more important measure, is the difference to garbage collection impact that having less things allocated on the heap would make. I think that this could make a big difference. I'm not sure what the best method to measure this is though (without creating a test implementation, which I'm willing to attempt if this proposal doesn't get quickly rejected)!

Quickly back to the intro

The primary goal of this proposal is to reduce the number of heap allocations due to this by optionally allowing a functions type signature to contain information about escape analysis.

I've got a couple of ideas, the first one being the key one and the second being a nice feature that could come for free with the first.

Idea 1 - Annotating types with ~ if they won't escape to the heap

The syntax of this proposal may not be ideal. I had the idea first and this is the nicest syntax that I could come up with, any suggestions would be appreciated!

The basic idea, is that you could define a method as:

func (...) Example(x *~Type) { ... }

The ~ in front of the type of x is a way of indicating that pointer x can point to the stack, it's a promise that it won't do anything to x that would require it to be on the heap, for example setting a global variable to it, or setting a struct value to it - anything that would make the reference live longer than the method duration.

If the Example method were defined this way and its implementation did anything which caused escape analysis to determine that x must move to the heap, it would cause a compile time error.

For example, this interface:

interface ExampleInterface {
	Example(*~Type)
}

When using the Example method from this interface, escape analysis would know that calling Example won't do anything that would cause it's argument to need to be moved to the heap.

In the first program I gave, a new interface could be defined, for example:

interface ReaderStack { // That's probably a bad name for it, but the name is a minor detail.
	Read([]~byte) (int, error) // Since it's a slice of ~byte, it means that the bytes in the slice can be on the stack.
}

Then, if the Read method of ExampleReader took []~byte instead of []byte, it would satisfy this interface.

Following this, when the interface is used, the byte slice wouldn't need to be moved to the heap, leading to the better performance and fewer heap allocations as seen in the tests.

If this were used throughout a large project, a lot less would perhaps end up on the heap - again, I'm not sure how much of a benefit this would be but I'd be super interested to find out as I think it could be significant.

To maximise compatibility, a method with the signature Example(*~Type) would also implement the Example(*Type) interface, and a value of *Type could be passed as a value of *~Type - since a heap allocated pointer will still work with code which doesn't cause it to escape. But importantly, a value of *Type cannot be passed as a value of *~Type and a method Example(*Type) cannot implement the Example(*~Type) interface.

This means that implementations of existing interfaces can use ~ notation to give the advantages when such an interface is used, but can also satisfy the old interface.

Idea 2 - Explicitly defining a variable to be of type ~Type.

Idea 1 already brings rules about using types of type ~Type - when a reference is passed as a function argument.

Therefore, it might make sense to allow types to be explicitly declared as on the stack, for example:

var x ~BigStruct = ...

Then, doing anything that would cause x to escape would be a compile time error. This effectively allows a programmer to guarantee that a variable does not become heap allocated if they don't want it to.

This could also work with :=, perhaps with ~:= to prefix the type with a ~ and allocate it on the stack - I'm not convinced by that syntax yet.

More discussion and an existing issue

The idea of checking whether something is stack allocated was brought up in this proposal #34798 for vet so there is a demand for it.

The same issue does apply though; this means that a program being correct is determined by the compilers escape analysis, which would mean that for it to be well defined whether or not a variable escapes would also have to be defined within the language specification. A big step. The spec could be designed in a way which gives compilers a bit of freedom, since escape analysis could be better than what's defined in the spec, allowing for more optimisation, but still error when the specs escape analysis is not met and ~ is used. Plus, a compiler could choose to ignore the ~ notation entirely and just treat interfaces as always escaping and all valid code would still work (a tool to check the specs escape analysis would be required for the compiler to properly error though).

In addition, all changes to the escape analysis would be backwards compatible since if it didn't escape before then it won't now. So improvements can be made without breaking code.

A nice side effect to this, is that if you pass a reference to something to a function with a ~Type, then you know the reference cannot still be around after that function call, meaning that you know it's safe to recycle something - without something unintentionally still having a reference to it.

Proposal template

Would you consider yourself a novice, intermediate, or experienced Go programmer?

I would consider myself to be fairly experienced. I have been using go on an at least weekly basis for about 3 years and have worked on a project in Go which is running at a small scale - but my production Go experience is very minimal.

What other languages do you have experience with?

JavaScript (a lot), Python (a lot), Haskell (a bit less), TypeScript (a little less), C (a fair bit), Scala (a fair bit), Rust (a little), Java (a little) + various languages I've played around with. (That was not in order of experience, not preference!)

Would this change make Go easier or harder to learn, and why?

This idea would add a feature to Go which requires quite a lot of understanding. It would be possible to use Go as it currently is, since it's backwards compatible - and any ~ interfaces work with non-~ types.

The ability to write code without having to think about memory is wonderful. This feature doesn't remove that, since you can happily write a program without having to worry at all about it! You can use libraries that use it to get a free performance boost, but you don't have to define functions that use it yourself unless you are optimising or implementing an interface which uses a ~Type.

So the initial learning experience wouldn't be much different as you don't have to learn the feature to use the language but it is something extra to learn though.

Has this idea, or one like it, been proposed before?

Not this feature (as far as I'm aware), however a related idea has been discussed. See above.

Who does this proposal help, and why?

This proposal helps programmers who are trying to reduce heap allocations - by giving them the freedom to use interfaces and callbacks without forcing heap allocations and a tool to enforce stack allocation. It also benefits anyone who uses APIs with these types in the signatures since less will need to be moved to the heap.

Is this change backward compatible?

Yep!

Show example code before and after the change.

Before:

//ExampleReader copies as much of Data that it can to the dst on a call to Read.
type ExampleReader struct {
	Data []byte
}

//Read simply copies r.Data to dst
func (r *ExampleReader) Read(dst []byte) (int, error) {
	return copy(dst, r.Data), nil
}

After:

//ExampleReader copies as much of Data that it can to the dst on a call to Read.
type ExampleReader struct {
	Data []byte
}

//Read simply copies r.Data to dst
func (r *ExampleReader) Read(dst []~byte) (int, error) {
	return copy(dst, r.Data), nil
}

Before:

interface Mutator {
	DoStuff(*BigStruct) error
}

After:

interface Mutator {
	DoStuff(*~BigStruct) error
}

If all of your Mutator implentations don't let the struct escape to the heap, which they probably shouldn't, then making a call to this interface would no longer force the struct to escape.

Before:

var ThisShouldBeOnTheStack BigStruct = ...

After:

var ThisShouldBeOnTheStack ~BigStruct = ...
// Now the compiler will error, at compile time, if it gets moved to the heap.

What is the cost of this proposal?

  • This would require a change to gopls, since it would need to helpfully point out where an escape analysis rule is broken.
  • It would also require the compiler to check that an annotated variable does not get moved to the heap, but the compiler already does escape analysis, so I don't think this would be very expensive? (I may be incorrect.)
  • This does not require anything to happen at run time, but it does reduce the heap usage, saving run time time!

Can you describe a possible implementation?

(Perhaps a naive description)

The compiler already performs escape analysis to determine what escapes to the heap, a check would simply have to be added after the analysis to ensure that a variable marked as being on the stack is not moved to the heap.

The escape analysis would also have to be modified to understand the meaning of a ~ (stack annotation) in the type signature.

Prototype

I do not have a prototype, however I would be willing to attempt it!

How would the language spec change?

The language spec would have to include the concept of escaping. Which is a loss of abstraction, however an implementation could change the memory management entirely since ~ does not change the functionality of the code.

Orthogonality: how does this change interact or overlap with existing features?

This is a change to the type system. It doesn't change how types are represented, or how the runtime works. It is just an annotation to the type to hint to escape analysis and get the compiler to enforce things for us.

The annotation is similar to that of * in terms of appearance and would be placed on to the type of the thing being referenced.

Is the goal of this change a performance improvement?

Yes

What quantifiable improvement should we expect?

I'm not sure. There is a definite improvement simply by not having to allocate on the heap! See the test in the introduction for numbers. But I'm really not sure what the performance difference would be like in a large scale project; I'm not sure how much of a difference to garbage collection times it would make. But I think, if used extensively, or maybe just a little on things that happen a lot, it could make a large difference.

How would we measure it?

By making a test implementation and trying it on something big!

Does this affect error handling?

No

Is this about generics?

No

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions