Skip to content

proposal: spec: immutable data #37303

@embeddedgo

Description

@embeddedgo

This issue describes language feature proposal to immutable data.

There are more general proposals for Go 2 that postulate changes in the language type system:

Support read-only and practical immutable values in Go

Read-only and immutability

Immutable type qualifier

This proposal isn't as general as the ones mentioned above and focuses only on the data embedded in the code as in the example below (taken from the unicode package):

var _C = &RangeTable{
        R16: []Range16{
                {0x0000, 0x001f, 1},
                {0x007f, 0x009f, 1},
                {0x00ad, 0x0600, 1363},
                {0x0601, 0x0605, 1},
                {0x061c, 0x06dd, 193},
                {0x070f, 0x08e2, 467},
                {0x180e, 0x200b, 2045},
                {0x200c, 0x200f, 1},
                {0x202a, 0x202e, 1},
                {0x2060, 0x2064, 1},
                {0x2066, 0x206f, 1},
                {0xd800, 0xf8ff, 1},
                {0xfeff, 0xfff9, 250},
                {0xfffa, 0xfffb, 1},
        },
        R32: []Range32{
                {0x110bd, 0x110cd, 16},
                {0x1bca0, 0x1bca3, 1},
                {0x1d173, 0x1d17a, 1},
                {0xe0001, 0xe0020, 31},
                {0xe0021, 0xe007f, 1},
                {0xf0000, 0xffffd, 1},
                {0x100000, 0x10fffd, 1},
        },
        LatinOffset: 2,
}

The problems this proposal tries to solve

  1. If a package exports some data (explicitly or implicitly) that is intended to be immutable there is no way in the current language specification/implementation to ensure immutability or to detect that some faulty code changes the exported data.

  2. In case of microcontroller based embedded systems the mutable data is copied from Flash to RAM at the system startup. In such systems there is a very little RAM because the immutable parts of the program (text and read-only data) are intended to be executed/read by the CPU directly from Flash. There is no way in the current language implementation to leave the immutable data in Flash which causes that the available RAM overflows very quickly as you import more packages.

Language changes

This proposal doesn't require changes to the language specification. It can be implemented by adding a new compiler directive as in the example bellow:

//go:immutable
var _C = &RangeTable1{
         R32: []Range32{
                {0x0000, 0x001f, 1},
        },
}

Edit: There is another syntax proposed that requires change in the language specification:

const var _C = &RangeTable1{
         R32: []Range32{
                {0x0000, 0x001f, 1},
        },
}

Unlike const x = 2 the const var y = 2 allows to take address of y.

Implementation

The go:immutable directive should make the variable and any composite literals used to construct it immutable. The compiler should return an error if the data on the right hand side cannot be generated at the compile time. Immutable data should be placed in .rodata section.

The go:immutable directive can be documented as a hint directive that may or may not be implemented by the compiler, the hardware or the operating system.

An immutability violation is detected at runtime and cause the program abort. The detection relies on the operating system which usually uses read-only pages for read-only sections. In case of embedded systems the immutability violation can be detected by hardware and generate an exception.

Design decision argumentation

Using the compiler directive instead of new keyword or an existing keyword combination like const var has the advantage that it doesn't introduce any changes to the language specification. If the more general approach for immutability will be developed the directive can be easily removed from the compiler specification.

Tests

I've done some tests simulating the go:immutable directive at the linker level by adding the following code to the Link.dodata function:

for _, s := range ctxt.Syms.Allsym {
        if strings.HasPrefix(s.Name, "unicode..stmp_") {
                s.Type = sym.SRODATA
        }
}

It moves to the .rodata section all "static temp" symbols from the unicode package that correspond mainly to the composite literals used to initialize global variables. The impact on the code generated for simple Hello, World! program:

package main

import "fmt"

func main() {
	fmt.Println("Hello, World!")
}

is as follow:

before:

   text    data     bss     dec     hex filename
 883610   58172   11128  952910   e8a4e helloworld.elf

after:

   text    data     bss     dec     hex filename
 931847    9700   11128  952675   e8963 helloworld.elf

As you can see about 48 KB have been moved from the data segment to the text segment and they are all from unicode package only. It isn't impressive from OS capable system point of view but it's a game changer in case of microcontroller based embedded systems which rarely have more than 256 KB of RAM.

Impact on the existing code

Introducing go:immutable directives for immutable data in the standard library and other packages shouldn't affect the correct code in any way. The faulty code can stop work.

Additional explanation

See additional explanation below which is also an example of using const var instead of //go:immutable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions