Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental linking of spec segments #1204

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

Conversation

Thar0
Copy link
Contributor

@Thar0 Thar0 commented Apr 24, 2022

Currently, all object files in all segments would be linked all at once in one call to ld. This PR adds incremental linking (with ld -r) by spec segment to the build process. Now all segments as defined in the spec are linked first, and then all of those are linked together in the final link step to produce zelda_ocarina_mq_dbg.elf.
Relocations are generated from the incrementally linked segment files rather than individual objects, for segments prefixed with ovl_. They no longer need to be manually written into the spec. Having a link step in between compilation and reloc generation gives us more control over the kinds of sections that fado sees, making it more robust when dealing with GCC compiled objects so the options passed to GCC need not be as restrictive.
This builds marginally slower from clean, however builds that only touch a subset of files are faster.

Thanks to glank for sorting out the incremental linking integration, especially the makefile dependencies

Copy link
Contributor

@glankk glankk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, this makes the build process and dependency management much more precise. I've done some nitpicking for your enjoyment.

"\t.debug_pubtypes 0 : { *(.debug_pubtypes) }\n"
"\t.debug_ranges 0 : { *(.debug_ranges) }\n"
"\n"
/* DWARF extensions */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are DWARF 5 sections.


build/$(SPEC): $(SPEC)
$(CPP) $(CPPFLAGS) $< > $@
ifeq ($(MAKECMDGOALS),$(filter-out clean assetclean distclean setup,$(MAKECMDGOALS)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An explanatory comment could be useful here. This conditional is to prevent the generation of segment dependency files for targets that don't touch any of the segment objects. In the case of cleaning targets, omitting this would cause these files to be generated just to be immediately removed again. Unfortunately I don't know of a better way to solve this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also causes weird issues when using exotic make calls like make assetclean all, but it's probably not worth worrying about.

Having a different makefile for the "clean" targets, or just a script given that it's just doing rms, sounds like a good fix 😄 but, as long as it works as is...


build/ldscript.txt: build/$(SPEC)
$(MKLDSCRIPT) $< $@
$(SEGMENT_DIR)/%.reloc.o: $(SEGMENT_DIR)/%.o
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably split this into two recipes unless it negatively affects the build time.

@@ -165,11 +165,17 @@ O_FILES := $(foreach f,$(S_FILES:.s=.o),build/$f) \
$(foreach f,$(C_FILES:.c=.o),build/$f) \
$(foreach f,$(wildcard baserom/*),build/$f.o)

OVL_RELOC_FILES := $(shell $(CPP) $(CPPFLAGS) $(SPEC) | grep -o '[^"]*_reloc.o' )
SEGMENTS := $(shell $(CPP) $(CPPFLAGS) $(SPEC) | grep -o '^[ \t]*name[ \t]\+".\+"' | sed 's/.*"\(.*\)".*/\1/g' )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SEGMENTS := $(shell $(CPP) $(CPPFLAGS) $(SPEC) | grep -o '^[ \t]*name[ \t]\+".\+"' | sed 's/.*"\(.*\)".*/\1/g' )
SEGMENTS := $(shell $(CPP) $(CPPFLAGS) $(SPEC) | grep -o '^[ \t]*name[ \t]\+".\+"' | sed 's/.*"\(.*\)".*/\1/g')

Copy link
Collaborator

@Dragorn421 Dragorn421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glankk this is rookie nitpicking, get on my level 😏

build/ldscript.txt is incredibly more readable now, great work. I'll try to do timings later for the performance side of this.
And I didn't realize reading the makefile but looking in build/segments/ we now have separate linker maps for every segment 🤩

I think tools/reloc_prereq can be removed (or repurposed)

For anyone interested in reviewing, you may need the make manual 😄
https://www.gnu.org/software/make/manual/make.html
(unless you're a pro already 😎, I'm not)

@@ -165,11 +165,17 @@ O_FILES := $(foreach f,$(S_FILES:.s=.o),build/$f) \
$(foreach f,$(C_FILES:.c=.o),build/$f) \
$(foreach f,$(wildcard baserom/*),build/$f.o)

OVL_RELOC_FILES := $(shell $(CPP) $(CPPFLAGS) $(SPEC) | grep -o '[^"]*_reloc.o' )
SEGMENTS := $(shell $(CPP) $(CPPFLAGS) $(SPEC) | grep -o '^[ \t]*name[ \t]\+".\+"' | sed 's/.*"\(.*\)".*/\1/g' )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this because it assumes things about the not-well-defined spec syntax, which makes it look really hacky.
But as I said before I personally don't even like the current spec syntax in the first place, so I'm biased here.

Still this could be improved by making a spec.c-based tool, that way all assumptions on the spec syntax are at least contained in spec.c, but it may be more trouble than it's worth.
I would be okay with keeping the grep | sed if you think it's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A spec.c based tool would probably be better, but my lack of knowledge on how to make a pipe-compatible program stopped me from pursuing this (I've read that it's as easy as reading from stdin but cat spec | program did not give expected results when I tried this...)

@@ -165,11 +165,17 @@ O_FILES := $(foreach f,$(S_FILES:.s=.o),build/$f) \
$(foreach f,$(C_FILES:.c=.o),build/$f) \
$(foreach f,$(wildcard baserom/*),build/$f.o)

OVL_RELOC_FILES := $(shell $(CPP) $(CPPFLAGS) $(SPEC) | grep -o '[^"]*_reloc.o' )
SEGMENTS := $(shell $(CPP) $(CPPFLAGS) $(SPEC) | grep -o '^[ \t]*name[ \t]\+".\+"' | sed 's/.*"\(.*\)".*/\1/g' )
SEGMENTS_OVL := $(filter ovl_%,$(SEGMENTS))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that overlays would be defined by segment name, but we already have/had this kind of name-dependant behavior in master (OVL_RELOC_FILES and _reloc in general) so whatever.

A possibility would be to add a spec directive for them but again may be more trouble than worth.

Copy link
Contributor Author

@Thar0 Thar0 Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like resorting to the name either, but left it at this since as you pointed out there was already name dependence. In an earlier commit I was toying with the idea of using flags OVL as a means to distinguish overlays, but I dropped this for two reasons. First, to implement this cleanly it would need a tool that can receive the spec through a pipe (see prior comment on why this is apparently a difficulty in it's own right), and second it would be a change to the spec format itself as this flag doesn't exist.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is changing the spec format itself an issue? (besides being more work than may be worth)
We already expanded it in the past afaik (at the very least include_data_with_rodata was surely not part of the "original", whatever the original is, and probably other directives)

SEGMENT_OBJECTS := $(SEGMENTS:%=$(SEGMENT_DIR)/%.o)
SEGMENT_RELOCS := $(SEGMENTS_OVL:%=$(SEGMENT_DIR)/%.reloc.o)
SEGMENT_SCRIPTS := $(SEGMENT_OBJECTS:.o=.ld)
SEGMENT_DEPENDS := $(SEGMENT_OBJECTS:.o=.d)

# Automatic dependency files
# (Only asm_processor dependencies and reloc dependencies are handled for now)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# (Only asm_processor dependencies and reloc dependencies are handled for now)
# (Only asm_processor dependencies are handled for now)

it's unclear what this comment is for but I assume it's about DEP_FILES
maybe mention relocs somewhere else
but idk an actual write up on the build system would be a better idea...


.PHONY: o_files asset_files
$(SEGMENT_SCRIPTS): build/$(SPEC)
$(SEGMENT_SCRIPTS): %.ld: %.d
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does each segment's linker script depend on the segment's dependency file? Isn't the dependency on the (preprocessed) spec enough?

Copy link
Contributor

@glankk glankk Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because mkldscript generates both the dependency file and the script in the same invocation. Having a dependency relation between the two is the neatest way to communicate this to make.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, they are indeed both generated at the same time, so why would one depend on the other?
Also, typically (always) .ld/.d will be newer than build/$(SPEC) and the .lds already depend on build/$(SPEC), so the .d would also be up to date?

Also unrelated to my question but even if this is needed, it assumes the .ld is written before after the .d, or "not too long" after before (whatever make's or the system's definition of "not too long" is)
(currently mkldscript writes the .ld after before the .d time-wise)

Copy link
Contributor

@glankk glankk Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove this dependency relation, then make would no longer know how to generate .ld. You would have to provide a recipe for it, which would be the same as .d. This would cause two invocations of mkldscript to occur, meaning both files would be generated twice, a waste of time. Having .d be newer than .ld is not a problem, because .ld does not have a recipe and thus will not be regenerated, even if it is considered "out of date" in a sense.

Edit: In practice this doesn't really make a difference, it would really only be necessary if you removed the -include $(SEGMENT_DEPENDS) line and then tried to do something like make build/segments/code.ld.


$(SEGMENT_DEPENDS): build/$(SPEC)
$(MKLDSCRIPT) -s $< $(SEGMENT_DIR) $(@:$(SEGMENT_DIR)/%.d=%)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo the makefile should pass the .d and .ld locations to mkldscript as well, instead of mkldscript deriving them from the segment name. That way paths are all contained in the makefile

Comment on lines 353 to 357
/* overlay relocation section */
fprintf(f, "\t\t_%sSegmentOvlStart = .;\n", seg->name);

if (strncmp(seg->name, "ovl_", 4) == 0)
fprintf(f, "\t\t%s/%s.reloc.o (.ovl)\n", dir, seg->name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do without the SegmentOvlStart & co symbols for segments that are not overlays, those symbols are useless anyway (still may be useful for actual overlays for debugging though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking to do this but I wasn't sure if it's worth it since the OvlEnd symbol is used in calculation of the segment rom size

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the obvious solution is to switch between

_%sSegmentRomSize = ABSOLUTE(_%sSegmentOvlEnd - _%sSegmentTextStart);

and

_%sSegmentRomSize = ABSOLUTE(_%sSegmentRoDataEnd - _%sSegmentTextStart);

as appropriate

Which looked bad to me at first, but tbh may not be so bad of an idea. mkldscript is fully in control of the script it outputs after all, including the last section linked, so I don't think this can go wrong

I actually implemented this to try it out, so I figured I'd PR it and include some other suggested changes
Thar0#7
feel free to come up with your own implementation of course

Comment on lines 419 to 420
"\t%s -s <spec-file> <segment-dir> <segment-name>\n"
"\t%s -r <spec-file> <segment-dir> <output-file>\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tiniest bit of documentation never hurts 😃

Suggested change
"\t%s -s <spec-file> <segment-dir> <segment-name>\n"
"\t%s -r <spec-file> <segment-dir> <output-file>\n",
"\t%s -s <spec-file> <segment-dir> <segment-name>\n"
"\t\tWrite a linker script and dependencies for the specified segment\n"
"\t\n"
"\t%s -r <spec-file> <segment-dir> <output-file>\n"
"\t\tWrite a linker script for linking all the segments together\n",

EDIT: Thar0#7


spec = util_read_whole_file(argv[1], &size);
parse_rom_spec(spec, &g_segments, &g_segmentsCount);
spec = util_read_whole_file(argv[2], &size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purely for the tool's source readability, I think it would be good to assign argv[i] to local variables with proper names so it doesn't seem so random what gets passed to functions

For example char* specFilePath = argv[2];


build/$(SPEC): $(SPEC)
$(CPP) $(CPPFLAGS) $< > $@
ifeq ($(MAKECMDGOALS),$(filter-out clean assetclean distclean setup,$(MAKECMDGOALS)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also causes weird issues when using exotic make calls like make assetclean all, but it's probably not worth worrying about.

Having a different makefile for the "clean" targets, or just a script given that it's just doing rms, sounds like a good fix 😄 but, as long as it works as is...


/* end segment */
fprintf(f,
"\t\t_%sSegmentEnd = .;\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have probably looked at it but just in case this is the makerom segment part of new build/ldscript.txt :

	_makeromSegmentRomStartTemp = _RomSize;
	_makeromSegmentRomStart = _makeromSegmentRomStartTemp;
	..makerom : AT(_makeromSegmentRomStart)
	{
		_makeromSegmentStart = .;

		build/segments/makerom.o (.text)
		build/segments/makerom.o (.data)
		build/segments/makerom.o (.rodata)
		_makeromSegmentOvlStart = .;
		_makeromSegmentOvlEnd = .;
		_makeromSegmentOvlSize = ABSOLUTE(_makeromSegmentOvlEnd - _makeromSegmentOvlStart);
	}
	_makeromSegmentRomSize = ABSOLUTE(_makeromSegmentOvlEnd - _makeromSegmentTextStart);
	_RomSize += _makeromSegmentRomSize;
	_makeromSegmentRomEndTemp = _RomSize;
	_makeromSegmentRomEnd = _makeromSegmentRomEndTemp;
	..makerom.bss (NOLOAD) :
	{
		build/segments/makerom.o (.bss)
	}
		_makeromSegmentEnd = .;
		_makeromSegmentSize = ABSOLUTE(_makeromSegmentEnd - _makeromSegmentStart);

So first congrats again because it's infinitely more legible than it was previously, but I must say the indent difference on _makeromSegmentEnd looks weird, unless that's intended to match _makeromSegmentStart?

Maybe you could _makeromSegmentStart = ... before ..makerom : AT(_makeromSegmentRomStart)
and change ..makerom : AT(_makeromSegmentRomStart) to ..makerom _makeromSegmentStart : AT(_makeromSegmentRomStart)
if the linker script format allows it. (I'm not an expert)
Then _makeromSegmentStart and _makeromSegmentEnd would be on the same indentation level
(think of my OCD tharo 😢)

EDIT: see Thar0#7 unless you want to implement your own thing

@EllipticEllipsis
Copy link
Contributor

I do appreciate the speed improvement.

Actual data collected to back up claims
This branch
Full
real    7m8.757s
user    6m5.644s
sys     1m5.222s

One file in code
real    0m2.323s
user    0m1.722s
sys     0m0.616s

One medium-sized overlay
real    0m2.508s
user    0m1.768s
sys     0m0.712s


master
Full
real    7m24.548s
user    6m9.258s
sys     1m7.429s

One file in code
real    0m4.148s
user    0m3.210s
sys     0m0.668s

One medium-sized overlay
real    0m4.792s
user    0m3.484s
sys     0m0.871s

I tested this several times with essentially the same numbers, of course there is some natural variation but it is consistent enough.

However, there are several significant problems that this introduces, some of which are avoidable, some of which seem to be inherent to partial linking.

  • As Dragorn has already pointed out, this makes significant assumptions about the forms of various file's names
  • However, it is significantly worse than that: with this, there is no way to use original reloc files, which will be required as soon as we have GLOBAL_ASM in overlays again, which we inevitably will need as soon as we begin supporting retail. Unless you have an idea for avoiding the reordering that the asm-processor induces on relocations?
  • It may well be that the ldscript is easier to read (I don't really see the difference myself, it simply has less in it), but this has wiped out essentially all of the useful information about the individual files from the mapfile:
..boot          0x0000000080000460    0x11f10 load address 0x0000000000001060
                0x0000000080000460                _bootSegmentStart = .
 build/segments/boot.o(.text)
 .text          0x0000000080000460     0x8fb0 build/segments/boot.o
                0x0000000080000460                _bootSegmentTextStart
                0x0000000080000460                cleararena
                0x0000000080000498                bootproc
                0x00000000800005a0                Main_ThreadEntry
                0x0000000080000694                Idle_ThreadEntry
                0x0000000080000a10                ViConfig_UpdateVi
--
                0x0000000080009310                __osSetWatchLo
                0x0000000080009320                D_80009320
                0x00000000800093f0                D_800093F0
                0x0000000080009410                _bootSegmentTextEnd
 build/segments/boot.o(.data)
 .data          0x0000000080009410     0x1bb0 build/segments/boot.o
                0x0000000080009410                D_80009410
                0x0000000080009410                _bootSegmentDataStart
                0x0000000080009430                D_80009430
                0x0000000080009434                gViConfigUseDefault
                0x0000000080009438                gViConfigAdditionalScanLines
--
                0x000000008000af84                D_8000AF84
                0x000000008000afa0                __osViDevMgr
                0x000000008000afbc                __additional_scanline
                0x000000008000afc0                _bootSegmentDataEnd
 build/segments/boot.o(.rodata)
  • This is a significant problem when the files change and move around, as they certainly do on the N64 versions (and even the NTSC versions have some files in different places on GC).
  • This also breaks, to a greater or lesser extent, many of our scripts that rely on the mapfile: I give a couple of examples:
sym_info.py
$ ./sym_info.py Idle_ThreadEntry
Symbol Idle_ThreadEntry (RAM: 0x80000694, ROM: 0x001294, build/segments/boot.o)
Some will argue that not knowing the file doesn't matter, you can just search for it, but why should that be necessary? The script is supposed to be able to do that anyway.
first_diff.py
$ ./first_diff.py
First difference at ROM addr 0x1004, <start of rom> (RAM 0x0, ROM 0x0, <no file>)
Bytes: 25:08:23:90 vs 25:08:23:70
Instruction difference at ROM addr 0x1080, cleararena (RAM 0x80000460, ROM 0x1060, build/segments/boot.o)
Bytes: AF:A2:00:1C vs 0C:00:11:14
Instruction difference at ROM addr 0x10B8, bootproc (RAM 0x800004B8, ROM 0x10B8, build/segments/boot.o)
Bytes: 27:BD:FF:E0 vs AF:AE:00:10
Instruction difference at ROM addr 0x11C0, Main_ThreadEntry (RAM 0x800005C0, ROM 0x11C0, build/segments/boot.o)
Bytes: 27:BD:FF:D8 vs 3C:04:80:01
Instruction difference at ROM addr 0x12B8, Idle_ThreadEntry (RAM 0x800006B4, ROM 0x12B4, build/segments/boot.o)
Bytes: AF:A4:00:20 vs 0C:00:08:4C
Instruction difference at ROM addr 0x1630, ViConfig_UpdateVi (RAM 0x80000A30, ROM 0x1630, build/segments/boot.o)
Bytes: 27:BD:FF:E8 vs 46:00:21:A1

Over 1000 differing words, must be a shifted ROM.
Map appears to have shifted just before bootproc (build/segments/boot.o) -- in cleararena?
This seems to be essentially unavoidable.
  • This is pretty catastrophic for understanding bss reordering, which from experience in MM will be one of our most significant problems with retail, and we still have no working solution.

Overall this PR seems to assume that the codebase is in a more mature and final form than it actually is: I think all of these issues are going to make it impractical while the codebase remains focussed on decompilation, matching, and even documentation for some things. I would also say that this appears to be trying to solve a problem that doesn't really exist, and while that's fine if it does not cause other regressions, I believe that the problems above are significant enough that this needs at least some adjustment (quite apart from the details of the actual implementation, which I have not yet examined in detail).

@Dragorn421
Copy link
Collaborator

Dragorn421 commented Apr 25, 2022

Answering to elliptic,

ELL-1 assumptions on segment names

As Dragorn has already pointed out, this makes significant assumptions about the forms of various file's names

😎
Only the ovl_ prefix though, right?
And that we could find alternatives I think.

ELL-2 non-matching overlay segments

However, it is significantly worse than that: with this, there is no way to use original reloc files, which will be required as soon as we have GLOBAL_ASM in overlays again, which we inevitably will need as soon as we begin supporting retail. Unless you have an idea for avoiding the reordering that the asm-processor induces on relocations?

I think this can be solved by not depending on the ovl_ prefix to identify overlays, for example if we had a "overlay" directive in the spec a matching segment could be:

beginseg
    name "ovl_select"
    overlay
    include "build/src/overlays/gamestates/ovl_select/z_select.o"
endseg

and a non matching entry could be (notice no overlay directive)

beginseg
    name "ovl_select"
    include "build/src/overlays/gamestates/ovl_select/z_select.o"
    include "build/src/overlays/gamestates/ovl_select/z_select_relocs.o"
endseg

idk the specifics of where the .o files would come from then but it wouldn't be up to the spec I think

EDIT: my example is flawed with respects to my idea of not linking the .ovl section unless the segment is an overlay but that could be addressed with more directives like overlay == link in .ovl, manual_relocs == don't auto-include %.relocs.o (ofc needs discussion but overall my point is I think this issue can be addressed)

ELL-3 losing information in build/z64.map compared to master

It may well be that the ldscript is easier to read (I don't really see the difference myself, it simply has less in it), but this has wiped out essentially all of the useful information about the individual files from the mapfile:

what is missing?

Also, build/ldscript.txt comparison:

master:

    _buffersSegmentRomStartTemp = _RomSize;
    _buffersSegmentRomStart = _buffersSegmentRomStartTemp;
    ..buffers : AT(_RomSize)
    {
        _buffersSegmentStart = .;
        . = ALIGN(0x10);
        _buffersSegmentTextStart = .;
        . = ALIGN(0x40);
            build/src/buffers/zbuffer.o (.text)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.text)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.text)
        . = ALIGN(0x10);
        _buffersSegmentTextEnd = .;
    _buffersSegmentTextSize = ABSOLUTE( _buffersSegmentTextEnd - _buffersSegmentTextStart );
        _buffersSegmentDataStart = .;
            build/src/buffers/zbuffer.o (.data)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.data)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.data)
        . = ALIGN(0x10);
        _buffersSegmentDataEnd = .;
    _buffersSegmentDataSize = ABSOLUTE( _buffersSegmentDataEnd - _buffersSegmentDataStart );
        _buffersSegmentRoDataStart = .;
            build/src/buffers/zbuffer.o (.rodata)
        . = ALIGN(0x10);
            build/src/buffers/zbuffer.o (.rodata.str1.4)
        . = ALIGN(0x10);
            build/src/buffers/zbuffer.o (.rodata.cst4)
        . = ALIGN(0x10);
            build/src/buffers/zbuffer.o (.rodata.cst8)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.rodata)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.rodata.str1.4)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.rodata.cst4)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.rodata.cst8)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.rodata)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.rodata.str1.4)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.rodata.cst4)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.rodata.cst8)
        . = ALIGN(0x10);
        _buffersSegmentRoDataEnd = .;
    _buffersSegmentRoDataSize = ABSOLUTE( _buffersSegmentRoDataEnd - _buffersSegmentRoDataStart );
        _buffersSegmentSDataStart = .;
            build/src/buffers/zbuffer.o (.sdata)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.sdata)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.sdata)
        . = ALIGN(0x10);
        _buffersSegmentSDataEnd = .;
        _buffersSegmentOvlStart = .;
            build/src/buffers/zbuffer.o (.ovl)
            build/src/buffers/gfxbuffers.o (.ovl)
            build/src/buffers/heaps.o (.ovl)
        _buffersSegmentOvlEnd = .;
    }
    _RomSize += ( _buffersSegmentOvlEnd - _buffersSegmentTextStart );
    _buffersSegmentRomEndTemp = _RomSize;
_buffersSegmentRomEnd = _buffersSegmentRomEndTemp;

    ..buffers.bss ADDR(..buffers) + SIZEOF(..buffers) (NOLOAD) :
    {
        . = ALIGN(0x10);
        _buffersSegmentBssStart = .;
        . = ALIGN(0x40);
            build/src/buffers/zbuffer.o (.sbss)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.sbss)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.sbss)
        . = ALIGN(0x10);
            build/src/buffers/zbuffer.o (.scommon)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.scommon)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.scommon)
        . = ALIGN(0x10);
            build/src/buffers/zbuffer.o (.bss)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (.bss)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (.bss)
        . = ALIGN(0x10);
            build/src/buffers/zbuffer.o (COMMON)
        . = ALIGN(0x10);
            build/src/buffers/gfxbuffers.o (COMMON)
        . = ALIGN(0x10);
            build/src/buffers/heaps.o (COMMON)
        . = ALIGN(0x10);
        . = ALIGN(0x10);
        _buffersSegmentBssEnd = .;
        _buffersSegmentEnd = .;
    }
    _buffersSegmentBssSize = ABSOLUTE( _buffersSegmentBssEnd - _buffersSegmentBssStart );

    _ovl_titleSegmentRomStartTemp = _RomSize;
    _ovl_titleSegmentRomStart = _ovl_titleSegmentRomStartTemp;
    ..ovl_title 0x80800000 : AT(_RomSize)
    {
        _ovl_titleSegmentStart = .;
        . = ALIGN(0x10);
        _ovl_titleSegmentTextStart = .;
            build/src/overlays/gamestates/ovl_title/z_title.o (.text)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.text)
        . = ALIGN(0x10);
        _ovl_titleSegmentTextEnd = .;
    _ovl_titleSegmentTextSize = ABSOLUTE( _ovl_titleSegmentTextEnd - _ovl_titleSegmentTextStart );
        _ovl_titleSegmentDataStart = .;
            build/src/overlays/gamestates/ovl_title/z_title.o (.data)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.data)
        . = ALIGN(0x10);
        _ovl_titleSegmentDataEnd = .;
    _ovl_titleSegmentDataSize = ABSOLUTE( _ovl_titleSegmentDataEnd - _ovl_titleSegmentDataStart );
        _ovl_titleSegmentRoDataStart = .;
            build/src/overlays/gamestates/ovl_title/z_title.o (.rodata)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/z_title.o (.rodata.str1.4)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/z_title.o (.rodata.cst4)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/z_title.o (.rodata.cst8)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.rodata)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.rodata.str1.4)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.rodata.cst4)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.rodata.cst8)
        . = ALIGN(0x10);
        _ovl_titleSegmentRoDataEnd = .;
    _ovl_titleSegmentRoDataSize = ABSOLUTE( _ovl_titleSegmentRoDataEnd - _ovl_titleSegmentRoDataStart );
        _ovl_titleSegmentSDataStart = .;
            build/src/overlays/gamestates/ovl_title/z_title.o (.sdata)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.sdata)
        . = ALIGN(0x10);
        _ovl_titleSegmentSDataEnd = .;
        _ovl_titleSegmentOvlStart = .;
            build/src/overlays/gamestates/ovl_title/z_title.o (.ovl)
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.ovl)
        _ovl_titleSegmentOvlEnd = .;
    }
    _RomSize += ( _ovl_titleSegmentOvlEnd - _ovl_titleSegmentTextStart );
    _ovl_titleSegmentRomEndTemp = _RomSize;
_ovl_titleSegmentRomEnd = _ovl_titleSegmentRomEndTemp;

    ..ovl_title.bss ADDR(..ovl_title) + SIZEOF(..ovl_title) (NOLOAD) :
    {
        . = ALIGN(0x10);
        _ovl_titleSegmentBssStart = .;
            build/src/overlays/gamestates/ovl_title/z_title.o (.sbss)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.sbss)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/z_title.o (.scommon)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.scommon)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/z_title.o (.bss)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (.bss)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/z_title.o (COMMON)
        . = ALIGN(0x10);
            build/src/overlays/gamestates/ovl_title/ovl_title_reloc.o (COMMON)
        . = ALIGN(0x10);
        . = ALIGN(0x10);
        _ovl_titleSegmentBssEnd = .;
        _ovl_titleSegmentEnd = .;
    }
    _ovl_titleSegmentBssSize = ABSOLUTE( _ovl_titleSegmentBssEnd - _ovl_titleSegmentBssStart );

and see Thar0#7 for the corresponding excerpt in this branch

There is indeed a lot less stuff (most of it moved into the individual segment linker scripts) which imo makes it heavily more readable

ELL-4 multi-version support

This is a significant problem when the files change and move around, as they certainly do on the N64 versions (and even the NTSC versions have some files in different places on GC).

Wouldn't that just be handled (though multiversion support has yet to happen) by different specs, ifdefs or idk what we'll end up with? The linker scripts only reflect what the spec indicates

ELL-5 this breaks tools relying on build/z64.map

This also breaks, to a greater or lesser extent, many of our scripts that rely on the mapfile

ah. I didn't think of that 😨

This seems to be essentially unavoidable.

We could make the tools read from the individual segment's map files as well. I'll try to whip something up (unless someone is interested too, let me know you're doing it so we're not going two times at it)

ELL-6 catastrophic for understanding bss reordering

This is pretty catastrophic for understanding bss reordering, which from experience in MM will be one of our most significant problems with retail, and we still have no working solution.

I don't have that experience but why? I feel like at that point it's enough to match the segment anyway and we don't have less information at the segment level?

ELL-boomer (jk)

I would also say that this appears to be trying to solve a problem that doesn't really exist

😠 but the cool technology 😢

@Dragorn421
Copy link
Collaborator

Timings for a full build / after modifying something in code / after modifying something in an actor:

## zeldaret/master

$ make distclean
$ make setup -j32
$ time make

real    6m25.648s
user    5m30.435s
sys     0m55.817s

$ touch src/code/z_actor.c
$ time make

real    0m4.526s
user    0m3.776s
sys     0m0.757s

$ touch src/overlays/actors/ovl_En_Daiku/z_en_daiku.c
$ time make

real    0m4.294s
user    0m3.528s
sys     0m0.772s


## tharo/segments

$ make distclean
$ make setup -j32
$ time make

real    6m10.779s
user    5m17.589s
sys     0m52.178s

$ touch src/code/z_actor.c
$ time make

real    0m3.016s
user    0m2.287s
sys     0m0.737s

$ touch src/overlays/actors/ovl_En_Daiku/z_en_daiku.c
$ time make

real    0m2.748s
user    0m2.106s
sys     0m0.657s
Graphs (careful dark mode users)

image

Elliptic's results are even better 😄

image

@EllipticEllipsis
Copy link
Contributor

Answering to elliptic,

ELL-1 assumptions on segment names

As Dragorn has already pointed out, this makes significant assumptions about the forms of various file's names

😎 Only the ovl_ prefix though, right? And that we could find alternatives I think.

See below.

ELL-2 non-matching overlay segments

However, it is significantly worse than that: with this, there is no way to use original reloc files, which will be required as soon as we have GLOBAL_ASM in overlays again, which we inevitably will need as soon as we begin supporting retail. Unless you have an idea for avoiding the reordering that the asm-processor induces on relocations?

I think this can be solved by not depending on the ovl_ prefix to identify overlays, for example if we had a "overlay" directive in the spec a matching segment could be:

beginseg
    name "ovl_select"
    overlay
    include "build/src/overlays/gamestates/ovl_select/z_select.o"
endseg

and a non matching entry could be (notice no overlay directive)

beginseg
    name "ovl_select"
    include "build/src/overlays/gamestates/ovl_select/z_select.o"
    include "build/src/overlays/gamestates/ovl_select/z_select_relocs.o"
endseg

idk the specifics of where the .o files would come from then but it wouldn't be up to the spec I think

EDIT: my example is flawed with respects to my idea of not linking the .ovl section unless the segment is an overlay but that could be addressed with more directives like overlay == link in .ovl, manual_relocs == don't auto-include %.relocs.o (ofc needs discussion but overall my point is I think this issue can be addressed)

This seems considerably worse for several reasons:

  • it changes the spec format yet again (if we're going to deviate this much from the original, it probably is time to use something else)
  • It obfuscates, fairly pointlessly, the simple operation of simply including one file or another... which is surely the point of the spec format in the first place!
  • Surely it's better for NON_MATCHING to be more symmetric than that?

ELL-3 losing information in build/z64.map compared to master

It may well be that the ldscript is easier to read (I don't really see the difference myself, it simply has less in it), but this has wiped out essentially all of the useful information about the individual files from the mapfile:

what is missing?

Files. Most of the issues I pointed out subsequently stem at least partially from the file splits being lost from the mapfile, which removes most of the functionality. "Just read the segment ldscripts instead" is hardly a sensible option for every new script that parses the mapfile to have to do (and I would know, having written several myself).

There is indeed a lot less stuff (most of it moved into the individual segment linker scripts) which imo makes it heavily more readable

This seems more of a flaw/redundancy in mkldscript than inherent to the current system, though: either the extra sections and alignment are not required, in which case they should not be mentioned in the first place, or you are simply sweeping the complexity under the rug into other files, creating a tree that has to be understood instead of a flat file should anything more precise be required. I'm not really sure why it helps that the main script is more readable if you end up having to troubleshoot in the messier individual ones anyway, which seems inevitable given what we've been dealing with so far.

ELL-4 multi-version support

This is a significant problem when the files change and move around, as they certainly do on the N64 versions (and even the NTSC versions have some files in different places on GC).

Wouldn't that just be handled (though multiversion support has yet to happen) by different specs, ifdefs or idk what we'll end up with? The linker scripts only reflect what the spec indicates

Again, this is not the point, the point is that the files have disappeared... suppose you want to compare two mapfiles to find where something has gone, or the ordering of bss...

ELL-5 this breaks tools relying on build/z64.map

This also breaks, to a greater or lesser extent, many of our scripts that rely on the mapfile

ah. I didn't think of that 😨

This seems to be essentially unavoidable.

We could make the tools read from the individual segment's map files as well. I'll try to whip something up (unless someone is interested too, let me know you're doing it so we're not going two times at it)

See above, this seems like pointless complication. I remind you that this is saving seconds, and possibly making the linker script easier to read, when surely the reason we're using the spec format in the first place is because we didn't want to deal with reading and editing a linker script...

I don't have that experience but why? I feel like at that point it's enough to match the segment anyway and we don't have less information at the segment level?

Segments are essentially unrelated to bss reordering, bss reordering occurs at compile time and is filewise. I remind you that OoT has exactly one file that we cannot match completely...

    include "build/src/code/fault.o"
    include "build/data/fault.bss.o"

Why? Because it is the only -g3 file with bss, which is reordered. But it's worse than that...

Suppose I have some required in-function static bss. Some of the bss in fault.c is like this on retail (which we know because fault is almost identical to MM's, and MM's needs it). collision_check.c needs it. sys_math3d.c is essentially unworkable at the moment in MM because of it: @Thar0 himself gave up on it because of it. Suppose now that somehow we do fix it, and then changing a header a little reorders it. The symbols themselves are not even in the map in the first place (now there's something in the linking system that it would be beneficial to fix!), so you have to find them by looking around them. And you do that my looking for other symbols in the file, for now. But with this, there are no indications of which files to check any more...

bss is hard enough to deal with with the current system, this exacerbates the problem. Until we have a full understanding of bss reordering and a solution for it, we really shouldn't be making this any worse.

ELL-boomer (jk)

I would also say that this appears to be trying to solve a problem that doesn't really exist

😠 but the cool technology 😢

Okay, so let's use Python 3.11, Rust, and C++23, and Ninja. And I'm not even joking about Ninja, I'm sure it has better mechanisms for dealing with some of this stuff than make. Hell, might as well just use yaml instead of the spec, seems to work pretty well for splat.

I also remind you that last time I suggested taking advantage of the advanced features of make like restarting, @Roman971 rejected all of my suggestions, and I redesigned the entire system of how Fado is used from scratch to take this into account.

Another problem is that this is employing advanced features of ld, which is going to make it even more difficult to replace should we want to try (which we should... I remind you that the original argument for this was efficiency, and using an inefficient program in a fancier way is surely a suboptimal way to do that, by definition).

@EllipticEllipsis
Copy link
Contributor

I'd love to see numbers on the RAM usage, but I don't know how to construct them.

@Dragorn421
Copy link
Collaborator

ELL-2 Changing the spec format

Why would this be an issue? If we're open to moving from our spec why not alter it in the mean time
It sounds reasonable to me to have more directives if the alternative is hardcoding things in mkldscript

It obfuscates, fairly pointlessly, the simple operation of simply including one file or another... which is surely the point of the spec format in the first place!

Well the idea would be relocs are entirely handled by the build system so I don't see why the "user" should have to see those in the spec (assuming you're talking about relocs)

Surely it's better for NON_MATCHING to be more symmetric than that?

I don't get what you mean there

ELL-3 effects on readability of the split segments on the scripts and maps

Files. Most of the issues I pointed out subsequently stem at least partially from the file splits being lost from the mapfile, which removes most of the functionality. "Just read the segment ldscripts instead" is hardly a sensible option for every new script that parses the mapfile to have to do (and I would know, having written several myself).

We could just have a script act as a "library" for reading the various map files and collecting all the information, even putting it all together possibly. I realize it's more work to do in the PR but I think the issue can be addressed in a satisfactory manner

I don't think (???) there is a way to carry the source file information all the way to the main map, but (disclaimer: possibly dumb idea ahead) we could put sections from each file into different sections. For example in code instead of

		build/src/code/z_en_a_keep.o (.text)
		. = ALIGN(0x10);
		build/src/code/z_en_item00.o (.text)
		. = ALIGN(0x10);

we could have

	.text.z_en_a_keep :
	{
		build/src/code/z_en_a_keep.o (.text)
		. = ALIGN(0x10);
	}
	.text.z_en_item00 :
	{
		build/src/code/z_en_item00.o (.text)
		. = ALIGN(0x10);
	}

then the main ldscript would link in .text.* and the map looks like

..code          0x000000008001ce60   0x13af30 load address 0x0000000156b52000
                0x000000008001ce60                _codeSegmentStart = .
 build/segments/code.o(.text.*)
 .text.z_en_a_keep
                0x000000008001ce60      0xad0 build/segments/code.o
                0x000000008001ce60                EnAObj_SetupAction
...
                0x000000008001d804                EnAObj_Draw
 .text.z_en_item00
                0x000000008001d930     0x24c0 build/segments/code.o
                0x000000008001d930                EnItem00_SetupAction

(note I didnt get this working, somehow my rushed changes ended up with

$ make
tools/elf2rom -cic 6105 zelda_ocarina_mq_dbg.elf zelda_ocarina_mq_dbg.z64
Makefile:269: recipe for target 'zelda_ocarina_mq_dbg.z64' failed
make: *** [zelda_ocarina_mq_dbg.z64] Segmentation fault

???)

This seems more of a flaw/redundancy in mkldscript than inherent to the current system, though: either the extra sections and alignment are not required, in which case they should not be mentioned in the first place, or you are simply sweeping the complexity under the rug into other files, creating a tree that has to be understood instead of a flat file should anything more precise be required. I'm not really sure why it helps that the main script is more readable if you end up having to troubleshoot in the messier individual ones anyway, which seems inevitable given what we've been dealing with so far.

It may be only personal preference then that the split scripts look better. Does it really matter anyway, we are mostly not going to look at them (though for lld support tinkering with shorter scripts sounds good).

ELL-5

I remind you that this is saving seconds

which is a lot when the whole thing also takes seconds! 30%-50% time saved when compiling after editing an actor!

!!! 😄

ELL-6 THE BSS AAAAAH

I still don't understand the stakes here. The bss doesn't move across segments (? :monkaBSS:) so you would only need the segment's own map file to figure it out, it actually seems like a shorter map would help to not get lost?

ELL-new-tech

Okay, so let's use Python 3.11, Rust, and C++23, and Ninja. And I'm not even joking about Ninja, I'm sure it has better mechanisms for dealing with some of this stuff than make. Hell, might as well just use yaml instead of the spec, seems to work pretty well for splat.

TOML > YAML 😏 (20% off your first monthly subscription by clicking this link https://toml.io/en/ )
But yeah I don't see a reason not to modernize everything.
Python 3.11 no because it doesn't come with older distros, it also doesn't matter much anyway for what we're doing here
C++23 I'm okay with the "23" part whatever (?) but it's the "C++" that scares me
Rust sure what for? I have no experience with it either. No clue if compiling it is a hassle (we use C for tools because it's unbelievably portable)
Why not ninja, why not literally anything if it can be easier to maintain and understand

(about why not use yaml though, a good reason is it's not in most standard libraries so a pain for tools to interact with. the current spec is also not good but at least the format is simplistic enough that regex can do "enough")

fwiw I would even get rid of ido or ido recomp if someone came up with a compiler that was better behaved, supported more recent C (I want indexed array initializers), and somehow happened to OK the rom. Probably not an opinion many would share (?) but it's unlikely enough we don't need to think about it, do we

Another problem is that this is employing advanced features of ld

Is it? Which ones?

@Thar0 Thar0 marked this pull request as draft June 1, 2022 21:13
@zeldaret zeldaret deleted a comment from abrasivezebra Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants