Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor tasks from asynchronous review #742

Open
9 tasks done
kayceesrk opened this issue Nov 16, 2021 · 20 comments
Open
9 tasks done

Minor tasks from asynchronous review #742

kayceesrk opened this issue Nov 16, 2021 · 20 comments

Comments

@kayceesrk
Copy link
Contributor

kayceesrk commented Nov 16, 2021

This is a list of minor tasks from the asynchronous review phase. Major tasks will have their own issues. The multicore team members should feel free to start picking up these tasks and working on it. Please put your GitHub handle beside the task if you've picked up a task to avoid duplicating the work.

  • Please move runtime/gen_sizeclasses.ml to tools/ or remove it entirely. As far as I could tell, it's not used in normal builds of the system.
  • The function build_frame_descriptors should be static or should be renamed caml_build_frame_descriptors.
  • For runtime/lf_skiplist.c I trust Luc's extensive review. Just one cosmetic suggestion:
    (see also William Pugh, "Skip lists: a probabilistic alternative to
    balanced binary trees", Comm. ACM 33(6), 1990). */

Replace with

    A sequential implementation of skip lists is in file skiplist.c and is based on the paper by
    William Pugh, "Skip lists: a probabilistic alternative to
    balanced binary trees", Comm. ACM 33(6), 1990). */
  • caml_extern_state could be allocated on first use of the marshaling functions in a domain (and then stored in Caml_state). This would make domain creation faster in the common case where a domain will not use the marshaler.
  • runtime/frame_descriptors.[ch] need copyright headers.
  • The function build_frame_descriptors should be static or should be renamed caml_build_frame_descriptors.
  • indentation in caml_process_pending_signals_exn needs fixing.
  • sync.c needs copyright header
  • sync.c line 191 line too long
@kayceesrk kayceesrk changed the title Remove gen_sizeclasses? Minor tasks from asynchronous review Nov 16, 2021
@damiendoligez
Copy link
Contributor

damiendoligez commented Nov 19, 2021

  • Revisit the capping of work-to-do.
    if (p > 0.3) p = 0.3;
    /* cap accumulated work todo to p = 0.3 */
    In the sequential GC, the work was capped because you can't do more than 1/3 of a GC cycle in one slice. This turned out to cause problems. Here, you have an accumulated work-to-do, which is not subject to the constraint, so I think it's better not to cap it (never mind doing it twice...). I think it would be better to cap computed_work here:
    computed_work = (dom_st->major_work_todo > 0)
  • This debug code looks strange.
    #ifdef DEBUG
    #define Is_markable(v) (Is_block(v) && !Is_young(v) && v != Debug_free_major)
    #else
    If v is Debug_free_major (or any Debug_* value) you probably want a fatal error right there, instead of ignoring the value. Moreover, the Debug_* values are odd, so they can't be blocks. ( @sadiqj )
  • Misleading comment:
    /* This function shrinks the mark stack back to the MARK_STACK_INIT_SIZE size
    and is called at the end of a GC compaction to avoid a mark stack greater
    than 1/32th of the heap. */
    The function (caml_shrink_mark_stack) is also called by domain_terminate via caml_finish_marking. ( @sadiqj )
  • I don't understand this comment:
    *work -= Wosize_hd(chd); /* account for header */
    In fact, it seems to me that the header is not accounted for, in this case. ( @sadiqj )
  • Let's simplify the code by removing the naked pointer checker (this will need a consensus decision at the synchronous meeting).

@damiendoligez
Copy link
Contributor

damiendoligez commented Nov 19, 2021

About shared_heap.c

  • This is relying on the initial value of mapped memory:
    for (i=1; i<POOLS_PER_ALLOCATION; i++) {
    I'd rather initialize the first pool explicitly like the others.
  • Assert vs CAMLassert: we should choose one and get rid of the other. We'll decide which one at the meeting.
  • In this file, you are using 0 as a pointer value instead of NULL. I don't like it and neither does Xavier [citation needed]
  • This doesn't work with the way large_allocate is called in caml_shared_try_alloc:
    if (!a) caml_raise_out_of_memory();
    Since this function can be called from the minor GC, this line should be replaced with if (!a) return NULL;.
  • In pool_sweep, when you find a garbage object and add it to the free list, in DEBUG mode you should set its fields to Debug_free_major.

@xavierleroy
Copy link
Contributor

In this file, you are using 0 as a pointer value instead of NULL. I don't like it and neither does Xavier [citation needed]

I can live with 0 but I do find NULL clearer. Also, since NULL is (void *) 0, it is typed a little more precisely than 0. E.g. mistakenly passing NULL to a function expecting a double is an error, but passing 0 is fine.

@damiendoligez
Copy link
Contributor

damiendoligez commented Nov 19, 2021

In finalise.c :

  • Please add a comment to caml_final_merge_finalisable explaining why the young of the source are added to the old of the target. I think I know why but I'd like to be sure.

@kayceesrk
Copy link
Contributor Author

kayceesrk commented Nov 22, 2021

  • runtime/amd64.S:589: jl (jump if signed less) should be jb (jump if unsigned less), since you're comparing pointers.

In asmcomp/amd64/emit.mlp:

  • [ ] The preproc_fun function looks independent of the target processor, so eventually it should be moved to a target-independent module, e.g. Emitaux. But actually I'd prefer to refactor all the computations of various aspects of functions (whether they contain calls or not, initial stack size, maximal stack size, ...) to after Linearize and before Emit. (We had this discussion for poll point insertion already.) (moved to Improvements to emit.mlp organization #754)
  • [ ] Stack checking is done at the beginning of the function (in Emit.fundecl) while the rest of the function prologue is emitted by the Lprologue instruction since this PR: Add Lprologue ocaml/ocaml#2055 , for reasons related to "debugger support". (Say no more.) I would prefer all the prologue code to be emitted in one place. This could mean reverting PR#2055 and removing the Lprologue instruction. (moved to Improvements to emit.mlp organization #754)
  • 838: you moved the cfi_adjust before the I.push instruction (twice), while in the trunk we adjust after push instructions. Which is best?
  • 923: let (overflow,ret) = new_label(), new_label() in
    Inconsistent parentheses + too complicated:
    let overflow = new_label() and ret = new_label() in
  • 934: (match !handle_overflow with None -> () | Some (overflow,ret) -> begin ... end);
    Would look better as
    begin match !handle_overflow with
    | None -> ()
    | Some (overflow, ret) -> ...
    end;

In runtime/amd64.S:

  • Speaking of CFI: could we assume that CFI is always supported? is it on macOS? This would avoid a mess of #ifdefs.
  • 61: the section name is .text.caml.##name in the current trunk, maybe a recent change that needs to be reflected.
  • 123: #include "../runtime/caml/config.h" makes me nervous; what is needed exactly? could we include smaller files like m.h (already included at the beginning of amd64.S) and something else to be discussed?
  • 153 and following: could we have C-like pseudocode in comments for what the stack switching operations are doing? It's kind of hard to follow right now.
  • 346 and following: #define for field offsets: move them earlier, before they are used? mention where the corresponding C structs are defined?
  • 397: it looks like the bottom word of a gc_regs_bucket structure sometimes contains the saved value of rax and sometimes contains a pointer to the previous structure. What about using distinct slots for these two purposes? It's not like the size of the gc_regs_bucket structure matters.
  • 467: caml_call_realloc_stack: not sure if all registers need saving, but it doesn't hurt.
  • 596: /* Pop arguments back off the stack */ only needed in DEBUG mode?
  • 619-623: movq $0, ... is this debug code? should it be in an IF_DEBUG section, then?
  • 502, 512, 522: LBL(caml_alloc{1,2,3}) is not useful, as far as I can see.
  • 984: wrong comment
    .quad 2 /* three descriptors */

@xavierleroy
Copy link
Contributor

I would suggest to delay the following two suggestions for after the MVP (i.e. after merge of Multicore 5.00 in the main OCaml trunk), because they are not urgent at all.

  • The preproc_fun function looks independent of the target processor, so eventually it should be moved to a target-independent module, [...]
  • Stack checking is done at the beginning of the function (in Emit.fundecl) while the rest of the function prologue is emitted by the Lprologue instruction since this PR

I'd be happy to propose PRs for these two after the MVP merge.

@kayceesrk
Copy link
Contributor Author

Thanks @xavierleroy. I've moved them to a separate issue #754.

@xavierleroy
Copy link
Contributor

xavierleroy commented Nov 24, 2021

  • In runtime/intern.c:
    v = caml_alloc(size, tag);
    for (i = 0; i < size; i++) Field(v, i) = Val_unit;

    Unless I'm mistaken, there is no need to initialize the fields to Val_unit, caml_alloc already did it.

@xavierleroy
Copy link
Contributor

xavierleroy commented Nov 24, 2021

@damiendoligez
Copy link
Contributor

damiendoligez commented Nov 25, 2021

@damiendoligez
Copy link
Contributor

damiendoligez commented Nov 25, 2021

@kayceesrk
Copy link
Contributor Author

Did I miss something? Is NULL now a valid value despite NO_NAKED_POINTERS?

null_stk is Val_ptr(NULL). So it is ok.

@kayceesrk
Copy link
Contributor Author

kayceesrk commented Nov 26, 2021

stdlib/fiber.mli:

  • line 33, Continuation_alread_taken should be Continuation_already_taken
  • line 52, the type "handler" is defined as "('a,'b) handler" but is then used at line 60 as "('b, 'c) handler". I would suggest using the same type variables names in both definitions; otherwise the reader is forced to perform a mental renaming. Same comment about "effect_handler" at lines 63 and 69. Same comment about "handler" at lines 89, 97, 105.

stdlib/fiber.ml:

  • IMHO, this code cannot be understood without comments. I would suggest writing some "big picture" comments at the beginning of the file, plus comments about the meaning of each type and function definition.

@Octachron
Copy link
Contributor

Octachron commented Nov 26, 2021

  • Looking at the stdlib diff with the anchor point, there is a bit of rebasing noise in stdlib/filename.ml: in the definition of check_suffix_name at line 103, String.ends_with has been inlined in multicore.

@xavierleroy
Copy link
Contributor

xavierleroy commented Dec 3, 2021

  • In runtime/domain.c, function create_domain: please initialize domain->c_stack to NULL. So that if C code calls caml_raise before any OCaml code is run in the domain, e.g. caml_stat_alloc that raises Out_of_memory, a clean fatal error is reported.
  • More generally, let's go over create_domain with a fine comb to make sure all the fields of the domain state are initialized, or to add comments explaining why some fields are left uninitialized and why this is safe.

@kayceesrk
Copy link
Contributor Author

Speaking of CFI: could we assume that CFI is always supported? is it on macOS? This would avoid a mess of #ifdefs.

I was pointed out there is a --disable-cfi option #454. IIUC we can't remove the #ifdefs.

@kayceesrk
Copy link
Contributor Author

467: caml_call_realloc_stack: not sure if all registers need saving, but it doesn't hurt.

Agreed. The C callee saved registers need not be saved.

In the previous design, the fiber stacks were managed on the OCaml heap. caml_try_realloc_stack may invoke the GC. So we needed to save all the registers as potentially all the values may contain OCaml values, and needed to be considered as roots for the GC.

This is not something that we need immediately as stack realloc is a rare operation. I created an issue for tracking this feature for the future #772.

@kayceesrk
Copy link
Contributor Author

596: /* Pop arguments back off the stack */ only needed in DEBUG mode?

SWITCH_C_TO_OCAML expects the rsp to point to the c_stack_link on the C stack. Hence, popping arguments is necessary.

@kayceesrk
Copy link
Contributor Author

#include "../runtime/caml/config.h" makes me nervous; what is needed exactly? could we include smaller files like m.h (already included at the beginning of amd64.S) and something else to be discussed?

@xavierleroy the only use of this is for the debug-only function caml_assert_stack_invariants https://github.com/ocaml-multicore/ocaml-multicore/blob/5.00/runtime/amd64.S#L969.

I could duplicate the definitions of macros in amd64.S, but it may end up difficult to keep them in sync if we need to change them since it would have to be changed for each of the different architectures.

The other option is to inline the function in amd64/emit.mlp, the only place where this is function is called. This function will have to be implemented differently for each architecture anyway. We already have Config.stack_threshold being used in amd64/emit.mlp. We'd have to add Config.stack_ctxt_wosize.

Any other ideas to do this better?

@ctk21
Copy link
Collaborator

ctk21 commented Dec 17, 2021

  • stack_size_bucket will only return something if wosize is exactly a power of two times caml_fiber_wsz but alloc_stack_noexc seems to call it with a weaker assumption.

This is a fair review comment; the code is brittle and looks like it might have bad errors. However I will attempt a defence of the existing situation:

  • if we don't get exactly a power of two times, then the cache is never hit; we fall back to allocate/deallocate
  • it turns out we will (nearly always) be a power of 2 times caml_fiber_wsz; new stacks start at caml_fiber_wsz and double on reallocation

The defence is weak, however I (at least) couldn't see a fast fix as things would need to be tested and written carefully here. I have filed an issue to pick this up #805

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants