Skip to content

02 PhIP Undeclared Variables

Marcus Denker edited this page Sep 27, 2021 · 8 revisions

Undeclared Variables that are used (read or written), but not defined. As such, we have no information other than the name.

They are handled differently in interactive and non-interactive mode: In interactive mode, we raise an exception that shows to the developer a menu that allows the developer to define the variable.

  • The resulting infrastructure on the compiler side is a bit complex, see the exception "ReparseAfterSourceEditing".
  • The user interaction can be annoying as it raises dialogs for each var.

For the case of compiling an Undeclared:

The compiler will add an association-like object (UndeclaredVariable) to the Undeclared dictionary (a Dictionary pointed to by the global variable "Undeclared"). It then compiles standard global access: it uses the bytecodes pushLiteraVariable/storeIntoLiteralVariable to modify the second ivar of the Variable Object (binding) which is referenced from the literal frame of the method.

When defining new Globals, it will check if an Undeclared of that name already exists, copy over the value and remove it entry from the Undeclared Dictionary. This fixes all accessing methods without needing a recompile.

For undeclared local methods, it will just recompile all methods of the class as soon as you add a new ivar, the value stored in the global is not retrieved (it is just one value, for an ivar it makes not much sense to recover that. Thus a new ivar starts with nil even though code with an Undeclared of that name has been written to before).

Good Properties

  • We can load uses before definitions in non-interactive mode

  • We are asked to define a Variable of compile when in interactive mode

  • When loading the definition, the uses are fixed automatically with no intervention needed

    • For Globals, this works even with no need for the compiler!
    • For Globals, this means that it fixes even hot methods that are on the stack
  • We do support limited execution capabilities, which allows code to not just load but be executed with undeclared variables But Caveat! It only is correct for Globals! For ivars it just has one global representing what should be per-object state.

Problem 1: Undeclared can be executed at runtime with no User Feedback

If you have an Undeclared, it reads as nil and we can write to it as a global.

This is for one good, as being able to execute such code in non-interactive mode is nice.

But on the other hand, it is bad:

  • If an Undeclared is sneaking in, the Programmer might find it very late. Executing code with tests will not find it!
  • It is very very confusing to the developers. We are getting bug reports for this regularly!
    • Non-interactive compile happens more often than you think, e.g. on removing an ivar from a class
    • This is especially bad as the resulting execution uses a Global for a variable that is instance state and hides the fact that execution is completely incorrect, see the next point.

Problem 2: All Undeclared Variable are modeled as Globals

An undeclared instance var or temp leads to one shared global of that name.

We could to better: Variables names that are not capitalized we know are either temps or instance state. Temps are only accessed in one method (and written to first).

We could see "Undeclared write" as examples that teach the Variables which role it has. Even the first step: compiling what should be a temp as an ivar is much better than what we have now, as it would be correct for ivars. I am sure we can come up with a good heuristic if a variable should be a temp or not.

Small bad properties:

  • To fix the instance state, we need the compile and thus rely on the compiler anyway.
  • Due to combining "save" and "compile", we forcing the programmer to define a variable every time a method is saved

What is the relationship to late-binding global access?

I would argue that this is related, but independent of the idea of late-binding global access in general.

  • We have to support undeclared local variables, too
  • We can late-bind Undeclared access without having to late bind all global access

Would it be a problem to rely on recompilation to fix undeclared variable access?

We do not have to. Recompilation would only be needed to remove the Undeclared read to make it fast and to make "references" to that var explicit (for the IDE).

By late-binding Undeclared access, we could in case that a recompile is not possible, make the late-bound version do the right thing.

  • This would just be slower
  • but it would work. This would e.g. mean that methods on the stack would be able to work after a var is defined

If we compile explicit read and write to the Undeclared, we could even fix that and replace it (without the need of a compiler) with the now valid variable after it has been defined. This would then just leave the difference of "pushLiteralVariable" vs. "pushLiteral, send, execute quickReturn" when e.g loading code without a compiler.

What do we want?

  • In non-interactive mode: code with Undeclards is loadable
  • The resulting code can be executed in non-interactive mode
  • We want an exception in interactive mode on execute (which shows a menu: define this var!)

How do we get this?

  • We compile a message to an instance of an Undeclared Variable -- for globals (vars with capital letter), we just compile a #read and #write -- for ivars and temps, we compile a send that hands over the current context

  • read and write check if we are in IDE mode or not yes: Show menu no: simulate a ivar/global/temp, by storing the state on the global, as state per var and instance for ivars and even as context state if we know that this is a temp.

  • read and write check if a var has been defined in the meantime on read and write and need to be able to forward to the real var in this case, without recompile.

  • We need to support fixing the var access if a compiler is available

    • either at read/write or at definition of vars. The current model does the latter, and I think we should do the same.
      • read/write checks if a definition has been added and forwards
      • any new definition of variables check if they can fix read/writes of Undeclareds by re-compiling
  • We have different kinds of Unknown Variable classes and select the right one according to a heuristic.

    • at the beginning, we just support ivar and global, heuristic is if the var name is capitalized
    • I think we can support temps, too (just accessed in one method, it is assigned first). But this then needs the possibility to mutate a temp to an instance var as more code is loaded. (in a way, we would teach the var its role slowly).
  • With this in place, we would not need to ask to define variables of "save". We could just compile an Undeclared and the developer fixes it with the gutter "repair" icon or as part of running the tests. Fixing undeclared vars would thus work the same as fixing a DNU. As a side effect, we can radically simplify the exception handling in the compiler and remove the "ReparseAfterSourceEditing" mechanism.

  • We can simplify the #doesNotUnderstand: on nil, as it now tries to find out if it happens on a nil that comes from a Undefined Global to propose creating classes at execution time.