Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store max-bound variable in terms #21

Open
brendanzab opened this issue Jul 8, 2018 · 1 comment
Open

Store max-bound variable in terms #21

brendanzab opened this issue Jul 8, 2018 · 1 comment
Labels
help wanted Extra attention is needed performance Making things faster, or use less memory

Comments

@brendanzab
Copy link
Owner

brendanzab commented Jul 8, 2018

Lean (apparently) stores the maximum bound variable to improve the performance of its locally nameless representation.

Here is a quote from section 3.3 in Elaboration in Dependent Type Theory (note that 'instantiate/abstract' maps to 'open/close' respectively in our nomenclature):

Although the locally nameless approach greatly simplifies the implementation effort, there is a performance penalty. Given a term t of size n with m binders, it takes O(nm) time to visit t while making sure there are no dangling bound variables. In [21], the authors suggest that this cost can be minimized by generalizing abstract and instantiate to process sequences of free and bound variables. This optimization is particularly effective when visiting terms containing several consecutive binders, such as λx1 : A1, λx2 : A2, … , λxn : An, t. Nonetheless, we observed that these two operations were still a performance bottleneck for several files in the Lean standard library. We have addressed this problem using a very simple complementary optimization. For each term t, we store a bound B such that all de Bruijn indices occurring in t are in the range [0,B). This bound can easily be computed when we create new terms: the bound for the de Bruijn variable with index n is n + 1, and given terms t and s with bounds Bt and Bs respectively, the bound for the application (t s) is max(Bt,Bs), and the bound for (λx:t,s) is max(Bt,_Bs_−1). We use the bound B to optimize the instantiate operation. The idea is simple: B enables us to decide quickly whether any subterm contains a bound variable being instantiated or not. If it does not, then our instantiate procedure does not even visit the subterm. Similarly, for each term t, we store a bit that is set to “true” if and only if t contains a free variable. We use this bit to optimize the abstract operation, since it enables us to decide quickly whether a subterm contains a free variable.

These optimizations are crucial to our implementation. The Lean standard library currently contains 172 files, and 41,700 lines of Lean code. With the optimizations, the whole library can be compiled in 71.06 seconds using an Intel Core i7 3.6Ghz processor with 32Gb of memory. Without the optimizations, it takes 2,189.97 seconds to compile the same set of files.

Perhaps our Scope<P, T> type could include this binding depth as a field?

@brendanzab brendanzab added help wanted Extra attention is needed and removed help wanted Extra attention is needed labels Jul 8, 2018
@brendanzab brendanzab added the performance Making things faster, or use less memory label Jul 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed performance Making things faster, or use less memory
Projects
None yet
Development

No branches or pull requests

1 participant