Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement garbage collector for dict backend #115

Open
youknowone opened this issue Apr 20, 2019 · 4 comments
Open

Implement garbage collector for dict backend #115

youknowone opened this issue Apr 20, 2019 · 4 comments

Comments

@youknowone
Copy link
Owner

youknowone commented Apr 20, 2019

Because dict backend doesn't have any gc machansm, it can be infinitely inflated.
For now, this is warned in documentation but GC will make everyone happy even who doesn't read document.

Decision background: The most important feature of dict backend is transparency. So that user always can access to the bare key/value data of the backend. Due to this limitation, encoding or adding metadata is not a preferred way for dict backend.

Policy:

  • Do not gc unless its size hits maxsize
  • Default value of maxsize is 128. Let's follow @ring.lru interface for maxsize.
  • Once it hits the maxsize, the gc removes at least 25% of maxsize.
  • GC runs 2 strategies one by one.
    • 1st strategy
      • (psuedo) Randomly pick an element and check it is expired.
      • If expired, remove it and run 1st strategy again; otherwise increase missing counter.
      • Once the missing counter hits 4, exit from the strategy.
    • Condition checking: If gc succeeed to remove more than 25% of garbages, stop here.
    • 2nd strategy
      • Randomly pick an element and remove until the size goes less than 75% of maxsize
@amsukdu
Copy link

amsukdu commented Jun 3, 2019

Hello sir,
I think I can work on this. Will you let me?

@youknowone
Copy link
Owner Author

Hello! Thanks to participate in this project.

Any kind of participation is welcomed and appreciated.
Of course you can take this over. Please don't hesitate to share any question, problem or suggestion you encounter during this work.

@amsukdu
Copy link

amsukdu commented Jun 5, 2019

Quite a few questions but I think you'll answer these with delight 😄

  1. if the user didn't define expire, we go straight to strategy 2, right?
  2. Does missing counter have to be parameterized?
  3. You noted

Condition checking: If gc succeeed to remove more than 25% of garbages, stop here.

but I think it could be maxsize not garbages. If it really is garbages, I assume we have to count all of the garbages based on the expire.

  1. storage_class seems like a wrapper of the actual backend cache storage, such as obj. If the storage_class can be injected by the user, how can we track all the getters and setters?

@youknowone
Copy link
Owner Author

  1. Yes
  2. For now, I don't think so. A constant variable seems enough.
  3. Yes. 25% of maxsize is correct.
  4. It is up to users. If someone replaced the storage (implementation), it means they needs their own behaviors. So we only need to implement this feature into the default storage class.

@amsukdu amsukdu mentioned this issue Jun 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants