Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support temporarily disabling persisting nils for a model #501

Open
Bringer128 opened this issue Jul 13, 2021 · 3 comments
Open

Support temporarily disabling persisting nils for a model #501

Bringer128 opened this issue Jul 13, 2021 · 3 comments

Comments

@Bringer128
Copy link

Background

We have an odd scenario where we have a model with a default scope.

class Model < ApplicationRecord
  enum create_status: [:in_progress, :success, :failed]
  default_scope { where(create_status: :success) }
end

class ModelBuilder
  def build
    model = Model.new(create_status: :in_progress)
    model.save

    # Do a bunch of work, taking a few seconds either inline or perhaps in a background job, then:
    model.update(create_status: :success)
  end
end

This enables us to hide this model from the rest of the system by default until it has reached :success.

Due to system complexity sometimes a request can come for this record while it is in :in_progress state, which will cause IDC to cache a nil value. For us, caching nil is a good thing if the model is in :failed state or if the record is on another shard (good performance characteristics) but not in :in_progress state which means the record is inaccessible until cache_ttl

Proposal

Add another flag that can be persisted to the db for a record called idc_do_not_cache_nil. Internally IDC will load this value and attempt to load from the db. In this scenario there are two possibilities:

  1. ActiveRecord returns nil - the default scope couldn't find the record (either :in_progress, :failed or simply not there) - return the nil to the client but do not cache.
  2. ActiveRecord returns the record - cache as normal.

The max time idc_do_not_cache_nil is present could be cache_ttl or it could be shorter in our case.

This solves the exact problem we're encountering without introducing locking similar to the Thundering Herd problem: (#373).

@kuldeepaggarwal kuldeepaggarwal pinned this issue Jul 13, 2021
@dylanahsmith
Copy link
Contributor

First of all, we should probably have a way to completely opt-out of nil caching for a cache key (e.g. the primary cache key for a model). Caching nil was added to handle the situation where there could be a large number of readers trying to read a record after it gets deleted (e.g. for a 404 page). However, it doesn't really make sense for records that don't get deleted (from the perspective of the default scope) and can introduce additional consistency issues around record creation.

I suspect that would be good enough for your use case as well.

Add another flag that can be persisted to the db for a record called idc_do_not_cache_nil. Internally IDC will load this value and attempt to load from the db

If Identity Cache is loading from the database using the default scope, then the row would be filtered in the database, so it wouldn't even see a idc_do_not_cache_nil column. Fundamentally, the database query itself would need to change, which also means that the default scope could no longer be used unchanged. Changing the default scope just for this caching concern doesn't seem appropriate. Also, using a separate scope for IdentityCache so that it can gets these rows and filter them in memory seems like it could interfere with the desired coupling to the default scope's semantics about what gets returned.

Also, if Identity Cache is going to have to load the records and do filtering in-memory, then it would already have the values that the proposed idc_do_not_cache_nil is derived from. So it feels unnecessary to have to add another column to the database for this derived data.

This solves the exact problem we're encountering without introducing locking similar to the Thundering Herd problem: (#373).

I don't see how this issue is in any way related to that. Cache fetching uses the CAS/ADD cache operations already to handle race conditions around cache fills even without that and the decision about whether or not to cache something loaded from the database just effects the decision about whether or not to fill the cache using one of those cache operations.

@Bringer128
Copy link
Author

If Identity Cache is loading from the database using the default scope, then the row would be filtered in the database, so it wouldn't even see a idc_do_not_cache_nil column. Fundamentally, the database query itself would need to change, which also means that the default scope could no longer be used unchanged. Changing the default scope just for this caching concern doesn't seem appropriate. Also, using a separate scope for IdentityCache so that it can gets these rows and filter them in memory seems like it could interfere with the desired coupling to the default scope's semantics about what gets returned.

My apologies, I made a mistake in my proposal wording. The goal was to persist an idc_do_not_cache_nil into the cache for the record in the same way that deleted and nil have special flags persisted. Storing something in the db would rightfully not apply here.

First of all, we should probably have a way to completely opt-out of nil caching for a cache key (e.g. the primary cache key for a model). Caching nil was added to handle the situation where there could be a large number of readers trying to read a record after it gets deleted (e.g. for a 404 page). However, it doesn't really make sense for records that don't get deleted (from the perspective of the default scope) and can introduce additional consistency issues around record creation.

I suspect that would be good enough for your use case as well.

I think this would probably work too. I can look into measuring the impact (basically the number of requests per day etc for this model that should return 404s).

This solves the exact problem we're encountering without introducing locking similar to the Thundering Herd problem: (#373).
I don't see how this issue is in any way related to that. Cache fetching uses the CAS/ADD cache operations already to handle race conditions around cache fills even without that and the decision about whether or not to cache something loaded from the database just effects the decision about whether or not to fill the cache using one of those cache operations.

I can definitely look into this again, but my recollection was that we didn't use CAS when it didn't exist in the cache in the first place.

@dylanahsmith
Copy link
Contributor

My apologies, I made a mistake in my proposal wording. The goal was to persist an idc_do_not_cache_nil into the cache for the record in the same way that deleted and nil have special flags persisted.

The special value for caching nil is to avoid ambiguity between a cached nil and the cache not being filled in backend APIs which ideally we could get rid of.

The special value for delete is on cache invalidation in order to invalidate any concurrent cache fills with old data. Semantically, it is treated as if the cache isn't filled.

However, in this case we can just not fill the cache when the record is in this state. This is only detected on a cache miss, when the cache is already not filled, so we can just leave it in that state.

my recollection was that we didn't use CAS when it didn't exist in the cache in the first place.

Right, because CAS only works for updating an existing value, so we use ADD to fill the cache in this case, which will also be invalidated by writing the special delete value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants