Support temporarily disabling persisting nils for a model #501

Bringer128 · 2021-07-13T16:20:02Z

Background

We have an odd scenario where we have a model with a default scope.

class Model < ApplicationRecord
  enum create_status: [:in_progress, :success, :failed]
  default_scope { where(create_status: :success) }
end

class ModelBuilder
  def build
    model = Model.new(create_status: :in_progress)
    model.save

    # Do a bunch of work, taking a few seconds either inline or perhaps in a background job, then:
    model.update(create_status: :success)
  end
end

This enables us to hide this model from the rest of the system by default until it has reached :success.

Due to system complexity sometimes a request can come for this record while it is in :in_progress state, which will cause IDC to cache a nil value. For us, caching nil is a good thing if the model is in :failed state or if the record is on another shard (good performance characteristics) but not in :in_progress state which means the record is inaccessible until cache_ttl

Proposal

Add another flag that can be persisted to the db for a record called idc_do_not_cache_nil. Internally IDC will load this value and attempt to load from the db. In this scenario there are two possibilities:

ActiveRecord returns nil - the default scope couldn't find the record (either :in_progress, :failed or simply not there) - return the nil to the client but do not cache.
ActiveRecord returns the record - cache as normal.

The max time idc_do_not_cache_nil is present could be cache_ttl or it could be shorter in our case.

This solves the exact problem we're encountering without introducing locking similar to the Thundering Herd problem: (#373).

The text was updated successfully, but these errors were encountered:

dylanahsmith · 2021-09-07T22:36:59Z

First of all, we should probably have a way to completely opt-out of nil caching for a cache key (e.g. the primary cache key for a model). Caching nil was added to handle the situation where there could be a large number of readers trying to read a record after it gets deleted (e.g. for a 404 page). However, it doesn't really make sense for records that don't get deleted (from the perspective of the default scope) and can introduce additional consistency issues around record creation.

I suspect that would be good enough for your use case as well.

Add another flag that can be persisted to the db for a record called idc_do_not_cache_nil. Internally IDC will load this value and attempt to load from the db

If Identity Cache is loading from the database using the default scope, then the row would be filtered in the database, so it wouldn't even see a idc_do_not_cache_nil column. Fundamentally, the database query itself would need to change, which also means that the default scope could no longer be used unchanged. Changing the default scope just for this caching concern doesn't seem appropriate. Also, using a separate scope for IdentityCache so that it can gets these rows and filter them in memory seems like it could interfere with the desired coupling to the default scope's semantics about what gets returned.

Also, if Identity Cache is going to have to load the records and do filtering in-memory, then it would already have the values that the proposed idc_do_not_cache_nil is derived from. So it feels unnecessary to have to add another column to the database for this derived data.

This solves the exact problem we're encountering without introducing locking similar to the Thundering Herd problem: (#373).

I don't see how this issue is in any way related to that. Cache fetching uses the CAS/ADD cache operations already to handle race conditions around cache fills even without that and the decision about whether or not to cache something loaded from the database just effects the decision about whether or not to fill the cache using one of those cache operations.

Bringer128 · 2021-09-08T15:57:03Z

If Identity Cache is loading from the database using the default scope, then the row would be filtered in the database, so it wouldn't even see a idc_do_not_cache_nil column. Fundamentally, the database query itself would need to change, which also means that the default scope could no longer be used unchanged. Changing the default scope just for this caching concern doesn't seem appropriate. Also, using a separate scope for IdentityCache so that it can gets these rows and filter them in memory seems like it could interfere with the desired coupling to the default scope's semantics about what gets returned.

My apologies, I made a mistake in my proposal wording. The goal was to persist an idc_do_not_cache_nil into the cache for the record in the same way that deleted and nil have special flags persisted. Storing something in the db would rightfully not apply here.

First of all, we should probably have a way to completely opt-out of nil caching for a cache key (e.g. the primary cache key for a model). Caching nil was added to handle the situation where there could be a large number of readers trying to read a record after it gets deleted (e.g. for a 404 page). However, it doesn't really make sense for records that don't get deleted (from the perspective of the default scope) and can introduce additional consistency issues around record creation.

I suspect that would be good enough for your use case as well.

I think this would probably work too. I can look into measuring the impact (basically the number of requests per day etc for this model that should return 404s).

This solves the exact problem we're encountering without introducing locking similar to the Thundering Herd problem: (#373).
I don't see how this issue is in any way related to that. Cache fetching uses the CAS/ADD cache operations already to handle race conditions around cache fills even without that and the decision about whether or not to cache something loaded from the database just effects the decision about whether or not to fill the cache using one of those cache operations.

I can definitely look into this again, but my recollection was that we didn't use CAS when it didn't exist in the cache in the first place.

dylanahsmith · 2021-09-08T16:18:46Z

My apologies, I made a mistake in my proposal wording. The goal was to persist an idc_do_not_cache_nil into the cache for the record in the same way that deleted and nil have special flags persisted.

The special value for caching nil is to avoid ambiguity between a cached nil and the cache not being filled in backend APIs which ideally we could get rid of.

The special value for delete is on cache invalidation in order to invalidate any concurrent cache fills with old data. Semantically, it is treated as if the cache isn't filled.

However, in this case we can just not fill the cache when the record is in this state. This is only detected on a cache miss, when the cache is already not filled, so we can just leave it in that state.

my recollection was that we didn't use CAS when it didn't exist in the cache in the first place.

Right, because CAS only works for updating an existing value, so we use ADD to fill the cache in this case, which will also be invalidated by writing the special delete value

kuldeepaggarwal pinned this issue Jul 13, 2021

dylanahsmith mentioned this issue Jan 17, 2023

how to handle inconsistency? #522

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support temporarily disabling persisting nils for a model #501

Support temporarily disabling persisting nils for a model #501

Bringer128 commented Jul 13, 2021

dylanahsmith commented Sep 7, 2021

Bringer128 commented Sep 8, 2021

dylanahsmith commented Sep 8, 2021

Support temporarily disabling persisting nils for a model #501

Support temporarily disabling persisting nils for a model #501

Comments

Bringer128 commented Jul 13, 2021

Background

Proposal

dylanahsmith commented Sep 7, 2021

Bringer128 commented Sep 8, 2021

dylanahsmith commented Sep 8, 2021