Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Ignore certain function arguments #135

Open
phayes opened this issue Nov 15, 2022 · 7 comments
Open

Feature Request: Ignore certain function arguments #135

phayes opened this issue Nov 15, 2022 · 7 comments

Comments

@phayes
Copy link

phayes commented Nov 15, 2022

I've got a function I'd like to memoize that looks like this:

fn do_expensive_db_query(db: &DbConnection, foo: &str) -> Bar {
   db.query("SELECT ...").args(foo).execute()
}

I'd like to be able to ignore the db argument, since it just contains a live connection to the database and has no impact on the memoized value. It also has a weird lifetime since it's Clone, but is tied back to a connection-pool.

@phayes phayes changed the title Feature Request: Ignore certain arguments Feature Request: Ignore certain function arguments Nov 15, 2022
@jaemk
Copy link
Owner

jaemk commented Nov 15, 2022

Hey @phayes, you can ignore arguments by specifying a convert block which generates the cache key from your arguments. Here's an example

#[cached(key = "String", convert = r#"{ format!("{}{}", a, b) }"#)]
and there's some more in here https://docs.rs/cached/latest/cached/proc_macro/index.html . When you specify a convert block, you also need to specify either the key which is the type that will be set as the cache key type or specify the type which is the full cache type (This is because the macro can no longer infer the cache key type from your function arguments). So for your function, you might do something like

#[cached(
    type = "SizedCache<String, Bar>",
    create = "{ SizedCache::with_size(20) }",
    convert = r#"{ String::from(foo) }"#
)]
fn do_expensive_db_query(db: &DbConnection, foo: &str) -> Bar {
   db.query("SELECT ...").args(foo).execute()
}

@phayes
Copy link
Author

phayes commented Nov 15, 2022

Thanks @jaemk! I appreciate the detailed answer.

Will the generated wrapper do the right thing with db: &DbConnection and just drop it on the ground when writing the cache or reading with a cache-hit?

@jaemk
Copy link
Owner

jaemk commented Nov 16, 2022

No problem!

The caller still passes the db connection reference, but it's not included in the value that is cached and it's not used to determine if there is a cached value. The only time it's used is when there's a cache miss and the body of your function needs to be executed to produce a value that will be returned and cloned to the cache.

@phayes
Copy link
Author

phayes commented Nov 16, 2022

Thanks @jaemk ,

I ask the question because I'm noticing something where when caching functions that take a db: &DbConnection, my database connections eventually get stalled and I get errors related to the database connection pool running out of connections.

My entire application is async, and my Cargo.toml looks like this:

cached = { version = "0.40", features = ["async", "async_tokio_rt_multi_thread"] }

Is there anything in particular we need to do to use the async side of the cached crate correctly?

@jaemk
Copy link
Owner

jaemk commented Nov 16, 2022

That's interesting. Have you confirmed that it only happens when a function is #[cached]? The core of the macro is right here

async fn inner(#inputs) #output #body;
let result = inner(#(#input_names),*).await;
let mut cache = #cache_ident.lock().await;
#set_cache_block
result

There are no modifications made to the function contents you are wrapping. It's simply copied to the "inner" function and then called when a cached value isn't found. The above chunk is referenced here as the #do_set_return_block

#visibility #signature_no_muts {
use cached::Cached;
let key = #key_convert_block;
{
// check if the result is cached
let mut cache = #cache_ident.lock().await;
if let Some(result) = cache.cache_get(&key) {
#return_cache_block
}
}
#do_set_return_block
}

My only thought is that it's possible the additional wait points and potential lock contention introduced from the global cache may mean that connections are being held onto longer than necessary when many of this one task are running.

It may help if you pass a connection pool to this function instead of a connection instance itself. Then a connection can be pulled, used, and returned only when your code needs to be executed instead of every time the cached function is called.

@phayes
Copy link
Author

phayes commented Nov 16, 2022

Thanks @jaemk ,

The issue is frustratingly non-deterministic. I suspect it may have something to do with lock contention.

Thank you for taking the time to explain all this to me, it's very helpful and deeply appreciated.

Let's close this issue for now, I'll open a new one if I manage to get to the bottom of it in a meaningful / reproducible way.

@jaemk
Copy link
Owner

jaemk commented Nov 16, 2022

Hope you can figure it out! I'd definitely try passing in a connection pool if it's possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants