Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loads old, incorrect, picture from it's cache instead of new picture #333

Open
IanQuilter opened this issue Jan 27, 2024 · 23 comments
Open

Comments

@IanQuilter
Copy link

I'm using Home Assistant to send a picture in a Gotify notification when a cctv camera is triggered. It's working and has been working since I started using last July. The pictures have a name and a number ie Drive1, Front3, Side1 etc. The number is incremented up to 100 then reset and starts from 1 again. ie Drive98 then Drive 99 then back to Drive1 then Drive2 etc.
The problem is that sometimes an old picture is served up from the app's cache instead of the new picture. Clearing the cache stops it happening for a while, then it does it again.

This has been doing it since I started using it. It's annoying and I have put up with it until now but it's happening more often and has finally made me annoyed enough report it.

The way the pictures are handled in Home Assistant means that the numbers added to the picture name aren't always consecutive and can have quite a few left out, ie Drive 34 may be followed by Drive 45 then Drive46 then Drive50 etc. Which means that it can be a day, or a week or more, before a picture is sent with the same number as one already in the cache.
Is there anything that can be done to the app to correct this?

@jmattheis
Copy link
Member

Are you able to modify the image url before pushing the message to gotify? You could include the current timestamp inside the url e.g. https://example.org/images/drive98.png?timestamp=1706353928517 then the cache won't use an old images because the url is unique.

@IanQuilter
Copy link
Author

Thanks for your suggestion of the time stamp I'll see if I can use it. I should have thought of that myself!

The reason I used a rotating 1 - 100 number suffix was that the pictures are stored locally and by re-using the numbers the old pictures eventually get overwritten as they don't need to be kept forever, but might need to be looked at again in a short period of time. Using 100 gives a long enough retention time without using too much storage space.

I'm not sure there is an easy way of deleting the old pictures if they all have unique file names. I'll see what I can do.

@IanQuilter
Copy link
Author

IanQuilter commented Jan 27, 2024

I've looked at changing the Home Assistant automation to add a time stamp to the picture. While it can do it it would mean completely recreating my automations and it will not be easy to do as multiple cameras use the same automation, using variables to enter camera name, picture name, number etc.

As a temporary test I've increased the number 100 that it resets to 1 at to 200 in the hope that the pictures in Gotify's cache will have expired by the time the same picture name reappears.

Do you know how long items in Gotify's cache are kept for?

@jmattheis
Copy link
Member

It is a LRU cache with a fixed size of 50mb, if it's more than that the least recently used picture is removed. Default caching of the http client is used here, so it's likely configurable via cache-control headers on the server that hosts the images.

@IanQuilter
Copy link
Author

IanQuilter commented Jan 27, 2024

Thanks for the reply. A 50mb cache may allow my picture number increase to 200 to work, as last time I cleared the cache it was around 35mb if I remember correctly. If 200 isn't enough increasing it a bit more may be all that's needed. I'll have to wait and see, which may take a few days or even a week or two.

By "server that hosts the images" do you mean the server that hosts Gotify or the Home Assistant server that is sending the images? I've never had to configure "cache-control headers" before so I'll need to look into exactly how they can be configured.

Thanks for your help and prompts replies.

@jmattheis
Copy link
Member

By "server that hosts the images" do you mean the server that hosts Gotify or the Home Assistant server that is sending the images?

Neither, or at least they are not required to be one of the ones listed. Example, if you push a message like this.

{
  "title": "Nextcloud",
  "message": "![](https://goverter.jmattheis.de/favicon.png)",
  "extras": {
    "client::display": {
      "contentType": "text/markdown"
    }
  }
}

The "server that hosts the images" is the one behind goverter.jmattheis.de.

@IanQuilter
Copy link
Author

Thanks for the explanation.

Home Assistant sends the Gotify notifications using a Rest api. I've added the cache-control: "no-cache" line to it and changed my reset number back to 100. Now I just need to wait and see if it's worked! I'll post back when I know in case it helps someone else.

Thanks for your help

@IanQuilter
Copy link
Author

Unfortunately adding the line cache-control: "no-cache" hasn't worked as I've started to get old pictures sent again. It's taken a while to show as there have been less notification being triggered recently. Checking the Gotify cache it has reached 50Mb so it appears that adding the no-cache line hasn't stopped Gotify caching the pictures.

The configuration entry in Home Assistant contains :-

platform: rest
resource: http://192.168.0.36:9080/message
method: POST_JSON
headers:
  cache-control: "no-cache"
  X-Gotify-Key: XXXXXXXXXXX
message_param_name: message
title_param_name: title
target_param_name: priority
data:
  priority: 1

It's possible I've not entered the command correctly, although I think it's correct. If it is correct do you have any other suggestions for stopping Gotify caching the pictures?
Thanks.

@jmattheis
Copy link
Member

Your setup looks fine. This then has to be fixed in the app, maybe the image loading framework has to be replaced with something supporting expiring images.

@IanQuilter
Copy link
Author

Thanks for the confirmation. As it doesn't look like this issue effects many users, or at least no one else seems to have reported it, I assume it won't be high on your fix agenda. But if you do make future changes to the app are you likely to fix this at the same time?

@jmattheis
Copy link
Member

Maybe, but I can't say this for sure.

@cyb3rko
Copy link
Contributor

cyb3rko commented Mar 11, 2024

As Picasso has not been updated in 4 years, switching to a more modern solution sounds reasonable.
Coil is a very modern and well maintained alternative:
https://coil-kt.github.io/coil
https://github.com/coil-kt/coil

By default it also respects cache headers of network responses.
Maybe that would already fix your issue.

@cyb3rko
Copy link
Contributor

cyb3rko commented Mar 12, 2024

I've tried it out and it works well for all kinds of image loading in the app as far as I can tell.
I will upload it soon so you two can test it out.

@IanQuilter
Copy link
Author

Sorry, I'm away and replying using a mobile and couldn't work out how to add a comment to the pull thread.
Yes, can you provide an apk as I have no idea how to make use of the coil code myself.
Thanks.

@aagit
Copy link

aagit commented May 12, 2024

method: POST_JSON
headers:
cache-control: "no-cache"
X-Gotify-Key: XXXXXXXXXXX

The cache-control of the http POST HA sends to gotify server isn't going to alter the caching behavior of the images you receive in gotify, you probably need to tweak the GET request on the http server that delivers the image to alter the behavior of the image cache.

@aagit
Copy link

aagit commented May 12, 2024

My workflow is about the opposite and my problem is that I'm not getting enough caching.

To avoid wasting space, I give gotify max 1 min to fetch the image into the android app cache, after that the image is gone and gotify can't fetch it anymore.

So ultimately the only place where the image is stored for longer than 1 minute is in the cache in the android client, or the web interface if I'm logged in.

To make sure gotify stores the image into the android cache while it's available I post the url of the image both on message markdown as well as the url of the bigImageUrl. If the android client is offline, I'll lose the image but that's fine with me, it's useless if it's not in real time anyway.

bigImageUrl never fails to show the image, just sometime the images are missing from the app in which case killing the app and restarting it shows all the images from the cache even if they were not shown before.

I tried the new upstream code with coil discussed above, well my workflow doesn't work anymore there. The whole disk cache is deleted a refresh which triggers when the app is opened and I believe the disk key and memory key need to be manually specified as url for the bigImageUrl request case.

So I just wonder if you could consider optimizing away all unnecessary http requests, so I could keep using my current workflow.

Thanks!

@aagit
Copy link

aagit commented May 12, 2024

You could include the current timestamp

Depending on the use case for HA, filename_{{ this.context.id }}.jpg or {{ trigger.event.data.camera + '_' + trigger.event.context.id }}.jpg may be simpler than adding a timestamp parameter because it avoids the non atomic overwriting which may cause other glitches, not that it matters in practice though.

@aagit
Copy link

aagit commented May 14, 2024

Just a minor update: it looks like the problem with coil is two parallel requests on the same url lead to data corruption despite the diskcache is serialized (by jvm means). Sometime I see corrupted images in the notification url, other times the image isn't received in the app but it looks ok in the bigImageUrl. I wonder if the notification context invokes the coil request of the bigImageURL from a different process that isn't able to serialize by normal means, or if it's a bug in coil disk cache.

If I use the latest release (pre-coil) it works better, but it is still flakey and I stilll get some image not shown occasionally, which is why I tried the upstream version to see if it resolved it.. I never seen corrupted images on the stable version and pulling down to invoke a refresh resolves it. In the new app pulling down wipes the disk cache and prevents to see any past image. Killing the app without refreshing it fixed it with coil too.

It's just the first request that goes bad if app and notification fetch the same url in parallel. Likely the problem is the two requests step on each other and one of the two sees partially fetched data from a partially written diskcahe file.

You should be able to reproduce my issue by opening the app and running this after filling in URL and IMAGE.

export URL=<your post handler url>; export IMAGE="<your favorite large image url>?$RANDOM"; curl -s -S --data '{"message": "'""\!"[x]($IMAGE)"'", "extras": {"client::display": {"contentType": "text/markdown"}, "client::notification": { "bigImageUrl": "'"$IMAGE"'" } } }' -H 'Content-Type: application/json' "$URL"

@jmattheis
Copy link
Member

Could be related to #337 (comment) as a quick fix we can probably use different folders for both coil instances. @cyb3rko what do you think?

@aagit
Copy link

aagit commented May 15, 2024

I see in #337 you already predicted there could be a race condition in the disk cache.

I wouldn't entirely rule out a bug in coil DiskLruCache, if nothing else because it looks fishy that the journal file is missing when the corrupted image show up, only the cache .0 and .1 files are in the cache dir at that time. And the journal didn't leave me impressed because is some append only text file, at best it's not going to scale and perform as well as it could.

Said that using different folders would most certainly solve it even if it'd be a bug in coil However that wouldn't be very different than disabling the disk cache for my purposes: I would be back to square one.

I'm deleting the images pointed by the bigImageUrl 1min after sending the message, I don't want to keep two caches one in the app and one on the server sending the message.

So I need coil to fetch the message at notification time, not later when I open the app and the markdown is rendered, by that time the image can't be downloaded anymore.

If making the same disk cache work reliably for both bigImageUrl and markdown is problematic, my preferred solution would be to add an explicit option to parse the markdown and pre-fetch the url in the markdown at notification time. So that the disk cache is filled while the image is still available.

The primary reason I use bigImageUrl is that it's the only thing fetched at notification time and I can't pre-fill the disk cache if I only use markdown.

@cyb3rko
Copy link
Contributor

cyb3rko commented May 15, 2024

@jmattheis I will take a look at this when I find some time

@IanQuilter
Copy link
Author

IanQuilter commented May 16, 2024

Not sure if it helps but.
I've been using the Dev version for about 3 weeks now and I haven't had any corruption of images or URL's. I send a click URL and a jpg image by the bigImageUrl. I only send jpg images so perhaps the format of the image makes a difference?

I have very occasionally received a notification with no image but one, two or sometimes three refreshes by pulling down eventually loads it.

But I'm not deleting the images pointed to by the bigImageUrl for quite some time, maybe hours or even days depending on how often the images are sent, so refreshing isn't an issue for my use case.

@aagit
Copy link

aagit commented May 17, 2024

I see in #337 you already predicted there could be a race condition in the disk cache.

I didn't notice yet there's two instances of the CoilHandler pointing to the same disk cache directory when I answered, I agree that's the most likely explanation for the image corruption...

For now I disabled all caching for the notification request only and it solves the corruption fine.

It still has some other issue, even if I don't delete the files scrolling down after a refresh doesn't show all images that are in the disk cache, scrolling up then shows them. There's no network request done when it failed to load some image during the scroll down. It's a scrolling rendering bug of some kind, I guess it doesn't even make the request to coil when this happens.

diff --git a/app/src/main/kotlin/com/github/gotify/CoilHandler.kt b/app/src/main/kotlin/com/github/gotify/CoilHandler.kt
index 9478111..be5169c 100644
--- a/app/src/main/kotlin/com/github/gotify/CoilHandler.kt
+++ b/app/src/main/kotlin/com/github/gotify/CoilHandler.kt
@@ -5,6 +5,7 @@ import android.graphics.Bitmap
 import android.graphics.BitmapFactory
 import android.graphics.drawable.BitmapDrawable
 import coil.ImageLoader
+import coil.request.CachePolicy
 import coil.annotation.ExperimentalCoilApi
 import coil.decode.SvgDecoder
 import coil.disk.DiskCache
@@ -38,7 +39,9 @@ internal class CoilHandler(private val context: Context, private val settings: S
     @Throws(IOException::class)
     fun getImageFromUrl(url: String?): Bitmap {
         val request = ImageRequest.Builder(context)
-            .data(url)
+            .data(Utils.resolveAbsoluteUrl("${settings.url}/", url))
+	    .memoryCachePolicy(CachePolicy.DISABLED)
+	    .diskCachePolicy(CachePolicy.DISABLED)
             .build()
         return (imageLoader.executeBlocking(request).drawable as BitmapDrawable).bitmap
     }
@@ -48,9 +51,7 @@ internal class CoilHandler(private val context: Context, private val settings: S
             return BitmapFactory.decodeResource(context.resources, R.drawable.gotify)
         }
         try {
-            return getImageFromUrl(
-                Utils.resolveAbsoluteUrl("${settings.url}/", app.image)
-            )
+            return getImageFromUrl(app.image)
         } catch (e: IOException) {
             Logger.error(e, "Could not load image for notification")
         }
@@ -59,10 +60,18 @@ internal class CoilHandler(private val context: Context, private val settings: S
 
     fun get() = imageLoader
 
-    @OptIn(ExperimentalCoilApi::class)
     fun evict() {
+    	evictInternal(false);
+    }
+    fun evictMemory() {
+    	evictInternal(true);
+    }
+    @OptIn(ExperimentalCoilApi::class)
+    private fun evictInternal(onlyMemory: Boolean) {
         try {
-            imageLoader.diskCache?.clear()
+	    if (!onlyMemory) {
+	        imageLoader.diskCache?.clear()
+            }
             imageLoader.memoryCache?.clear()
         } catch (e: IOException) {
             Logger.error(e, "Problem evicting Coil cache")
diff --git a/app/src/main/kotlin/com/github/gotify/messages/MessagesActivity.kt b/app/src/main/kotlin/com/github/gotify/messages/MessagesActivity.kt
index 006a981..48e09e6 100644
--- a/app/src/main/kotlin/com/github/gotify/messages/MessagesActivity.kt
+++ b/app/src/main/kotlin/com/github/gotify/messages/MessagesActivity.kt
@@ -175,7 +175,7 @@ internal class MessagesActivity :
     }
 
     private fun onRefresh() {
-        viewModel.coilHandler.evict()
+        viewModel.coilHandler.evictMemory()
         viewModel.messages.clear()
         launchCoroutine {
             loadMore(viewModel.appId).forEachIndexed { index, message ->

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants