Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to use geocoder client side #213

Open
bozdoz opened this issue Mar 4, 2023 · 52 comments
Open

Option to use geocoder client side #213

bozdoz opened this issue Mar 4, 2023 · 52 comments

Comments

@bozdoz
Copy link
Owner

bozdoz commented Mar 4, 2023

From an email:

Instead of storing the latitude and longitude of all locations in the database, as individual entries in the database,

I would like better if the lat lng were not to be saved in the database and if they could instead to be calculated on the runtime, in the client side, when loading the page.

I sent you a snippet that demonstrates this possibility.

I have hundreds of locations. So there are huge entries saved in the database, more than WordPress and plugins combined.

This could be a new feature, a new attribute which you toggle on or off while generating the shortcode.
" to retrieve the data from database: toggle on or off".

If the short code is ON by default, it maintains current functionality.

If the value is OFF, it is going to calculate the lat and long from "London" in real time, without storing in database while doing so.

@webd-uk
Copy link

webd-uk commented Apr 19, 2023

From https://wordpress.org/support/topic/wp_options-table-with-lots-of-leaflet_osm_-rows/#post-16668472

In that case surely the plugin should use the WordPress Transients API rather than the options API?

That would negate the requirement for the “Clear Geocoder Cache” button and allow old data to be forgotten.

I would appreciate if this could be considered.

Thank you.

Oliver

@hupe13
Copy link
Collaborator

hupe13 commented Apr 19, 2023

This question also relates to this issue: https://wordpress.org/support/topic/leaflet-marker-shortcode-check-if-address-exists/

@designzwang
Copy link

Quite difficult to decide how to do the lookup ... I found this issue while searching a solution for a ( potential ) database problem.

Looking up the location at runtime will of course case lots of hits on the geocoder, and a plugin like yours is probably meant to be easy to use without any additioinal dependencies (especially when it comes to commercial vendors).

Otherwise, like with google maps or others, you need an API key if you create larger amounts of Geocoding requests. I know that some plugins, like TEC (see below), come with a "builtin" key, but in the end someone has to pay for it and I am not aware of a geocoder that is actually "free" and unlimited at the same time.

So I think your combination of Nominatim and a local cached storage in the DB is a very good concept - but only for sites with only a few maps. Therefore, IMHO improving the caching solution would probably be better than following the suggestion to do a runtime client side lookup.

Regarding my DB problem with geocoding/caching:

I want to use the "TheEventsCalendar" plugin for all sorts of events. It comes with a google maps integration , and it has a builtin API key for some basic features, so it can do geocoding out of the box. Unfortunately it has no OSM features.

So I wanted to replace google maps with leaflet , using your plugin and using do_shortcode(...). Basically it works, but it will fill our options table very quickly, because we will have many events in different locations, and the Calendar does not store Lat/Long information, but only addresses. So there will be a lookoup for every new event or location, and this will end up in the options table.

Problems here:

  • hundreds, if not thousands, of entries in the option table, and they are autoloaded ( btw - why autoload?)
  • one huge array with all addresses in leaflet_geocoded_locations' option, which will eventually grow so big that it may break things
  • creates the necessity for manual "housekeeping" , i.e. deleting cache
  • will flood the geocoder with requests after "housekeeping"

All this makes me feel a bit uncomfortable without really a good idea for a solultion

@webd-uk - When commenting on https://wordpress.org/support/topic/wp_options-table-with-lots-of-leaflet_osm_-rows/, I first thought that using transients would be a better solution, but actually it will not really change the DB usage - transients will also end up in the options table, and they will not be deleted automatically on expire unless there are housekeeping plugins in place who clean up.

How about this:

  • using transients instead of "normal" option entries , with a reasonable life span of , say, a month or so and no autoloading
  • change the housekeeping/delete job to have a choice : clean out all or only expired transients
  • create a cron task (and/or wpcli action) to clean expired transients
  • skip creating the potentially huge "count" array in leaflet_geocoded_locations - identify them by a common prefix

This would avoid to clean out all cached entries at once, causing many new coding requests, and would be a more stable self-cleaning setup that will not store an unlimited amount of lookups if nobody uses the delete function. The maximum amount of cached entries would be limited by the default cache ttl of the entries, and that coud be made to be a configurable value.

I think creating and using a separate table is considered an antipattern for plugin developers these days, isn't it? Or would that work?

Regards
Ulrich

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 19, 2023

In that case surely the plugin should use the WordPress Transients API rather than the options API?

From the Transients link:

storing cached data in the database temporarily

This is not meant to be a temporary cache; it's intended to reduce the calls to Nominatum or Google Geocoder API.

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 19, 2023

hundreds, if not thousands, of entries in the option table, and they are autoloaded ( btw - why autoload?)

I'm not sure how or why autoload is there. Any advice on that?

https://github.com/bozdoz/wp-plugin-leaflet-map/blob/master/class.geocoder.php#L54-L57

                $location = (Object) $this->$geocoding_method( $address );

                /* add location */
                add_option($cached_address, $location);

Does add_option automatically add autoload?

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 19, 2023

One more comment: if you clear the geocoder cache and then visit the shortcode helper page in the admin, you should see a significant delay (because it's looking up addresses). That's one problem I'm trying to mitigate. Also, as mentioned above, I want to hit the geocoder endpoints as little as possible; I believe Nominatum and maybe Google Geocoder both require that consumers cache their lookups.

@webd-uk
Copy link

webd-uk commented Apr 19, 2023

I'm not sure how or why autoload is there. Any advice on that?

Basically unless an option is used on all or lots of pages then it shouldn’t be auto loaded …

From https://developer.wordpress.org/reference/functions/add_option/

$autoload string|bool Optional
Whether to load the option when WordPress starts up.
Default is enabled. Accepts 'no' to disable for legacy reasons.
Default: 'yes'

@webd-uk
Copy link

webd-uk commented Apr 19, 2023

In that case surely the plugin should use the WordPress Transients API rather than the options API?

From the Transients link:

storing cached data in the database temporarily

This is not meant to be a temporary cache; it's intended to reduce the calls to Nominatum or Google Geocoder API.

A cache, by definition, is temporary. You can set the expiration time in seconds to whatever you like but the longer you make this the more you will be reducing the calls to the API. I would probably set this to a week or a month.

But to be honest I don’t think the wp_options table is the right place for this data be it transient or otherwise. It’s not right to be meta data either seeing as your plug-in doesn’t create anything that can have meta data added so that leaves a custom database table. Especially if you are looking to keep the data permanently (until it’s flushed in settings).

Oliver

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 19, 2023

A cache, by definition, is temporary. You can set the expiration time in seconds to whatever you like but the longer you make this the more you will be reducing the calls to the API. I would probably set this to a week or a month.

A month sounds reasonable to me. Are transients autoloaded? Should I be autoloading any of these wp options? I've never worked with wp options performance issues before.

@designzwang
Copy link

Hi,
regarding transients vs. options table:

The transient API serves as a transparent interface to whatever caching technology is active. So using transients will benefit from any type of cache ( memcache, redis, you name it) if there is one. If no other cache is enabled, it will fall back to storage in the options table. And then, @bozdoz, looking at the set_transient()-function it seems they are not set to be autoloaded if they have an expiry - see https://developer.wordpress.org/reference/functions/set_transient/

Regarding the temporary nature of transients - it is a free choice how long the expiry is. So "transient" can in fact be a very long time and serve the purpose of saving nominatim requests until cleanup
Regards
Ulrich

@webd-uk
Copy link

webd-uk commented Apr 19, 2023

@designzwang Yes, I concur. All that I would add to that, and in response to @bozdoz question about wp_options performance ... Wordpress checks ("Dashboard - Tools - Site Health") that you have less than 500 rows in the wp_options table set to autoload (in addition to checking that their total size is less than 100Kb). So moving to expiring transients should also help to reduce the total rows of autoloaded wp_options.

I'm still not convinced that this data should be stored here, however. We had 280 on our site which is a lot of rows when you consider how many there should be in total on a normal Wordpress install.

Oliver

@designzwang
Copy link

@webd-uk - even if it feels bad to throw in so many records into the options table, it is done by many plugins, probably for the same reason it is used here. It is easy, it is an accepted design pattern and will work in any environment. The overhead of creating your own caching layer or handle custom tables is quite large, and actually I don't like plugins that create unnecessary custom tables ... Now for transients and the leaflet plugin, I will try how things work out with transients and redis, the latter being a long time favorite on my to-do list for performance improvements. I guess it is not a high priority , so if @bozdoz can wait a couple of days I could probably try to work on a PR.

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 20, 2023

Just specifically about autoload: Should any of this plugin's options be set to autoload?

@webd-uk
Copy link

webd-uk commented Apr 20, 2023

Just specifically about autoload: Should any of this plugin's options be set to autoload?

I believe the rule of thumb is that any option that is used on all or most requests should be auto loaded. Otherwise not.

And in an ideal world you would have a single wp_options row that was a serialised array of the plugin’s options.

Oliver

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 20, 2023

Let's make that ideal world happen

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 21, 2023

I've added a PR to move all the geocoded addresses to a single transient with a month expiry date; what do you think?

#217

@webd-uk
Copy link

webd-uk commented Apr 21, 2023

Although this sounds good it may have undesired results as in this situation all addresses would expire at exactly the same time? If you’re OK with that then no problem!

@designzwang
Copy link

Short update on this: I made a deep dive into the transients documentation and found some good news. There actually IS a garbage collection that takes care of expired transients. ( The cron event is delete_expired_transients )
So no need to put additional work into that, and everything that is needed is to replace the option storage with transients.
Also, I found a way to delete the cached entries on-demand without having to keep track of them in a (potentially) huge option array. PR is prepared with minimal changes, however I will run the changed code for a couple of days on a test system.
Regards urich

@webd-uk
Copy link

webd-uk commented Apr 22, 2023

We use the following hooks to dynamically control various aspects of Leaflet JS plugin ...

                add_filter('option_leaflet_js_url', 'our_plugin_class::option_leaflet', 10, 2);
                add_filter('default_option_leaflet_js_url', 'our_plugin_class::option_leaflet', 10, 2);
                add_filter('option_leaflet_css_url', 'our_plugin_class::option_leaflet', 10, 2);
                add_filter('default_option_leaflet_css_url', 'our_plugin_class::option_leaflet', 10, 2);
                add_filter('option_leaflet_map_tile_url', 'our_plugin_class::option_leaflet', 10, 2);
                add_filter('default_option_leaflet_map_tile_url', 'our_plugin_class::option_leaflet', 10, 2);

Please confirm ASAP if the proposed changes will affect any of these options "leaflet_js_url", "leaflet_css_url", "leaflet_map_tile_url" and what the format of new options will be accordingly.

Many thanks,

Oliver

@designzwang
Copy link

designzwang commented Apr 22, 2023

@bozdoz Dammit, I did not see that you already created a PR, so I started
working on my own. There are not many changes however.
#218

Compared to your solution, I think it is more scalable to use
one transient per record, and not collect them all in a single and potentially
huge transient.

If a persistant cache is used, the cache deletion function to delete
everything will not work - it only takes care of transients stored in the
options table, and that only happens if no 3rd party cache is used.
However all transients set with this code will have an expiry,
so that should not be a problem.

I saw some other changes in your branch, so I based my PR on master.
If you consider my PR for inclusion, I could re-do it whichever way you prefer.

@webd-uk i am not aware that my code would have any impact on your url filters.

@webd-uk
Copy link

webd-uk commented Apr 22, 2023

And in an ideal world you would have a single wp_options row that was a serialised array of the plugin’s options.

Let's make that ideal world happen

Please confirm ASAP if the proposed changes will affect any of these options "leaflet_js_url", "leaflet_css_url", "leaflet_map_tile_url" and what the format of new options will be accordingly.

Thanks @designzwang, this was more for @bozdoz and less about the transients as I believe the existing options may be due to be merged into a single option array.

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 22, 2023

We use the following hooks to dynamically control various aspects of Leaflet JS plugin ...

                add_filter('option_leaflet_js_url', 'our_plugin_class::option_leaflet', 10, 2);
                add_filter('default_option_leaflet_js_url', 'our_plugin_class::option_leaflet', 10, 2);

Please confirm ASAP if the proposed changes will affect any of these options "leaflet_js_url", "leaflet_css_url", "leaflet_map_tile_url" and what the format of new options will be accordingly.

I legitimately don't understand any of this. To my knowledge there are no filters created using those names.

@webd-uk
Copy link

webd-uk commented Apr 22, 2023

Oh, sorry. OK, so basically there is a global option_{option name} filter in Wordpress that allows us to change the values of get_option() based on the option name. This in turn allows us to change the settings of your plug-in dynamically in our application (which we require).

So … to cut to the chase, if you are intending to merge all the options that your plug-in saves in the options table into one option that has a serialised array of options then I would like to know before you push that out because it would adversely affect our application.

Does that make sense?

Oliver

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 24, 2023

so basically there is a global option_{option name} filter in Wordpress that allows us to change the values of get_option() based on the option name

Didn't know that. Good to know. Does it work for transients?

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 24, 2023

Looks like there's a transient one but not a default transient one. I'm assuming this would be an issue.

@webd-uk
Copy link

webd-uk commented Apr 24, 2023

That would be transient_{option name} :)

@webd-uk
Copy link

webd-uk commented Apr 24, 2023

Looks like there's a transient one but not a default transient one. I'm assuming this would be an issue.

Not for us. We cache latlngs a completely different way. They're saved into post meta as per this recommendation ... so wouldn't need to filter your transient data.

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 26, 2023

@webd-uk @designzwang

I'm probably going to opt for something like in this PR: #217 (comment)

  1. Added filters for getting, setting, removing, and updating all geocoder addresses.
  2. Sets geocoder options to autoload=false.

This is the best backwards-compatible option, and allows advanced users to customize their geocoder caching.

Thoughts?

@webd-uk
Copy link

webd-uk commented Apr 26, 2023

Hmm ... as per https://developer.wordpress.org/reference/functions/update_option/#parameters

$autoload string|bool Optional
Whether to load the option when WordPress starts up. For existing options, $autoload can only be updated using update_option() if $value is also changed.

... I'd be inclined to perform some kind of (one time?) check on existing options to delete them and then add them again to force them to be set to $autoload 'no' seeing as $value is unlikely to be changed.

As for your filters on getting / setting / removing / updating geocoded addresses ... I'm not sure I'd have any need for these filters but so long as they're being saved as transients and have an expiry date, I'd imagine that it would be fine for our purposes because it would mean un-used geo-data would eventually be removed automatically from the wp_options table.

Similarly to the above described check on the options ... would you be having a (one time?) check on existing cached geo data to delete from wp_options and re-add as transients?

Oliver

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 26, 2023

would you be having a (one time?) check on existing cached geo data to delete from wp_options and re-add as transients?

You might be missing what that PR is intending to do: there are no transients. The only db change is to set autoload to false going forward. If someone wants to swap their existing options with non-autoloaded options, then they can clear the geocode cache.

@designzwang
Copy link

@bozdoz well, I see your point regarding backward compatibility. However that is quite a lot of logic being transferred to functions.php, I wonder whether many users , especially small site operators, would be able to follow ... My proposals were not done with functions.php in mind, so I will check/review your PR and try it over the weekend on a dev site, and report the results.

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 27, 2023

I wonder whether many users , especially small site operators, would be able to follow

I see all of this as a power user request/customization

@designzwang
Copy link

@bozdoz I think your solution is good for backward compatibility and has minimal impact for existing users if no custom filters/actions are used by the site operators ( other than not using autoload any more, which is good :) ). At the same time, it allows me, or others, to implement my desired caching and transient handling so I can actually use leaflet-map for our calendar ( which would not have been possible with the previous implementation). Thank you for considering my proposal! I added some comments in the PR regarding key length and return values, though.

I verified (briefly) that the code from the try-out-transients branch works as expected, at least for one or two hours now, and will keep an eye on it. I got no entries in my options table, and i got the transients I expected, stored in memcached. Great!

One last thing: for me, the usage of the options table was surprising, combined with the large "list-of-all-records" entry. I would add a hint to the docs to point out this behaviour for people who plan to store more than a handful locations can make an informed deceision which storage they want to use.

I will create a public gist with the filter examples, which might be helpful if someone else is interested in using transient storage.

Thanks again and best regards
Ulrich

@webd-uk
Copy link

webd-uk commented Apr 28, 2023

@bozdoz OK, I see what you're doing now. You're passing the onus of deciding to use transients rather than options to cache the location data onto the plugin user. Can I ask why you have decided to do that rather than to move over to use transients entirely? I find it unlikely that a user would know to do that so the plugin will continue to fill up the wp_options table. Thanks.

@designzwang
Copy link

@webd-uk Let me speculate ... after looking into the code, and reading about the project's history, I guess the main goal of this plugin is to offer easy map embedding with a lot of useful options for sites using just a handful maps, like on a contact page with a handful of branch offices or so. It was clearly not designed for large-scale operation. So what you call "fill up the option table" might be true for you and me, but not for the average user.

What @bozdoz did in the latest PR makes perfect sense, given the limited amount of resources available: few changes to the
core functionality while offering custom handling of cache entries to those who want it. Even if it would have been better to use transients from day one, you can not turn back time, and refactoring is more time-consuming than simply adding the filters. IMHO it is still ok to use options to store values for a small number of maps if this is what the majority of users are doing. Look at the issue queue, there is more stuff to take care of....

On the other hand : from looking at the description of the plugin I had different expectations regarding the suitability for our site with many (potentially 1000s) of location maps (I have done quite a lot of plugin research and evaluation ). The professional logo design problably raised expectations, too ;) So probably there are more users who might get bitten by the option storage. Now they can tailor their own storage concept.

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 29, 2023

I could see adding an option to save to transients instead of plain options. But if transients are just saving to the options table (not all transients would) then the only benefit I see is setting auto load to false, which I have done for options. I'd have to give some more thought towards fixing all the other settings. Ideally I think maybe the settings should take a single serialized db entry, maybe also set to auto load false

@hupe13
Copy link
Collaborator

hupe13 commented Apr 29, 2023

This is all very interesting for me. Maybe I should rethink "autoload" as well. Bozdoz, if you change options please let me know. Some I use in my plugin.

@designzwang
Copy link

@bozdoz I thought your deceision to add the filters was because of backwards compatiblity, and to avoid refactoring. Otherwise I would advise against making the choice about transient vs options table an option. This is something the developer should decide, not the users (who might lack the knowledge about the implications of their decision). The Transient API was created explicitly for this type of data. And it will automatically scale for larger sites, because these are likely to implement an object cache like Redis or memcached - which means that for them the transients are stored in the cache automatically. Making it configurable could lead to users chosing database storage even if they have an object cache - simply because they don't know the benefits of transients.

Autoloading the config settings or not depends on assumptions about how large the percentage of pages with maps is - something you can not guess reliably. I think this is a minor optimization - the number of options is small so I think they could as well be autoloaded without problems. Looking at the options tables of my sites I get the impression that very few deveopers even care about this and simply use autoloading.

@bozdoz
Copy link
Owner Author

bozdoz commented Apr 30, 2023

Well yeah, I imagine many developers don't even think about the autoload option because it defaults to true.

Switching from option to transient means I ought to do some migration script when updating to the newest version. Something I haven't implemented so far and something I may not have the stomach or experience for. Which brought me back to the question: why am I doing this? It's a 9-year old plug-in and the vast majority of users aren't concerned about this topic at all.

Im still open to adding a migration script. If anyone has experience doing that in Wordpress that would be great to see.

Otherwise, I'm leaning on my pr to add filters and actions to simply make this extendable.

@designzwang
Copy link

Oh, I didn't mean to push you , sorry if it sounded like that. For me, the filter PR would be good enough. Like I said it works on a dev/test site without problems. I can't help out with a migration script - never have done one.

@webd-uk
Copy link

webd-uk commented Apr 30, 2023

As a plugin developer, I am also guilty of using more than one row in the wp_options table for multiple options rather than an array of options. I'm also guilty of not knowing ("lazy" is too strong as I genuinely wasn't aware) that options should be set to not autoload if they're not being used on most page loads. However, with the recent addition in the "Dashboard - Tools - Health Check" of the notice to use a persistent object cache if the site has more than 500 options (or more than 100Kb of options) we have definitely started paying much more close attention to what the cause is on our client sites. Which is what lead me to start the thread on your plugin's support forum.

And I'm sure this is going to be much more common going forward. I've actually written an article about the notice in "Site Health".

For our plugins I'm definitely going to take a closer look at how we're using options and tidying them up accordingly.

@bozdoz
Copy link
Owner Author

bozdoz commented May 18, 2023

Updated the PR #217 by going back to transients, and added a migration system to move existing options over to transients.

It works by checking for the plugin version in the db; if it doesn't find it, assumes v3.3.0 and runs a migration script to work with v3.4.0.

@webd-uk
Copy link

webd-uk commented May 18, 2023

That sounds great! The only thing I would say is that rather than checking the plugin version, why not just check for the existence of an option which would in turn require the migration to go ahead?

@bozdoz
Copy link
Owner Author

bozdoz commented May 18, 2023

That sounds great! The only thing I would say is that rather than checking the plugin version, why not just check for the existence of an option which would in turn require the migration to go ahead?

Well it is doing that. It's looking for an option in the db for the last installed plug-in version. If it's not there we can assume at most 3.3.0

@webd-uk
Copy link

webd-uk commented May 18, 2023

I'm not sure that's the same thing. What if the new version was replaced with an older version then the new version installed again? There would be options that would not be detected.

@bozdoz
Copy link
Owner Author

bozdoz commented May 18, 2023

I'm not sure that's the same thing. What if the new version was replaced with an older version then the new version installed again? There would be options that would not be detected.

Any ideas on what should be done in such a situation?

@webd-uk
Copy link

webd-uk commented May 18, 2023

Yes :) That's why I thought it might be better to check if an option existed and run the migration in that instance rather than rely on the version number. And by "check for an option" I mean check to see if the options table contains something that should be a transient?

@bozdoz
Copy link
Owner Author

bozdoz commented May 18, 2023

Yes :) That's why I thought it might be better to check if an option existed and run the migration in that instance rather than rely on the version number. And by "check for an option" I mean check to see if the options table contains something that should be a transient?

This seems like it would make for a much more complicated migration system. I might have to give it more thought.

@webd-uk
Copy link

webd-uk commented May 18, 2023

Something like this :)

if (is_admin()) {
$alloptions = wp_load_alloptions();
$found = false;
foreach ($alloptions as $option_name => $option_value) {
if (str_starts_with($option_name, 'leaflet_' . $settings->get('geocoder'))) {
$found = true;
break;
}
}
if ($found) {
$this->run_migrations();
}
}

@bozdoz
Copy link
Owner Author

bozdoz commented May 18, 2023

Definitely not. I was considering changing the key for the "all locations" db entry, and then checking for the pre-3.4.0 key name; but, again, this would make migrations much more complicated, when it's supposed to be just a one-time operation

@webd-uk
Copy link

webd-uk commented May 18, 2023

I don’t understand how this makes the migration any different, let alone more complicated. Or perhaps I’m not understanding what you’re migrating. Either way, it’s your shout.

@bozdoz
Copy link
Owner Author

bozdoz commented May 19, 2023

Just complication of constantly checking if a db value exists as opposed to checking an auto-loaded value that we expect to be there. It's less intuitive in my mind. I like the idea if there was a good clear way to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants