Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive growth of .mbtiles file size with updates #1655

Open
antonioabelgc opened this issue Apr 15, 2024 · 1 comment
Open

Excessive growth of .mbtiles file size with updates #1655

antonioabelgc opened this issue Apr 15, 2024 · 1 comment

Comments

@antonioabelgc
Copy link

I am currently using OpenMapTiles to generate .mbtiles file for my project. My process involves generating an .mbtiles file from an initial .osm.pbf file and then applying periodic updates using a Change File. However, I have encountered an inconvenient during this process.

The problem arises when I apply periodic updates using the Change File: each time I apply an update, and use "make generate-changed-tiles", the size of the resulting .mbtiles file increases considerably, even if the updates are small. This becomes a problem in the long run, especially if I plan to perform weekly updates, as the .mbtiles file size can become disproportionately large over time.

Could it be that obsolete tiles are not being properly removed or that empty tiles are taking up significant space in the .mbtiles file, leading to excessive size growth with each update? Is there another more efficient way to carry out periodic updates?

Thanks

@antonioabelgc
Copy link
Author

I've been researching and testing with a very small region's .osm.pbf file. After generating the .mbtiles file from scratch based on this .osm.pbf file, I noticed that initially it has a size of 784K. However, starting from this .mbtiles file as a base, if I generate the changes for several days using a .osc file and the generate-changed-tiles command, I see how it increases to a weight of 856K (tiles.mbtiles2).

$ ls -lah tiles.mbtiles tiles.mbtiles2 
-rw-r--r-- 1 user group 784K abr 16 11:47 tiles.mbtiles
-rw-r--r-- 1 user group 856K abr 16 11:57 tiles.mbtiles2

When I extract the contents of both .mbtiles files with mb-util to compare them, I find that the extracted weights are the same for both and the only differences are in the .png files. However, this doesn't affect the size, as the .png files are different but maintain the same weight.

$ mb-util tiles.mbtiles tiles1/
$ mb-util tiles.mbtiles2 tiles2/
$ du tiles1/ | tail -n1 && du tiles2/ | tail -n1
4824	tiles1/
4824	tiles2/
$ diff -r -q tiles1/ tiles2/
Los archivos tiles1/1/0/0.png y tiles2/1/0/0.png son distintos
Los archivos tiles1/10/503/404.png y tiles2/10/503/404.png son distintos
Los archivos tiles1/2/1/1.png y tiles2/2/1/1.png son distintos
Los archivos tiles1/3/3/3.png y tiles2/3/3/3.png son distintos
Los archivos tiles1/6/31/25.png y tiles2/6/31/25.png son distintos
Los archivos tiles1/7/62/50.png y tiles2/7/62/50.png son distintos
Los archivos tiles1/9/251/202.png y tiles2/9/251/202.png son distintos
$ ls -lah tiles1/1/0/0.png tiles2/1/0/0.png tiles1/10/503/404.png tiles2/10/503/404.png tiles1/2/1/1.png tiles2/2/1/1.png tiles1/3/3/3.png tiles2/3/3/3.png tiles1/6/31/25.png tiles2/6/31/25.png tiles1/7/62/50.png tiles2/7/62/50.png tiles1/9/251/202.png tiles2/9/251/202.png
-rw-r--r-- 1 user group  20K abr 16 11:57 tiles1/1/0/0.png
-rw-r--r-- 1 user group  20K abr 16 11:58 tiles2/1/0/0.png

-rw-r--r-- 1 user group 6,0K abr 16 11:57 tiles1/10/503/404.png
-rw-r--r-- 1 user group 6,0K abr 16 11:58 tiles2/10/503/404.png

-rw-r--r-- 1 user group  23K abr 16 11:57 tiles1/2/1/1.png
-rw-r--r-- 1 user group  23K abr 16 11:58 tiles2/2/1/1.png

-rw-r--r-- 1 user group 5,9K abr 16 11:57 tiles1/3/3/3.png
-rw-r--r-- 1 user group 5,9K abr 16 11:58 tiles2/3/3/3.png

-rw-r--r-- 1 user group 2,2K abr 16 11:57 tiles1/6/31/25.png
-rw-r--r-- 1 user group 2,2K abr 16 11:58 tiles2/6/31/25.png

-rw-r--r-- 1 user group 2,6K abr 16 11:57 tiles1/7/62/50.png
-rw-r--r-- 1 user group 2,6K abr 16 11:58 tiles2/7/62/50.png

-rw-r--r-- 1 user group 3,9K abr 16 11:57 tiles1/9/251/202.png
-rw-r--r-- 1 user group 3,9K abr 16 11:58 tiles2/9/251/202.png

Later on, I found this tool: https://maplibre.org/martin/mbtiles-meta.html and I've generated a summary for each .mbtiles file. Here is where I observed what appears to be the main difference causing the increase in weight: The number of pages.

$ ./mbtiles summary tiles.mbtiles
MBTiles file summary for tiles.mbtiles
Schema: normalized
File size: 784.00KiB
Page size: 4.00KiB
Page count: 196

 Zoom |   Count   | Smallest  |  Largest  |  Average  | Bounding Box
    0 |         1 |   10.8KiB |   10.8KiB |   10.8KiB | -180,-85,180,85
    1 |         2 |   19.9KiB |   61.3KiB |   40.6KiB | -180,0,180,85
    2 |         2 |   22.5KiB |   63.5KiB |   43.0KiB | -90,0,90,67
    3 |         4 |    5.8KiB |   35.0KiB |   17.4KiB | -45,-0,45,67
    4 |         4 |    5.8KiB |   26.6KiB |   11.7KiB | -23,22,22,56
    5 |         4 |    2.8KiB |    7.0KiB |    4.1KiB | -11,32,11,49
    6 |         6 |      971B |    2.6KiB |    1.6KiB | -6,32,6,45
    7 |        20 |       20B |    2.5KiB |      899B | -6,34,6,45
    8 |        63 |       20B |    2.3KiB |      519B | -6,34,4,44
    9 |       208 |       20B |    3.8KiB |      295B | -5,35,4,44
   10 |       744 |       20B |    5.9KiB |      166B | -5,35,4,44
  all |      1058 |       20B |   63.5KiB |      534B | -180,-85,180,85
$ ./mbtiles summary tiles.mbtiles2
MBTiles file summary for tiles.mbtiles2
Schema: normalized
File size: 856.00KiB
Page size: 4.00KiB
Page count: 214

 Zoom |   Count   | Smallest  |  Largest  |  Average  | Bounding Box
    0 |         1 |   10.8KiB |   10.8KiB |   10.8KiB | -180,-85,180,85
    1 |         2 |   19.9KiB |   61.3KiB |   40.6KiB | -180,0,180,85
    2 |         2 |   22.5KiB |   63.5KiB |   43.0KiB | -90,0,90,67
    3 |         4 |    5.8KiB |   35.0KiB |   17.4KiB | -45,-0,45,67
    4 |         4 |    5.8KiB |   26.6KiB |   11.7KiB | -23,22,22,56
    5 |         4 |    2.8KiB |    7.0KiB |    4.1KiB | -11,32,11,49
    6 |         6 |      971B |    2.6KiB |    1.6KiB | -6,32,6,45
    7 |        20 |       20B |    2.5KiB |      899B | -6,34,6,45
    8 |        63 |       20B |    2.3KiB |      519B | -6,34,4,44
    9 |       208 |       20B |    3.8KiB |      295B | -5,35,4,44
   10 |       744 |       20B |    5.9KiB |      166B | -5,35,4,44
  all |      1058 |       20B |   63.5KiB |      534B | -180,-85,180,85

Although the content of both .mbtiles files is apparently the same, one has 196 pages and the second one has about 214 pages, each at 4.00KiB, which causes the second file to weigh more. As far as I understand, the "page count" refers to the total number of pages in the MBTiles database file. A "page" would be a storage unit containing a portion of the map data, as the MBTiles database stores map tiles in pages. While this might not be significant for a small region, for an entire country, during each update, in my case, it increased the file size by around 1GB. The question now is: What generates these pages? Can they be avoided? I will continue investigating...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant