Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensuring cache presence before starting borg create. Fixes #7450 #7475

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
25 changes: 25 additions & 0 deletions src/borg/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -514,6 +514,15 @@ def __init__(
self.txn_active = False

self.path = cache_dir(self.repository, path)
if os.path.exists(self.path) and not self.check_cache_integrity():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the first term redundant as the check call would also return False in that case?

logger.warning("Rebuilding cache as some cache-files have gone missing")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess this should be more precise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also: no "-" in english.

if not os.path.exists(os.path.join(self.path, "chunks")):
self.recreate_cache_config(delete_existing=True) # recreating to remove integrity data stored in config
self.chunks = ChunkIndex()
self.chunks.write(os.path.join(self.path, "chunks"))
Comment on lines +521 to +522
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will create an empty (and thus incorrect) chunks index on disk, how is that helpful?

also creation of that file is done in a rather unsafe way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I was assuming that creating an empty chunks cache should be enough for borg create to continue and re-populate it as it normally would.

  1. Should we be entirely rebuilding it here?
  2. How else would be creating the file. Is there another piece of code that I can take a look at?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

borg create needs the chunks cache in a valid state (refcounts, sizes for all chunks in the repo) and updates refcounts on that basis.

elif not os.path.exists(os.path.join(self.path, "config")):
self.recreate_cache_config()

self.security_manager = SecurityManager(self.repository)
self.cache_config = CacheConfig(self.repository, self.path, lock_wait)

Expand Down Expand Up @@ -900,6 +909,13 @@ def create_master_idx(chunk_idx):
self.do_cache = os.path.isdir(archive_path)
self.chunks = create_master_idx(self.chunks)

def check_cache_integrity(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only tests existence, not integrity.

how about:

return all(os.path.exists(os.path.join(self.path, file)) for file in "chunks", "config")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also: only called from one place.

maybe rather do the joins only once and have the rest in-line?

cache_files = ["chunks", "config"]
for file in cache_files:
if not os.path.exists(os.path.join(self.path, file)):
return False
return True

def check_cache_compatibility(self):
my_features = Manifest.SUPPORTED_REPO_FEATURES
if self.cache_config.ignored_features & my_features:
Expand All @@ -921,6 +937,15 @@ def wipe_cache(self):
self.chunks = ChunkIndex()
with SaveFile(os.path.join(self.path, files_cache_name()), binary=True):
pass # empty file
self.recreate_cache_config()

def recreate_cache_config(self, delete_existing=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe call this create... - it only mutates to the more special recreate_... in you give delete_existing=True.

if delete_existing and os.path.exists(os.path.join(self.path, "config")):
os.unlink(os.path.join(self.path, "config"))
if not os.path.exists(os.path.join(self.path, "config")):
self.cache_config = CacheConfig(self.repository, self.path)
self.cache_config.create()
self.cache_config.load() # loading config here to initialize property `_config`
self.cache_config.manifest_id = ""
self.cache_config._config.set("cache", "manifest", "")

Expand Down
25 changes: 25 additions & 0 deletions src/borg/testsuite/archiver/create_cmd.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

from ... import platform
from ...constants import * # NOQA
from ...helpers import get_cache_dir
from ...manifest import Manifest
from ...platform import is_cygwin, is_win32, is_darwin
from ...repository import Repository
Expand Down Expand Up @@ -659,6 +660,30 @@ def test_file_status_cs_cache_mode(self):
)
self.assert_in("M input/file1", output)

@pytest.mark.allow_cache_wipe
def test_cache_rebuild_on_missing_cache_files(self):
"""test if borg rebuilds cache if any of the cache files have gone missing"""
self.cmd(f"--repo={self.repository_location}", "rcreate", RK_ENCRYPTION)
self.create_regular_file("file1", size=10)
self.cmd(f"--repo={self.repository_location}", "create", "archive1", "input")
info = json.loads(self.cmd(f"--repo={self.repository_location}", "rinfo", "--json"))
repository = info["repository"]
cache_dir = os.path.join(get_cache_dir(), repository["id"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repo_cache_dir

config_cache_path = os.path.join(cache_dir, "config")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the cache config, not the config cache.

chunk_cache_path = os.path.join(cache_dir, "chunks")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunks_cache_path

os.unlink(config_cache_path)
self.create_regular_file("file2", size=10)
self.cmd(f"--repo={self.repository_location}", "create", "archive2", "input")
assert os.path.exists(config_cache_path)
os.unlink(chunk_cache_path)
self.create_regular_file("file3", size=10)
self.cmd(f"--repo={self.repository_location}", "create", "archive3", "input")
assert os.path.exists(chunk_cache_path)
self.create_regular_file("file4", size=10)
os.unlink(config_cache_path)
self.cmd(f"--repo={self.repository_location}", "create", "archive4", "input")
assert os.path.exists(config_cache_path)

def test_file_status_ms_cache_mode(self):
"""test that a chmod'ed file with no content changes does not get chunked again in mtime,size cache_mode"""
self.create_regular_file("file1", size=10)
Expand Down