Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.4.3] Failed to get user/group name for some UID/GIDs although they all exist #1666

Open
jean-christophe-manciot opened this issue Mar 13, 2024 · 12 comments
Labels
Bug Discussion decision or consensus needed High

Comments

@jean-christophe-manciot

In ~/.local/share/backintime/takesnapshot.log:

DEBUG: [common/snapshots.py:322 Snapshots.groupName] Failed to get group name for GID 181: 'getgrgid(): gid not found: 181'
DEBUG: [common/snapshots.py:322 Snapshots.groupName] Failed to get group name for GID 171: 'getgrgid(): gid not found: 171'
DEBUG: [common/snapshots.py:297 Snapshots.userName] Failed to get user name for UID 169: 'getpwuid(): uid not found: 169'
DEBUG: [common/snapshots.py:322 Snapshots.groupName] Failed to get group name for GID 180: 'getgrgid(): gid not found: 180'
DEBUG: [common/snapshots.py:322 Snapshots.groupName] Failed to get group name for GID 154: 'getgrgid(): gid not found: 154'

despite:

# for gid in 181 171 180 154; do grep x:${gid} /etc/group; done
postdrop:x:181:
ntpsec:x:171:
postfix:x:180:
Debian-snmp:x:154:

# grep x:169 /etc/passwd
postfix:x:169:180::/var/spool/postfix:/usr/sbin/nologin

That's strange.

bit-diags.txt

@buhtz
Copy link
Member

buhtz commented Mar 13, 2024

I only see DEBUG messages in your output. Do you have any errors or the snapshots taken are not complete?

@buhtz
Copy link
Member

buhtz commented Mar 13, 2024

Seems happen in context of backupPermissions(). What a smelly piece of code:

def userName(self, uid):
"""
Get the username for the given uid.
uid->name will be cached to speed up subsequent requests.
Args:
uid (int): User identifier (UID) to search for
Returns:
str: name of the user with UID uid or '-' if not found
"""
if uid in self.userCache:
return self.userCache[uid]
else:
name = '-'
try:
name = pwd.getpwuid(uid).pw_name
except Exception as e:
logger.debug('Failed to get user name for UID %s: %s'
%(uid, str(e)),
self)
self.userCache[uid] = name
return name

Based on your diagnostics info you run this as "root" with "Python 3.11.8". Can you please modify the code to re-raise the Exception. Based on the pwd-docs I assume it is a KeyError.

EDIT:
Can you please insert some debug code like this:

import grp
print(f'{grp.getgrall()=}')

Maybe you run BIT (or its tests?) somehow isolated from the system?

EDIT2:
And another idea. Can you use your two shell commands (for gid in ... and grep x:169 /etc/passwd) in a subprocess.run() call inside this BIT method?

@jean-christophe-manciot
Copy link
Author

jean-christophe-manciot commented Mar 14, 2024

@buhtz

I only see DEBUG messages in your output. Do you have any errors or the snapshots taken are not complete?

No, but can these false reports impact the quality of a successful snapshot? In theory, it shouldn't, but I do not know the code.
Also, I still do not understand why bit needs to retrieve the user/group names from UIDs/GIDs. Everything which is not needed should not be done.

Regarding your second message, I have no python skills.

@buhtz
Copy link
Member

buhtz commented Mar 14, 2024

In which context ocured that debug messages? Did you created a package for your PPA? You run tests? Or is it really a real take snapshot run?

Can you reproduce the problem on your machine?

You you send the full debug output please.

If it impacts the quality? I am also not sure. But I assume that restoring that snapshot will cause problems.

@jean-christophe-manciot
Copy link
Author

jean-christophe-manciot commented Mar 14, 2024

In which context ocured that debug messages? Did you created a package for your PPA? You run tests? Or is it really a real take snapshot run?

It's just a regular snapshot taken with the command:

/usr/bin/nice -n 19 /usr/bin/ionice -c2 -n7 /usr/bin/backintime --debug --profile "$profile_name" backup 2>&1

The Snapshots.backup rsync command returns 24.

Can you reproduce the problem on your machine?

It happens only when backing up one particular server, not the others.

The binary comes from backintime-qt_1.4.3+23.10_all.deb; I'll try later with backintime-qt_1.4.3-1+23.10_all.deb.

@jean-christophe-manciot
Copy link
Author

You you send the full debug output please.

========== Take snapshot (profile <profile_name>): Wed 13 Mar 2024 07:12:35 PM CET ==========
DEBUG: [common/backintime.py:589 argParse] Arguments: {'debug': True, 'profile': '<profile_name>', 'command': 'backup', 'func': <function backup at 0x767661bbb6a0>} | unknownArgs: []
DEBUG: [common/configfile.py:591 Config.setCurrentProfile] Change current profile: 1=Main profile
DEBUG: [common/tools.py:186 initiate_translation] No language code. Use systems current locale.
DEBUG: [common/backintime.py:677 getConfig] config file: /root/.config/backintime/config
DEBUG: [common/backintime.py:678 getConfig] share path: /root/.local/share/backintime
DEBUG: [common/backintime.py:679 getConfig] profiles: <list of profiles>
DEBUG: [common/configfile.py:591 Config.setCurrentProfile] Change current profile: 12=<profile_name>
DEBUG: [common/pluginmanager.py:245 PluginManager.load] Register plugin path /usr/share/backintime/plugins
DEBUG: [common/pluginmanager.py:249 PluginManager.load] Probing plugin usercallbackplugin.py
DEBUG: [common/pluginmanager.py:263 PluginManager.load] Add plugin usercallbackplugin.py
DEBUG: [common/pluginmanager.py:249 PluginManager.load] Probing plugin notifyplugin.py
DEBUG: [common/pluginmanager.py:263 PluginManager.load] Add plugin notifyplugin.py
DEBUG: [common/pluginmanager.py:249 PluginManager.load] Probing plugin systrayiconplugin.py
DEBUG: [common/tools.py:735 is_Qt5_working] Qt5 probing result: exit code 2
DEBUG: [common/tools.py:738 is_Qt5_working] Qt5 probing stdout:

DEBUG: [common/tools.py:739 is_Qt5_working] Qt5 probing errout:
DEBUG: [common/qt5_probing.py:89 <module>] /usr/share/backintime/common/qt5_probing.py started... Call args: ['/usr/share/backintime/common/qt5_probing.py', '--debug']
DEBUG: [common/qt5_probing.py:90 <module>] Display system: ($XDG_SESSION_TYPE is not set)
DEBUG: [common/qt5_probing.py:91 <module>] XDG_RUNTIME_DIR=($XDG_RUNTIME_DIR is not set)
DEBUG: [common/qt5_probing.py:92 <module>] XAUTHORITY=/home/actionmystique/.Xauthority
DEBUG: [common/qt5_probing.py:93 <module>] QT_QPA_PLATFORM=($QT_QPA_PLATFORM is not set)
DEBUG: [common/qt5_probing.py:95 <module>] Current euid: 0
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
DEBUG: [common/qt5_probing.py:126 <module>] isSystemTrayAvailable for Qt5: True
DEBUG: [common/qt5_probing.py:131 <module>] /usr/share/backintime/common/qt5_probing.py is terminating normally (exit code: 2)

DEBUG: [plugins/systrayiconplugin.py:76 init] System tray is available to show the BiT system tray icon
DEBUG: [common/pluginmanager.py:263 PluginManager.load] Add plugin systrayiconplugin.py
DEBUG: [common/applicationinstance.py:169 flockExclusiv] Trying to put an advisory lock on the flock file /root/.local/share/backintime/worker12.lock.flock
DEBUG: [common/applicationinstance.py:201 flockUnlock] Trying to remove the advisory lock from the flock file /root/.local/share/backintime/worker12.lock.flock
INFO: [common/snapshots.py:733 Snapshots.backup] Lock
DEBUG: [common/tools.py:1637 inhibitSuspend] Inhibit Suspend failed because BIT was started as root.
DEBUG: [common/tools.py:1134 keyringSupported] No keyring due to import error.
DEBUG: [common/mount.py:161 Mount.__init__] pw-cache is not running
DEBUG: [common/mount.py:169 Mount.__init__] Call command: /usr/bin/backintime pw-cache start
DEBUG: [plugins/usercallbackplugin.py:81 UserCallbackPlugin.callback] Call user-callback: /root/.config/backintime/user-callback 12 <profile_name> 7
DEBUG: [plugins/systrayiconplugin.py:90 processBegin] Trying to start systray icon sub process...
DEBUG: [plugins/usercallbackplugin.py:81 UserCallbackPlugin.callback] Call user-callback: /root/.config/backintime/user-callback 12 <profile_name> 1
INFO: [common/snapshots.py:764 Snapshots.backup] Take a new snapshot. Profile: 12 <profile_name>
INFO: [common/snapshots.py:1215 Snapshots.takeSnapshot] Call rsync to take the snapshot
DEBUG: [common/snapshots.py:796 Snapshots.backup] Call command "rsync --recursive --times --devices --specials --hard-links --human-readable -s --links --acls --xattrs --perms --executability --group --owner --info=progress2 --no-inc-recursive --delete --delete-excluded -v -i --out-format=BACKINTIME: %i %n%L --link-dest=<path> --chmod=Du+wx --exclude=<path> --include=<path> / <path>/root/12/new_snapshot/backup"
DEBUG: [common/tools.py:2500 run] Starting command 'rsync --recursiv...'
DEBUG: [qt/qtsystrayicon.py:250 <module>] Sub process tries to show systray icon...
DEBUG: [qt/qtsystrayicon.py:251 <module>] qtsystrayicon.py call args: ['/usr/share/backintime/qt/qtsystrayicon.py', '12', '--debug']
DEBUG: [common/configfile.py:591 Config.setCurrentProfile] Change current profile: 1=Main profile
DEBUG: [common/tools.py:186 initiate_translation] No language code. Use systems current locale.
DEBUG: [common/configfile.py:591 Config.setCurrentProfile] Change current profile: 12=<profile_name>
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
DEBUG: [qt/qttools.py:229 createQApplication] QT QPA platform plugin: xcb
DEBUG: [qt/qttools.py:230 createQApplication] QT_QPA_PLATFORMTHEME=<not set>
DEBUG: [qt/qttools.py:235 createQApplication] QT_STYLE_OVERRIDE=<not set>
DEBUG: [qt/qttools.py:238 createQApplication] QT active style: fusion
DEBUG: [qt/qttools.py:239 createQApplication] QT fallback style: hicolor
DEBUG: [qt/qttools.py:240 createQApplication] QT supported styles: ['Breeze', 'Windows', 'Fusion']
DEBUG: [qt/qttools.py:241 createQApplication] themeSearchPaths: ['/root/.local/share/icons', '/usr/local/share/icons', '/usr/share/icons', ':/icons']
DEBUG: [qt/qttools.py:242 createQApplication] fallbackSearchPaths: ['/usr/share/pixmaps']
DEBUG: [qt/qttools.py:247 createQApplication] Is SystemTray available: True
DEBUG: [qt/qttools.py:260 createQApplication] Trying to set App ID for root user
DEBUG: [qt/qttools.py:304 initiate_translator] No language code. Use systems current locale.
DEBUG: [qt/icon.py:20 <module>] Checking if the current theme contains the BiT icon...
DEBUG: [qt/icon.py:36 <module>] Probing theme: ubuntu-mono-dark (activated as ubuntu-mono-dark)
DEBUG: [qt/icon.py:32 <module>] Found an installed theme: ubuntu-mono-dark
WARNING: [common/snapshots.py:796 Snapshots.backup] Command "rsync --recursive --times --devices --specials --hard-links --human-readable -s --links --acls --xattrs --perms --executability --group --owner --info=progress2 --no-inc-recursive --delete --delete-excluded -v -i --out-format=BACKINTIME: %i %n%L --link-dest=<path> --chmod=Du+wx --exclude=<path> --include=<path> / <path>/root/12/new_snapshot/backup" returns 24
INFO: [common/snapshots.py:965 Snapshots.backupConfig] Save config file
INFO: [common/snapshots.py:1041 Snapshots.backupPermissions] Save permissions
DEBUG: [common/snapshots.py:1296 Snapshots.takeSnapshot] Call command "rsync --dry-run -s -r --out-format=%n <path>/root/12/new_snapshot/backup/ /tmp/tmp8joyn8wq/"
DEBUG: [common/tools.py:2500 run] Starting command 'rsync --dry-run ...'
DEBUG: [common/snapshots.py:322 Snapshots.groupName] Failed to get group name for GID 181: 'getgrgid(): gid not found: 181'
DEBUG: [common/snapshots.py:322 Snapshots.groupName] Failed to get group name for GID 171: 'getgrgid(): gid not found: 171'
DEBUG: [common/snapshots.py:297 Snapshots.userName] Failed to get user name for UID 169: 'getpwuid(): uid not found: 169'
DEBUG: [common/snapshots.py:322 Snapshots.groupName] Failed to get group name for GID 180: 'getgrgid(): gid not found: 180'
DEBUG: [common/snapshots.py:322 Snapshots.groupName] Failed to get group name for GID 154: 'getgrgid(): gid not found: 154'
DEBUG: [common/snapshots.py:1296 Snapshots.takeSnapshot] Command "rsync --dry-run ..." returns 0
INFO: [common/snapshots.py:1014 Snapshots.backupInfo] Create info file
DEBUG: [common/tools.py:1585 writeTimeStamp] write timestamp '20240313 1912' into file '/root/.local/share/backintime/anacron/12_<profile_name>'
DEBUG: [common/snapshots.py:2013 Snapshots.createLastSnapshotSymlink] Create symlink <path>/root/12/last_snapshot => 20240313-190431-178
DEBUG: [common/snapshots.py:1666 Snapshots.freeSpace] Remove snapshots older than: 20240304-000000
DEBUG: [common/snapshots.py:1462 Snapshots.smartRemoveList] Considered: [20240313-190431-178, 20240305-185732-918]
DEBUG: [common/snapshots.py:1353 Snapshots.smartRemoveKeepAll] Keep all >= 20240313-000000-824 and < 20240314-000000-378
DEBUG: [common/snapshots.py:1381 Snapshots.smartRemoveKeepFirst] Keep first >= 20240313-000000-491 and < 20240314-000000-792
DEBUG: [common/snapshots.py:1381 Snapshots.smartRemoveKeepFirst] Keep first >= 20240310-000000-563 and < 20240318-000000-926
DEBUG: [common/snapshots.py:1381 Snapshots.smartRemoveKeepFirst] Keep first >= 20240301-000000-554 and < 20240401-000000-866
DEBUG: [common/snapshots.py:1381 Snapshots.smartRemoveKeepFirst] Keep first >= 20240101-000000-963 and < 20250101-000000-464
DEBUG: [common/snapshots.py:1526 Snapshots.smartRemoveList] Keep snapshots: {20240313-190431-178}
INFO: [common/snapshots.py:1632 Snapshots.smartRemove] [smart remove] remove snapshots: [20240305-185732-918]
DEBUG: [common/snapshots.py:1637 Execute.smartRemove] Call command "rsync -a --delete -s /tmp/tmpiittw_hi/ <path>/root/12/20240305-185732-918"
DEBUG: [common/tools.py:2500 run] Starting command 'rsync -a --delet...'
DEBUG: [common/snapshots.py:1637 Execute.smartRemove] Command "rsync -a --delet..." returns 0
DEBUG: [common/snapshots.py:1702 Snapshots.freeSpace] Keep min free disk space: 1024 MiB
DEBUG: [common/snapshots.py:1740 Snapshots.freeSpace] Keep min 1% free inodes
DEBUG: [plugins/usercallbackplugin.py:81 UserCallbackPlugin.callback] Call user-callback: /root/.config/backintime/user-callback 12 <profile_name> 3 20240313-190431-178 <path>/root/12/20240313-190431-178
DEBUG: [plugins/usercallbackplugin.py:81 UserCallbackPlugin.callback] Call user-callback: /root/.config/backintime/user-callback 12 <profile_name> 2
INFO: [plugins/usercallbackplugin.py:93 UserCallbackPlugin.callback] user-callback returned '-------------------------------
Backup of <profile_name> is done
-------------------------------'
DEBUG: [common/tools.py:1134 keyringSupported] No keyring due to import error.
DEBUG: [common/mount.py:161 Mount.__init__] pw-cache is not running
DEBUG: [common/mount.py:169 Mount.__init__] Call command: /usr/bin/backintime pw-cache start
DEBUG: [plugins/usercallbackplugin.py:81 UserCallbackPlugin.callback] Call user-callback: /root/.config/backintime/user-callback 12 <profile_name> 8
INFO: [common/snapshots.py:851 Snapshots.backup] Unlock

Back In Time
Version: 1.4.3

Back In Time comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; type `backintime --license' for details.

@buhtz
Copy link
Member

buhtz commented Mar 14, 2024

Thanks for your feedback.

The Snapshots.backup rsync command returns 24.

Can you please take a look into the snapshot log. But don't post it here. It may contain sensible information.

Can you reproduce the problem on your machine?

It happens only when backing up one particular server, not the others.

Mhm... This supports my assumption that the reason depends on the specific configuration of a machine.

I am a bit out of ideas.

Please can you try this on your root shell:

python3 -c "import grp;priint(grp.getgrall());"

@jean-christophe-manciot
Copy link
Author

Can you please take a look into the snapshot log. But don't post it here. It may contain sensible information.

Nothing significant in the log except maybe this:

[I] Take snapshot (rsync: sent 3.96G bytes  received 324.18K bytes  19.44M bytes/sec)
[I] Take snapshot (rsync: total size is 41.46G  speedup is 10.48)
[I] Take snapshot (rsync: rsync warning: some files vanished before they could be transferred (code 24) at main.c(1338) [sender=3.2.7])
[I] 'rsync' ended with exit code 24: Partial transfer due to vanished source files (see 'man rsync')
[I] Saving config file…
[I] Saving permissions…
[I] Removing old snapshots
[I] Smart remove
[I] Smart remove 1/1
[I] Trying to keep min free space
[I] Trying to keep min 1% free inodes
[I] Finalizing
python3 -c "import grp;print(grp.getgrall());" | grep -P "(181|171|180|154)"
[grp.struct_group(...gr_gid=171,...gr_gid=180,...gr_gid=181,...gr_gid=154,...])]

@jean-christophe-manciot
Copy link
Author

With a new snapshot, I get the same UIDs/GIDs DEBUG issues with a successful rsync return code (0).

@jean-christophe-manciot
Copy link
Author

jean-christophe-manciot commented Mar 15, 2024

IMHO, the code that tries to get user/group names from UIDs/GIDs should be removed altogether.

@buhtz buhtz added Discussion decision or consensus needed Medium Heisenbug a problem that is not reproducible but random (non-deterministic) labels Mar 16, 2024
@jean-christophe-manciot
Copy link
Author

@buhtz
I've discovered why this issue happens: bit tries to get the username/group name on the host which started the snapshot, although the snapshot was made over a remote host accessible through an sshfs tunnel (and a mount point)!
Of course, both hosts don't necessarily share the same UIDs/GIDs; even worse, they can share some with different usernames/group names!

One more reason to remove that code.

@buhtz buhtz added High Bug and removed Medium Heisenbug a problem that is not reproducible but random (non-deterministic) labels Mar 25, 2024
@buhtz buhtz added this to the Upcoming release (1.5.0) milestone Mar 25, 2024
@buhtz
Copy link
Member

buhtz commented Mar 25, 2024

Thanks for analyzing this. A real bug.

But I am not convinced yet that just removing that code will solve the problem. But I am also not enough into it to come to a final decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Discussion decision or consensus needed High
Projects
None yet
Development

No branches or pull requests

2 participants