Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During a total backup, it saves Whisper's models files, raising the backup from 440MB to 1730MB #3545

Open
SaintTDI opened this issue Apr 4, 2024 · 3 comments

Comments

@SaintTDI
Copy link

SaintTDI commented Apr 4, 2024

Describe the issue you are experiencing

Yesterday I installed a local voice assist pipeline. Before Installing it each full backup (automatically done by Onedrive backup) was about 440MB... but since I installed whisper, piper and openwakeword, yesterday the backup was 1gb and this morning is 1,6gb.

Doing a partial backup of Whisper, only for this addon is 1266MB, and unzipping the file I can see the 3 models that I tried with some big files (eg path: core_whisper\data\models--rhasspy--faster-whisper-medium-int8\blobs).

On the Whisper add-on documentation it says:

Backups
Whisper model files can be quite large, so they are automatically excluded from backups. The models will be re-downloaded when the backup is restored.

But it seems it doesn't happen.

What type of installation are you running?

Home Assistant OS

Which operating system are you running on?

Home Assistant Operating System

Which add-on are you reporting an issue with?

Whisper

What is the version of the add-on?

2.0.0

Steps to reproduce the issue

  1. Install whisper
  2. change different models
  3. perform a total backup
  4. unzip the file and find the blobs for each model
    ...

System Health information

System Information

version core-2024.3.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.2
os_name Linux
os_version 6.6.20-haos
arch x86_64
timezone Europe/Rome
config_dir /config
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
GitHub API Calls Remaining 5000
Installed Version 1.34.0
Stage running
Available Repositories 1402
Downloaded Repositories 35
Home Assistant Cloud
logged_in false
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 12.1
update_channel stable
supervisor_version supervisor-2024.03.1
agent_version 1.6.0
docker_version 24.0.7
disk_total 468.7 GB
disk_used 27.1 GB
healthy true
supported true
board generic-x86-64
supervisor_api ok
version_api ok
installed_addons File editor (5.8.0), Advanced SSH & Web Terminal (17.2.0), Mosquitto broker (6.4.0), Zigbee2MQTT (1.36.1-1), Studio Code Server (5.15.0), Duck DNS (1.16.0), OneDrive Backup (2.3.1), Grocy (0.21.0), ESPHome (2024.3.1), Piper (1.5.0), Whisper (2.0.0), openWakeWord (1.10.0)
Dashboards
dashboards 7
resources 26
views 52
mode storage
Recorder
oldest_recorder_run 20 marzo 2024 alle ore 13:28
current_recorder_run 3 aprile 2024 alle ore 15:23
estimated_db_size 996.68 MiB
database_engine sqlite
database_version 3.44.2
Spotify
api_endpoint_reachable ok

Anything in the Supervisor logs that might be useful for us?

2024-04-04 11:21:17.572 INFO (MainThread) [supervisor.api.middleware.security] /backups access from de91e161_hassio_onedrive_backup
2024-04-04 11:21:18.008 INFO (MainThread) [supervisor.api.middleware.security] /backups access from de91e161_hassio_onedrive_backup
2024-04-04 11:22:42.609 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state running
2024-04-04 11:22:42.609 INFO (MainThread) [supervisor.resolution.checks.base] Run check for security/core
2024-04-04 11:22:42.610 INFO (MainThread) [supervisor.resolution.checks.base] Run check for no_current_backup/system
2024-04-04 11:22:42.610 INFO (MainThread) [supervisor.resolution.checks.base] Run check for trust/supervisor
2024-04-04 11:22:42.617 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_failed/dns_server
2024-04-04 11:22:42.670 INFO (MainThread) [supervisor.resolution.checks.base] Run check for pwned/addon
2024-04-04 11:22:42.670 INFO (MainThread) [supervisor.resolution.checks.base] Run check for multiple_data_disks/system
2024-04-04 11:22:42.671 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_ipv6_error/dns_server
2024-04-04 11:22:42.671 INFO (MainThread) [supervisor.resolution.checks.base] Run check for docker_config/system
2024-04-04 11:22:42.671 INFO (MainThread) [supervisor.resolution.checks.base] Run check for free_space/system
2024-04-04 11:22:42.671 INFO (MainThread) [supervisor.resolution.checks.base] Run check for ipv4_connection_problem/system
2024-04-04 11:22:42.672 INFO (MainThread) [supervisor.resolution.check] System checks complete
2024-04-04 11:22:42.672 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state running
2024-04-04 11:22:42.754 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
2024-04-04 11:22:42.754 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state running
2024-04-04 11:22:42.754 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
2024-04-04 11:23:18.455 WARNING (MainThread) [supervisor.addons.options] Unknown option 'serial' for Zigbee2MQTT (45df7312_zigbee2mqtt)
2024-04-04 11:23:18.455 WARNING (MainThread) [supervisor.addons.options] Unknown option 'advanced' for Zigbee2MQTT (45df7312_zigbee2mqtt)
2024-04-04 11:23:44.647 INFO (MainThread) [supervisor.backups.manager] Found 45 backup files
2024-04-04 11:23:44.713 INFO (MainThread) [supervisor.backups.manager] Found 45 backup files
2024-04-04 11:23:50.924 INFO (MainThread) [supervisor.backups.manager] Backup 8e7bb872 starting stage addon_repositories
2024-04-04 11:23:50.925 INFO (MainThread) [supervisor.backups.manager] Backup 8e7bb872 starting stage docker_config
2024-04-04 11:23:50.925 INFO (MainThread) [supervisor.backups.manager] Creating new full backup with slug 8e7bb872
2024-04-04 11:23:50.928 INFO (MainThread) [supervisor.backups.manager] Backup 8e7bb872 starting stage addons
2024-04-04 11:23:50.944 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on core_configurator
2024-04-04 11:23:50.949 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon core_configurator
2024-04-04 11:23:50.956 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on a0d7b954_ssh
2024-04-04 11:23:50.963 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon a0d7b954_ssh
2024-04-04 11:23:50.970 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on core_mosquitto
2024-04-04 11:23:50.975 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon core_mosquitto
2024-04-04 11:23:50.983 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on 45df7312_zigbee2mqtt
2024-04-04 11:23:50.987 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon 45df7312_zigbee2mqtt
2024-04-04 11:23:50.995 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on a0d7b954_vscode
2024-04-04 11:23:52.111 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon a0d7b954_vscode
2024-04-04 11:23:52.119 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on core_duckdns
2024-04-04 11:23:52.137 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon core_duckdns
2024-04-04 11:23:52.147 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on de91e161_hassio_onedrive_backup
2024-04-04 11:23:52.152 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon de91e161_hassio_onedrive_backup
2024-04-04 11:23:52.160 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on a0d7b954_grocy
2024-04-04 11:23:52.191 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon a0d7b954_grocy
2024-04-04 11:23:52.200 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on 5c53de3b_esphome
2024-04-04 11:23:52.203 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon 5c53de3b_esphome
2024-04-04 11:23:52.211 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on core_piper
2024-04-04 11:23:52.216 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon core_piper
2024-04-04 11:23:52.223 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on core_whisper
2024-04-04 11:24:06.122 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon core_whisper
2024-04-04 11:24:06.130 INFO (MainThread) [supervisor.addons.addon] Building backup for add-on core_openwakeword
2024-04-04 11:24:06.135 INFO (MainThread) [supervisor.addons.addon] Finish backup for addon core_openwakeword
2024-04-04 11:24:06.135 INFO (MainThread) [supervisor.backups.manager] Backup 8e7bb872 starting stage home_assistant
2024-04-04 11:24:06.143 INFO (MainThread) [supervisor.homeassistant.module] Backing up Home Assistant Core config folder
2024-04-04 11:24:17.050 INFO (MainThread) [supervisor.homeassistant.module] Backup Home Assistant Core config folder done
2024-04-04 11:24:17.057 INFO (MainThread) [supervisor.backups.manager] Backup 8e7bb872 starting stage folders
2024-04-04 11:24:17.058 INFO (SyncWorker_3) [supervisor.backups.backup] Backing up folder share
2024-04-04 11:24:17.060 INFO (SyncWorker_3) [supervisor.backups.backup] Backup folder share done
2024-04-04 11:24:17.062 INFO (SyncWorker_6) [supervisor.backups.backup] Backing up folder addons/local
2024-04-04 11:24:17.065 INFO (SyncWorker_6) [supervisor.backups.backup] Backup folder addons/local done
2024-04-04 11:24:17.066 INFO (SyncWorker_4) [supervisor.backups.backup] Backing up folder ssl
2024-04-04 11:24:17.069 INFO (SyncWorker_4) [supervisor.backups.backup] Backup folder ssl done
2024-04-04 11:24:17.070 INFO (SyncWorker_2) [supervisor.backups.backup] Backing up folder media
2024-04-04 11:24:17.072 INFO (SyncWorker_2) [supervisor.backups.backup] Backup folder media done
2024-04-04 11:24:17.073 INFO (MainThread) [supervisor.backups.manager] Backup 8e7bb872 starting stage finishing_file
2024-04-04 11:24:17.076 INFO (MainThread) [supervisor.backups.manager] Creating full backup with slug 8e7bb872 completed
2024-04-04 11:24:17.082 INFO (MainThread) [supervisor.backups.manager] Found 46 backup files
2024-04-04 11:24:39.304 INFO (MainThread) [supervisor.backups.manager] Found 46 backup files
2024-04-04 11:24:40.099 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json
2024-04-04 11:24:50.068 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
2024-04-04 11:26:17.992 INFO (MainThread) [supervisor.api.middleware.security] /backups access from de91e161_hassio_onedrive_backup
2024-04-04 11:26:18.687 INFO (MainThread) [supervisor.api.middleware.security] /backups access from de91e161_hassio_onedrive_backup
2024-04-04 11:28:18.460 WARNING (MainThread) [supervisor.addons.options] Unknown option 'serial' for Zigbee2MQTT (45df7312_zigbee2mqtt)
2024-04-04 11:28:18.460 WARNING (MainThread) [supervisor.addons.options] Unknown option 'advanced' for Zigbee2MQTT (45df7312_zigbee2mqtt)
2024-04-04 11:31:18.490 INFO (MainThread) [supervisor.api.middleware.security] /backups access from de91e161_hassio_onedrive_backup
2024-04-04 11:31:18.955 INFO (MainThread) [supervisor.api.middleware.security] /backups access from de91e161_hassio_onedrive_backup

Anything in the add-on logs that might be useful for us?

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service whisper: starting
s6-rc: info: service whisper successfully started
s6-rc: info: service discovery: starting
[11:15:30] WARNING: Your CPU does not support Advanced Vector Extensions (AVX). Whisper will run slower than normal.
INFO:__main__:Ready
[11:15:34] INFO: Successfully send discovery information to Home Assistant.
s6-rc: info: service discovery successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started

Additional information

No response

@PengBG
Copy link

PengBG commented Apr 8, 2024

I can confirm this behavior that Whisper Addon is fully backed up and backup file is huge, every model that I tried is there

@carldebilly
Copy link

Yep, models should not be saved into the /data folder, because it's part of the backups, which is not useful.

@donburch888
Copy link

donburch888 commented May 13, 2024

Bump. My Full Backups also increased by 2.8GB, which means I now have to delete backups frequently to keep in my 32GB disk partition.
My system is currently:

HA OS in a proxmox partition on a x86-64 PC
Core 2024.5.3
Supervisor 2024.05.1
Operating System 12.3
Frontend 20240501.1

I created a Full Backup (System > Backups > Create backup > Full Backup) taking 3840.1MB. Download to my PC and open
Screenshot from 2024-05-13 20-43-03
Note the last entry … core_whisper.tar.gz 2.8GB out of 3.8GB file; and that that file contains the “medium.en” whisper model I am currently using, plus tiny folders for previously used models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants