Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage backend: SMB #4185

Open
aneesh-n opened this issue Jan 30, 2023 · 8 comments · May be fixed by #4186
Open

Storage backend: SMB #4185

aneesh-n opened this issue Jan 30, 2023 · 8 comments · May be fixed by #4186

Comments

@aneesh-n
Copy link
Contributor

aneesh-n commented Jan 30, 2023

Output of restic version

restic 0.15.0 compiled with go1.19.5 on windows/amd64

What should restic do differently? Which functionality do you think we should add?

Ability to configure SMB as a storage backend in restic.

What are you trying to do? What problem would this solve?

It would be useful to be able to configure SMB as a backend in restic. This way we will not have to mount smb on the OS and then back up to it. Also, this is implementation would be more cross-platform.
I propose to use the https://github.com/hirochachacha/go-smb2 library which is also used by rclone for its SMB integration.
While it is possible to use SMB as a backend through rclone, having an SMB backend directly on restic will have less overhead and will be cleaner.
I have created an implementation for this and can create a PR for review.

Did restic help you today? Did it make you happy in any way?

@aneesh-n aneesh-n linked a pull request Jan 31, 2023 that will close this issue
8 tasks
@rawtaz
Copy link
Contributor

rawtaz commented Jan 31, 2023

having an SMB backend directly on restic will have less overhead

By what metrics?

@aneesh-n
Copy link
Contributor Author

aneesh-n commented Feb 1, 2023

Memory and CPU overhead for http processing due the local webserver required with rclone.

@MichaelEischer
Copy link
Member

MichaelEischer commented Feb 1, 2023

Memory and CPU overhead for http processing due the local webserver required with rclone.

Can you quantify that a bit more? Yes, obviously an additional process adds some overhead, but adding 1k+ lines of code to restic is not free either. So we have to weigh a possible performance benefit (that's all it is up to now) against the added complexity and maintenance overhead. Reviewing such a large PR is also a significant time investment which competes with other features that provide larger benefits for restic (e.g. features which cannot be properly implemented outside of restic, e.g. a restore progress bar).

Additional backends increase the work necessary to maintain them and to e.g. evolve the internal interfaces. It also adds even more new dependencies, which is also somewhat problematic (see e.g, #4119 ). So, in general we are very reluctant to add new backends. Even more so if those are already available via rclone.

@aneesh-n
Copy link
Contributor Author

aneesh-n commented Feb 2, 2023

Thank you for the insights.

I can run some benchmarks and try to provide some more details and speed comparisons in a couple of days.

Some other minor considerations -

  1. The rest backend which has to be used with rclone does not support atomic replace during save, while the smb backend follows the same logic as the local backend along with pooling of smb connections and does support atomic replace (some refactoring may be possible to reduce duplicate code with the local backend and improve maintainability, but I thought that could be done as a subsequent step. That would significantly reduce the loc. Also the test setup and documentation added many lines, I was trying to be thorough).
  2. The smb backend does call sync on the files and marks them readonly to avoid accidental modifications. I believe those functionalities would be skipped when using the rest api with rclone.

But I do understand that this needs to be weighed against the big picture for restic.

@aneesh-n
Copy link
Contributor Author

aneesh-n commented Feb 5, 2023

Here are the benchmarks:

Test data used:
Files - 23130
Dirs - 3599
Total size - 54.076 GiB

Note: All tests were performed on the same environment.

Test Environment:
Source:
Windows 10
Processor Intel(R) Core(TM) i7-6650U CPU @ 2.20GHz, 2201 Mhz, 2 Core(s), 4 Logical Processor(s)
Installed Physical Memory (RAM) 16.0 GB

Destination: SMB server (SMB3) on an Asustor NAS device.

Restic version: Build from #4186 commit - d062a82
Rclone version: Release v1.61.1 Intel/AMD - 64 Bit

Performance Monitor was used to capture Memory (in Bytes converted to MBs) and CPU usage (in % Processor time) for restic and rclone processes every 5 seconds.
For SMB backend rclone process was not running.
For Rclone backend, the usage of rclone and restic processes is summed up.

The Average CPU and Memory gives an idea about the average resource usage at a given instant.
However, since Rclone backend completes faster than SMB backend, the Total CPU and Memory usage difference is not as large.
The total CPU and Memory usage readings are just for a comparison between the two and do not have any significance on their own.

<style> </style>
Metric Run 1 Run 2 Run 3 Run 4 Average
Rclone (time Minutes) 17.55 18 17.35 18.25 17 min 59 secs
SMB  (time Minutes) 16.15 16.1 16.05 16.1 16 min 11 secs
Rclone (Average CPU % Processor time) 244.44 243.8 245.78 241.38 243.85
SMB (Average CPU % Processor time) 243.21 242.8 242.21 246.41 243.6575
Rclone (Average Memory used MB) 241.47 241.19 242.57 241.07 241.575
SMB (Average Memory used MB) 188.57 192.53 191.55 191.52 191.0425
Rclone (Total CPU % Processor time) 52799.76 52659.87 52105.11 53585.63 52787.5925
SMB (Total CPU % Processor time) 47670.07 47345.42 46988.5 48050.05 47513.51
Rclone (Total Memory used MB) 52156.57 52097.08 51424.11 53518.49 52299.0625
SMB (Total Memory used MB) 36959.34 37542.87 37160.29 37347.15 37252.4125
           
Test data set:          
Files 23130        
Dirs 3599        
Total size 54.076 GiB        
           
Conclusions for backup statistics:          
SMB backend is 10% faster than Rclone SMB backend.    
SMB backend uses 0% less average CPU than Rclone SMB backend.    
SMB backend uses 26% less average Memory than Rclone SMB backend.    
SMB backend uses 11% less total CPU than Rclone SMB backend.    
SMB backend uses 40% less total Memory than Rclone SMB backend.    

Benchmarks -
ResticRcloneBenchmark1.xlsx
ResticRcloneBenchmark2.xlsx
ResticRcloneBenchmark3.xlsx
ResticRcloneBenchmark4.xlsx

ResticSMBBenchmarkC5-1.xlsx
ResticSMBBenchmarkC5-2.xlsx
ResticSMBBenchmarkC5-3.xlsx
ResticSMBBenchmarkC5-4.xlsx

Summary -
BenchmarksSummary.xlsx

So the SMB backend is 10% faster than the Rclone backend when same max connections are used and it saves 11% total CPU usage, 26% average memory usage and 40% total memory usage.
These numbers might increase as the data size increases.

Besides this, the SMB backend makes Sync calls on files like the local backend and has ability to mark files as read-only to avoid accidental modifications like the local backend, which would not happen with the rclone+SMB backend. SMB backend also supports atomic replace during save.

@aneesh-n
Copy link
Contributor Author

aneesh-n commented Feb 11, 2023

The test results below indicate the pattern for performance characteristics with increasing data:

Test data used:
Files - 12985
Dirs - 139
Total size - 1.238 TiB

<style> </style>
Metric SMB test Rclone test
Time in Minutes 301.25 369.58
Average CPU % Processor time 285.56 253.06
Average Memory used MiB 318.39 395.72
Total CPU % Processor time 1032303.48 1122336.28
Total Memory used MiB 1150974.63 1755033.81
     
Test data set:    
Files 12985  
Dirs 139  
Total size 1.238 TiB  
     
Conclusions for backup statistics:    
SMB backend is 18% faster than Rclone SMB backend.
SMB backend uses -13% less average CPU than Rclone SMB backend.
SMB backend uses 20% less average Memory than Rclone SMB backend.
SMB backend uses 8% less total CPU than Rclone SMB backend.
SMB backend uses 34% less total Memory than Rclone SMB backend.

Benchmarks:

ResticSMBRcloneBenchmark-1TB .xlsx

Summary:

BenchmarksSummary.xlsx

image

image

For 1.24 TiB, the SMB backend is 18% faster than the Rclone SMB backend and it saves 8% total CPU usage, 20% average memory usage and 34% total memory usage.

@greatroar
Copy link
Contributor

I know that adding backends adds to the maintenance burden, but SMB is apparently very popular, judging by the number of bug reports about the Linux CIFS driver. The speedup is also impressive.

@andro42
Copy link

andro42 commented Feb 19, 2024

I agree with SMB being very popular. In our environment, we would benefit with SMB support. I had many problems with cifs. Also it adds complexity with mount, or with rclone, that I would like to avoid with backup solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants