Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An unexpected error occurred: TypeError: Network request failed - on Macbook M1 #876

Open
DPG7332 opened this issue Sep 1, 2022 · 22 comments · May be fixed by #2088
Open

An unexpected error occurred: TypeError: Network request failed - on Macbook M1 #876

DPG7332 opened this issue Sep 1, 2022 · 22 comments · May be fixed by #2088
Labels
bug Something isn't working MacOS Issue pertains specifically to MacOS

Comments

@DPG7332
Copy link

DPG7332 commented Sep 1, 2022

Information:

  • Chainner version: 0.11.6
  • OS: MacOS Monterey 12.5.1
  • Device: Macbook Air (M1, 2020)

Description
I'm using PyTorch, Upscale with the UltraSharp 4k model and I can successfully upscale an image that is 256x256. However when I try a larger image 2048x2048 the process runs for around 20 minutes and then an error message is displayed that reads: An unexpected error occurred: TypeError: Network request failed.

Logs
main.log
renderer.log

chainner_error

Thank you

@DPG7332 DPG7332 added the bug Something isn't working label Sep 1, 2022
@joeyballentine
Copy link
Member

image

It appears you are running out of RAM while upscaling. Since you are on a mac, upscaling via PyTorch takes place on the CPU and RAM. I would recommend using NCNN instead and converting any models you want to use, since that will take advantage of your GPU. Though, NCNN has been having some issues on Mac as well so YMMV. I plan on adding support for the apple silicon ONNX runtime, so that will be an option at some point in the future as well, but for now I would recommend attempting to use NCNN.

@DPG7332
Copy link
Author

DPG7332 commented Sep 3, 2022

Thanks for the reply! I'll work on that. :) Looking forward to apple silicon support in the future.

@RunDevelopment
Copy link
Member

It appears you are running out of RAM while upscaling.

Is that really a reason for chainner to fail? I mean, wouldn't the system just use a swap file in the background and continue running, albeit more slowly?

@joeyballentine
Copy link
Member

@RunDevelopment I based that assumption off the fact that it just crashed without warning and the screenshot shows very high ram usage. I'm not actually 100% sure if it's the reason

@iamwavecut
Copy link

Getting the same error while trying to use SBER finetuned RealESRGAN_x2 model, converted from PTH to fp16 NCNN.

Log
[2022-09-12 17:48:19 +0200] [51092] [INFO] Iterating over frames in video file: C:\Users\WaveCut\Downloads\JUSTaFiLezqcixlbymv.mp4
[2022-09-12 17:48:19 +0200] [51092] [INFO] {'044186ff-1820-47b7-8136-dd8a35c2fba7': {'schemaId': 'chainner:image:resize_factor', 'id': '044186ff-1820-47b7-8136-dd8a35c2fba7', 'inputs': [{'id': '242ce50b-f483-41cb-93df-bd59e8296053', 'index': 0}, 50, 1], 'child': True, 'nodeType': 'regularNode', 'hasSideEffects': False, 'cacheOptions': {'shouldCache': True, 'maxCacheHits': None, 'clearImmediately': False}}, '242ce50b-f483-41cb-93df-bd59e8296053': {'schemaId': 'chainner:ncnn:upscale_image', 'id': '242ce50b-f483-41cb-93df-bd59e8296053', 'inputs': [{'id': '5db9f9ee-c5f8-4f8c-bdf2-439bd900ce31', 'index': 0}, {'id': '77e8126a-65e6-4c7a-900c-36631542d480', 'index': 0}, 0], 'child': True, 'nodeType': 'regularNode', 'hasSideEffects': False, 'cacheOptions': {'shouldCache': True, 'maxCacheHits': None, 'clearImmediately': False}}, '77e8126a-65e6-4c7a-900c-36631542d480': {'schemaId': 'chainner:image:simple_video_frame_iterator_load', 'id': '77e8126a-65e6-4c7a-900c-36631542d480', 'inputs': [None], 'child': True, 'nodeType': 'iteratorHelper', 'hasSideEffects': True, 'cacheOptions': {'shouldCache': True, 'maxCacheHits': None, 'clearImmediately': False}}, '82d52179-ccad-49b7-a86d-2768b966aa60': {'schemaId': 'chainner:image:view', 'id': '82d52179-ccad-49b7-a86d-2768b966aa60', 'inputs': [{'id': '242ce50b-f483-41cb-93df-bd59e8296053', 'index': 0}], 'child': True, 'nodeType': 'regularNode', 'hasSideEffects': True, 'cacheOptions': {'shouldCache': False, 'maxCacheHits': 0, 'clearImmediately': False}}, 'd0691ce0-d489-45ee-b8e9-195094aaf9bd': {'schemaId': 'chainner:image:simple_video_frame_iterator_save', 'id': 'd0691ce0-d489-45ee-b8e9-195094aaf9bd', 'inputs': [{'id': '044186ff-1820-47b7-8136-dd8a35c2fba7', 'index': 0}, 'C:\\Users\\WaveCut\\Downloads', 'musa', 'mp4'], 'child': True, 'nodeType': 'iteratorHelper', 'hasSideEffects': True, 'cacheOptions': {'shouldCache': False, 'maxCacheHits': 0, 'clearImmediately': False}}, '5db9f9ee-c5f8-4f8c-bdf2-439bd900ce31': {'schemaId': 'chainner:ncnn:load_model', 'id': '5db9f9ee-c5f8-4f8c-bdf2-439bd900ce31', 'inputs': ['I:\\NN\\models\\sber_realesrgan_tuned\\RealESRGAN_x2.param', 'I:\\NN\\models\\sber_realesrgan_tuned\\RealESRGAN_x2.bin'], 'child': False, 'nodeType': 'regularNode', 'hasSideEffects': False, 'cacheOptions': {'shouldCache': True, 'maxCacheHits': None, 'clearImmediately': False}}}

[2022-09-12 17:48:19.295] [info] Backend: [51092] [INFO] Execution options: fp16: True, device: cuda:0

[2022-09-12 17:48:19.954] [error] Backend: find_blob_index_by_name onnx::Unsqueeze_703 failed

[2022-09-12 17:48:19.955] [error] Backend: find_blob_index_by_name onnx::Squeeze_712 failed

[2022-09-12 17:48:20.000] [error] Backend: parse layer_type failed

[2022-09-12 17:48:20.045] [error] Backend: load_model error at layer 1376, parameter file has inconsistent content.

[2022-09-12 17:48:21.183] [error] Python subprocess exited with code 3221225477 and signal null
[2022-09-12 17:49:23.995] [info] Attempting to kill backend...
[2022-09-12 17:49:23.995] [error] Error killing backend.
[2022-09-12 17:49:24.034] [info] Cleaning up temp folders...

@iamwavecut
Copy link

I mean, the same in terms of non-verbosity, and backend dies.

@joeyballentine
Copy link
Member

I think there's two action items to be done based on this discussion (besides fixing the actual issues)

  1. When the backend dies like this, we need to tell the user in a more verbose way
  2. We need to have a way to restart the backend without closing chaiNNer. This actually would be useful for installing/updating dependencies as well. I figure we can just refactor how we handle the backend event handlers and whatnot and then trigger that in these cases, followed by a frontend refresh on button press.

@theflyingzamboni
Copy link
Collaborator

@iamwavecut Could you link the model used? Also, did this error occur when trying to convert, or when trying to load the model/upscale? And if the latter, how did you convert it?

@iamwavecut
Copy link

iamwavecut commented Sep 13, 2022

@iamwavecut Could you link the model used?

https://icedrive.net/s/VWT5tiYG53W9jCxwztB5yWaBtagk both original and converted fp16

Also, did this error occur when trying to convert, or when trying to load the model/upscale? And if the latter, how did you convert it?

The error occurs when I press the RUN button. First popup it's just a general error message (i believe it's about unable to load model or smthn) and right after it second popup says about a network error (I believe its a pipeline status request), so, the backend is already dead at the moment.

The model was converted using recently introduced chaiNNer functionality.

@theflyingzamboni
Copy link
Collaborator

The model was converted using recently introduced chaiNNer functionality.

How were you able to get the pth model to convert using chaiNNer? I tried doing it myself, but I get this error when pytorch attempts to export the model to ONNX (the intermediate step in converting to NCNN):

Exporting the operator pixel_unshuffle to ONNX opset version 14 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

The opset version of ONNX we export as does not appear to support one of the operators in the pth model, so I don't know how you got it to convert in the first place.

@iamwavecut
Copy link

I believe that depends on the model. All (tens) ESRGAN-based models were converted successfully for me.

@joeyballentine
Copy link
Member

@iamwavecut the RealESRGAN_x2 model uses pixelunshuffle though, at least the official one. Is the one you were converting an unofficial one?

@iamwavecut
Copy link

Yeah, that's Sberbank-ai variant of realesrgan

@iamwavecut
Copy link

X4 and x8 works OK, though

@joeyballentine
Copy link
Member

Just checked it out and based on the arch in the repo, it should be using pixelunshuffle for 2x and 1x scales, so the same as the official ones.

@theflyingzamboni
Copy link
Collaborator

Which is why I'm wondering how it converted for you @iamwavecut. Given that the opset we export to ONNX with does not support that op, you never should have been able to get an ONNX model through chaiNNer with this particular model, never mind generating an NCNN model through chaiNNer.

@iamwavecut
Copy link

Oh, I forgot to mention something important: I'm using chaiNNer with local python installation, which is Python 3.9.6

@joeyballentine
Copy link
Member

That shouldn't matter

@nocturnal808
Copy link

nocturnal808 commented Sep 22, 2022

Hi there, I receive the same error. Also while upscaling using a pytorch model, but with the 'image file iterator' in this case. Mac book pro (not M1). For anything over a small handful of images at some point it will quit the process with that error (most i have managed is 4 in one go). RAM looks likely here too i guess as it showing a red circle like the OP

@RunDevelopment RunDevelopment added the MacOS Issue pertains specifically to MacOS label Mar 9, 2023
@stonerl
Copy link
Collaborator

stonerl commented Aug 13, 2023

The problem here is the Tile Size. I ran into the same issue with my M2 Pro with 32 GB of RAM, even with MPS.

@nocturnal808 @DPG7332 If you set the Tile Size manually from Auto to a lower value, e.g., 1024 (works flawless on my machine), you shouldn't see these issues, anymore.

@joeyballentine When set to automatic, I assume PyTorch tries to figure it out by itself?

@RunDevelopment
Copy link
Member

I assume PyTorch tries to figure it out by itself?

Yes. We once did a few tests to empirically figure out a rough formula to estimate the VRAM a model needs to upscale an image of a certain size. The result was this function.

@joeyballentine
Copy link
Member

Also btw the RAM always being a red circle was a bug that only recently got fixed

@stonerl stonerl linked a pull request Aug 14, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working MacOS Issue pertains specifically to MacOS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants