Mac OS current error #5

henrykironde · 2021-10-22T20:54:00Z

 model = df_model()
Reading config file: /Users/henrysenyondo/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/deepforest/data/deepforest_config.yml
> model$use_release()
Model from DeepForest release https://github.com/weecology/DeepForest/releases/tag/1.0.0 was already downloaded. Loading model from file.
Loading pre-built model: https://github.com/weecology/DeepForest/releases/tag/1.0.0
> 
> annotations_file = get_data("testfile_deepforest.csv")
> model$config$cpus = 1L
> model$config$workers = 1L
> model$config$epochs = 1
> model$config["save-snapshot"] = FALSE
> model$config$train$csv_file = annotations_file
> model$config$train$root_dir = get_data(".")
> 
> model$config$train$fast_dev_run = TRUE
> 
> model$create_trainer()
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Running in fast_dev_run mode: will run a full train, val, test and prediction loop using 1 batch(es).
> model$trainer$fit(model)

  | Name  | Type      | Params
------------------------------------
0 | model | RetinaNet | 32.1 M
------------------------------------
31.9 M    Trainable params
222 K     Non-trainable params
32.1 M    Total params
128.592   Total estimated model params size (MB)
Epoch 0:   0%|          | 0/1 [00:00<00:00, 4152.78it/s]  /Users/henrysenyondo/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/pytorch_lightning/trainer/data_loading.py:106: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 16 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  f"The dataloader, {name}, does not have many workers which may be a bottleneck."
/Users/henrysenyondo/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/pytorch_lightning/trainer/data_loading.py:327: UserWarning: The number of training samples (1) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  f"The number of training samples ({self.num_training_batches}) is smaller than the logging interval"
/Users/henrysenyondo/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/pytorch_lightning/trainer/data_loading.py:382: UserWarning: One of given dataloaders is None and it will be skipped.
  rank_zero_warn("One of given dataloaders is None and it will be skipped.")
[W ParallelNative.cpp:212] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)

henrykironde · 2021-10-23T21:14:47Z

Some more error report from terminal, the above was from Rstudio

> model$trainer$fit(model)

  | Name  | Type      | Params
------------------------------------
0 | model | RetinaNet | 32.1 M
------------------------------------
31.9 M    Trainable params
222 K     Non-trainable params
32.1 M    Total params
128.592   Total estimated model params size (MB)
/Users/henrysenyondo/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/pytorch_lightning/trainer/data_loading.py:327: UserWarning: The number of training samples (1) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  f"The number of training samples ({self.num_training_batches}) is smaller than the logging interval"
/Users/henrysenyondo/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/pytorch_lightning/trainer/data_loading.py:382: UserWarning: One of given dataloaders is None and it will be skipped.
  rank_zero_warn("One of given dataloaders is None and it will be skipped.")
Epoch 0:   0%|                                                                                                | 0/1 [00:00<00:00, 4782.56it/s][W ParallelNative.cpp:212] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:212] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:212] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:212] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
zsh: abort      R

henrykironde · 2021-10-23T21:20:05Z

Looks like there a crash on binary libiomp
OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initializ
Some reference :

Worked for me after setting Sys.setenv("KMP_DUPLICATE_LIB_OK"="TRUE").
We have to be careful since libiomp5.dylib vs libomp.dylib may give us different results

spono · 2022-08-18T08:25:42Z

same OMP issue on W10 when running model = df_model():

plain crash on Rstudio
some informative help on R:

OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

What do you suggest "[The best thing to do is ] to ensure that only a single OpenMP runtime is linked into the process"?

Your solution using Sys.setenv("KMP_DUPLICATE_LIB_OK"="TRUE") seems "risky" for the actual use in a production environment (having no idea if and when it may cause issues).
Thanks in advance

ethanwhite · 2023-03-05T21:22:10Z

I've now fixed the OMP issue via a change in the installation instructions that removes the mkl package which is causing this issue e10c158

Can someone using macOS follow the new installation instructions and see if the rest of the issues reported here remain? I'm still seeing training issues on Windows, but things now work properly for predicting from the release model

mirandateats · 2023-06-14T15:10:48Z

Using macOS, I ran into the following issues during installation:

reticulate::conda_remove('r-reticulate', packages = 'mkl') returned the following:

"+ '~/Library/r-miniconda/bin/conda' 'remove' '--yes' '--name' 'r-reticulate' 'mkl'
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed

PackagesNotFoundError: The following packages are missing from the target environment:

mkl

Error: Error 1 occurred removing conda environment r-reticulate"

After this error, I continued with installation anyways...

I had to run install.packages('devtools') (not included in the installation code) before running devtools::install_github('weecology/deepforestr')
It seems that any code including the df_model() function crashes RStudio. Examples that have caused a crash:
model <- df_model()
deepforestr::df_model()

ethanwhite · 2023-06-29T12:57:26Z

Thanks for the report @mirandateats! Unfortunately we've had ongoing stability issues with reticulate (which is how we run the core Python package from within R) on non-Linux systems. We'll keep trying to address those issues, but at the moment my recommendation is to do the core DeepForest work using the Python package directly and then import the results to R for further analysis and visualization.

ethanwhite · 2023-12-28T13:59:35Z

@mirandateats - it looks like some of the upstream issues have been resolved now and I have things running properly on Windows 10. Can you try a fresh install and let me know if you're still running into issues?

ethanwhite · 2023-12-28T14:01:38Z

@spono - after some upstream fixes everything seems to be working on Windows now. Can you try a fresh install and then see if the test code below runs

library(deepforestr)

model = df_model()
model$use_release()

annotations_file = get_data("testfile_deepforest.csv")

model$config$train$csv_file = annotations_file
model$config$train$root_dir = get_data(".")

model$create_trainer()
model$train$fit(model)

ethanwhite · 2023-12-28T14:02:23Z

@henrykironde - can you test again on macOS since our upstream issues seem to be resolved now (at least on Windows)

robAndrus34 · 2024-05-09T13:58:50Z

@henrykironde and @ethanwhite - I'm curious if you've resolved this issue. I ran into the same problem on macOS yesterday. After a basic install according to the directions on the website, Rstudio crashed when I ran model = df_model()

Thank you.

ethanwhite · 2024-05-09T15:06:52Z

Thanks for the report @robAndrus34! We haven't managed to reproduce this locally in part due to not having many mac's in the lab. If you have time to work with us on debugging on macOS we'd be happy to do that. If you need to get something up and running quickly then it's pretty easy to do in Python even if you don't much Python work. Let us know which direction you'd like to go and we'll be happy to help.

robAndrus34 · 2024-05-10T20:20:01Z

Thanks @ethanwhite . I decided to go the Python route for now. At some future date, I may be interested in troubleshooting the R issue. Thanks

ethanwhite · 2024-05-11T02:24:11Z

Sounds good @robAndrus34 - let us know if you have any questions as you get things up and running in Python

ethanwhite · 2024-05-11T21:52:02Z

This failure is now reflected in our failing macOS tests which may help us explore this further.

ethanwhite mentioned this issue Oct 22, 2021

GPU & TPU & IPU unavailable and failure to re-train the model on Windows #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mac OS current error #5

Mac OS current error #5

henrykironde commented Oct 22, 2021

henrykironde commented Oct 23, 2021

henrykironde commented Oct 23, 2021

spono commented Aug 18, 2022

ethanwhite commented Mar 5, 2023 •

edited

mirandateats commented Jun 14, 2023

ethanwhite commented Jun 29, 2023

ethanwhite commented Dec 28, 2023

ethanwhite commented Dec 28, 2023

ethanwhite commented Dec 28, 2023

robAndrus34 commented May 9, 2024

ethanwhite commented May 9, 2024

robAndrus34 commented May 10, 2024

ethanwhite commented May 11, 2024

ethanwhite commented May 11, 2024

Mac OS current error #5

Mac OS current error #5

Comments

henrykironde commented Oct 22, 2021

henrykironde commented Oct 23, 2021

henrykironde commented Oct 23, 2021

spono commented Aug 18, 2022

ethanwhite commented Mar 5, 2023 • edited

mirandateats commented Jun 14, 2023

ethanwhite commented Jun 29, 2023

ethanwhite commented Dec 28, 2023

ethanwhite commented Dec 28, 2023

ethanwhite commented Dec 28, 2023

robAndrus34 commented May 9, 2024

ethanwhite commented May 9, 2024

robAndrus34 commented May 10, 2024

ethanwhite commented May 11, 2024

ethanwhite commented May 11, 2024

ethanwhite commented Mar 5, 2023 •

edited