Skip to content

Commit

Permalink
release
Browse files Browse the repository at this point in the history
  • Loading branch information
jph00 committed Mar 31, 2021
1 parent 7c9cefa commit d28a027
Show file tree
Hide file tree
Showing 2 changed files with 285 additions and 324 deletions.
259 changes: 4 additions & 255 deletions CHANGELOG.md
Expand Up @@ -2,271 +2,20 @@

<!-- do not remove -->

## 2.2.8
## 2.3.0
### Breaking Changes

- fix optimwrapper to work with param_groups (closes #2829) ([#3241](https://github.com/fastai/fastai/pull/3241)), thanks to [@tmabraham](https://github.com/tmabraham)
- Currently, `OptimWrapper` does not support `param_groups`, preventing use of model freezing/unfreezing and discriminative learning rate.

This PR incorporates @muellerzr's [fix](https://muellerzr.github.io/fastai_minima/optimizer.html#Differential-Learning-Rates-and-Groups-with-Pytorch-Optimizers) into `OptimWrapper`. In doing so, the usage of `OptimWrapper` is slightly changed. For example, it changes usage from:
```
def opt_func(params, lr, **kwargs): return OptimWrapper(torch.optim.SGD(params, lr=lr))
```
to
```
def opt_func(params, **kwargs): return OptimWrapper(torch.optim.SGD, params, **kwargs)
```

This PR will also solve issue #2829 as well.
- fix optimwrapper to work with `param_groups` ([#3241](https://github.com/fastai/fastai/pull/3241)), thanks to [@tmabraham](https://github.com/tmabraham)
- OptimWrapper now has a different constructor signature, which makes it easier to wrap PyTorch optimizers

### New Features

- Support discriminative learning with OptimWrapper ([#2829](https://github.com/fastai/fastai/issues/2829))
- Currently, the following code gives error
```python
from fastai.vision.all import *

def SGD_opt(params, **kwargs): return OptimWrapper(torch.optim.SGD(params, **kwargs))

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
path, get_image_files(path), valid_pct=0.2, seed=42,
label_func=is_cat, item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=SGD_opt)
learn.fit_one_cycle(1)
```

The error is as follows:
```python
TypeError Traceback (most recent call last)
<ipython-input-133-20a3ebb82957> in <module>
10 label_func=is_cat, item_tfms=Resize(224))
11
---> 12 learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=SGD_opt)
13 learn.fit_one_cycle(1)

~/miniconda3/lib/python3.8/site-packages/fastcore/logargs.py in _f(*args, **kwargs)
50 log_dict = {**func_args.arguments, **{f'{k} (not in signature)':v for k,v in xtra_kwargs.items()}}
51 log = {f'{f.__qualname__}.{k}':v for k,v in log_dict.items() if k not in but}
---> 52 inst = f(*args, **kwargs) if to_return else args[0]
53 init_args = getattr(inst, 'init_args', {})
54 init_args.update(log)

~/miniconda3/lib/python3.8/site-packages/fastai/vision/learner.py in cnn_learner(dls, arch, loss_func, pretrained, cut, splitter, y_range, config, n_out, normalize, **kwargs)
175 model = create_cnn_model(arch, n_out, ifnone(cut, meta['cut']), pretrained, y_range=y_range, **config)
176 learn = Learner(dls, model, loss_func=loss_func, splitter=ifnone(splitter, meta['split']), **kwargs)
--> 177 if pretrained: learn.freeze()
178 return learn
179

~/miniconda3/lib/python3.8/site-packages/fastai/learner.py in freeze(self)
513
514 @patch
--> 515 def freeze(self:Learner): self.freeze_to(-1)
516
517 @patch

~/miniconda3/lib/python3.8/site-packages/fastai/learner.py in freeze_to(self, n)
508 @patch
509 def freeze_to(self:Learner, n):
--> 510 if self.opt is None: self.create_opt()
511 self.opt.freeze_to(n)
512 self.opt.clear_state()

~/miniconda3/lib/python3.8/site-packages/fastai/learner.py in create_opt(self)
139 def _bn_bias_state(self, with_bias): return norm_bias_params(self.model, with_bias).map(self.opt.state)
140 def create_opt(self):
--> 141 self.opt = self.opt_func(self.splitter(self.model), lr=self.lr)
142 if not self.wd_bn_bias:
143 for p in self._bn_bias_state(True ): p['do_wd'] = False

<ipython-input-133-20a3ebb82957> in SGD_opt(params, **kwargs)
1 from fastai.vision.all import *
2
----> 3 def SGD_opt(params, **kwargs): return OptimWrapper(torch.optim.SGD(params, **kwargs))
4
5 path = untar_data(URLs.PETS)/'images'

~/miniconda3/lib/python3.8/site-packages/torch/optim/sgd.py in __init__(self, params, lr, momentum, dampening, weight_decay, nesterov)
66 if nesterov and (momentum <= 0 or dampening != 0):
67 raise ValueError("Nesterov momentum requires a momentum and zero dampening")
---> 68 super(SGD, self).__init__(params, defaults)
69
70 def __setstate__(self, state):

~/miniconda3/lib/python3.8/site-packages/torch/optim/optimizer.py in __init__(self, params, defaults)
49
50 for param_group in param_groups:
---> 51 self.add_param_group(param_group)
52
53 def __getstate__(self):

~/miniconda3/lib/python3.8/site-packages/torch/optim/optimizer.py in add_param_group(self, param_group)
208 for param in param_group['params']:
209 if not isinstance(param, torch.Tensor):
--> 210 raise TypeError("optimizer can only optimize Tensors, "
211 "but one of the params is " + torch.typename(param))
212 if not param.is_leaf:

TypeError: optimizer can only optimize Tensors, but one of the params is list
```

The error is due to the fact that pytorch optimizers want the param list to be of the format `list(dict(params='model_parameters'))`. In case of fastai this is `list(list(params='model_parameters'))`.

It was also verified that the error does not occur when discriminative learning is not used.

One possible solution to the problem is to update the splitter to output `list(dict)` as shown below
```python
def splitter(m):
ps = L(m[0][:3], m[0][3:], m[1:]).map(params)
param_list = []
for p in ps: param_list.append(dict(params=p))
return p
```

I don't know if this is the best solution but as of now it get the work done.

Working code
```python
from fastai.vision.all import *

def SGD_opt(params, **kwargs): return OptimWrapper(torch.optim.SGD(params, **kwargs))

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
path, get_image_files(path), valid_pct=0.2, seed=42,
label_func=is_cat, item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=SGD_opt, splitter=splitter)
learn.fit_one_cycle(1)
```
Maybe the splitter code can be made part of `OptimWrapper` but I am not sure.

### Bugs Squashed

- Updated to support adding transforms to multiple dataloaders ([#3268](https://github.com/fastai/fastai/pull/3268)), thanks to [@marii-moe](https://github.com/marii-moe)
- Fixes the issue here: https://forums.fast.ai/t/performance-degradation-between-fastai-2-2-5-and-2-2-7/86069

We add `add_tfms` in order to support adding the tfms to both dataloaders. Why?

Previously dls.train.after_batch.fs == dls.valid.after_batch.fs, so adding a transform to the training dataloader automatically added it to the validation dataloader. Removing that requirement caused validation to break, so adding back that functionality here. (moving my analysis from discord into the forum thread now)

- BrokenProcessPool in fastai.vision.utils.download_images() on Windows ([#3196](https://github.com/fastai/fastai/issues/3196))
- Please confirm you have the latest versions of fastai, fastcore, and nbdev prior to reporting a bug (delete one): YES

**Describe the bug**
<details><summary>When calling `fastai.vision.utils.download_images()` with a URL list and destination Path, some students of mine* who were running on Windows received a BrokenProcessPool exception (click to expand):</summary>

```
~\Documents\hands on ai\Notebook 7\u7_utils.py in download_all_images(path)
105 dest = path/classname
106 dest.mkdir(exist_ok=True)
--> 107 download_images(dest, url_file=csv, max_pics=200)
108 print("Done.")
109
~\miniconda3\envs\handsoai\lib\site-packages\fastai\vision\utils.py in download_images(dest, url_file, urls, max_pics, n_workers, timeout, preserve_filename)
35 dest = Path(dest)
36 dest.mkdir(exist_ok=True)
---> 37 parallel(partial(_download_image_inner, dest, timeout=timeout, preserve_filename=preserve_filename),
38 list(enumerate(urls)), n_workers=n_workers)
39
~\miniconda3\envs\handsoai\lib\site-packages\fastcore\parallel.py in parallel(f, items, n_workers, total, progress, pause, threadpool, timeout, chunksize, *args, **kwargs)
104 if total is None: total = len(items)
105 r = progress_bar(r, total=total, leave=False)
--> 106 return L(r)
107
108 # Cell
~\miniconda3\envs\handsoai\lib\site-packages\fastcore\foundation.py in __call__(cls, x, *args, **kwargs)
95 def __call__(cls, x=None, *args, **kwargs):
96 if not args and not kwargs and x is not None and isinstance(x,cls): return x
---> 97 return super().__call__(x, *args, **kwargs)
98
99 # Cell
~\miniconda3\envs\handsoai\lib\site-packages\fastcore\foundation.py in __init__(self, items, use_list, match, *rest)
103 def __init__(self, items=None, *rest, use_list=False, match=None):
104 if (use_list is not None) or not is_array(items):
--> 105 items = listify(items, *rest, use_list=use_list, match=match)
106 super().__init__(items)
107
~\miniconda3\envs\handsoai\lib\site-packages\fastcore\basics.py in listify(o, use_list, match, *rest)
54 elif isinstance(o, list): res = o
55 elif isinstance(o, str) or is_array(o): res = [o]
---> 56 elif is_iter(o): res = list(o)
57 else: res = [o]
58 if match is not None:
~\miniconda3\envs\handsoai\lib\concurrent\futures\process.py in _chain_from_iterable_of_lists(iterable)
482 careful not to keep references to yielded objects.
483 """
--> 484 for element in iterable:
485 element.reverse()
486 while element:
~\miniconda3\envs\handsoai\lib\concurrent\futures\_base.py in result_iterator()
609 # Careful not to keep a reference to the popped future
610 if timeout is None:
--> 611 yield fs.pop().result()
612 else:
613 yield fs.pop().result(end_time - time.monotonic())
~\miniconda3\envs\handsoai\lib\concurrent\futures\_base.py in result(self, timeout)
437 raise CancelledError()
438 elif self._state == FINISHED:
--> 439 return self.__get_result()
440 else:
441 raise TimeoutError()
~\miniconda3\envs\handsoai\lib\concurrent\futures\_base.py in __get_result(self)
386 def __get_result(self):
387 if self._exception:
--> 388 raise self._exception
389 else:
390 return self._result
BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
```
</details>

**To Reproduce**
Steps to reproduce the behavior:
1. Create a list of URLs, such as:
```python
urls = ['https://www.bk.com/sites/default/files/03156-5%20EggnormousBurrito%20500x540_CR.png',
'https://s3.eu-central-1.amazonaws.com/food-truck-data-eu-central-1/media/operator/app/a3c936cad578f6e01b6b54809fc10f1e.jpg',
'http://www.noracooks.com/wp-content/uploads/2018/03/IMG_6057-2.jpg',
'https://www.nahundfrisch.at/cache/images/recipes/resizecrop_mm_h370_w782_q100/0ae000020b896f9015780ed9fd21b168.jpg']
```
2. Import `download_images`:
```python
from fastai.vision.utils import download_images
```
3. Run it:
```python
from pathlib import Path
download_images(Path('.'), urls=urls)
```

**Expected behavior**
The code should download all images, irrespective of the platform it is run on.

**Error with full stack trace**
At least for some versions of Windows, it fails with the stack trace given on top. As I lack access to Windows machines, I unfortunately don't know exactly what circumstances must be met, but at least 2 in 250 people were affected.

**Additional context**
Judging from https://stackoverflow.com/questions/43836876/processpoolexecutor-works-on-ubuntu-but-fails-with-brokenprocesspool-when-r and https://bugs.python.org/issue17874, this is a general problem with the `ProcessPoolExecutor` running from an interactive shell. Using a `ThreadPoolExecutor` would be an easy fix. As suggested in https://github.com/fastai/fastai/issues/978, this would require adding `threadpool=True` to the call in https://github.com/fastai/fastai/blob/0e01131/fastai/vision/utils.py#L37. (Sorry, I haven't looked into how to submit a PR to fastai, given that all the code seems to be generated from notebooks, so I'm just reporting the problem and a potential fix here.)
Pinging @jph00 since he wanted to be informed.

(\* Just if you're wondering, I've used the image downloader and cnn_learner in a university class to demonstrate what's possible with transfer learning and what isn't, and am of course giving proper credit to the fastai library and the corresponding [fastbook chapter](https://github.com/fastai/fastbook/blob/master/02_production.ipynb)! Thank you for the great work!)
- This fixes an issue in 2.2.7 which resulted in incorrect validation metrics when using Normalization


## 2.2.7
Expand Down

0 comments on commit d28a027

Please sign in to comment.