Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to import LaserEncoderPipeline #280

Closed
sumedhan-r opened this issue Mar 26, 2024 · 10 comments
Closed

Unable to import LaserEncoderPipeline #280

sumedhan-r opened this issue Mar 26, 2024 · 10 comments

Comments

@sumedhan-r
Copy link

While calling LaserEncoderPipeline for the purpose of downstream NLP tasks, the first error that popped up was a ValueError, which stated

mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory

I asked the opinion of ChatGPT on the same and had got a code that slightly modified the line where the Config classes are declared.

After having made changes to all the respective Config related classes, I was getting another Error, namely ValidationError which stated

Object of unsupported type: '_MISSING_TYPE'
full_key:
reference_type=None
object_type=None

Below are attached some screenshots related to the errors. Please look into this at the earliest.

LASER 1_1 LASER 1_2 LASER 2_1 LASER 2_2 LASER 3_1 LASER 3_2
@sumedhan-r
Copy link
Author

sumedhan-r commented Mar 26, 2024

The suggestion given by ChatGPT is as follows :

@dataclass class FairseqConfig(FairseqDataclass): common: CommonConfig = field(default_factory=CommonConfig) common_eval: CommonEvalConfig = field(default_factory=CommonEvalConfig)

@avidale
Copy link
Contributor

avidale commented Apr 16, 2024

@sumedhan-r can you please indicate the versions of fairseq and omegaconf that you are using, and the minimal code required to reproduce the problem?

@dpdeb
Copy link

dpdeb commented Apr 22, 2024

I am getting this error too. I was able to recreate it in Python 3.11 right from the import statement:
from laser_encoders import LaserEncoderPipeline

My sense is that ChatGPT's suggestion is on the right track. Specifically, there are a number of statements toward the end of fairseq/dataclass/configs.py that assign a mutable type as a default. Assigning these using the pattern field(default_factory=x) makes these errors go away. (See https://stackoverflow.com/questions/53632152/why-cant-dataclasses-have-mutable-defaults-in-their-class-attributes-declaratio for an additional explanation)

It appears the dubious pattern is still used through the latest version of fairseq, v0.12.2, which was released in 2022.

@avidale
Copy link
Contributor

avidale commented Apr 22, 2024

@dpdeb @sumedhan-r can you please indicate the versions of fairseq and omegaconf that cause the error?

When I am installing all packages from scratch (see this Colab notebook for repro), I get fairseq-0.12.2, omegaconf-2.0.6 and laser_encoders-0.0.1 installed by default, and they are working fine.

@TheHappyLemon
Copy link
Contributor

So, I have got the same error:

raise ex # set end OC_CAUSE=1 for full backtrace
^^^^^^^^
omegaconf.errors.ValidationError: Object of unsupported type: '_MISSING_TYPE'
full_key:
reference_type=None
object_type=None

I also changed classes fairseq\dataclass\configs.py and hydra\conf_init_.py to use default_factory and then received the error above. Has anyone come up with a fix? :)

@avidale
Copy link
Contributor

avidale commented Apr 25, 2024

@TheHappyLemon how can I reproduce your error?
Can you please share a Colab notebook or something that reproduces it?

@TheHappyLemon
Copy link
Contributor

TheHappyLemon commented Apr 25, 2024

@avidale
Well, I havent done anything special. Firstly, I installed laser_encoders through anaconda with pip install laser_encoders. Then when i wanted to just import the library I got error stating that I should use default_factory for some configs.py. So it is done just with

from laser_encoders import LaserEncoderPipeline

Initially, I thought I have to install newest versions of dependent package fairseq. But I just couldn`t install fairseq at all, because I was getting error: FileNotFoundError: [Errno 2] No such file or directory: 'VERSION.txt. There are multiple issues for this error, like skrub-data/skrub#476 I also tried to install it from local clone, but then I was getting errors described here facebookresearch/demucs#423 So I decided to abandon this idea and fix error with default_factory. I did fixes in miniconda3\Lib\site-packages\fairseq\dataclass\configs.py and in miniconda3\Lib\site-packages\hydra\conf_init_.py And then finally I received the error I commented.

I have following packages versions (ran pip install laser_encoders to get this info):
Requirement already satisfied: laser_encoders in c:\users\artem\miniconda3\lib\site-packages (0.0.1)
Requirement already satisfied: fairseq>=0.12.2 in c:\users\artem\miniconda3\lib\site-packages (from laser_encoders) (0.12.2)
Requirement already satisfied: omegaconf<2.1 in c:\users\artem\miniconda3\lib\site-packages (from fairseq>=0.12.2->laser_encoders) (2.0.6)

So I dont know what whould be the best way to reproduce. Maybe do a clean install? Idk :(

@TheHappyLemon
Copy link
Contributor

@avidale

I just created a new fresh virtual conda environment, ran pip install laser_encoders, succesfully installed following packages:

Installing collected packages: tbb, sentencepiece, intel-openmp, bitarray, antlr4-python3-runtime, unicategories, portalocker, omegaconf, mkl, cython, torch, sacremoses, sacrebleu, hydra-core, torchaudio, fairseq, laser_encoders
Successfully installed antlr4-python3-runtime-4.8 bitarray-2.9.2 cython-3.0.10 fairseq-0.12.2 hydra-core-1.0.7 intel-openmp-2021.4.0 laser_encoders-0.0.1 mkl-2021.4.0 omegaconf-2.0.6 portalocker-2.8.2 sacrebleu-2.4.2 sacremoses-0.1.0 sentencepiece-0.2.0 tbb-2021.12.0 torch-2.3.0 torchaudio-2.3.0 unicategories-0.1.2

And after running from laser_encoders import LaserEncoderPipeline, I got:

raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory

@TheHappyLemon
Copy link
Contributor

TheHappyLemon commented Apr 25, 2024

Error is reproduced on windows 11 with this environment:
conda create -n test_env python=3.11.8 anaconda

@avidale
Copy link
Contributor

avidale commented Apr 25, 2024

Apparently, Fairseq is not supporting Python 3.11 and newer versions; see e.g. facebookresearch/fairseq#5191.

Thus, there are 3 possible solutions for you:

  1. Downgrade your Python to 3.10 or below
  2. Fork Fairseq and fix the error (this pull request Add support for Python3.11 fairseq#5359 might be what you need)
  3. Migrate from the LASER encoder (which uses Fairseq which has already become pretty stale) to the SONAR encoder (which performs better and is based on Fairseq2, a package that currently enjoys better support).

@avidale avidale closed this as completed Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants