Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: sec-filings reader load_data() TypeError #13598

Closed
fdejax90 opened this issue May 20, 2024 · 4 comments · Fixed by #13602
Closed

[Bug]: sec-filings reader load_data() TypeError #13598

fdejax90 opened this issue May 20, 2024 · 4 comments · Fixed by #13602
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@fdejax90
Copy link

Bug Description

from llama_index.readers.sec_filings import SECFilingsLoader

loader = SECFilingsLoader(tickers=["TSLA"], amount=1, filing_type="10-K", num_workers=1)

>>> ---------------------------------------------------------------------------
_RemoteTraceback                          Traceback (most recent call last)
_RemoteTraceback: 
"""
Traceback (most recent call last):
  File "[/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py", line 261](http://localhost:8889/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py#line=260), in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py", line 210](http://localhost:8889/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py#line=209), in _process_chunk
    return [fn(*args) for args in chunk]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py", line 210](http://localhost:8889/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py#line=209), in <listcomp>
    return [fn(*args) for args in chunk]
            ^^^^^^^^^
  File "[/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/sec_filings.py", line 243](http://localhost:8889/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/sec_filings.py#line=242), in get_text_from_url
    text = self.get_filing(
           ^^^^^^^^^^^^^^^^
TypeError: fake_decorator.<locals>.inner() got an unexpected keyword argument 'company'
"""

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 loader.load_data()

File [~/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/base.py:77](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/base.py#line=76), in SECFilingsLoader.load_data(self)
     72 with concurrent.futures.ThreadPoolExecutor(
     73     max_workers=thread_workers
     74 ) as executor:
     75     results = executor.map(self.multiprocess_run, self.tickers)
---> 77 for res in results:
     78     curr_tic = next(iter(res.keys()))
     79     for data in res[curr_tic]:

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:619](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py#line=618), in Executor.map.<locals>.result_iterator()
    616 while fs:
    617     # Careful not to keep a reference to the popped future
    618     if timeout is None:
--> 619         yield _result_or_cancel(fs.pop())
    620     else:
    621         yield _result_or_cancel(fs.pop(), end_time - time.monotonic())

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:317](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py#line=316), in _result_or_cancel(***failed resolving arguments***)
    315 try:
    316     try:
--> 317         return fut.result(timeout)
    318     finally:
    319         fut.cancel()

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:449](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py#line=448), in Future.result(self, timeout)
    447     raise CancelledError()
    448 elif self._state == FINISHED:
--> 449     return self.__get_result()
    451 self._condition.wait(timeout)
    453 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:401](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py#line=400), in Future.__get_result(self)
    399 if self._exception:
    400     try:
--> 401         raise self._exception
    402     finally:
    403         # Break a reference cycle with the exception in self._exception
    404         self = None

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/thread.py:58](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/thread.py#line=57), in _WorkItem.run(self)
     55     return
     57 try:
---> 58     result = self.fn(*self.args, **self.kwargs)
     59 except BaseException as exc:
     60     self.future.set_exception(exc)

File [~/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/base.py:57](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/base.py#line=56), in SECFilingsLoader.multiprocess_run(self, tic)
     53 with concurrent.futures.ProcessPoolExecutor(
     54     max_workers=self.num_workers
     55 ) as executor:
     56     results = executor.map(self.se.get_text_from_url, field_urls)
---> 57 for idx, res in enumerate(results):
     58     all_text, filing_type = res
     59     text_dict[tic].append(
     60         {
     61             "year": years[idx],
   (...)
     65         }
     66     )

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py:620](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py#line=619), in _chain_from_iterable_of_lists(iterable)
    614 def _chain_from_iterable_of_lists(iterable):
    615     """
    616     Specialized implementation of itertools.chain.from_iterable.
    617     Each item in *iterable* should be a list.  This function is
    618     careful not to keep references to yielded objects.
    619     """
--> 620     for element in iterable:
    621         element.reverse()
    622         while element:

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:619](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py#line=618), in Executor.map.<locals>.result_iterator()
    616 while fs:
    617     # Careful not to keep a reference to the popped future
    618     if timeout is None:
--> 619         yield _result_or_cancel(fs.pop())
    620     else:
    621         yield _result_or_cancel(fs.pop(), end_time - time.monotonic())

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:317](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py#line=316), in _result_or_cancel(***failed resolving arguments***)
    315 try:
    316     try:
--> 317         return fut.result(timeout)
    318     finally:
    319         fut.cancel()

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:449](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py#line=448), in Future.result(self, timeout)
    447     raise CancelledError()
    448 elif self._state == FINISHED:
--> 449     return self.__get_result()
    451 self._condition.wait(timeout)
    453 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:

File [~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:401](http://localhost:8889/lab/workspaces/auto-X/tree/jason/~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py#line=400), in Future.__get_result(self)
    399 if self._exception:
    400     try:
--> 401         raise self._exception
    402     finally:
    403         # Break a reference cycle with the exception in self._exception
    404         self = None

TypeError: fake_decorator.<locals>.inner() got an unexpected keyword argument 'company'```

### Version

 0.1.19

### Steps to Reproduce

pyenv local 3.11.9

pip install --upgrade pipenv

pipenv run pip install llama-index-readers-sec-filings


```python
from llama_index.readers.sec_filings import SECFilingsLoader
loader = SECFilingsLoader(tickers=["TSLA"], amount=1, filing_type="10-K", num_workers=1)

loader.load_data()

Relevant Logs/Tracbacks

Started for TSLA
---------------------------------------------------------------------------
_RemoteTraceback                          Traceback (most recent call last)
_RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py", line 210, in _process_chunk
    return [fn(*args) for args in chunk]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py", line 210, in <listcomp>
    return [fn(*args) for args in chunk]
            ^^^^^^^^^
  File "/Users/floriandejax/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/sec_filings.py", line 243, in get_text_from_url
    text = self.get_filing(
           ^^^^^^^^^^^^^^^^
TypeError: fake_decorator.<locals>.inner() got an unexpected keyword argument 'company'
"""

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 loader.load_data()

File ~/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/base.py:77, in SECFilingsLoader.load_data(self)
     72 with concurrent.futures.ThreadPoolExecutor(
     73     max_workers=thread_workers
     74 ) as executor:
     75     results = executor.map(self.multiprocess_run, self.tickers)
---> 77 for res in results:
     78     curr_tic = next(iter(res.keys()))
     79     for data in res[curr_tic]:

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:619, in Executor.map.<locals>.result_iterator()
    616 while fs:
    617     # Careful not to keep a reference to the popped future
    618     if timeout is None:
--> 619         yield _result_or_cancel(fs.pop())
    620     else:
    621         yield _result_or_cancel(fs.pop(), end_time - time.monotonic())

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:317, in _result_or_cancel(***failed resolving arguments***)
    315 try:
    316     try:
--> 317         return fut.result(timeout)
    318     finally:
    319         fut.cancel()

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:449, in Future.result(self, timeout)
    447     raise CancelledError()
    448 elif self._state == FINISHED:
--> 449     return self.__get_result()
    451 self._condition.wait(timeout)
    453 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:401, in Future.__get_result(self)
    399 if self._exception:
    400     try:
--> 401         raise self._exception
    402     finally:
    403         # Break a reference cycle with the exception in self._exception
    404         self = None

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/thread.py:58, in _WorkItem.run(self)
     55     return
     57 try:
---> 58     result = self.fn(*self.args, **self.kwargs)
     59 except BaseException as exc:
     60     self.future.set_exception(exc)

File ~/.pyenv/versions/3.11.9/lib/python3.11/site-packages/llama_index/readers/sec_filings/base.py:57, in SECFilingsLoader.multiprocess_run(self, tic)
     53 with concurrent.futures.ProcessPoolExecutor(
     54     max_workers=self.num_workers
     55 ) as executor:
     56     results = executor.map(self.se.get_text_from_url, field_urls)
---> 57 for idx, res in enumerate(results):
     58     all_text, filing_type = res
     59     text_dict[tic].append(
     60         {
     61             "year": years[idx],
   (...)
     65         }
     66     )

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/process.py:620, in _chain_from_iterable_of_lists(iterable)
    614 def _chain_from_iterable_of_lists(iterable):
    615     """
    616     Specialized implementation of itertools.chain.from_iterable.
    617     Each item in *iterable* should be a list.  This function is
    618     careful not to keep references to yielded objects.
    619     """
--> 620     for element in iterable:
    621         element.reverse()
    622         while element:

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:619, in Executor.map.<locals>.result_iterator()
    616 while fs:
    617     # Careful not to keep a reference to the popped future
    618     if timeout is None:
--> 619         yield _result_or_cancel(fs.pop())
    620     else:
    621         yield _result_or_cancel(fs.pop(), end_time - time.monotonic())

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:317, in _result_or_cancel(***failed resolving arguments***)
    315 try:
    316     try:
--> 317         return fut.result(timeout)
    318     finally:
    319         fut.cancel()

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:449, in Future.result(self, timeout)
    447     raise CancelledError()
    448 elif self._state == FINISHED:
--> 449     return self.__get_result()
    451 self._condition.wait(timeout)
    453 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:

File ~/.pyenv/versions/3.11.9/lib/python3.11/concurrent/futures/_base.py:401, in Future.__get_result(self)
    399 if self._exception:
    400     try:
--> 401         raise self._exception
    402     finally:
    403         # Break a reference cycle with the exception in self._exception
    404         self = None

TypeError: fake_decorator.<locals>.inner() got an unexpected keyword argument 'company'
@fdejax90 fdejax90 added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 20, 2024
Copy link

dosubot bot commented May 20, 2024

The error you're encountering, where fake_decorator.<locals>.inner() got an unexpected keyword argument 'company', suggests there's a mismatch between the expected arguments in a function call within the SECFilingsLoader class or its dependencies and the arguments being passed. This issue typically arises when a function is called with an argument it does not expect or require.

To address this issue, ensure that any function within the SECFilingsLoader or its dependencies that is being called with the 'company' argument actually expects such an argument. It's possible that the error stems from an internal inconsistency where a decorator or function has been updated without proper adjustments to its calls.

Given the traceback, the problem seems to be rooted in how get_text_from_url or a similar method is implemented or called. If you have made custom modifications to the library or are using a version of the library that has known issues with Python 3.11.9, consider reviewing these changes or testing with a different Python version. The llama-index-readers-sec-filings package specifies compatibility with Python versions 3.9.x and 3.10.x, so using Python 3.11.9 might introduce unexpected behavior.

If you haven't already, checking for any updates to the llama-index-readers-sec-filings package or related dependencies could also be beneficial. There might have been a fix released for the issue you're experiencing.

Lastly, ensure that your environment matches the package's requirements, including the Python version and any other dependencies. If the issue persists after these checks, it might be helpful to reach out to the maintainers of the llama-index-readers-sec-filings package for more targeted support or to report a potential bug.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@fdejax90
Copy link
Author

fdejax90 commented May 20, 2024

@dosu The llama-index-readers-sec-filings package specifies compatibility with Python versions >=3.9,<3.12

Copy link

dosubot bot commented May 20, 2024

Given the compatibility range you mentioned, it seems the issue is not directly related to the Python version. The error might be due to a recent change in the package or an oversight in the handling of arguments within the get_text_from_url method or its related decorators.

A potential next step is to review the implementation of get_text_from_url and any decorators it uses to ensure they correctly handle the 'company' argument. If this method or its decorators have been modified in a recent update, there could be a regression or change that hasn't been fully integrated across the package.

If you're comfortable diving into the source code, you could look for where get_text_from_url is defined and check how the 'company' argument is used or expected. This might give you insight into whether the issue is a simple typo, a missing argument in a function call, or a more complex integration problem.

If this doesn't resolve the issue or if you're unsure about making these changes, I recommend reaching out for support directly through the package's issue tracker or support channels. Providing them with the detailed traceback and explaining the steps you've already taken can help them diagnose the problem more effectively. They might also be aware of a recent change that could have introduced this issue and could offer a workaround or patch.

Details

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@logan-markewich
Copy link
Collaborator

logan-markewich commented May 20, 2024

sec-fillings had some weird hidden dependencies. Added to the actual dependencies.

This error is solved with that change (but tbh, I just get 403, but thats outside of the scope, this reader seems quite hacky haha)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants