Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFL Boxscore: Document is Empty #729

Open
mikeyru123 opened this issue Jun 3, 2022 · 12 comments
Open

NFL Boxscore: Document is Empty #729

mikeyru123 opened this issue Jun 3, 2022 · 12 comments

Comments

@mikeyru123
Copy link

mikeyru123 commented Jun 3, 2022

Describe the bug
Trying to pull boxscores for the 2021 NFL season and result is error saying "Document is empty"

To Reproduce
Sample code which causes an issue.

!pip install sportsipy

from sportsipy.nfl.boxscore import Boxscores, Boxscore

game_str = Boxscores(7,2021).games['7-2021'][0]['boxscore']
game_stats = Boxscore(game_str)
game_stats.dataframe

Expected behavior
would like to see the boxscores of the games played week 7 of the 2021 season

software
Using Google Colab on Chrome

image
image

@pepaananen
Copy link

I am having same issue. Totally unable to get any stats for a given boxscore.

@datasportslab
Copy link

I am also having this issue. Any solution yet?

@CodeeMcCoderson
Copy link

I am having similar issue. Pasting below.

Traceback (most recent call last):
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\pyquery\pyquery.py", line 57, in fromstring
result = getattr(etree, meth)(context)
File "src\lxml\etree.pyx", line 3252, in lxml.etree.fromstring
File "src\lxml\parser.pxi", line 1913, in lxml.etree._parseMemoryDocument
File "src\lxml\parser.pxi", line 1793, in lxml.etree._parseDoc
File "src\lxml\parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc
File "src\lxml\parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
File "src\lxml\parser.pxi", line 725, in lxml.etree._handleParseResult
File "src\lxml\parser.pxi", line 654, in lxml.etree._raiseParseError
File "", line 1
lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\users\python scripts\SportsNFL.py", line 266, in
pred_games_df, comp_games_df = prep_test_train(current_week,weeks,year)
File "C:\users\python scripts\SportsNFL.py", line 242, in prep_test_train
game_data = Boxscore('202112190buf')
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\sportsipy\nfl\boxscore.py", line 296, in init
self._parse_game_data(uri)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\sportsipy\nfl\boxscore.py", line 784, in _parse_game_data
value = self.parse_name(short_field, boxscore)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\sportsipy\nfl\boxscore.py", line 447, in parse_name
return pq(str(boxscore(scheme)).strip())
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\pyquery\pyquery.py", line 217, in init
elements = fromstring(context, self.parser)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\pyquery\pyquery.py", line 61, in fromstring
result = getattr(lxml.html, meth)(context)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\lxml\html_init
.py", line 875, in fromstring
doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\lxml\html_init
.py", line 763, in document_fromstring
raise etree.ParserError(
lxml.etree.ParserError: Document is empty

@CodeeMcCoderson
Copy link

If you all have not seen yet, this was fixed with the following commit

e2aabf3

@RichardSJTotten
Copy link

RichardSJTotten commented Jul 14, 2022

I'm getting a similar error when attempting to access boxscores.

File "src/lxml/etree.pyx", line 3252, in lxml.etree.fromstring
File "src/lxml/parser.pxi", line 1913, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1793, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc
File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 663, in lxml.etree._raiseParseError
File "", line 3543
lxml.etree.XMLSyntaxError: line 3543: b'Tag use invalid'

@CodeeMcCoderson do you know how I can resolve this?

@CodeeMcCoderson
Copy link

@RichardSJTotten to fix the problem that I had, which seems almost identical to yours, is I changed the sportsipy module directly on my hard drive. I did not pull down the fix that I posted earlier.

If you navigate to where you have your modules stored on your local machine, find the 'sportsipy' module and got into it. Then go into the 'nfl' module and click on the 'constants.py' scripts.

Within that script navigate to line 81 and change this line of code:
'home_name': 'a[itemprop="name"]:first',
To this:
'home_name': 'div[class="linescore_wrap"] table tbody tr:last td:nth-child(2)',

Next go to line 84 and change this line of code:
'away_name': 'a[itemprop="name"]:last',
To this:
'away_name': 'div[class="linescore_wrap"] table tbody tr:first td:nth-child(2)',

After changing it, save the script, navigate to your script that was throwing the error and run it again.
It should work.

Let me know if anything was not clear or if it does not work, I will try and help more.

@RichardSJTotten
Copy link

Thanks @CodeeMcCoderson !! Really appreciate the response 👍

This fix worked for me after tinkering a little. Turns out I had to pip uninstall the package first and then install locally for it to work as expected.

Do you know if @roclark is planning to merge any of the new commits that have been worked out that solve issues like this? Also @roclark - this package is great! Thanks for creating it.

@selvamshan
Copy link

@RichardSJTotten to fix the problem that I had, which seems almost identical to yours, is I changed the sportsipy module directly on my hard drive. I did not pull down the fix that I posted earlier.

If you navigate to where you have your modules stored on your local machine, find the 'sportsipy' module and got into it. Then go into the 'nfl' module and click on the 'constants.py' scripts.

Within that script navigate to line 81 and change this line of code: 'home_name': 'a[itemprop="name"]:first', To this: 'home_name': 'div[class="linescore_wrap"] table tbody tr:last td:nth-child(2)',

Next go to line 84 and change this line of code: 'away_name': 'a[itemprop="name"]:last', To this: 'away_name': 'div[class="linescore_wrap"] table tbody tr:first td:nth-child(2)',

After changing it, save the script, navigate to your script that was throwing the error and run it again. It should work.

Let me know if anything was not clear or if it does not work, I will try and help more.

@khampel
Copy link

khampel commented Sep 13, 2022

Thanks for the adjustment for the code @selvamshan and @CodeeMcCoderson. I am getting an empty document as well but this has an issue with the SCHEDULE_SCHEME in the constants file maybe? URL looks to be working still as well. I also switched lines 81 and 84 as well. Here is the code and error. Thanks.

`from sportsipy.nfl.schedule import Schedule

team_one_df_org = pd.DataFrame()
GNB_schedule = Schedule(team_one)
for game in GNB_schedule:

    games = game.dataframe_extended
    team_one_df_org = team_one_df_org.append(games, ignore_index = True)

team_one_df_org`

src/lxml/etree.pyx in lxml.etree.fromstring()

src/lxml/parser.pxi in lxml.etree._parseMemoryDocument()

src/lxml/parser.pxi in lxml.etree._parseDoc()

src/lxml/parser.pxi in lxml.etree._BaseParser._parseUnicodeDoc()

src/lxml/parser.pxi in lxml.etree._ParserContext._handleParseResultDoc()

src/lxml/parser.pxi in lxml.etree._handleParseResult()

src/lxml/parser.pxi in lxml.etree._raiseParseError()

XMLSyntaxError: Document is empty, line 1, column 1 (, line 1)

@Laneville
Copy link

@CodeeMcCoderson It looks like this patch works for previous games that have already been played, but I am still getting this DocumentEmpty error for any games that have not been played yet

@ericmk52
Copy link

@CodeeMcCoderson It looks like this patch works for previous games that have already been played, but I am still getting this DocumentEmpty error for any games that have not been played yet

The fix appears to work for all seasons prior to the current 2022 season.

@calebhacala
Copy link

@CodeeMcCoderson is this fix still relevant, I tried it but am still getting the same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants