Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a possible fix to historical quotes #48

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

a possible fix to historical quotes #48

wants to merge 4 commits into from

Conversation

amalek215
Copy link

Since things broke when the API changed I wrote this.
It relies on the fact that the web page has all the data in json objects.
So I pattern match to grab the json fragments for the quotes (so a bit brittle).

I do this instead of the download link b/c the download link is generated via javascript and requires cookie handling... so while the data is better formatted getting to it requires reverse engineering, so brittle in a different way.

@amalek215
Copy link
Author

so works in python 2.7 but not 3.x

content = str(resp.read().decode('utf-8').strip())
daily_data = content.splitlines()
content = resp.read()
quotes = re.findall('{"date":\d+[^}]+}', content)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you would need to call .decode() on content before passing it to re.findall.
It seems that you would also have to check if it's Python 3 and call .decode() only if it's Python 3. There may be other, more elegant, way of doing this -- I'm not used to port Python code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes adding .decode('utf-8').strip() seems to get past this

for quote in quotes:
j = json.loads(quote)
for k in ('open', 'close', 'high', 'low', 'unadjclose'):
j[k] = "%.2f" % j[k]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ystockquote.get_historical_prices('GOOGL', '2006-01-01', '2017-05-29') would throw KeyError: 'open' at this line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starforever I do not get that error either in my own tests or running the test suite.
(note I have not tested in Python 3.x)
Does it happen in Python 3.x?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using Python 2.7. I'm not sure if you are using the exact same arguments. It only happens for some specific symbol and time ranges. I haven't got time to dive deep on this. So no idea how it happens.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing it doesn't throw the error in the tests since the specific arguments are not used in tests. I silently caught the exception and it worked fine then. But I think this is only a quick fix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see the issue, when the quote/date range includes a split or dividend the pattern match pulls it in but there is no 'open', 'close' etc.
I think I may add to the tests a request that includes a split and will hard code some values to also check for the adj/un adj case you noted in another comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yahoo's current data does not include un-adjusted prices anymore
All prices are adjusted for splits
I had to stop using yahoo a while ago b/c of this

@cgoldberg cgoldberg self-requested a review May 30, 2017 15:26
@cgoldberg cgoldberg self-assigned this May 30, 2017
ystockquote.py Outdated
@@ -13,7 +13,7 @@
# Requires: Python 2.7/3.3+


__version__ = '0.2.6dev' # NOQA
__version__ = '0.2.7dev' # NOQA
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should still be 0.2.6dev.. next release will be 0.2.6 (assuming we can fix this) .... and afterwards, 0.2.7dev will open

ystockquote.py Outdated
@@ -24,6 +24,9 @@
# py2
from urllib2 import Request, urlopen
from urllib import urlencode
from datetime import datetime, timedelta
import re
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphabetize imports says PEP8

content = str(resp.read().decode('utf-8').strip())
daily_data = content.splitlines()
content = resp.read()
quotes = re.findall('{"date":\d+[^}]+}', content)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes adding .decode('utf-8').strip() seems to get past this

d = timedelta(seconds=j["date"])
# these keys are for backwards compatibility
for key in j.keys():
j[key.capitalize()] = j[key]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im getting an error running the tests on this line. not sure what you are trying to do here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at #52

ystockquote.py Outdated
# these keys are for backwards compatibility
for key in j.keys():
j[key.capitalize()] = j[key]
j['Adj Close'] = j['unadjclose']
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unadjclose means the real price, while Adj Close should be adjusted price w.r.t. splits and dividends. Assigning one to the other would lead to confusion and potential incorrect results.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks starforever that is a straight up bug.
The issue is w/ the api change "close" used to be the unadjusted close but now it is the adjusted close.
I will adjust shortly.

from cgoldberg: "this should still be 0.2.6dev.. next release will be 0.2.6"
cgoldberg: "alphabetize imports says PEP8"
use the lower case keys to get the "new" values
Not sure if under the old api "High" / "Low" was adjusted or not
If so more work will need to be done to be backwards compatible
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants