Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dtype returned from get_next_message #14

Open
cjelsa opened this issue Jun 6, 2020 · 1 comment
Open

Dtype returned from get_next_message #14

cjelsa opened this issue Jun 6, 2020 · 1 comment

Comments

@cjelsa
Copy link

cjelsa commented Jun 6, 2020

I am trying to parse an IEX file and (eventually) fill a database with tick data. To do this I think the best way is to create a dataframe first (which also gives me more control on sizes, batches and writes to DB). Just doing some tests, but the dataframe is not how I would expect it, and especially not the timestamp field. I would like it to be either timestamp or datetime64[ns] and to be the index to fill the timeseries DB.

import IEXTools
from datetime import datetime
from IEXTools import Parser, messages
import pandas as pd

p = Parser(r'/Users/XXXX/Projects/XXXX/XXXXX/IEX_data/20200604_IEXTP1_TOPS1.6.pcap')

allowed = [messages.QuoteUpdate]

i=1
df = pd.DataFrame(columns=('timestamp', 'symbol', 'bid_size', 'bid_price', 'ask_price', 'ask_size'))
df = df.astype({'timestamp': 'datetime64[ns]', 'symbol': 'str', 'bid_size': 'int64', 'bid_price': 'float', 'ask_price': 'float', 'ask_size': 'int64'})
df = df.set_index(['timestamp'])
for i in range(5000):
msg = p.get_next_message(allowed)
df = df.append({'timestamp': msg.timestamp, 'symbol': msg.symbol, 'bid_size': msg.bid_size, 'bid_price': msg.bid_price_int/1000, 'ask_price': msg.ask_price_int/1000, 'ask_size': msg.ask_size}, ignore_index='True')

print(msg.symbol)
print(msg.timestamp)
print(i)
i += 1

df

output:

0 BOIL 100 351.4 352.5 100 1.591273e+18
1 KOLD 100 585.8 587.8 100 1.591273e+18
2 FXE 4000 1060.7 1063.1 4000 1.591273e+18
3 FXE 4000 1060.7 1063.0 4000 1.591273e+18
4 FXB 1000 1213.0 1216.3 1000 1.591273e+18
... ... ... ... ... ... ...
4995 MARK 0 0.0 0.0 0 1.591273e+18
4996 PCG 0 0.0 0.0 0 1.591273e+18
4997 CIDM 300 25.9 0.0 0 1.591273e+18
4998 RSX 100 211.2 212.4 100 1.591273e+18
4999 UCO 100 266.8 267.5 100 1.591273e+18

df.dtypes

output:

symbol object
bid_size int64
bid_price float64
ask_price float64
ask_size int64
timestamp float64
dtype: object

@lvfrazao
Copy link
Owner

lvfrazao commented Jul 6, 2020

Hey @cjelsa sorry for not getting to this earlier. The timestamp attribute is the unix timestamp in nanosecond precision (as an integer). There is another attribute on the message object called date_time that gives you a datetime object. Does that help?

Example:

Python 3.8.3 (default, Jun 17 2020, 15:49:49)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import IEXTools
>>> from datetime import datetime
>>> p = IEXTools.Parser('IEX_data/20200604_IEXTP1_TOPS1.6.pcap')
>>> allowed = [IEXTools.messages.QuoteUpdate]
>>> msg = p.get_next_message(allowed)
>>> msg
QuoteUpdate(flags=64, timestamp=1591269627171164339, symbol='A', bid_size=0, bid_price_int=0, ask_price_int=0, ask_size=0)
>>> msg.date_time
datetime.datetime(2020, 6, 4, 11, 20, 27, 171164, tzinfo=datetime.timezone.utc)
>>> msg.timestamp
1591269627171164339
>>> type(msg.timestamp)
<class 'int'>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants