get_timeseries returning incorrect time period from what was declared

Question

question

pmorlen

2 ●0 ●0 ●2

get_timeseries returning incorrect time period from what was declared

I have this code:

data = ek.get_timeseries(rics, fields='CLOSE',
start_date='2019-01-01',
end_date='2019-06-30')

but it returns data starting 5/20/2019 and ignores the start_date declared in the code:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 29 entries, 2019-05-20 to 2019-06-28
Columns: 103 entries, MMM to YUM
dtypes: float64(103)
memory usage: 23.6 KB

Based on other posts, it appears to be related to the 3000 shared row limit.

Here is a snippet of what I'd like returned - with daily closing price dates going back to 1/1/2019 for 103 equity tickers in total:

Close Date MMM AFL T ABBV ABT

2019-05-28 163.35 51.30 31.93 78.03 75.71

2019-05-29 161.40 51.43 31.91 78.06 75.67

Here is my current data retrieval code:

'rics' is a list of tickers

data3 = ek.get_data(rics, ['TR.PriceClose', 'TR.PriceCloseDate'], {'Sdate':'2019-01-01', 'EDate':'2019-06-30'})

But it returns a tuple:

(      Instrument  Price Close                  Date
 0            MMM       190.95  2019-01-02T00:00:00Z
 1            MMM       183.76  2019-01-03T00:00:00Z
 2            MMM       191.32  2019-01-04T00:00:00Z
 3            MMM       190.88  2019-01-07T00:00:00Z
 4            MMM       191.68  2019-01-08T00:00:00Z
 ...          ...          ...                   ...
 12767        YUM       110.66  2019-06-24T00:00:00Z
 12768        YUM       110.31  2019-06-25T00:00:00Z
 12769        YUM       110.12  2019-06-26T00:00:00Z
 12770        YUM       110.56  2019-06-27T00:00:00Z
 12771        YUM       110.67  2019-06-28T00:00:00ZI'm having trouble setting up 'get_data()' to return a dataframe, instead of a tuple.
Can you please provide some guidance to correct?
Thank you

eikon eikon-data-api workspace workspace-data-api refinitiv-dataplatform-eikon python

Jul 29, 2019 at 01:31 PM

10 |1500

Attachments: Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Answer 1 · 2019-08-05T05:44:02Z

Jirapongse

38.1k ●71 ●35 ●53

@pmorlen

I have modified the code as shown below.

data3 = ek.get_data(rics,['TR.PriceCloseDate','TR.PriceClose'], {'Sdate':'2019-01-01', 'EDate':'2019-06-30'}) 

dfs = dict(tuple(data3[0].groupby('Instrument'))) 
dfarray = [] 
for ric, data in dfs.items(): 
   df_tmp = dfs[ric].dropna() 
   df_tmp = df_tmp.drop_duplicates() 
   df_tmp = df_tmp.set_index('Date') 
   df_tmp = df_tmp.drop(['Instrument'], axis=1) 
   df_tmp = df_tmp.rename(columns={"Price Close":ric}) 
   dfarray.append(df_tmp) 

result = pd.concat(dfarray, axis=1, sort=False) 
result.columns.name = 'CLOSE' 
result

It uses df.dropna() and df.drop_duplicates().

ktb.png (10.2 KiB)

Aug 05, 2019 at 05:44 AM

10 |1500

Attachments: Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

pmorlen Aug 05, 2019 at 09:03 AM

@jirapongse.phuriphanvichai

Excellent! Exactly what I needed. Thank you!

Answer 2 · 2019-07-30T01:43:03Z

chavalit-jintamalit

18k ●21 ●12 ●20

Hi @pmorlen

To get dataframe into data3, you can follow this code:

data3,err = ek.get_data(rics, ['TR.PriceClose', 'TR.PriceCloseDate'], {'Sdate':'2019-01-01', 'EDate':'2019-06-30'})

or

data3 = ek.get_data(rics, ['TR.PriceClose', 'TR.PriceCloseDate'], {'Sdate':'2019-01-01', 'EDate':'2019-06-30'}) [0]

data3 will be dataframe.

For datapoint limitation, please refer to this document on the last section, "

Try to detect and address datapoint limits"

Jul 30, 2019 at 01:43 AM

10 |1500

Attachments: Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

pmorlen Jul 30, 2019 at 09:36 AM

Thank you - this did return a dataframe, however, it returns the data for only 1 of the 103 tickers in the 'rics' list. It returned the data for the last ticker in the list.

chavalit-jintamalit Jul 31, 2019 at 12:54 AM

I can successfully receive multiple data point on multiple RIC.

ahs01.png (34.9 KiB)

pmorlen Jul 31, 2019 at 10:02 AM

Hello @chavalit.jintamalit

Again, thank you for your assistance. I should have started this thread with my goal, which is to calculate the correlations for a list of stocks over a certain time period. In my current code, it is a period of 6 months. Your results are close to what I need, however, to calculate the correlations I would like to see each price date represent a row in the dataframe.

chavalit-jintamalit pmorlen Jul 31, 2019 at 10:40 AM

@pmorlen

Can you give me example of the data in DF which you would like to have ?

pmorlen chavalit-jintamalit Jul 31, 2019 at 11:17 AM

Attached is an example. From this DF, I calculate the log returns, then correlations. What caused me problems was the data limit for the get_timeseries() function, which is why I am trying get_data. Thank you.

df-example.jpg

df-example.jpg (121.0 KiB)

Answer 3 · 2019-07-30T04:19:51Z

Jirapongse

38.1k ●71 ●35 ●53

The dataframe's format returned from get_data and get_timeseries is different.

I have implemented a simple script to make the dataframe from get_data similar to get_timeseries.

data3 = ek.get_data(rics,['TR.PriceCloseDate','TR.PriceClose'], {'Sdate':'2019-01-01', 'EDate':'2019-06-30'}) 

dfs = dict(tuple(data3[0].groupby('Instrument'))) 
dfarray = [] 
for ric, data in dfs.items(): 
   df_tmp = dfs[ric][pd.notnull(dfs[ric]['Date'])] 
   df_tmp = df_tmp.set_index('Date') 
   df_tmp = df_tmp.drop(['Instrument'], axis=1) 
   df_tmp = df_tmp.rename(columns={"Price Close":ric}) 
   dfarray.append(df_tmp) 

result = pd.concat(dfarray, axis=1, sort=False) 
result.columns.name = 'CLOSE' 
result

Jul 30, 2019 at 04:19 AM

10 |1500

Attachments: Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

pmorlen Jul 30, 2019 at 09:37 AM

Thank you - this did return a dataframe, however, it returns the data for only 1 of the 103 tickers in the 'rics' list. It returned the data for the last ticker in the list.

Jirapongse Jul 30, 2019 at 11:50 PM

@pmorlen

Could you please share the rics list used in the code?

It may relate to the usage limit mentioned in the EIKON DATA API USAGE AND LIMITS GUIDELINE.

pmorlen Jul 31, 2019 at 10:34 AM

@jirapongse.phuriphanvichai

Thank you again for your time. I was incorrect in stating that your code returned the correct results. I am attaching 3 images showing 1) the rics list used, 2) the intermediate results of your code, specifically dfarray, and 3) the error I'm receiving at 'result = pd.concat(dfarray, axis=1, sort=False)'. Third image will be in a separate comment.

rics.jpg

dfarray.jpg

rics.jpg (63.9 KiB)

dfarray.jpg (51.4 KiB)

pmorlen Jul 31, 2019 at 10:35 AM

@jirapongse.phuriphanvichai

Continuing comment above (system would not let me attach a 3rd image).

error.jpg

error.jpg (135.9 KiB)

pmorlen pmorlen Jul 31, 2019 at 11:46 AM

@jirapongse.phuriphanvichai

I thought this information may be useful to you:

Attached is an example of the DF I'm trying to create. From this DF, I calculate the log returns, then correlations. What caused me problems was the data limit for the get_timeseries() function, which is why I am trying get_data. Thank you.

df-example.jpg

df-example.jpg (121.0 KiB)

Answer 4 · 2019-07-31T11:29:00Z

chavalit-jintamalit

18k ●21 ●12 ●20

Hi @pmorlen

Just an idea, if you hit the limit, you can split the request and delay it.

So you can query period1, period2, periodN and combine them together.

See this sample:

ahs.png (38.1 KiB)

Jul 31, 2019 at 11:29 AM

10 |1500

Attachments: Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Q&A Forum

question

get_timeseries returning incorrect time period from what was declared

4 Answers

question

get_timeseries returning incorrect time period from what was declared

4 Answers

Related Questions