question

Upvotes
Accepted
2 1 3 1

Python API function to use to retrieve a span of tick data

Hi,

(My previous question on StreamingPrices seemed to be poorly phrased, and customersupport@refinitiv.com directed me back here, so let me try this again.)

I'm working on a benchmark project where I need tick data (fields TIMACT, BID, BIDSIZE, ASK, ASKSIZE) for a set of RICs (SON3H0, SON3M0, SON3U0, SON3Z0, SON3H1) for a specific 20 minute window, let's say 0800-0820 UTC. I want to issue my request and retrieve all the ticks within that window ideally within one minute after the window closes (by 0821 UTC). I need to retrieve this set of ticks programmatically, since I want my program to run unattended outside of my normal business hours, and use these ticks to produce an output. So using the Eikon GUI or Eikon for Excel is not an option.

I don't want to use StreamingPrices if I don't have to, since it seems that would require that I keep the streaming connection open for ~20 minutes, issuing a constant stream of snapshot requests.

I am able to configure Python appropriately and get data back from rdp.StreamingPrices, so connectivity does not appear to be an issue.

My questions:

(1) Is there a function in the Python API that will retrieve all ticks within a time window? If so, what is the lag between the real-time tick and when it is available from the API?

(2) For future reference, is there documentation, tutorials, or examples that would demonstrate the available functionality within the Python API? I've found the documentation within Python (e.g., help(refinitiv.dataplatform)), but in general, most of the description and documentation I've seen is buried in presentations, where code is in a screen shot rather than in a location where it can be cut and pasted in.

(3) Where would I be able to get information about delays or lags in data from various functions? For example, in using get_historical_price_events during NYSE trading hours, there appears to be a 15-minute delay.

pythonrdp-apirefinitiv-data-platformdocumentationexampletutorial
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
Accepted
20.3k 73 10 20

Hi @jeff.kenyon

If you want to consume tick data after the realtime event then it is classed as historical data and will, therefore, require the use of the historical data source and related functions. The lags will be dependent on the underlying data source. My colleague has reached out to the API Product owner to ask him to comment on the lags etc.

As the Refintiv Data Platform library is in Beta, full documentation is not yet available - but will be added in due course. The Product owner should again be able to advise on this.

If you want the data in the most timely manner then you can use the Streaming Price interface in event-driven mode - and thereby avoid having to make constant snapshot request. i.e. open the request once, handle all the responses for the require duration and then close the stream once done.

So, you could Open the StreamingPrice instance at say 08:00 and close it at 8:20

Two examples you can refer to:

Example 2.3.1 Streaming Price with all Events

Example 2.3.0 - Update Dataframe

Note that when you connect to ERT in Cloud i.e. Platform session, the feed is Trade Safe conflated - i.e. up to 3 Quotes events a second + all Trade events.

If however, you are using a Deployed session - then you will need to check with your MarketData team that you are using a full tick Service - if that is what you are after.

You can optionally specify the service name (for a deployed session) as follows:

prices = rdp.StreamingPrices(session = mySession, 
    service = 'ELEKTRON_EDGE',
    universe = ['EUR=','CHF=', 'AED='],
    fields = ['BID', 'ASK','DSPLY_NAME', 'QUOTIM'])

If you don't specify a service name, then it will use whichever service your MarketData team have specified as the default service for Websocket type connections.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
2 1 3 1

Hi @Umer.Nalla,

Thanks very much for the detailed response. After getting permissioned for the data I need, I used the example you cited, and the ticks I need are now showing up in my Jupyter notebook. Final question: what I need to do is collect the ticks to create a CSV file for the time window of interest? Options seem to be to collect everything in a data frame, then write out the data frame after I close the streaming prices, or write each tick to a file as I go. Unfortunately, I can't seem to do either at the moment. Are there any examples of writing the tick data to a file?

Note that I can't just use an ending summary...I need all the individual ticks, which I'll then be doing some analysis on in another program.

Thanks!

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
2 1 3 1

An addedum to the above...I've defined my handlers for the refresh, update, and status events (below), and it runs great in a Jupyter notebook: I have a file handle f that is opened before the RDP stream, and flushed and closed after. I get my desired CSV file of events. When I use the Jupyter notebook code in a standard module, and run it from the Windows command line...nothing happens. The StreamState is opened (I checked), I wait for n seconds, then close the stream and the session, but my destination file is empty. It seems as if no events are coming through on the open stream.

def display_refreshed_fields(streaming_price, instrument_name, fields):
    current_time = datetime.datetime.now()
    timestampStr = current_time.strftime("%Y-%m-%d %H:%M:%S.%f")
    output_string = timestampStr + ",REFRESH," + instrument_name + ",\"" + str(fields) + "\"\n"
    f.write(output_string)
    print(current_time, "- Refresh received for", instrument_name, ":", fields)
    
def display_updated_fields(streaming_price, instrument_name, fields):
    current_time = datetime.datetime.now()
    timestampStr = current_time.strftime("%Y-%m-%d %H:%M:%S.%f")
    output_string = timestampStr + ",UPDATE," + instrument_name + ",\"" + str(fields) + "\"\n"
    f.write(output_string)
    print(current_time, "- Update received for", instrument_name, ":", fields)
    
def display_status(streaming_price, instrument_name, status):
    current_time = datetime.datetime.now()
    timestampStr = current_time.strftime("%Y-%m-%d %H:%M:%S.%f")
    output_string = timestampStr + ",STATUS," + instrument_name + ",\"" + str(status) + "\"\n"
    f.write(output_string)
    print(current_time, "- Status received for", instrument_name, ":", status)

Anyway...very puzzled as to why this works perfectly in Jupyter Notebook, and not at all from console. Suggestions welcome.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
20.3k 73 10 20

Hi @jeff.kenyon

Apologies that I did not get back to you first follow up question of the Apr 15th - seems there was a problem with (my?) email notifications and I was not aware of the follow-up.

So, the current situation is that you managed to answer that question for yourself but are now stuck when you run the code as a script rather than a notebook.

If you can share the full script -minus any username etc then I would be happy to try it out here and see if I can recreate your issue.


icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Hi @jeff.kenyon i recommend you upload the script as a .txt file

Hi,

Here you go.

TestStreamingScript.txt

This program produces the following console output:

About to open!
Opened! 
StreamState.Open

As expected, it returns after the amount of time specified by the time.sleep command...but the created file is empty. My thinking is that the display_* functions called by rdp.StreamingPrices are running in a thread that doesn't know about the f filehandle, and rather than erroring out, everything is just getting directed to /dev/null.

Upvote
9.7k 49 38 60

Hi @jeff.kenyon,

One thing I noticed when using the streaming interfaces within a notebook vs running from a command line is that within a Jupyter Notebook, an event loop is automatically created for you. Try the following as the last line within your main when running from a command line:

asyncio.get_event_loop().run_forever() 
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvote
20.3k 73 10 20

Hi @jeff.kenyon

Ok - this had me scratching my head for several minutes - my near-identical script worked fine - but the one you supplied did not.

Line by line comparison showed it turned out to be a typo ...

universe = ['EUR=', 'GBP=', 'JPY=', 'CAD='] 
fields   = ['BID', 'ASK', 'OPEN_PRC', 'HST_CLOSE']

If you remove the "" from the universe and fields variable definitions so that you are defining a list rather than a string it should work.

Oh, and you will need to change the sleep to something like the following:

end_time = time.time() + WINDOW_LENGTH 
while (time.time() < end_time):      
    asyncio.get_event_loop().run_until_complete(asyncio.sleep(1)) 
streaming_prices.close() 

Otherwise, the Update handlers will never trigger.

A modified version of your script: teststreamingscriptx.txt


icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Hi Ulan, thanks so much for the head-scratching effort. All good now!

Click below to post an Idea Post Idea