question

Upvotes
Accepted
1 0 0 2

Recommended practice for querying many RIC time combinations

I would like to query maybe 20000 RIC Time period pairs, in each instance picking up the tick data - top of the book, trades, sizes and timestamps. Each Ric might have 200 time periods which would be about 30 seconds long. What is the best practice way to query this via the API?

tick-history-rest-api
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
Accepted
13.7k 26 8 12

@johnD,

AWS improves the data download time (not the extraction time), but this will be apparent and of use only for medium to large data sets (for small data sets the faster download is offset by the overhead), as detailed in this article.

AWS can improve the data download time, but only for medium to large data sets (for small data sets the faster download is offset by the overhead), as detailed in this article.

I'm sorry, but there is no API call that allows to specify combinations of time ranges and instruments; all API calls apply the same time range to the entire instrument list.

An analysis of your required combinations of instruments and time frames should allow you to group a fair number of instruments (hopefully several hundred or more) for a defined time frame (a few hours or up to a day), request data for that combination, and then pick out what you require. I agree this is not optimal, you will retrieve more data than required, and then discard it, but for your use case this should deliver better performance than making numerous very small queries.

EDIT Oct 2018:

IMPORTANT CHANGE in API capabilities: it is now possible to use API calls specifying combinations of time ranges and instruments.

For more information, see this thread.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Yes I understand and had tested that AWS was probably slower for small (2-3kB) requests. But better I think, (just trying to get the comparisons working, I'm picking up the data direct from AWS but the larger files from TR is giving me some Curl errors), for picking up a months data set.

Everyone seems to agree that there's no batching of queries, the approach is one sub optimally picks up all the data with a minimum of fields to cover most of the instances and then any others later. And that the performance is very variable - it's working a lot better for me today, e.g. 1 minute instead of 6

minutes to run a report for a small data set.

Suggestions for the service:

1/ provide a way of accepting multiple ExtractionRequests in the same query (the api doesn't need changing), just the processing of the query

2/ look at getting some form of SLA out there rather than "it depends on how busy things are" - you're using AWS already so you should be able to make it scaleable and speed it up

It should be a good solution for the Mifid and FRTB compliance request.

Thanks for those who gave feedback.

Upvotes
13k 32 12 18

Hi @johnD,

This cannot be done using a single API call and you will have to invoke multiple calls - one for each time period. Multiple RIC's for a particular time period should be added to a single request.

Multiple queries can be issued in parallel, while waiting for one to complete.

Follow the extraction limit guidelines specified here: LINK

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
1 0 0 2

Thanks for that, that's what I feared. 50 concurrent queries, 6 minutes to process 5 minutes data when I've tried over the last week through the rest API. Makes 40 hours for 20000 combinations. Hmmm. I can't believe there isn't a batch mode and it's qoing to be quicker to download whole months of data. Maybe Mifid etc is a very well kept secret

Thanks for replying though and I hope I've missed something.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
18k 21 12 20

Just would like to add more information.

This is the best practices limits for TRTH document. LINK.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Thanks for that. That's where I picked up the 50 concurrent queries. There's detail on picking up via AWS in the API User Guide, though it's not helpful on a small test of 10 combinations with 4KB data sets. But none of that sorts the problem: With Mifid 2 one should really verify all trades to an external market price, the tick history should be a nice way of doing it, but there's no good way I've found of getting larger amounts of combinations of RICs and small time periods.

Sensible approaches would have been the possibility to submit a single scheduled request of many RIC time combinations

or a service that didn't take 5 minutes to process a request for a few minutes worth or data, or the ability to queue many more than 50 requests or others that don't come to mind. But what remains is to download lots of data and then filter it this end, and then have the gaps filled in with the concurrent requests. I'm still hoping there's a more sensible approach.

Upvotes
3 1 2 4

It's possible to get very good returns with following scenario:

using multiple accountId's

multi-thread the calls (using sempahore to guarantee that the max number of threads is always up)

group your identifiers by time range (1 call all identifiers for a day/hour etc)
use AWS
cache identifiers

-MWA

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Click below to post an Idea Post Idea