For a deeper look into our DataScope Select SOAP API, look into:

Overview |  Quickstart |  Documentation |  Downloads |  Tutorials

question

Upvotes
Accepted
38 2 7 8

Retrieving bulk data

I made a series of requests for 1500 rics. My Composite, FundAllocation, and TermsAndConditions requests all came back, but my two PriceHistory requests (adjusted and unadjusted prices) and CorporateActions requests have not come back yet.

These requests are all available under X-Client-Session-Id:

6e8cdf96-0508-11e9-8014-525400a87d41_6933

6e8cdf96-0508-11e9-8014-525400a87d41_6938

6e8cdf96-0508-11e9-8014-525400a87d41_6932

For the two PriceHistory requests I am using

https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractRaw

For the CorporateActions I am using

https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractWithNotes

I have set a 4 minute (!) timeout on my requests, but I am still occasionally timing out on hitting the Location url received with a 202 response.

The requests were all created at 08:09, yet half an hour later they're still not resolved.

1500 Rics is only part of the overall data we need to retrieve so this is a bit worrying.

dss-rest-apidatascope-selectdssbulk-download
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

@davet1,

can you please share the body of the 2 price history and the corax requests ? That will help us analyze.

Please note that for 1500 instruments, depending on the number of fields and total data, the extraction time could be important.

They're a bit big. Can you find them from the X-Client-Session-Ids?

I expect the CorporateActions problem is data size, so I am changing that to use the Csv/RawExtract flow.

@davet1,

you can share them as attachements. Only the Datascope product support team can access requests from Session Ids, we moderators cannot do that.

For very big requests (many RICs and/or long time range), I'd suggest segmenting them into more manageable ones, maybe 500 RICs at a time, and time ranges of 1 year (these are ballpark numbers, in absence of details). We recommend to make few queries for more RICs rather than many queries for few RICs, but there is a point where very large requests are a disadvantage.

Yes, requesting compressed data instead of CSV will speed up download.

Show more comments

1 Answer

Upvotes
Accepted
13.7k 26 8 12

@davet1,

those are quite big requests (>1200 RICs, 20 years of data). For such large requests (many RICs and/or long time range and/or many data fields), I recommend segmenting them into more manageable ones.

In the DSS best practices we recommend making fewer queries for more RICs, rather than many queries for fewer RICs, to minimize the overhead. This works well, but there is a point where very large requests become a disadvantage. This is not documented, but thinking of it, it appears logical: every system has its limits. And in your case it seems your queries have hit a limit.

The difficulty is evaluating when a request becomes too big. This is more of an art than a science, and it will take some experimenting with your queries to determine what works best for you. As a very rough rule of thumb, if a query takes more than an hour or two to complete, it is probably too big.

For your PriceHistory and Corporate Actions requests I suggest you try one of the following approaches:

  1. Request 1 instrument over 20 years. If it takes less than 5 minutes, try 10 instruments. If that takes less than 5 minutes increase the instrument count more. At the point where the wait time increases too much, stop increasing the number of instruments.
  2. An alternative could be to request more instruments per request, but for a shorter time period. I'd try ranges of 1 year as a start, and apply a similar methodology as above.

I agree this is not ideal, but there is no silver bullet I'm afraid.

I hope this helps.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Note that one must be careful when comparing execution times, as the extraction time also depends on server load, which can strongly fluctuate.

Click below to post an Idea Post Idea