question

Upvotes
Accepted
5 0 0 3

Search explore Skip/Top cannot retrieve results past 10k records

Hi,


I am using

https://api.refinitiv.com/discovery/search/v1/explore

my query:

{
  "View": "Entities",
  "Filter": "ContributorCode eq 'TWEB'",
  "Select": "RIC",
  "Top": 10000
}


the number of available RICs are more than 10k (which is a maximum value for TOP)

There is a parameter 'Skip' that allows to skip first records. I assumed this is the one to use to get further results, .e.g. 10000-20000, 20000-30000, etc

Using something like

{
  "View": "Entities",
  "Filter": "ContributorCode eq 'TWEB'",
  "Select": "RIC",
  "Skip": 10000
  "Top": 5000
}


however, I get an error

    "message": "Invalid result window: (top + skip) must not exceed 10,000",


Wondering, how can I get records further than 10k?


Thanks,

Sergei


search
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvote
Accepted
23k 22 9 14

Hello @sergei.ermakov ,

Requesting of more then 10k result items (docs) is not supported, despite pagination within record set is supported, so that you should be able to request 5k and then another 5k. See API Playground -> Search -> Reference Guide for more details.

However, a set of TWEB RICs is much bigger then 10k and the question becomes how to request them all.

The working approach could be splitting the oversized single request into several smaller ones using filters.

I am not a content expert, so would select a straightforward approach to splitting:

{
  "View": "Entities",
  "Filter": "ContributorCode eq 'TWEB' and RIC eq '0*'",
  "Select": "RIC",
  "Top": 10000
}

resulting in

{
    "Total": 1048,
    "Hits": [
        {
            "RIC": "0#USBMK=TWEB"
        },
...

Consequently would do:

{
  "View": "Entities",
  "Filter": "ContributorCode eq 'TWEB' and not(RIC eq '0*')",
  "Select": "RIC",
  "Top": 10000
} 

and either examine the result and decide how to cut off the next subset <=10000 for example:

{
  "View": "Entities",
  "Filter": "ContributorCode eq 'TWEB' and RIC eq 'A*'",
  "Select": "RIC",
  "Top": 10000
}

Or, in a loop, just run through first letters alphabetically and digits 0-9, concatenating the resulting subsets into a single result set.

I expect that there may be more efficient approaches to "divide and conquer" approach, for people with more precise understanding of Tradeweb symbology...

Hope the approach information helps.


icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

hi @zoya faberov - thanks for the suggested approach.

I was trying similar partitioning the set by DbType, your approach seems to be better.

However, there are no guarantees that RIC pattern is more or less equally spread across the alphabet. E.g. there might be the case where more than 10k RICs start with 'A'


I'll try this out as a workaround, but it seems to me that a proper solution to paginating results should be present in the API, are there any plans to support that?

I understand the limit on the returned results (top), but why limit skip+top?


Thanks,

Sergei

just checked another contributor: FINR.

this one is much bigger than TWEB, I had to further split the patter to 00, 01, 02,...

{'View': 'Entities', 'Filter': "ContributorCode eq 'FINR' and RIC eq '00*'", 'Select': 'RIC,CUSIP', 'Top': 10000}

 {'View': 'Entities', 'Filter': "ContributorCode eq 'FINR' and RIC eq '01*'", 'Select': 'RIC,CUSIP', 'Top': 10000}



in total this returned 31850 rics starting with 0

Upvotes
23k 22 9 14

Hello @sergei.ermakov ,

RDP Search service is currently in Early Access, it is actively evolving and being improved.

For working with very large result sets, the approach that is presently available is to divide the very large result set into several smaller result subsets via Filters and Navigators on request.

I hear you, this is worth noting with product, as possible improvement area going forward.

I would suggest also reviewing article Building Search into your Application Workflow for the awareness on how to scout out Search Metadata, and work effectively with Filters and Navigators to narrow down the search results per your requirement.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvote
23k 22 9 14

Hello @sergei.ermakov and all,

Here is additional information, on RDP Search product direction, that you may find helpful:

Search is meant for fast-searching/retrieval of individual entities, or groups of 100s or at very most a few thousand. Bulk download of tens or hundreds of thousands of documents is beyond the scope and technical capabilities of this API, and there are currently no plans to change that.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
5 0 0 3

thanks, @zoya faberov , appreciate your help

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
9.7k 49 38 60

Hi @sergei.ermakov

I would suggest you review the Limits section within the article @zoya faberov referred to. There is also a working Notebook that provides greater details and suggestions that may allow for more accurate groupings for larger data sets.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Click below to post an Idea Post Idea