question

Upvotes
Accepted
22 2 3 7

Changing delimiter in CSV data from TRTH Tick history

Hi,

Did something change in the TickHistoryTimeAndSalesExtractionRequest CSV delimiter recently?

When pulling the following query from 2018-12-05

{'method': 'POST', 
 'url': 'https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractRaw', 
 'json': {'ExtractionRequest': {'@odata.type': '#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.TickHistoryTimeAndSalesExtractionRequest', 'ContentFieldNames': ['Trade - Price','Trade - Volume', 'Trade - Qualifiers','Trade - Sequence Number'], 'IdentifierList': {'@odata.type': '#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList', 'InstrumentIdentifiers': [{'Identifier': 'HSIZ8', 'IdentifierType': 'Ric'}], 'ValidationOptions': {'AllowHistoricalInstruments': True}}, 'Condition': {'ReportDateRangeType': 'Range', 'QueryStartDate': '2018-12-04T00:00:00', 'QueryEndDate': '2018-12-05T00:00:00'}}}, 'stream': None}} 

I get the following data (unzipped), note `Seq. No.` now has a colon `;` delimiter.

#RIC,Domain,Date-Time,Type,Price,Volume,Qualifiers,Seq. No.
HSIZ8,Market Price,2018-12-05T09:14:00.148335969+08,Trade,26788,8,3[ACT_FLAG1];707327933366141519[TRADE_ID],
HSIZ8,Market Price,2018-12-05T09:14:00.148335969+08,Trade,26788,1,2[ACT_FLAG1];707327933366141519[TRADE_ID], 

If I pull this data for 2018-11-01, `Seq No.` seems to have a regular comma `,` delimiter

#RIC,Domain,Date-Time,Type,Price,Volume,Qualifiers,Seq. No.
HSIX8,Market Price,2018-11-01T09:14:00.147380832+08,Trade,25080,3,3[ACT_FLAG1],705982406011650308
HSIX8,Market Price,2018-11-01T09:14:00.147380832+08,Trade,25080,1,2[ACT_FLAG1],705982406011650308
pythontick-history-rest-api
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
Accepted
13.7k 26 8 12

@bmcelroy,

Has the separator changed ? (hint: no)

The separator is always a comma. The semicolon is an integral part of the Qualifiers field.

I made tests based on your queries. Taking December 2018 data as example:

#RIC,Alias Underlying RIC,Domain,Date-Time,GMT Offset,Type,Price,Volume,Qualifiers,Seq. No.
HSIZ8,,Market Price,2018-12-04T09:15:00.131944328Z,+8,Trade,27266,1,1[ACT_FLAG1];707288350947813826[TRADE_ID],

The Qualifiers field contains: 1[ACT_FLAG1];707288350947813826[TRADE_ID]

The Seq. No. contains nothing (hence the final comma, which is the separator).

Just to double check, if I do not requested the Sequence Number field then the result is:

#RIC,Alias Underlying RIC,Domain,Date-Time,GMT Offset,Type,Price,Volume,Qualifiers
HSIZ8,,Market Price,2018-12-04T09:15:00.131944328Z,+8,Trade,27266,1,1[ACT_FLAG1];707288350947813826[TRADE_ID]

This confirms the separator is the comma.

As a side comment, there are other cases where a semicolon is used inside the qualifiers field, like in this example:

HSIZ8,,Market Price,2018-11-01T01:15:23.137752529Z,+8,Trade,25085,1,1[ACT_FLAG1];Combo Trade[USER],705982406011651801,

Where is the trade ID ?

For November 2018 the data (including the sequence number) effectively looks different:

#RIC,Alias Underlying RIC,Domain,Date-Time,GMT Offset,Type,Price,Volume,Qualifiers,Seq. No.
HSIZ8,,Market Price,2018-11-01T01:15:16.929679422Z,+8,Trade,25085,1,1[ACT_FLAG1],705982406011651553

What you suggest in your answer is correct: the comma is the separator, and the content of fields Qualifiers and Seq. No. has changed. It looks like the Hang Seng was initially publishing trade IDs in the sequence number, and then changed that.

I searched where it could be, making various tests, using fields "Trade - Qualifiers", "Trade - Sequence Number", "Trade - Unique Trade Identification".

Using:

"ContentFieldNames": [ "Trade - Price", "Trade - Volume", "Trade - Qualifiers", "Trade - Sequence Number", "Trade - Unique Trade Identification" ],

I get in November:

#RIC,Alias Underlying RIC,Domain,Date-Time,GMT Offset,Type,Price,Volume,Qualifiers,Seq. No.,Unique Trade Identification
HSIZ8,,Market Price,2018-11-01T01:15:16.929679422Z,+8,Trade,25085,1,1[ACT_FLAG1],705982406011651553,

I get in December:

#RIC,Alias Underlying RIC,Domain,Date-Time,GMT Offset,Type,Price,Volume,Qualifiers,Seq. No.,Unique Trade Identification
HSIZ8,,Market Price,2018-12-04T09:15:00.131944328Z,+8,Trade,27266,1,1[ACT_FLAG1];707288350947813826[TRADE_ID],,707288350947813826

    I also tried field "Trade - Trade Sequence Number" but it delivered no data.

    So in conclusion:

    • November: trade ID is in field "Trade - Sequence Number"
    • December: trade ID is in field "Trade - Unique Trade Identification" (and it is also concatenated with the qualifiers, using a semicolon as separator between the qualifier and trade ID).

    Conclusion

    Refinitiv publishes the data it receives from the exchanges, as it receives it. If the exchange modifies the way it publishes data, then the data format will change. That seems to be the case here, and there is not much one can do about it except manage it in the code.

    If you want to understand precisely why this has changed, as it is a data query the best and speediest way to receive an answer is to open a content-related enquiry via MyRefinitiv or to call the Refinitiv Help Desk directly.

    icon clock
    10 |1500

    Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

    Thanks Christiaan! A very detailed and satisfactory response, appreciate the time taken. We will fix this on our end.

    Upvotes
    22 2 3 7

    Further investigation leads me to believe that what I believed to be the `Seq Id` is now just being appended to the Qualifier?

    Dropping `Seq. Id.` returns the following (still)

    #RIC,Domain,Date-Time,Type,Price,Volume,Qualifiers
    HSIZ8,Market Price,2018-12-05T09:14:00.148335969+08,Trade,26788,8,3[ACT_FLAG1];707327933366141519[TRADE_ID],

    Could someone confirm that the `Qualifier` field now includes a `[TRADE_ID]`?

    icon clock
    10 |1500

    Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

    Click below to post an Idea Post Idea