question

Upvotes
Accepted
28 3 6 12

Why not all instruments received if AutoDecompression is to true on ExtractionContext for TickHistoryTimeAndSaleExtractionRequest?

Hi,

We are trying to migrate to TickHistory v2 using the REST API in .Net C#.

Currently we are retrieving a gzip file containing the following information for each Equity Spot. But when I set AutoDecompression option on ExtractionContext to true, then in the received file (.csv file) I do not receive all the instruments I have specified. But if I do not set this option AutoDecompression then I do receive all the instruments in the received zip file (gz file for example). The template I am using is TickHistoryTimeAndSaleExtractionRequest.

Do you know the reason for this?

Thanks.

tick-history-rest-api
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
Accepted
13.7k 26 8 12

@mojtaba.danai, to help answer your queries, I have just updated .Net Tutorial 5. It now shows how you can:

  • Save the extracted data directly as a compressed file, on hard disk.
  • Save the extracted data as a CSV file, for later usage (out of scope here).
  • Read and decompress the data from the compressed file that was saved to hard disk.
  • Read and decompress the data on the fly, as it is received from the server. This variant also saves the data as a second CSV file (identical to the one created by the other variant).

It also shows how to download the data from AWS (the Amazon Web Service cloud), which is faster than the standard download.

You can download this new code, it is part of the .Net SDK Tutorials code package. The code uses the .Net SDK.

Note: on the fly processing is not recommended for large data sets. Instead, we recommend to save the compressed data to file, and then to read, decompress and treat it from there.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
38.1k 71 35 53

This problem has been mentioned in this document.

Most Tick History reports deliver output as a gzip file. If a report is large, it delivers its output as several smaller gzip files concatenated into a single large gzip file. If your HTTP client does not support concatenated gzip files, you may not receive some of your output. At this time, most HTTP clients do not fully support concatenated gzip files.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

@ jirapongse.phuriphanvichai

Thanks for the reply. As mentioned we are using the REST API in .Net C#. How do we check out HTTP client through .Net is supporting concatenated gzip files?

If we are using the .NET C# , how this is using the HTTP client?

If the decompression flag is not set, then we do get a large gz file (zip file) containing every thing. And if the decompression flag is set to true then the file of course is not compressed and contains only some of the instruments.

Upvotes
28 3 6 12

@ jirapongse.phuriphanvichai

We have been using HttpClient or HttpClientHandler in our code. Actually we wish to download the whole zip file and decompress it afterwards after the download is finished. We have been using System.IO.Compression.GzipStream in another PowerShell script to decompress the downloaded file from Reuters FTP (version 1 of the api).

As mentioned we are suing ExtractionsContext to save the gzip file:

using (var response = _extractionContext.RawExtractionResultOperations.GetReadStream(result)) { using (var fileStream = File.Create(fullFileName)) { response.Stream.CopyTo(fileStream); } }

as mentioned in Reuters sample code provided in the TickHistoryExamples.cs file.

I am not sure we have to use HttpClientHandler or HttpClient.

There is no sample for code using this.

And as mentioned what if we do wish to decompress the downloaded file using System.IO.Compression.GzipStream in another PowerShell script?

Thanks

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Internally, ExtractionsContext uses .NET HTTP Client (HttpClient and HttpClientHandler) to send and receive the response.

To avoid this problem and get the data in gzip file, the application needs to set AutomaticDecompression to false in ExtractionsContext to disable AutomaticDecompression in .NET HTTP Client.

extractionsContext.Options.AutomaticDecompression = false;

Therefore, if you are using ExtractionsContext, you don't need to use HttpClientHandler or HttpClient anymore.

@ jirapongse.phuriphanvichai

Unfortunately the Reuters documentation on this is very poor. We are not sure how to decompress the gz file? Please see below comment

It was mentioned in the advisory on page 4 (Decompress the Gzip file). It recommends using SharpZipLib to decompress your gzip output files. It is available from ICSharpCode and NuGet.

You can apply the above methods to your code. For example, if you need to save the output in a CSV file, you can use the below code:

var response = extractionsContext.RawExtractionResultOperations.GetReadStream(result);
using (var gzip = new GZipInputStream(response.Stream))
 {
     using (var outputStream = File.Create("output.csv"))
     {
         gzip.CopyTo(outputStream);
     }
 }
code.png (65.4 KiB)
Upvotes
13.7k 26 8 12

@mojtaba.danai, have you tried the code of .Net Tutorial 5 ? It uses our .Net SDK, thus simplifying your coding work (there is no need to manage the HTTP connection yourself), and illustrates both saving the compressed data to file, and decompressing on the fly. It follows the recommendations outlined in the advisory. I believe it works for large data sets as well.

That said, we recommend you save the compressed data as a file on local hard disk, and then read it from there, instead of decompressing it on the fly.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

@Christiaan Meihsl,

I am using the following code to get the gz file at the moment:

using (var response = _extractionContext.RawExtractionResultOperations.GetReadStream(result)) { GZipStream using (var fileStream = File.Create(fullFileName)) { response.Stream.CopyTo(fileStream); } }

fullFileName is the name of the gz file (abcd.csv.gz).

- Do you know where I can download the sample code for decompressing the gz file runtime using the ExtractionsContext?

- Can I keep the above code and make decompression afterwards?

- Or What mehtod shall use on extractionsContext to decompress the file?

@mojtaba.danai, I shall modify .Net SDK Tutorial Code sample 5 now to answer your queries. I'll post it as soon as it is ready. It will be available under the downloads tab.

@Christiaan Meihsl

Unfortunately the Reuters documentation on this is very poor. We are not sure how to decompress the gz file? Please see above comment

Upvotes
3 0 0 2

It would be nice if we could use the standard .NET Ssystem.IO.Compression library to unzip the files, but unfortunately this doesn't work. Presumably this is why TR have included a third-party library in their tutorial code.

The root of the problem seems to be that the TR code that compresses the data file incorrectly records the size of the original file (this is stored in the last 4 byes of a gzip file). 7Zip, SharpZipLib and other utilities are robust enough to handle this error, but the Microsoft library isn't.

It would be great if TR could fix their compression code so that the gzip file is correctly formed and we wouldn't have to use third-party libraries.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Click below to post an Idea Post Idea