question

Upvotes
Accepted
7 1 0 1

Accessing HTML content of news using the RDP .NET API

Hello,

What is the best way of accessing the HTML content part off a news story. The IStoryData part of the returned IStoryResponse contains only the text/plain content of the styory in the NewsStory field.

Many thanks,

Darko Roje
Spreadex Ltd

rdp-apirefinitiv-data-platformrefinitiv-dataplatform-libraries
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvote
Accepted
9.7k 49 38 60

Hi @darko.roje,

The Refinitiv Data Library for .Net doesn't make any attempt to manipulate the response from the platform. However, I would imagine the Playground would take the raw response and once presented within the browser, would likely remove <CR><LF> and other non-relevant formatting sequences, which would explain the differences. That being said, the critical content should not be affected, or shouldn't be.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
7 1 0 1

To add, I am able to see that IStoreResponse.Data is a Refinitiv.Data.Content.News.StoryData object and has a Raw field which contains the whole response from the server. However since Refinitiv.Data.Content.News.StoryData is internal, I am unable to cast to that type to get access to the Raw Field.

I see that there is a IStoryDefinition.HtmlFormat method, but calling that with true just seems to convert the story.Data.NewsStory to some strange type of text/plain document, even though it has an <html> header.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
38.1k 71 35 53

@darko.roje

From my testing, when setting HtmlFormat(true), the ContentType is text/html.

1625742866470.png



1625742866470.png (59.5 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
7 1 0 1

Hello,

Yes, you are right that the context type is text/html when using the HtmlFormat(true) but even that the actual story returned is not the original HTML received from Refinitiv, but some kind of conversion to text.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
38.1k 71 35 53

@darko.roje

I have verified the retrieved data from the API Playground.

1625816056687.png

The HTML response from RDP .NET API is similar to the API playground.

1625816164459.png


You can refer to the Reference guide of the /data/news/v1/stories/{storyId} endpoint in the API Playground regarding HTML view response.


1625816056687.png (87.9 KiB)
1625816164459.png (81.5 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
7 1 0 1

Hello,

Thanks for your response. For me, the HTML content retrieved via RDP API is similiar, but not the same as the one received by the API playground, or indeed the one I can see in the C# debugger if I look at the returned Story. I have attached the two files.HTML from Playground.txtHTML from API.txt which demonstrate the difference.


icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
7 1 0 1

Thank you. I have figured out that if HtmlFormat is on, it changes the accept header and the server sends back different HTML than the one sent when Accept header is text/plain. This explains what I was seeing.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Click below to post an Idea Post Idea