For a deeper look into our Eikon Data API, look into:

Overview |  Quickstart |  Documentation |  Downloads |  Tutorials |  Articles

question

Upvotes
Accepted
10 1 2 3

How to retrieve news story via Eikon Data API, if the underlying news is a PDF file

Hi team,

How to retrieve news story via Eikon Data API, if the underlying news is a PDF file? Many thanks.

Regards,

Sunny

Below is python code

---------------------------------------------------------------------------------------------

Ric='3333.HK'
start_date='10/5/2020'
end_date='10/5/2021'


news_headline= ek.get_news_headlines(f'{Ric} HIIS UNAUDITED OPERATING',
date_from=f'{start_date}',
count=100,
date_to=f'{end_date}')


news_body=ek.get_news_story(news_headline['storyId'][2])
news_body


news
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
Accepted
38.1k 71 35 53

@sunny.to@refinitiv.com

According to the response on this thread, this feature is still not available in Eikon Data API.

However, I can get the PDF file with this URL: https://newsfile.refinitiv.com/getnewsfile/v1/story?guid=urn:newsml:reuters.com:20210701:nHKS6BgsFL.

'<div class="storyContent" lang="en"><p>UNAUDITED OPERATING STATISTICS OF PROPERTIES OF THE GROUP FOR JUNE 2021(with URL)</p><p class="line-break"><br/></p><p class="line-break"><br/></p><p class="line-break"><br/></p><p class="line-break"><br/></p><p>Exchange T1 category code 10000:"Announcements and Notices"</p><p class="line-break"><br/></p><p class="line-break"><br/></p><p>Exchange T2 category code 19800:"Other – Trading Update"</p><p class="line-break"><br/></p><p class="line-break"><br/></p><p><a href="reuters://screen/verb=Open/url=cpurl%3A%2F%2Fviews.cp.%2Fnewsfile%2Fgetnewsfile%2Fv1%2Fstory%3Fguid%3Durn%3Anewsml%3Areuters.com%3A20210701%3AnHKS6BgsFL" data-type="cpurl" data-cpurl="cpurl://views.cp./newsfile/getnewsfile/v1/story?guid=urn:newsml:reuters.com:20210701:nHKS6BgsFL" translate="no">http://newsfile.refinitiv.com/getnewsfile/v1/story...</a></p><p class="line-break"><br/></p><p class="line-break"><br/></p><p class="line-break"><br/></p><p>Double click on the URL above to view the article. Please note that internet access is required. If you experience problem accessing the internet, please consult your network administrator or technical support</p><p class="line-break"><br/></p><p class="line-break"><br/></p><p class="line-break"><br/></p><p>Latest version of Adobe Acrobat reader is recommended to view PDF files.  The latest version of the reader can be obtained from <a href="http://www.adobe.com/products/acrobat/readstep2.html" data-type="url" class="tr-link" translate="no">http://www.adobe.com/products/acrobat/readstep2.html</a></p></div>'
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Hi @Jirapongse ,

Thanks for your reply! May I know how to retrieve the link as a whole by using the python code? Since I can see it is separated in the news body.

Thanks,

Danni

Upvotes
38.1k 71 35 53

@Danni Qiu

You need to parse the news body to get the link.

For example, you can use the BeautifulSoup library to parse the news body.

try: 
    from BeautifulSoup import BeautifulSoup
except ImportError:
    from bs4 import BeautifulSoup


parsed_html = BeautifulSoup(news_body)
link = parsed_html.body.find('a', attrs={'data-type':'cpurl'})
link_cpurl = link['data-cpurl']
link_text = link.text.split('...')[0] + "?" + link_cpurl.split('?')[1]
link_text

This is just a sample code that may not work for all cases.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Click below to post an Idea Post Idea