I have fetched the news story in a dataframe and saved it to a CSV. The data is having HTML tags. Is there a way i can get the story in plain text?
syntax1 = "(Hospital OR Health Center OR Medical center OR health system OR university hospital OR Emergency Department OR Inpatient OR Rehabilitat OR ICU ) AND ( build OR reopen OR construct OR expansion OR upgrade OR develop OR repurpose OR modern )" df = ek.get_news_headlines(syntax1,100,date_from="2021-03-25T00:00:00", date_to="2021-04-10T00:00:00") stories = pd.DataFrame(columns=['DATE','STORY']) for index, headline_row in df.iterrows(): story = ek.get_news_story(headline_row['storyId']) stories = stories.append({'DATE':index,'STORY':story}, ignore_index=True) stories = stories.set_index('DATE') result = pd.concat([df, stories], axis=1) result.to_csv("news.csv")
The result dataframe looks like this. I want to get rid of the html tags.