I am downloading data in a bulk for all public firms in 47 countries, I used the add-in first to download some data but when I realized that this method was inefficient, I moved to Phyton. I used exactly the same code to download the data in the add-in and in Phyton, but for some reason, when I downloaded through Phyton some data of some variables for some specific years would be completely missing. I know this because I compared the data downloaded through phyton with the data downloaded previously by using de add-in.
More notably, all the data of firms from China ('CN') in most of variables I tried to download were missing from 1999 to 2016 and for some reason, all the data from the same variables was completely downloaded from 2017 to 2020. All using the same CODE in one run! The same happened for Latin American countries ("'PE', 'CL', 'BR', 'CO', 'MX', 'AR', 'EC'") in 2013, but this happened ONLY for 2013, the data was completely downloaded for all variables in the rest of the years.
I actually run it again to see whether this was a random occurrence and it was not, it happened again every time I run the code! I tried also to download using only the year (e.g. 2013) instead of CY2013, and then, the data is downloaded completely but I am not sure whether the default is CY and not fiscal year. Moreover, the data for market capitalization changes completely when I used only the year (e.g. 2013) instead of CY2013.
I would really appreciate any guidance in this respect, considering that I am downloading now the annual data, but I am planning to download the quarterly data soon and there will be more scope for problems in the downloading process...
I highly appreciate any guidance and thank you in advance for the attention given to this question
I am using this code in phyton:
#=========
#Regions to be downloaded
#You can take out the regions that you do not need anymore here
region_list = [g7, adv_other, asia, lac, asia_CN, asia_IN] #each region has the ISO2 codes of the countries from which I am getting the data
#=========
#Variables to be downloaded
fields = ''
fields = fields + 'TR.HeadquartersCountry' + ' ' +'TR.TotalAssetsReported' + ' ' + 'TR.TotalDebtOutstanding' + ' ' + 'TR.TotalRevenue' + ' ' + 'TR.TotalLiabilities' + ' '
fields = fields + 'TR.CashandEquivalents' + ' ' + 'TR.CashandSTInvestments' + ' ' + 'TR.PptyPlantEqpmtTtlGross' + ' '
fields = fields + 'TR.PropertyPlantEquipmentTotalNet' + ' ' + 'TR.CapitalExpenditures' + ' '
fields = fields + 'TR.InterestExpense' + ' ' + 'TR.CostofRevenueTotal' + ' ' + 'TR.SgaExpenseTotal' + ' ' + 'TR.TotalEquity' + ' '
fields = fields + 'TR.CapitalExpendituresCFStmt' + ' ' + 'TR.TotalCurrentAssets' + ' ' + 'TR.TotalCurrentLiabilities' + ' '
fields = fields + 'TR.TotalLongTermDebt' + ' ' + 'TR.HistPE' + ' '
fields = fields + 'TR.EBITDA' + ' '
fields = fields + 'TR.EBIT' + ' ' + 'TR.ROATotalAssetsPercent' + ' ' + 'TR.TotalInventory' + ' '
fields = fields + 'TR.NetIncomeCFStmt' + ' ' + 'TR.LTDebt' + ' ' + 'TR.TangibleBookValueRptd' + ' ' + 'TR.HQCountryCode'
fields_series = fields.split()
#=========
# Time period of the data
#yearly data
periods = ''
for yr in range(1999, 2021, 1):
periods = periods + ' ' + 'CY'+ str(yr)
periods_yr = periods.split()
#=========
#Some values to store
currency = 'USD'
scale = 6
#=========
#Downloading the data
for region in region_list:
#saving the name of the region
region_name = [ k for k,v in locals().items() if v is region ][0]
print("Refinitiv is downloading data to " + region_name)
for j in periods_yr: #for j in range(len(periods)):
df1, err = ek.get_data(firms_ric_list[region_name],
'TR.CompanyMarketCap',
{'Scale': scale,
#'FRQ' : frequency,
'SDate': j,
'Curn' : currency,
'NULL':'BLANK',
'FXRate':'PeriodEnd'})
dfj, err = ek.get_data(firms_ric_list[region_name],
fields_series,
{'Scale': scale,
#'FRQ' : frequency,
'Curn' : currency,
'ConsolBasis':'Consolidated',
'ReportType':'Final',
'reportingstate':'Rsdt',
'NULL':'BLANK',
'FXRate':'PeriodEnd',
'Period': j})
df1 = pd.merge(df1, dfj, on='Instrument')
#including all other vars in the excel sheet
with pd.ExcelWriter(os.path.join(d_bs_yr_dir, 'bs_' + region_name + '.xlsx'), engine='openpyxl',mode='a') as writer:
df1.to_excel(writer, sheet_name= j, index=True)
#printing the batch name
print( j + " was downloaded")