Marcelo Perlin, Guilherme Kirch, Daniel Vancin
Unisinos/Porto Alegre - January 19, 2018
GetDFPData
provides an open R interface to all financial statements distributed by B3 and CVMJournal | Surveyed | Total | Percent |
---|---|---|---|
Contabilidade Vista & Revista | 19 | 80 | 0.24 |
Revista Brasileira de Finanças | 9 | 85 | 0.11 |
Revista Contabilidade & Finanças | 34 | 118 | 0.29 |
Revista Contabilidade e Organizações | 22 | 82 | 0.27 |
Revista Contemporânea de Contabilidade | 21 | 112 | 0.19 |
GetDFPData
Made by researchers, to researchers
Always tabular format (tidy data principles (Wickham 2014)) and always the correct class in R
R and web/shiny interface
Open and free!
GetDFPData
A web version was developed and published in the internet as a shiny application.
GetDFPData
# Release version in CRAN
install.packages('GetDFPData')
GetDFPData
The starting point of GetDFPData
is looking for information about current companies.
library(GetDFPData)
library(tibble)
df.info <- gdfpd.get.info.companies(type.data = 'companies')
## Found cache file. Loading data..
glimpse(df.info)
## Observations: 691
## Variables: 8
## $ name.company <chr> "521 PARTICIPAÇOES S.A. - EM LIQUIDAÇÃO EXTRAJ...
## $ id.company <int> 16330, 16284, 108, 20940, 21725, 19313, 18970,...
## $ situation <chr> "ATIVO", "ATIVO", "CANCELADA", "CANCELADA", "A...
## $ listing.segment <chr> NA, "Tradicional", "Tradicional", "Tradicional...
## $ main.sector <chr> NA, "Financeiro e Outros", "Materiais Básicos"...
## $ tickers <chr> NA, "QVQP3B", NA, NA, "ADHM3", "AELP3", "TIET1...
## $ first.date <date> 1998-12-31, 2001-12-31, 2009-12-31, 2009-12-3...
## $ last.date <date> 2016-12-31, 2016-12-31, 2009-12-31, 2009-12-3...
The current number of active and inactive companies, as of 2018-01-19, is available in column situation
. From it, we can see that there are 519 active companies and 167 cancelled.
We can also look at the distribution of sectors of traded companies at B3:
t.sector <- table(df.info$main.sector)
print(as.data.frame(sort(t.sector, decreasing = TRUE)))
## Var1 Freq
## 1 Financeiro e Outros 145
## 2 Consumo Cíclico 92
## 3 Bens Industriais 80
## 4 Utilidade Pública 74
## 5 Materiais Básicos 47
## 6 Consumo não Cíclico 35
## 7 Não Classificados 28
## 8 Saúde 18
## 9 Construção e Transporte 14
## 10 Petróleo. Gás e Biocombustíveis 13
## 11 Tecnologia da Informação 10
## 12 Telecomunicações 8
## 13 Petróleo, Gás e Biocombustíveis 2
The distribution of current governance listings is also available:
t.listings <- table(df.info$listing.segment)
print(as.data.frame(sort(t.listings, decreasing = TRUE)))
## Var1 Freq
## 1 Tradicional 359
## 2 Novo Mercado 143
## 3 Corporate Governance - Level 1 27
## 4 Corporate Governance - Level 2 21
## 5 Bovespa Mais 14
## 6 Bovespa Mais - Level 2 2
Tickers of companies are also available in column tickers
. As an example, we can check all currently traded stocks of Gerdau:
tickers.Gerdau <- df.info$tickers[which(df.info$name.company == 'GERDAU S.A.')]
print(tickers.Gerdau)
## [1] "GGBR3;GGBR4"
Every company in the database can be identified by its official name. Function gitrd.search.company
allow the user to search for the official name of a company.
Let’s find the official name of Ambev, one of the largest companies in Brazil:
my.names <- gdfpd.search.company('ambev')
## Found cache file. Loading data..
##
## Found 2 companies:
## COMPANHIA DE BEBIDAS DAS AMÉRICAS-AMBEV | situation = CANCELADA | first date = 2010-12-31 | last date - 2012-12-31
## AMBEV S.A. | situation = ATIVO | first date = 2012-12-31 | last date - 2016-12-31
Its official name in Bovespa records is AMBEV S.A..
name.companies <- 'AMBEV S.A.'
first.date <- '2012-12-31'
last.date <- '2016-12-31'
df.reports <- gdfpd.GetDFPData(name.companies = name.companies,
first.date = first.date,
last.date = last.date)
## Found cache file. Loading data..
##
## Downloading data for 1 companies
## First Date: 2012-12-31
## Laste Date: 2016-12-31
## Inflation index: dollar
##
## Downloading inflation data
## Caching inflation RDATA into tempdir() Done
##
## Inputs looking good! Starting download of files:
##
## AMBEV S.A.
## Available periods: 2016-12-31 2015-12-31 2014-12-31 2013-12-31 2012-12-31
##
##
## Processing 23264 - AMBEV S.A.
## Finding info from Bovespa
## Found cache file DFP Cache Folder/23264_AMBEV S./GetDFPData_BOV_cache_23264_AMBE.rds
## Processing 23264 - AMBEV S.A. | date 2016-12-31
## Acessing DFP data | Found DFP cache file
## Acessing FRE data | Found FRE cache file
## Acessing FCA data | Found FCA cache file
## Processing 23264 - AMBEV S.A. | date 2015-12-31
## Acessing DFP data | Found DFP cache file
## Acessing FRE data | Found FRE cache file
## Acessing FCA data | Found FCA cache file
## Processing 23264 - AMBEV S.A. | date 2014-12-31
## Acessing DFP data | Found DFP cache file
## Acessing FRE data | Found FRE cache file
## Acessing FCA data | Found FCA cache file
## Processing 23264 - AMBEV S.A. | date 2013-12-31
## Acessing DFP data | Found DFP cache file
## Acessing FRE data | Found FRE cache file
## Acessing FCA data | Found FCA cache file
## Processing 23264 - AMBEV S.A. | date 2012-12-31
## Acessing DFP data | Found DFP cache file
## Acessing FRE data | Found FRE cache file
## Acessing FCA data | Found FCA cache file
The resulting object from gdfpd.GetDFPData
is a tibble
, a data.frame type of object that allows for list columns (Müller and Wickham 2017):
glimpse(df.reports)
## Observations: 1
## Variables: 33
## $ company.name <chr> "AMBEV S.A."
## $ company.code <int> 23264
## $ company.tickers <chr> "ABEV3"
## $ min.date <date> 2012-12-31
## $ max.date <date> 2016-12-31
## $ n.periods <int> 5
## $ company.segment <chr> "Tradicional"
## $ current.stockholders <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ current.stock.composition <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ fr.assets <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ fr.liabilities <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ fr.income <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ fr.cashflow <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ fr.assets.consolidated <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ fr.liabilities.consolidated <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ fr.income.consolidated <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ fr.cashflow.consolidated <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.dividends <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.stockholders <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.capital.issues <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.mkt.value <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.capital.increases <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.capital.reductions <list> [<>]
## $ history.stock.repurchases <list> [<c("AMBEV S.A.", "AMBEV S.A.")...
## $ history.other.stock.events <list> [<AMBEV S.A., 2015-12-31, 2013-...
## $ history.compensation <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.compensation.summary <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.transactions.related <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.debt.composition <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.governance.listings <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.board.composition <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.committee.composition <list> [<c("AMBEV S.A.", "AMBEV S.A.",...
## $ history.family.relations <list> [<c("AMBEV S.A.", "AMBEV S.A.")...
All financial statements for the different quarters are available within df.reports
. For example, the income statements for all desired quarters of AMBEV are:
df.income.long <- df.reports$fr.income[[1]]
glimpse(df.income.long)
## Observations: 130
## Variables: 6
## $ name.company <chr> "AMBEV S.A.", "AMBEV S.A.", "AMBEV S.A.", "...
## $ ref.date <date> 2016-12-31, 2016-12-31, 2016-12-31, 2016-1...
## $ acc.number <chr> "3.01", "3.02", "3.03", "3.04", "3.04.01", ...
## $ acc.desc <chr> "Receita de Venda de Bens e/ou Serviços", "...
## $ acc.value <dbl> 20634193, -11263620, 9370573, 4006806, -399...
## $ acc.value.infl.adj <dbl> 6332420.75, -3456688.66, 2875732.09, 122964...
If desired, the wide format can be forced in the resulting financial reports with function gdfpd.convert.to.wide
:
temp.df <- dplyr::filter(df.income.long, ref.date >= as.Date('2015-01-31'))
df.income.wide <- gdfpd.convert.to.wide(temp.df)
knitr::kable(df.income.wide)
acc.number | acc.desc | name.company | 2015-12-31 | 2016-12-31 |
---|---|---|---|---|
3.01 | Receita de Venda de Bens e/ou Serviços | AMBEV S.A. | 22106965 | 20634193 |
3.02 | Custo dos Bens e/ou Serviços Vendidos | AMBEV S.A. | -11431627 | -11263620 |
3.03 | Resultado Bruto | AMBEV S.A. | 10675338 | 9370573 |
3.04 | Despesas/Receitas Operacionais | AMBEV S.A. | 3371176 | 4006806 |
3.04.01 | Despesas com Vendas | AMBEV S.A. | -3591038 | -3999531 |
3.04.02 | Despesas Gerais e Administrativas | AMBEV S.A. | -1277065 | -1190964 |
3.04.03 | Perdas pela Não Recuperabilidade de Ativos | AMBEV S.A. | 0 | 0 |
3.04.04 | Outras Receitas Operacionais | AMBEV S.A. | 1974913 | 2631206 |
3.04.05 | Outras Despesas Operacionais | AMBEV S.A. | -530970 | -245490 |
3.04.06 | Resultado de Equivalência Patrimonial | AMBEV S.A. | 6795336 | 6811585 |
3.05 | Resultado Antes do Resultado Financeiro e dos Tributos | AMBEV S.A. | 14046514 | 13377379 |
3.06 | Resultado Financeiro | AMBEV S.A. | -1304158 | -2509282 |
3.06.01 | Receitas Financeiras | AMBEV S.A. | 3988762 | 2502321 |
3.06.02 | Despesas Financeiras | AMBEV S.A. | -5292920 | -5011603 |
3.07 | Resultado Antes dos Tributos sobre o Lucro | AMBEV S.A. | 12742356 | 10868097 |
3.08 | Imposto de Renda e Contribuição Social sobre o Lucro | AMBEV S.A. | -318585 | 1678513 |
3.08.01 | Corrente | AMBEV S.A. | 1920185 | 2157328 |
3.08.02 | Diferido | AMBEV S.A. | -2238770 | -478815 |
3.09 | Resultado Líquido das Operações Continuadas | AMBEV S.A. | 12423771 | 12546610 |
3.10 | Resultado Líquido de Operações Descontinuadas | AMBEV S.A. | 0 | 0 |
3.10.01 | Lucro/Prejuízo Líquido das Operações Descontinuadas | AMBEV S.A. | 0 | 0 |
3.10.02 | Ganhos/Perdas Líquidas sobre Ativos de Operações Descontinuadas | AMBEV S.A. | 0 | 0 |
3.11 | Lucro/Prejuízo do Período | AMBEV S.A. | 12423771 | 12546610 |
3.99 | Lucro por Ação - (Reais / Ação) | AMBEV S.A. | 0 | 0 |
3.99.01 | Lucro Básico por Ação | AMBEV S.A. | 0 | 0 |
3.99.02 | Lucro Diluído por Ação | AMBEV S.A. | 0 | 0 |
GetDFPData
is specially designed for handling large scale download of data. Let’s build a case with three randomly selected active companies:
my.companies <- c("COMPANHIA CELG DE PARTICIPAÇÕES", "DOHLER S.A.", "KLABIN S.A.")
first.date <- '2012-01-31'
last.date <- '2017-01-31'
df.reports <- gdfpd.GetDFPData(name.companies = my.companies,
first.date = first.date,
last.date = last.date)
And now we can check the resulting tibble
:
glimpse(df.reports )
## Observations: 3
## Variables: 33
## $ company.name <chr> "KLABIN S.A.", "COMPANHIA CELG D...
## $ company.code <int> 12653, 21393, 5207
## $ company.tickers <chr> "KLBN11;KLBN3;KLBN4", "GPAR3", "...
## $ min.date <date> 2012-12-31, 2012-12-31, 2012-12-31
## $ max.date <date> 2016-12-31, 2016-12-31, 2016-12-31
## $ n.periods <int> 5, 5, 5
## $ company.segment <chr> "Corporate Governance - Level 2"...
## $ current.stockholders <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ current.stock.composition <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ fr.assets <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ fr.liabilities <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ fr.income <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ fr.cashflow <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ fr.assets.consolidated <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ fr.liabilities.consolidated <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ fr.income.consolidated <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ fr.cashflow.consolidated <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.dividends <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.stockholders <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.capital.issues <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.mkt.value <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.capital.increases <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.capital.reductions <list> [<>, <c("COMPANHIA CELG DE PART...
## $ history.stock.repurchases <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.other.stock.events <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.compensation <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.compensation.summary <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.transactions.related <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.debt.composition <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.governance.listings <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.board.composition <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.committee.composition <list> [<c("KLABIN S.A.", "KLABIN S.A....
## $ history.family.relations <list> [<c("KLABIN S.A.", "KLABIN S.A....
gdfpd.GetDFPData
Function gdfpd.GetDFPData
provides many options to the user. Here we will describe and explain its features.
The package also includes function gdfpd.export.DFP.data
for exporting the financial data in the Excel or csv format.
Next we present an example of saving the output as an Excel file.
my.basename <- 'MyExcelData'
my.type.export <- 'xlsx' # only supported so far
gdfpd.export.DFP.data(df.reports = df.reports,
base.file.name = my.basename,
type.export = my.type.export)
## File MyExcelData.xlsx already exists. Deleting it..
## Start Copying Dataframes to MyExcelData.xlsx
## Copying table metadata - 3 rows, 5 columns
## Copying table metadata - 3 rows, 5 columns
## Copying table current.stock.composition - 9 rows, 4 columns
## Copying table fr.assets - 240 rows, 6 columns
## Copying table fr.liabilities - 375 rows, 6 columns
## Copying table fr.income - 400 rows, 6 columns
## Copying table fr.cashflow - 316 rows, 6 columns
## Copying table fr.assets.consolidated - 225 rows, 6 columns
## Copying table fr.liabilities.consolidated - 390 rows, 6 columns
## Copying table fr.income.consolidated - 504 rows, 6 columns
## Copying table fr.cashflow.consolidated - 350 rows, 6 columns
## Copying table history.dividends - 146 rows, 12 columns
## Copying table history.stockholders - 71 rows, 14 columns
## Copying table history.capital.issues - 13 rows, 5 columns
## Copying table history.mkt.value - 14 rows, 5 columns
## Copying table history.capital.increases - 23 rows, 8 columns
## Copying table history.capital.reductions - 3 rows, 12 columns
## Copying table history.stock.repurchases - 8 rows, 12 columns
## Copying table history.other.stock.events - 4 rows, 8 columns
## Copying table history.compensation - 43 rows, 18 columns
## Copying table history.compensation.summary - 43 rows, 9 columns
## Copying table history.transactions.related - 48 rows, 13 columns
## Copying table history.debt.composition - 27 rows, 9 columns
## Copying table history.governance.listings - 15 rows, 5 columns
## Copying table history.board.composition - 316 rows, 17 columns
## Copying table history.committee.composition - 10 rows, 15 columns
## Copying table history.family.relations - 18 rows, 8 columns
##
## Export sucessful. Data available at file MyExcelData.xlsx
## Warning: Ignoring unknown parameters: binwidth, bins, pad
## Warning: Ignoring unknown parameters: binwidth, bins, pad
GetDFPData
has a clear impact, facilitating access to a large dataset from B3
Researchers will benefit with more time for research and less time for data acquisition and cleaning
The software is in constant development
Müller, Kirill, and Hadley Wickham. 2017. Tibble: Simple Data Frames. https://CRAN.R-project.org/package=tibble.
Wickham, Hadley. 2014. “Tidy Data.” The Journal of Statistical Software 59 (10). http://www.jstatsoft.org/v59/i10/.