Brazilian Telecommunications Data

Data Collection and Cleaning

Author

Thiago Alckmin

Published

March 15, 2023

Abstract

This file documents the data collection and extraction of telecommunications data for all municipalities in Brazil since the start of the century. The objective is to compile a municipality-month-year level panel which documents a municipality’s access to communication between 2000 and 2020 throughout Brazil. We, however, are only able to find municipality-year-month level communications data starting in 2007.

Originally, the plan was to find information on the initial date that 3G data was available for each municipality. This specific information was not readily available. Given that the true objective is to gauge connectivity, we implement an arguably better appraoch than just using 3G start-date data.

We collect granular access-point contract information for all telecoms providers in Brazil since 2007 using Anatel’s API and compute the total number of accesses (number of communication contracts) for each municipality-year-month. We then collect and clean yearly population estimates for each municipality from IBGE, to control for a municipality’s over-all size.

There are many alternate ways of proxy-ing the flow of communications in and out of a municipality. This method ignores differences in telecommunication technologies, transmission mechanisms, operators and speed; though we have this information on hand in case a more refined approach is preferred. E.g. We could look at only land-line, mobile or internet connections and introduce speed into the equation.

The current final data-set is available on the server under: /politics_at_work/analysis/data/telecoms/processing/accesses_per_municipio_year_month.csv. The clean Anatel micro-data is under: /politics_at_work/analysis/data/telecoms/processing/anatel_microdata_full.csv The clean population data is under: /politics_at_work/analysis/data/telecoms/input/population_mun_year.csv The file used to download and clean the data is titled clean_telecoms_data.R . The github repository for this sub-project is: https://github.com/Thiago-Alckmin/telecoms though it is private and you need to request access if you wish to see it.