Chapter 1 Trade and Economy

1.1 Data Download

For this chapter, we’ll be using data from the UN Comtrade. The data file can be downloaded directly here.

Alternatively, the same data file can be downloaded from the UN Comtrade website using the steps listed below. The UN Comtrade website is a great, easy-to-use resource which hosts rich trade data. To familiarize yourself more with this resource, you can practice using their interface by downloading the data set yourself using the following steps:

  • First create a Comtrade account. The account is free, and this permits you to download the different data sets.
  • Once you are logged into your account, go back to the Comtrade home page.
  • Under the data quick search, click the Advanced Search Page option.
  • Make the following selections:
    • Type of product: Goods
    • Reporting frequency: Annual
    • Commodity classification: HS
    • HS (as reported) commodity codes: TOTAL (default)
    • Reporters: All (default)
    • Trade Flows: select both Export and Import
    • Periods: select 2020, 2019, 2018, 2017 and 2016
    • Partners: World (default)
    • Keep defaults for all other parameters
    • Preview and then Download the dataset to your local computer

Note: How much data you can download through this interface in one go depends on the type of your subscription. Please check the corresponding section of UN Comtrade website for further details and download options.

Now the data should be downloaded, either directly as a CSV file from a link from above or by following the steps for gathering the data via UN Comtrade website.

Additionally, before we start handling the data, we need to load a few R packages.

#install.packages("tidyverse")
library(tidyverse) # for manipulating data
#install.packages("readxl")
library(readxl) # for reading in data files in a clean format
#install.packages("WDI")
library(WDI) # for accessing the WDI API
options(scipen = 100) # turn off scientific notation

Set the path to your directory in which you will save the downloaded datasets used in this and the subsequent sections.

data_path <- "your/path/to/data/"

Now that we’ve activated the packages we need, we will read in the trade data. You will need to include the correct link to the data file stored on your computer. Then, we will drop columns we don’t need and rename the ones that we need. Now the data file is prepped and ready.

# import the data
data <- read_csv(paste0(data_path, "TradeData_all-world-16-20.csv"))

# select the required columns
data <- data %>% 
  select(RefYear, ReporterDesc, FlowDesc, PartnerDesc, PrimaryValue)

# rename the columns
data <- data %>% 
  rename(year = RefYear, reporter = ReporterDesc, trade_direction = FlowDesc, 
         partner = PartnerDesc, trade_value_usd = PrimaryValue)

data
## # A tibble: 1,642 × 5
##     year reporter    trade_direction partner trade_value_usd
##    <dbl> <chr>       <chr>           <chr>             <dbl>
##  1  2016 Afghanistan Import          World       6534140413 
##  2  2016 Afghanistan Export          World        596455337 
##  3  2016 Albania     Import          World       4669289913 
##  4  2016 Albania     Export          World       1962117416 
##  5  2016 Algeria     Import          World      47090683586 
##  6  2016 Algeria     Export          World      29992101470 
##  7  2016 Andorra     Import          World       1354095937.
##  8  2016 Andorra     Export          World         96912959.
##  9  2016 Angola      Import          World      14347710498.
## 10  2016 Angola      Export          World      28057499522.
## # … with 1,632 more rows

1.2 Trade Dependence Index

What does it tell us? The trade dependence index (also often called the openness index) is a measure of the importance of international trade in the overall economy. It can give an indication of the degree to which an economy is open to trade (subject to some serious limitations).

Definition: The value of total trade (imports plus exports) as a percentage of GDP.

Mathematical definition:

\(TDI = \frac{\sum_{s} X_{ds} + \sum^{}_{s} M_{sd}}{GDP_{d}}\times100\)

where d is the country under study, s is the set of all other countries, X is total bilateral exports, M is total bilateral imports and GDP is gross domestic product (of country d). In words, the numerator is total exports from d plus total imports to d, and the denominator is the GDP of d.

Range of values: Takes values between 0 and +∞.

Limitations: Openness of an economy is determined by a large number of factors, most importantly by trade restrictions like tariffs, non-tariff barriers, foreign exchange regimes, non-trade policies and the structure of national economies. The share of trade transactions in a country’s value added is a result of all these factors. It is possible that an open and liberalized economy has a relatively small TDI, if a large proportion of its GDP is created by non-traded activities supported by the domestic market. Low trade dependence may indicate high trade restrictions either in that country or toward that country in the overseas markets, or both.

Let’s calculate the TDI for Indonesia in the time period from 2016 through to 2020. Note that we don’t need to specify the years since the data file already spans this period.

TDI <- data %>% 
  # keep data on Indonesia only and select columns of interest
  filter(reporter == "Indonesia") %>% 
  select(year, trade_direction, trade_value_usd)

We use spread() to separate Import and Export data into separate columns. After that, we will make an API call to the WDI database to get annual GDP data for the country we are interested in. We do that using WDI() function. We then merge this additional dataset with our original TDI data frame, and we deselect columns country and iso2c, as they are redundant.

# reshape the dataset
TDI <- spread(TDI, trade_direction, trade_value_usd)

# make a call to WDI database to obtain GDP data on Indonesia over 2016-2020 period
gdp <- WDI(indicator= c("GDP" = "NY.GDP.MKTP.CD"), country="IDN", start=2016, end=2020)

# merge the additional dataset with TDI data frame and deselect redundant columns
TDI <- merge(TDI, gdp, by = "year") %>% select(year, Export, Import, GDP)

TDI
##   year       Export       Import           GDP
## 1 2016 144489796418 135652799792  931877364178
## 2 2017 168827554043 156985501122 1015618742566
## 3 2018 180215034095 188711171618 1042271531012
## 4 2019 167682995133 171275708684 1119099868265
## 5 2020 163191837311 141568761235 1058688935455

We can now calculate the Trade Dependence Index by following the operations shown in the formula and store the results into a new variable named tdi. Finally, we round the TDI value to have a clearer view of the result.

# calculate TDI for Indonesia in each examined year
TDI$tdi <- (TDI$Export + TDI$Import)/TDI$GDP*100 
 
# only keep values of the TDI index 
TDI <- TDI %>% select(year, tdi)

# round the TDI values
TDI$tdi <- round(TDI$tdi, 0)

TDI
##   year tdi
## 1 2016  30
## 2 2017  32
## 3 2018  35
## 4 2019  30
## 5 2020  29

Now let’s calculate TDI for a set of 6 economies - ASEAN-6 - in 2020, and display the result in a bar plot. Again, we use spread() to separate Import and Export data into separate columns.

# create a vector with names of selected economies
ASEAN.6 <- c("Indonesia", "Malaysia", "Philippines", "Singapore", "Thailand", "Viet Nam")

x <- data %>% 
  # filter the dataset to only keep data on trade of selected countries in 2020
  filter(reporter %in% ASEAN.6 & year == 2020) %>% 
  # keep only columns of interest
  select(reporter, trade_direction, trade_value_usd)

# reshape the data
x <- spread(x, trade_direction, trade_value_usd)

Now we make another call to the WDI database to retrieve 2020 GDP data for the selected economies. Note that we also need to change Vietnam to Viet Nam to ensure consistency of economy names. And, finally, we deselect the redundant columns.

# make a call to WB database to obtain GDP data on six examined economies
gdp <- WDI(indicator= c("GDP" = "NY.GDP.MKTP.CD"), 
           country= c("IDN", "MYS", "PHL", "SGP", "THA", "VNM"), start=2020, end=2020)

# rename Viet Nam to ensure consistency
gdp$country[gdp$country=="Vietnam"] <- "Viet Nam"

# merge the additional dataset with x data frame and deselect redundant columns
x <- merge(x, gdp, by.x = "reporter", by.y = "country") %>% 
  select(reporter, Export, Import, GDP)

x
##      reporter       Export       Import           GDP
## 1   Indonesia 163191837311 141568761235 1058688935455
## 2    Malaysia 234050267453 190404531975  337337932675
## 3 Philippines  65214435072  95066801050  361751116293
## 4   Singapore 373683730056 328624455636  345295933899
## 5    Thailand 231387919677 207695675615  499681757031
## 6    Viet Nam 281441457237 261309451921  346615750664

As was previously done, we calculate TDI for all countries and store the values into tdi column. Let’s round the calculated values to zero decimals.

# calculate TDI indices for the six economies
x$tdi <- (x$Export + x$Import)/x$GDP*100

# round the values
x$tdi <- round(x$tdi, 0)

x[, c("reporter", "tdi")]
##      reporter tdi
## 1   Indonesia  29
## 2    Malaysia 126
## 3 Philippines  44
## 4   Singapore 203
## 5    Thailand  88
## 6    Viet Nam 157

Finally, we will create a bar plot by using ggplot() function from ggplot2 package (which is part of the tidyverse package) and by adjusting it’s various arguments to shape up our plot as needed.

We will use geom_col() to display our data as a bar chart and specify the width and fill color for the columns. With labs() function we will set up the title and remove labels of the axes. We will use theme_minimal() function to apply a standard, nice-looking theme to the whole graph. Finally, by default, the plot will display countries in reverse alphabetical order. To fix this, we will use functions reorder() and desc() applied to appropriate variables within the aes() function nested within ggplot().

If you would like to better understand all arguments that are available within each specific function, you can run ?function_name().

Now let’s visualize our calculated TDI values.

# create a bar chart by selecting your variables and reordering the economies by values of TDI
TDI_plot <- ggplot(x, aes(x = tdi, y = reorder(reporter, desc(tdi)))) +
  
  # adding the bar plot to the chart area, and adjusting the width of the bars, and
  # and inactivating the legend
  # setting bar color based on value of TDI
  geom_col(aes(fill = tdi), width = 0.6, show.legend = FALSE) +
  
  # adding the chart title and removing the axis labels
  labs(title = "Trade Dependence Index for ASEAN-6 Economies (2020)", x = NULL, y = NULL) +
  
  # applying the minimal theme
  theme_minimal()

TDI_plot

As can be seen from the chart, there was a considerable variation in the degree of trade openness between these economies in 2020, with Singapore, Viet Nam and Malaysia much more dependent on trade than the other examined economies. Note that it is possible for the value of trade to exceed the value of production, hence there are index values of over 100 per cent for several economies.

1.3 Import Penetration

What does it tell us? The import penetration rate shows to what degree domestic demand (the difference between GDP and net exports) is satisfied by imports. Calculated at the sectoral level it is termed as the self-sufficiency ratio. The index may be used as the basis of specific policy objectives targeting self-sufficiency. It may provide an indication of the degree of vulnerability to certain types of external shocks.

Definition: The ratio of total imports to domestic demand, as a percentage. Range of values: Ranges from 0 (with no imports) to 100 per cent when all domestic demand is satisfied by imports only (no domestic production).

Mathematical definition:

\(IP = \frac{\sum_{s} M_{sd}}{GDP_{d}-\sum^{}_{s} X_{ds} + \sum^{}_{s} M_{sd}}\times100\)

where d is the country under study, s is the set of all other countries, X is total bilateral exports, M is total bilateral imports and GDP is gross domestic product (of country d). In words, the numerator is total imports to d, the denominator is the GDP of d less total exports and plus total imports (i.e., total domestic demand).

Limitations: The import penetration index is biased upward by re-exports (can be corrected for in principle). Will tend to be negatively correlated with economic size. A low import penetration ratio (complete self-sufficiency) may be an inappropriate policy target from an efficiency perspective.

We start again with the data object containing data from UN Comtrade. This time we’ll be calculating Import Penetration for the Philippines in years 2016-2020. The process to build our data frame is the same as above (we only select a different economy). After that we can calculate Import Penetration index for the Philippines following the operations as shown in the formula, and we round the index values.

IP <- data %>% 
  # keep trade data on the Philippines and select columns of interest
  filter(reporter == "Philippines") %>% 
    select(year, trade_direction, trade_value_usd) %>%
  # reshape the data to have separate columns for exports and imports
  spread(trade_direction, trade_value_usd)

# obtain GDP data for the Philippines over 2016-2020
gdp <- WDI(indicator= c("GDP" = "NY.GDP.MKTP.CD"), country="PHL", start=2016, end=2020)

# merge the new dataset with IP data frame
IP <- merge(IP, gdp, by = "year") %>% select(year, Export, Import, GDP)

# calculate the value for domestic demand
IP$Domestic_Demand <- IP$GDP - IP$Export + IP$Import

# calculate the IP index
IP$ip <- (IP$Import / IP$Domestic_Demand) * 100

# only keep values of the IP index, and round the values
IP <- IP %>% select(year, ip) %>% mutate(ip = round(ip, 0))
  
IP
##   year ip
## 1 2016 25
## 2 2017 28
## 3 2018 29
## 4 2019 28
## 5 2020 24

In a second example let’s calculate Import Penetration index for the same set of ASEAN-6 economies in 2020, and display the values in a tidy bar chart.

In this case we can reuse x data frame from the previous section, since it already includes the variables we need (Import, Exports and GDP). We only need to remove tdi column, as it is redundant, and create ip column to save values of the new indicator.

# remove TDI index values from the data frame
x <- x %>% select(-tdi)

# calculate domestic demand for the 6 economies
x$Domestic_Demand <- x$GDP - x$Export + x$Import

# calculate IP indices for the 6 economies, and round the values
x$ip <- (x$Import / x$Domestic_Demand) * 100
x$ip <- round(x$ip, 0)

x[, c("reporter", "ip")]
##      reporter  ip
## 1   Indonesia  14
## 2    Malaysia  65
## 3 Philippines  24
## 4   Singapore 109
## 5    Thailand  44
## 6    Viet Nam  80

Again let’s create the bar chart.

# create a bar chart by selecting your variables, and reordering economies by values of IP
IP_plot <- ggplot(x, aes(x = ip, y = reorder(reporter, desc(ip)))) +
  # adding the bar plot to the chart area, adjusting bar width, and removing the legend
  # setting bar color based on value of IP
  geom_col(aes(fill = ip), width = 0.6, show.legend = FALSE) +
  # adding the chart title, and removing axis labels
  labs(title = "Import Penetration Index for ASEAN-6 Economies (2020)", x = NULL, y = NULL) +
  # applying minimal theme to the chart
  theme_minimal()

IP_plot

Import penetration indices are revealed to be very high for Singapore, Viet Nam and Malaysia in 2020. The value higher than 100% for Singapore is most likely the reflection of re-exports’ presence of in the raw data.

1.4 Export Propensity

What does it tell us? The index shows the overall degree of reliance of domestic producers on foreign markets. It is similar to the trade dependence index, but may provide a better indicator of vulnerability to certain types of external shocks (e.g., falls in export prices or changes in exchange rates). It may be a policy target.

Definition: The ratio of exports to GDP, defined as a percentage.

Mathematical definition:

\(EP = \frac{\sum^{}_{s} X_{ds}}{GDP_{d}}\times100\)

where d is the country under study, s is the set of all other countries, X is total bilateral exports, and GDP is gross domestic product (of country d). In words, the numerator is total exports from d, and the denominator is the GDP of d.

Range of values: The ratio is expressed as a percentage and it ranges from zero (with no exports) to 100 (with all domestic production exported).

Limitations: The export propensity index is biased upward by re-exports (can be corrected for in principle). Will tend to be negatively correlated with economic size. A high export propensity may be an inappropriate policy target from an efficiency perspective.

We’ll now calculate Export Propensity index for Thailand in 2016-2020. We start with subsetting the data object to extract data on Thailand exports and selecting relevant variables. After that we rename the variable trade_value_usd as Export.

# extract data on Thailand exports, and keep only columns of interest
# rename the column
EP <- data %>% 
  filter(reporter == "Thailand" & trade_direction == "Export") %>% 
  select(year, trade_value_usd) %>%
  rename(Export=trade_value_usd)

After that, we make an API call to WDI database to obtain GDP values for Thailand over the same time period. Finally, we can calculate EP indices.

# obtain GDP data for Thailand over 2016-2020
gdp <- WDI(indicator= c("GDP" = "NY.GDP.MKTP.CD"), country="THA", start=2016, end=2020)

# merge the two datasets and keep only columns of interest
EP <- merge(EP, gdp, by = "year")  %>% select(year, Export, GDP)

# calculate and round the EP indices
EP$ep <- (EP$Export / EP$GDP) *100
EP$ep <- round(EP$ep, 0)

EP %>% select(year, ep)
##   year ep
## 1 2016 52
## 2 2017 52
## 3 2018 50
## 4 2019 43
## 5 2020 46

Now let’s calculate EP indices for ASEAN-6 economies in 2020. First, we go back to x data frame and restore it to its original form, so that we only have reporter, Export and GDP columns.

Now we can easily calculate EP index and store the values in ep column. After that we run ggplot() with geom_col() to make another bar chart.

# clean x data frame
x <- x %>% select(reporter, Export, GDP)

# calculate EP index for each economy in 2020, and round the values
x$ep <- (x$Export / x$GDP) * 100
x$ep <- round(x$ep, 0)

x %>% select(reporter, ep)
##      reporter  ep
## 1   Indonesia  15
## 2    Malaysia  69
## 3 Philippines  18
## 4   Singapore 108
## 5    Thailand  46
## 6    Viet Nam  81
# create a bar chart by selecting your variables, and reordering economies by value of EP
EP_plot <- ggplot(x, aes(x = ep, y = reorder(reporter, desc(ep)))) +
  # adding the bar plot to the chart area, adjusting bar width, and removing the legend
  # setting bar color based on value of EP
  geom_col(aes(fill = ep), width = 0.6, show.legend = FALSE) +
  # adding the chart title, and removing axis labels
  labs(title = "Export Propensity Index for ASEAN-6 Economies (2020)", x = NULL, y = NULL) + 
  # applying minimal theme to the chart
  theme_minimal()

EP_plot

It is revealed that once again Singapore, Viet Nam and Malaysia have the highest values for Export Propensity Index. The value higher than 100% for Singapore is most likely the reflection of re-exports’ presence of in the raw data.

1.5 Marginal Propensity to Import

What does it tell us? The marginal propensity to import (MPM) is a measure of the extent to which imports are induced by a change in incomes. The relevance for policymakers depends on the cycle of the economy. With higher MPM, in an economic downturn with a fall in GDP, there will also be a significant fall in imports as compared with lower a MPM. More generally, a higher MPM reduces the multiplier effect of an increase in GDP.

Definition: The ratio of the change in total imports to the change in GDP over a defined period (typically one year).

Mathematical definition:

\(MPM = \frac{\Delta\sum_{s} M_{sd}}{\Delta GDP_{d}}\)

where d is the country under study, s is the set of all other countries, is the change operator, M is total bilateral imports and GDP is gross domestic product (of country d). In words, the numerator is the change in total imports to d over a given period (usually one year), and the denominator is the change in the GDP of d over the same period.

Range of values: In macroeconomic theory ranges between 0 (with no part of extra GDP spent on additional imports) and 1 when the whole extra GDP created is spent on imports.

Limitations: The MPM is not a constant and can vary over time, so care should be taken in using it as an input to policy decisions. Calculations based on annual data only approximate the true value, and may lie outside of the theoretically sensible range.

This time we’ll find Marginal Propensity Import indices for Malaysia over 2016-2020.

# extract the data on Malaysia's imports
# select and rename columns as necessary
MPM <- data %>% 
  filter(reporter == "Malaysia" & trade_direction == "Import") %>% 
  select(year, trade_value_usd) %>%
  rename(Import=trade_value_usd)

# obtain GDP data for Malaysia over the same period
gdp <- WDI(indicator= c("GDP" = "NY.GDP.MKTP.CD"), country="MYS", start=2016, end=2020)

# merge the two data frames
MPM <- merge(MPM, gdp, by = "year")  %>% select(year, Import, GDP)

MPM
##   year       Import          GDP
## 1 2016 168375228233 301255454041
## 2 2017 194720187073 319112175612
## 3 2018 218018425151 358791603678
## 4 2019 205030495401 365175135788
## 5 2020 190404531975 337337932675

At this point we need to calculate values for change in total imports and GDP over the examined period of time. We do that by running lag() in our subtraction. This function will automatically select the Import value related to the preceding year.

# calculate change in total imports each year, and store values in dImport column
# Note `NA` in dImport cell for 2016:
# this is because there is no previous year to compare with
MPM$dImport <- MPM$Import - lag(MPM$Import)

# calculate change in GDP each year
MPM$dGDP <- MPM$GDP - lag(MPM$GDP)

# calculate and round the index values
MPM$mpm <- MPM$dImport / MPM$dGDP
MPM$mpm <- round(MPM$mpm, 2)

MPM %>% select(year, mpm)
##   year   mpm
## 1 2016    NA
## 2 2017  1.48
## 3 2018  0.59
## 4 2019 -2.03
## 5 2020  0.53

Now let’s calculate MPM indices for ASEAN-6 economies over 2016-2020 period and display results in a line chart. First, let’s compile necessary data in a new x data frame.

# retrieve Import data for ASEAN-6 economies
# select columns of interest
x <- data %>% 
  filter(reporter %in% ASEAN.6 & trade_direction == "Import") %>% 
  select(reporter, year, trade_value_usd)

# obtain GDP data for ASEAN-6 economies in 2016-2020
gdp <- WDI(indicator= c("GDP" = "NY.GDP.MKTP.CD"), 
           country=c("IDN", "MYS", "PHL", "SGP", "THA", "VNM"), start=2016, end=2020)

# rename Viet Nam for consistency
gdp$country <- ifelse(gdp$country == "Vietnam", "Viet Nam", gdp$country)

# merge the two datasets
x <- merge(x, gdp, by.x = c("year", "reporter"), 
           by.y = c("year", "country")) %>% 
  # select columns of interest
  select(year, reporter, trade_value_usd, GDP)

Now let’s make our calculations and then reshape the data to have MPM index values in separate columns for each economy. We will also remove the first line of our final data frame since it’s populated with NAs only.

Note use of the function mutate() below, which adds new variables to a data frame and preserve the existing ones. It preserves the number of rows in the dataset. If calculation is made for grouped variables, than one same value will be saved into the specified column across all the rows corresponding to a given combination of grouping variables. This should be paid attention to when you conduct further calculations, and it is a good practice to use ungroup() function on a dataset after a given calculation is finalized and the grouping becomes redundant.

x <- x %>% 
  # group data by reporter
  group_by(reporter) %>%
  # calculate change in total imports each year for each economy
  mutate(dImports = trade_value_usd - lag(trade_value_usd, n = 1, default = NA)) %>% 
  # calculate change in GDP each year for each economy
  mutate(dGDP = GDP - lag(GDP, n = 1, default = NA)) %>%
  # calculate MPM indices for each economy for each year
  mutate(MPM = dImports / dGDP) %>% 
  ungroup() 

x <- x %>% 
  # remove redundant columns
  select(year, reporter, MPM) %>%
  # reshape the data
  spread(reporter, MPM)

# drop the first row currently containing NAs
x <- x[-1,]

x
## # A tibble: 4 × 7
##    year Indonesia Malaysia Philippines Singapore Thailand `Viet Nam`
##   <dbl>     <dbl>    <dbl>       <dbl>     <dbl>    <dbl>      <dbl>
## 1  2017     0.255    1.48       1.62        1.47    0.636      1.58 
## 2  2018     1.19     0.587      0.716       1.27    0.549      0.823
## 3  2019    -0.227   -2.03       0.0737      7.56   -0.867      0.683
## 4  2020     0.492    0.525      1.47        1.01    0.205      0.642

As the last step, we will run ggplot() to visualize our time series.

Typically, it’s easier to graph data with ggplot() if the data is in a long format, so we will reshape the data frame back to longer format so there is one observation for each country-year grouping. Then we will add a geom_line() function to specify that we want a time-series (line) graph. We will also color code line charts for each economy.

# pivot the dataset to long format
x <- x %>% 
  pivot_longer(cols = Indonesia:`Viet Nam`, names_to = "country", values_to = "mpm")

# create a time-series chart by selecting your variables, and by specifying the color coding rule
MPM_plot <-  ggplot(x, aes(x = year, y = mpm, color = country)) + 
  # adding the line charts
      geom_line() + 
  # specifying the color palette
      scale_color_brewer(palette = "Set1") +
  # providing the chart title, deactivating axis labels, and creating a color coding legend
      labs(title = "MPM for ASEAN-6 economies (2016-2020)", x = NULL, y = NULL, color = "Economies") +
  # applying the minimal theme
      theme_minimal()

MPM_plot

The MPM indices were calculated year-on-year. The resulting chart reveals that in 2019 4 out of 6 economies experienced a significant decrease in MPM indices, that rebounded in 2020. As opposed to the rest of economies, year 2019 was characterized by a significant increase in MPM for Singapore, which by 2020 has decreased to a level comparable to year 2018. Note that in practice the calculated values can in fact be negative or exceed unity.