Economic Complexity of Australian Regions

Hamish Gamble; Sharif Rasel

1 Introduction

2 Literature Review

César A. Hidalgo and Hausmann (2009) introduced the concept of economic complexity as a means of quantifying and explaining differences in the economic development trajectory of different countries. Their method used bilateral trade data to identify the network structure of countries and the products they export and built on the concept of relatedness introduced in C. A. Hidalgo et al. (2007). Economic complexity has been shown to be a positive predictor of Gross Domestic Product (GDP), and GDP growth. Increasing economic complexity has also been shown to decrease unemployment and increase employment Adam et al. (2023), reduce green house gas emissions Romero and Gramkow (2021) and reduce income inequality Hartmann et al. (2017).

Relatedness has since been applied across industry (Neffke and Henning 2012), research areas (Guevara et al. 2016), occupation (Muneepeerakul et al. 2013) and technology (patents) (Kogler, Rigby, and Tucker 2013).

The relatedness approach has also been used to quantify economic complexity across cities, states, and regions, using employment dataChávez, Mosqueda, and Gómez-Zaldívar (2017), business counts(Gao and Zhou 2018), patent classifications (Balland and Boschma 2021), and interstate and international trade data (Reynolds et al. 2018).

Despite differences in data sources, the method for calculating economic complexity in the literature is relatively standard. The presence of an activity in a region is often identified using a location quotient method, such that an activity is said to be present in a region if:

\[\frac{X_a^r/\sum_{a}X_a^r}{\sum_{r}X_a^r/\sum_{r,a}X_a^r} \geq 1\]

Where \(X\) is the measure of an activity \(a\) in region \(r\) - such as the level of employment in an occupation in a city, or the number of businesses classified in an industry in a province, or the value of exports of a product from a country. The location quotient method creates a binary matrix \(M\) with \(a\) rows and \(r\) columns.

2.1 Regional economic complexity of small areas

The location quotient method can be unreliable due to the discontinuity at 1. This is especially relevant when economic complexity is calculated in regional areas where either \(X_a^r\) or \(\frac{\sum_{r}X_a^r}{\sum_{r,a}X_a^r}\) are small. In these cases, small changes, or measurement error in \(X_a^r\) can significantly change the location quotient.

The choice of region size and activity classification is important. In a study of the economic complexity of US regions, Fritz and Manduca (2021) use metropolitan areas as the basis for calculations. Metropolitan areas in the United States are defined such that jobs within a given area are held by residents who live in that area.Metropolitan areas have a population of at least 50,000 people. The smallest MSA was estimated to have a 2023 population of 57,700 (about 0.015% of US population). They find a poor correlation between ECI calculated at higher level aggregated industry classifications indicating the importance of a high degree of disaggregation to provide as much information to the model as possible Fritz and Manduca (2021).

In New Zealand, Davies and Maré (2021) use weighted correlations of local employment shares. Regions range from a population of 1,434 to 573,150 with a mean population of 29,947 and median population of 6,952. Employment is measured as an industry-occupation pair.

Differences in relationship between complexity and relatedness on indicators may be entirely context dependent.

3 Data & Methods

3.1 Data

Calculate economic complexity indicators for Australian regions using employment data from the 2021 Census.
Regions classified by Statistical Areas Level 3 (SA3)
Economic activity classified by ANZSIC industry division and ANZSCO major group

In [1]:

library(tidyverse)

Warning: package 'tidyverse' was built under R version 4.4.1

Warning: package 'ggplot2' was built under R version 4.4.1

Warning: package 'tibble' was built under R version 4.4.1

Warning: package 'tidyr' was built under R version 4.4.1

Warning: package 'readr' was built under R version 4.4.1

Warning: package 'purrr' was built under R version 4.4.1

Warning: package 'dplyr' was built under R version 4.4.1

Warning: package 'stringr' was built under R version 4.4.1

Warning: package 'forcats' was built under R version 4.4.1

Warning: package 'lubridate' was built under R version 4.4.1

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(strayr)
library(ecomplexity)
library(sf)

Warning: package 'sf' was built under R version 4.4.1

Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE

library(tmap)


Attaching package: 'tmap'

The following object is masked from 'package:datasets':

    rivers

library(spdep)

Warning: package 'spdep' was built under R version 4.4.1

Loading required package: spData

Warning: package 'spData' was built under R version 4.4.1

To access larger datasets in this package, install the spDataLarge
package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`

library(spatialreg)

Warning: package 'spatialreg' was built under R version 4.4.1

Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'spatialreg'

The following objects are masked from 'package:spdep':

    get.ClusterOption, get.coresOption, get.mcOption,
    get.VerboseOption, get.ZeroPolicyOption, set.ClusterOption,
    set.coresOption, set.mcOption, set.VerboseOption,
    set.ZeroPolicyOption

library(huxtable)

Warning: package 'huxtable' was built under R version 4.4.1


Attaching package: 'huxtable'

The following object is masked from 'package:dplyr':

    add_rownames

The following object is masked from 'package:ggplot2':

    theme_grey

library(flextable)

Warning: package 'flextable' was built under R version 4.4.1


Attaching package: 'flextable'

The following objects are masked from 'package:huxtable':

    align, as_flextable, bold, font, height, italic, set_caption,
    valign, width

The following object is masked from 'package:purrr':

    compose

industry_occupation_sa3 <- read_complexitydata("sa3_indp1_occp1", export_dir = "data")

Reading sa3_indp1_occp1 file found in data

regions <- industry_occupation_sa3 |> 
  group_by(x = str_detect(sa3, "POW|Migratory")) |>  
  summarise(count = sum(count)) |> 
  mutate(p = count/sum(count),
         x = ifelse(!x, "included", "excluded")) |> 
  split(f = ~x)

sa3 <- read_absmap("sa32021", export_dir = "data")

Reading sa32021 file found in data

industry_occupation_sa3 <- industry_occupation_sa3 |> 
  filter(!str_detect(sa3, "POW|Migratory")) |> 
  mutate(anzsic_division = fct_inorder(anzsic_division),
         anzsco_major = fct_inorder(anzsco_major), 
         industry_occupation = paste0(anzsic_division, " (", anzsco_major,")"),
         industry_occupation = fct_inorder(industry_occupation)) |> 
  group_by(sa3, industry_occupation) |> 
  summarise(count = sum(count),
            .groups = "drop")

We exclude individuals who identify their place of work as a Migratory - Offshore - Shipping region or as No Fixed Address. Employment in these regions totals 497,913 or about 4% of the total sample.
Following Davies and Maré (2021), employment is aggregated into industry-occupation pairs, allowing for differentiation between, for example, managers working in agriculture, forestry, and fishing, and managers working in retail trade.
Dataset covers 340 regions and 152 industry-occupations. Figure 1 shows the presence of any level of employment within a region and industry-occupation. As can be seen, there is a high level of employment density across our data, with 88.5% of all combinations of region, industry, and occupation.

In [2]:

employment_density <- function(data, region, activity, flip = FALSE) {
  region_size = paste0(region, "_size")
  activity_size = paste0(activity, "_size")
  
  data |> 
    group_by(.data[[region]])  |> 
    mutate({{region_size}} := sum(count)) |> 
    group_by(.data[[activity]]) |> 
    mutate({{activity_size}} := sum(count)) |> 
    ungroup() |> 
    mutate(presence = count > 0,
           {{region}} := fct_infreq(.data[[region]], w = .data[[region_size]]),
           {{activity}} := fct_infreq(.data[[activity]], w = .data[[activity_size]])) |> 
    ggplot(aes(x = if(!flip) .data[[region]] else .data[[activity]],
               y = if(!flip) .data[[activity]] else .data[[region]],
               fill = presence)) + 
    geom_raster() + 
    scale_fill_manual(breaks = c(TRUE, FALSE),
                      values = c("#3182db", "white")) + 
  theme(axis.text = element_blank(),
        axis.ticks = element_blank(),
        panel.border = element_rect(fill = NA),
        legend.position = "none") + 
  coord_equal()
}

employment_density(industry_occupation_sa3, "sa3", "industry_occupation") + 
  labs(x = "Region",
       y = "Industry-Occupation")

Figure 1: Presence of employment across regions and industry-occupations.

3.2 Method

This section follows the method of Davies and Maré (2021) using correlations of employment shares rather than a location quotient method.

3.2.1 Relatedness

Activities are related based on the weighted correlation between the local activity share of employment, weighted by each regions share of total employment.

First calculate the weighted covariance

\[ cov_{aa} = \sum_{c \in C} (\frac{E_c^{a_i}}{E_c}-\frac{E^{a_i}}{E})(\frac{E_c^{a_j}}{E_c}-\frac{E^{a_j}}{E}) \]

Divide the weighted covariance by the city share-weighted standard deviations of the local activity shares to get the weighted correlation.
Map the correlation to the interval \([0,1]\) such that:

\[ r_{aa} = \frac{1}{2}(cor(a_i, a_j) + 1) \]

City relatedness is calculated symmetrically such that:

\[ r_{cc} = \frac{1}{2}(cor(c_i, c_j) + 1) \]

3.2.2 Complexity

Activity complexity is defined by the second eigenvector of the matrix \(r_{aa}\) and city complexity is defined by the second eigenvector of the matrix \(r_{cc}\). The sign of activity complexity is set such that it is positively correlated with the weighted mean size of cities that contain activity \(a\), and the sign of city complexity is set such that it is positively correlated with the local share-weighted mean complexity of activities in city \(c\)

In [3]:

# function to calculate complexity from a region * activity matrix m

complexity_nz <- function(m, base_data) {
  
  activity_share <- m / rowSums(m)
  national_share_employment <- rowSums(m) / sum(m)
  
  region_name <- colnames(base_data)[1]
  activity_name <- colnames(base_data)[2]
  
  # Relatedness of activities is the weighted covariance between the local share vectors for activities i and j, 
  # weighted by each regions share of total employment.
  
  
  r_aa <- 0.5*(cov.wt(x = activity_share, wt = national_share_employment, cor = TRUE)$cor + 1)
  
  
  #Complexity of activity a is the element a of the standardized second eigenvector of the row-standardized relatedness matrix r.
  complexity <- list()
  complexity$activity <- Re(eigen(r_aa/rowSums(r_aa))$vector[,2])
  complexity$activity  <- (complexity$activity  - mean(complexity$activity ))/sd(complexity$activity)
  names(complexity$activity) <- colnames(m)
  
  #Complexity is positively correlated with the weighted mean size of cities that contain activity a
  wmsc <- colSums(m/colSums(m)*rowSums(m))
  
  if (cor(complexity$activity , wmsc) < 0) {
    complexity$activity  = -1*complexity$activity
  } else complexity$activity  = complexity$activity 
  
  message(glue::glue("most complex activity: {names(complexity$activity[complexity$activity == max(complexity$activity)])}")) 
  
  
  # Relatedness of cities is symmetric to activities.
  city_share <- t(m / rowSums(m))
  national_share_activity <- colSums(m)/sum(m)
  r_cc <- 0.5*(cov.wt(x = city_share, wt = national_share_activity, cor = TRUE)$cor + 1)
  
  complexity$city <- Re(eigen(r_cc/rowSums(r_cc))$vector[,2])
  complexity$city <- (complexity$city- mean(complexity$city))/sd(complexity$city)
  names(complexity$city) <- rownames(m)
  
  #City complexity is positively correlated with the local share weighted mean complexity of activities in city c.
  wmc <- rowSums(m/colSums(m)*complexity$activity)
  
  if (cor(complexity$city, wmc) < 0 || complexity$city["Brisbane Inner"] < 0) {
    complexity$city = -1 * complexity$city
    } else complexity$city = complexity$city
  
  message(glue::glue("most complex city: {names(complexity$city[complexity$city == max(complexity$city)])}")) 
  
  df.complexity <- inner_join(base_data, enframe(complexity$city, name = region_name, value = "city_complexity")) |> 
    inner_join(enframe(complexity$activity, name = activity_name, value = "activity_complexity"))
  return(df.complexity)
}

4 Results

In [4]:

#|
m <- industry_occupation_sa3 |> 
  pivot_wider(id_cols = sa3,
              names_from = industry_occupation,
              values_from = count) |> 
  column_to_rownames(var = "sa3") |> 
  as.matrix()

complexity <- list()

complexity$indp_occp <- complexity_nz(m, base_data = industry_occupation_sa3)

most complex activity: Professional, Scientific and Technical Services (Professionals)

most complex city: Brisbane Inner

Joining with `by = join_by(sa3)`
Joining with `by = join_by(industry_occupation)`

complexity$indp_occp_rca <- calculate_complexity(industry_occupation_sa3, 
                                                 region = "sa3",
                                                 product = "industry_occupation",
                                                 value = "count", 
                                                 year = 2021) |> 
  select(sa3, industry_occupation, count, city_complexity = country_complexity_index, activity_complexity = product_complexity_index)

Figure 2 shows the regional complexity of SA3 regions in Australian Greater Capital City Areas based on 2021 Census data. Complexity is highest in capital cities and surrounding regions.

In [5]:

complexity_map <- function(data, layer) {
  data <- data |> 
    pluck(layer) |> 
    distinct(sa3, city_complexity) |> 
    left_join(sa3, by = c("sa3" = "sa3_name_2021")) |> 
  st_as_sf() 
  
  data |> 
  filter(str_detect(gcc_name_2021, "Australian Capital Territory|Greater")) |> 
  tm_shape() +
  tm_polygons(fill = "city_complexity",
              fill.scale = tm_scale_continuous(values = "greek", ),
              fill.legend = tm_legend(title = "City Complexity Index",
                                      orientation = "landscape")) +
  tm_facets("gcc_name_2021") +
  tm_place_legends_bottom()
  
  }


complexity_map(complexity, "indp_occp")

Figure 2: Complexity of Australian Greater Capital City Areas

4.1 Regression

\(ECI_r = CC + log(employment) + log(population) + log(income_{hh}) + log(business_{entries}) + business_{share}\)

In [6]:

load("data/data_by_region.rda")


complexity.reg <- complexity$indp_occp |> 
  distinct(sa3, city_complexity) |> 
  inner_join(data_by_region, by = c("sa3" = "region")) |> 
  inner_join(sa3, by = c("sa3" = "sa3_name_2021")) |> 
  group_by(state_name_2021) |> 
  mutate(business_share = total_businesses_no/sum(total_businesses_no, na.rm = TRUE)) |>
  ungroup() |> 
  mutate(turnover_greater_2m = total_businesses_no - (businesses_with_turnover_of_10m_or_more_no + businesses_with_turnover_of_5m_to_less_than_10m_no + businesses_with_turnover_of_2m_to_less_than_5m_no),
         turnover_greater_2m_share = turnover_greater_2m/total_businesses_no,
         across(c(total_businesses_no, 
                  total_business_entries_no,
                  total_business_exits_no,
                  total_persons_employed_aged_15_years_and_over_no,
                  estimated_resident_population_persons_no,
                  median_equivalised_total_household_income_weekly), 
                .fn = ~log(.x),
                .names = "log_{.col}")) |> 
  drop_na(city_complexity,
          log_total_businesses_no,
          log_total_business_entries_no,
          log_total_business_exits_no,
          turnover_greater_2m_share,
          log_estimated_resident_population_persons_no,
          business_share,
          log_median_equivalised_total_household_income_weekly) |> 
  st_as_sf() 

regression_spec <- parse(text = "city_complexity ~ log_total_businesses_no + log_total_business_entries_no + log_total_business_exits_no + turnover_greater_2m_share + log_total_persons_employed_aged_15_years_and_over_no + log_estimated_resident_population_persons_no + business_share + log_median_equivalised_total_household_income_weekly")[[1]]

fit.ols <- lm(regression_spec,
              data = complexity.reg)

4.2 Spatial Correlation

In [7]:

nb <- poly2nb(complexity.reg, queen = TRUE)
lw <- nb2listw(nb, style = "W", zero.policy = TRUE)

In [8]:

moranI <- moran(complexity.reg$city_complexity, listw = lw, n = length(nb), S0 = Szero(lw))

moranIp <- moran.mc(complexity.reg$city_complexity, lw, nsim = 1000)

Based on Figure 2, it looks like there are clusters of complexity, centred around capital cities.

In [9]:

complexity.reg <- complexity.reg |>
  mutate(olsresid = resid(fit.ols)) 

tm_shape(complexity.reg) + 
  tm_polygons(fill = "olsresid",
              fill.scale = tm_scale_continuous(values = "reds"),
              fill.legend = tm_legend(outside = TRUE,
                                      title = "OLS Residuals",
                                      text.size = 0.8),
              col = "grey80",
              lwd = 0.5) +
  tm_layout(frame = FALSE)

[plot mode] fit legend/component: Some legend items or map compoments do not fit well, and are therefore rescaled. Set the tmap option 'component.autoscale' to FALSE to disable rescaling.

Figure 3: Residuals from linear regression

The residuals from the linear regression are shown in Figure 3 which also shows that the distribution of the residuals appears non-random.

In [10]:

moran.plot(as.numeric(scale(complexity.reg$city_complexity)),
           listw = lw,
           xlab = "City Complexity",
           ylab = "Neighbours City Complexity")

Figure 4: Moran Scatterplot for City Complexity in Australia

The correlation between complexity and lagged complexity is shown in Figure 4 which also shows a dependency. Finally, we observe a global Moran’s I of 0.498499 with a p.value of 0. As such, the data appear to be spatially autocorrelated, so a lagged AR or lagged error model should be estimated instead.

In [11]:

fit.lag <- lagsarlm(regression_spec,
                    data = complexity.reg,
                    listw = lw, 
                    zero.policy = TRUE)

fit.err <- errorsarlm(regression_spec,
                      data = complexity.reg,
                      listw = lw, 
                      zero.policy = TRUE)

In [12]:

huxreg("OLS" = fit.ols, "Spatial AR" = fit.lag, "Spatial Error" = fit.err)

	OLS	Spatial AR	Spatial Error
(Intercept)	-17.224 ***	-13.830 ***	-16.581 ***
	(3.417)	(3.351)	(3.561)
log_total_businesses_no	-0.102	0.063	-0.114
	(0.230)	(0.221)	(0.254)
log_total_business_entries_no	0.028	-0.060	-0.037
	(0.409)	(0.389)	(0.395)
log_total_business_exits_no	0.937 *	0.650	0.939 *
	(0.431)	(0.416)	(0.420)
turnover_greater_2m_share	0.491	-0.083	-0.220
	(1.748)	(1.668)	(1.721)
log_total_persons_employed_aged_15_years_and_over_no	-1.691 **	-1.382 **	-1.793 **
	(0.517)	(0.500)	(0.565)
log_estimated_resident_population_persons_no	1.208 *	1.096 *	1.430 *
	(0.511)	(0.490)	(0.565)
business_share	4.280 ***	4.457 ***	4.371 **
	(1.206)	(1.149)	(1.415)
log_median_equivalised_total_household_income_weekly	2.213 ***	1.669 ***	2.097 ***
	(0.283)	(0.294)	(0.321)
rho		0.273 ***
		(0.060)
lambda			0.360 ***
			(0.066)
N	330	330	330
R2	0.528	0.560	0.574
logLik	-346.835	-338.084	-335.202
AIC	713.669	698.168	692.403
* p < 0.001; p < 0.01; * p < 0.05.

5 Hot spots

In [13]:

MC.i <- localmoran_perm(complexity.reg$city_complexity, lw, nsim = 9999) 

MC.i.df <- MC.i |> 
  as.data.frame() 

complexity.reg$p <- MC.i.df$`Pr(folded) Sim`
complexity.reg$Ii <- hotspot(MC.i, 
                              Prname = "Pr(folded) Sim", 
                              cutoff = 0.05, 
                              p.adjust = "fdr")
complexity.reg$Ii <- factor(complexity.reg$Ii, 
                             levels = c("High-High",
                                        "Low-Low",
                                        "Low-High", 
                                        "High-Low",
                                        ">0.05"))
complexity.reg$Ii[is.na(complexity.reg$Ii)] <- ">0.05"

pal2 <- c( "#FF0000", "#0000FF", "#a7adf9", "#f4ada8","#ededed")

tm_shape(complexity.reg) + 
  tm_polygons(fill = "Ii",
              fill.scale = tm_scale_categorical(values = pal2),
              fill.legend = tm_legend(outside = TRUE, 
                                      text.size = 0.8),
              col = "grey80",
              lwd = 0.5) +
  tm_layout(frame = FALSE) + 
  tm_facets(by = "gcc_name_2021")

Figure 5: City Complexity hot spots (based on local Moran’s I p.values)

6 Conclusion

7 Appendix

Regional economic complexity can be calculated using other data, including employment by industry and employment by occupation.

7.1 Other Employment Indicators

In [14]:

industry_sa3 <- read_complexitydata("sa3_indp4", export_dir = "data") |> 
  mutate(year = 2021) |> 
  filter(!str_detect(indp, "[A-S]|&&&&|@@@@"))

Reading sa3_indp4 file found in data

occupation_sa3 <- read_complexitydata("sa3_occp4", export_dir = "data") |> mutate(year = 2021)

Reading sa3_occp4 file found in data

complexity$indp4 <- calculate_complexity(industry_sa3, region = "sa3", product = "indp", value = "count", year = 2021)  |> mutate(product = "indp4")
complexity$occp4 <- calculate_complexity(occupation_sa3, region = "sa3", product = "occp", value = "count", year = 2021) |> mutate(product = "occp4")

Employment density is much sparser when using 4-digit industry of employment data. Only 48 of the industry class-region combinations contain any level of employment.

In [15]:

library(patchwork)

Warning: package 'patchwork' was built under R version 4.4.1

p1 <- employment_density(industry_sa3, "sa3", "indp") +
  labs(x = "Region",
       y = "Industry Class") 
p2 <- employment_density(occupation_sa3, "sa3", "occp") +
  labs(x = "Region",
       y = "Occupation Unit")

wrap_plots(p1, p2) + 
  plot_annotation(tag_levels = "A")

Figure 6: Employment density, SA3 regions: (A) ANZSIC industry class, (B) ANZSCO occupation unit

In [16]:

#|label: city-complexity-industry

industry_sa3 <- industry_sa3 |> 
  group_by(indp) |> 
  mutate(indp_size = sum(count)) |> 
  filter(indp_size > 0) |> 
  ungroup()

m <- industry_sa3 |> 
  pivot_wider(id_cols = sa3,
              names_from = indp,
              values_from = count) |> 
  column_to_rownames(var = "sa3") |> 
  as.matrix()

complexity$indp4 <- complexity_nz(m, industry_sa3)

most complex activity: 6931

most complex city: Cottesloe - Claremont

Joining with `by = join_by(sa3)`
Joining with `by = join_by(indp)`

m <- occupation_sa3 |> 
  pivot_wider(id_cols = sa3,
              names_from = occp,
              values_from = count) |> 
  column_to_rownames(var = "sa3") |> 
  as.matrix()

complexity$occp4 <- complexity_nz(m, occupation_sa3)

most complex activity: Human Resource Managers
most complex city: North Sydney - Mosman
Joining with `by = join_by(sa3)`Joining with `by = join_by(occp)`

full_join(distinct(complexity$indp4, sa3, city_complexity),
          distinct(complexity$indp_occp, sa3, city_complexity), by = "sa3") |>
  ggplot(aes(x = city_complexity.x, y = city_complexity.y)) + 
  geom_point() + labs(x = "City Complexity (Industry Division)", y = "Economic Complexity (Industry-Occupation)")

In [17]:

complexity.reg.all <- complexity.reg |> 
  inner_join(distinct(complexity$indp4, sa3, indp_complexity = city_complexity), by = "sa3") |> 
  inner_join(distinct(complexity$occp4, sa3, occp_complexity = city_complexity))

Joining with `by = join_by(sa3)`

reg_spec <- parse(text = "indp_complexity ~ log_total_businesses_no + log_total_business_entries_no + log_total_business_exits_no + turnover_greater_2m_share + log_total_persons_employed_aged_15_years_and_over_no + log_estimated_resident_population_persons_no + business_share + log_median_equivalised_total_household_income_weekly")[[1]]

fit.ols <- lm(reg_spec, data = complexity.reg.all)

7.2 Specific Industries

Following Mealy and Teytelboym (2022) analysis of green economic complexity, we can apply an advanced manufacturing lens to determine advanced manufacturing complexities.

In [18]:

#|
am <-  tibble::tribble(
   ~indp,                                                           ~description,
                    1811L,                                         "Industrial gas manufacturing",
                    1812L,                                 "Basic organic chemical manufacturing",
                    1813L,                               "Basic inorganic chemical manufacturing",
                    1821L,                   "Synthetic resin and synthetic rubber manufacturing",
                    1829L,                                    "Other basic polymer manufacturing",
                    1831L,                                             "Fertiliser manufacturing",
                    1832L,                                              "Pesticide manufacturing",
                    1841L,             "Human pharmaceutical and medicinal product manufacturing",
                    1842L,        "Veterinary pharmaceutical and medicinal product manufacturing",
                    1851L,                                      "Cleaning compound manufacturing",
                    1852L,                      "Cosmetic and toiletry preparation manufacturing",
                    1891L,                          "Photographic chemical product manufacturing",
                    1892L,                                              "Explosive manufacturing",
                    1899L,                    "Other basic chemical product manufacturing n.e.c.",
                    2311L,                                          "Motor vehicle manufacturing",
                    2312L,                         "Motor vehicle body and trailer manufacturing",
                    2313L,                        "Automotive electrical component manufacturing",
                    2319L,                              "Other motor vehicle parts manufacturing",
                    2391L,                                     "Shipbuilding and repair services",
                    2392L,                                     "Boatbuilding and repair services",
                    2393L,              "Railway rolling stock manufacturing and repair services",
                    2394L,                           "Aircraft manufacturing and repair services",
                    2399L,                       "Other transport equipment manufacturing n.e.c.",
                    2411L,         "Photographic, optical and ophthalmic equipment manufacturing",
                    2412L,                         "Medical and surgical equipment manufacturing",
                    2419L,            "Other professional and scientific equipment manufacturing",
                    2421L,               "Computer and electronic office equipment manufacturing",
                    2422L,                                "Communication equipment manufacturing",
                    2429L,                             "Other electronic equipment manufacturing",
                    2431L,                                "Electric cable and wire manufacturing",
                    2432L,                            "Electric lighting equipment manufacturing",
                    2439L,                             "Other electrical equipment manufacturing",
                    2441L,                                    "Whiteware appliance manufacturing",
                    2449L,                               "Other domestic appliance manufacturing",
                    2451L,                                    "Pump and compressor manufacturing",
                    2452L, "Fixed space heating, cooling and ventilation equipment manufacturing",
                    2461L,                   "Agricultural machinery and equipment manufacturing",
                    2462L,                      "Mining and construction machinery manufacturing",
                    2463L,                           "Machine tool parts and parts manufacturing",
                    2469L,              "Other specialised machinery and equipment manufacturing",
                    2491L,                         "Lifting and handling equipment manufacturing",
                    2499L,                          "Other machinery and equipment manufacturing"
   ) |>
  mutate(indp = as.character(indp))

In [19]:

amci <- calculate_complexity(industry_sa3, region = "sa3", product = "indp", value = "count", year = 2021) |>
  mutate(m = as.numeric(rca >= 1),
         is_am = as.numeric(indp %in% am$indp),
         pci = scales::rescale(product_complexity_index, to = c(0, 1)),
         p = m * is_am) |>
  group_by(sa3) |>
  summarise(amci = sum(pci * p),
            amp = sum((1-p)*density*pci)/sum(1-p),
            amce = sum(count*is_am)) |>
  mutate(across(c(amci, amp), ~ (.x - mean(.x))/sd(.x)))

complexity.reg <- complexity.reg |>
  inner_join(amci)

Joining with `by = join_by(sa3)`

lm.amci <- lm(city_complexity ~ median_age_persons_years + estimated_resident_population_persons_no + total_business_entries_no + total_business_exits_no + business_share + amci + amp, data = complexity.reg)

7.3 Smaller Areas

In [20]:

industry_occupation_sa2 <- read_complexitydata("sa2_indp1_occp1", export_dir = "data")|>
  filter(!str_detect(sa2, "POW|Migratory")) |>
  mutate(anzsic_division = fct_inorder(anzsic_division),
         anzsco_major = fct_inorder(anzsco_major),
         industry_occupation = paste0(anzsic_division, " (", anzsco_major, ")"),
         industry_occupation = fct_inorder(industry_occupation)) |>
  group_by(sa2, industry_occupation) |>
  summarise(count = sum(count), .groups = "drop") |>
  group_by(sa2) |>
  mutate(sa2_size = sum(count)) |>
  group_by(industry_occupation) |>
  mutate(industry_occupation_size = sum(count)) |>
  ungroup() |>
  filter(sa2_size > 0,
         industry_occupation_size > 0)

Reading sa2_indp1_occp1 file found in data

employment_density(industry_occupation_sa2, "sa2", "industry_occupation", flip = TRUE) +
  labs(x = "Industry-Occupation",
       y = "Region") +
  coord_equal(ratio = 0.25)

Coordinate system already present. Adding new coordinate system, which will
replace the existing one.

Figure 7: Employment density, Industry-occupation, SA2 regions

In [21]:

m <- industry_occupation_sa2 |>
  pivot_wider(id_cols = sa2,
              names_from = industry_occupation,
              values_from = count) |>
  column_to_rownames(var = "sa2") |>
  as.matrix()

complexity_small <- complexity_nz(m, industry_occupation_sa2)

most complex activity: Professional, Scientific and Technical Services (Professionals)

most complex city: Ashburton (Vic.)

Joining with `by = join_by(sa2)`
Joining with `by = join_by(industry_occupation)`

complexity_small |>
  distinct(sa2, city_complexity) |>
  inner_join(read_absmap("sa22021", export_dir = "data"),
             by = c("sa2" = "sa2_name_2021")) |>
  st_as_sf() |>
  filter(str_detect(gcc_name_2021, "Australian Capital Territory|Greater")) |>
  tm_shape() +
  tm_polygons(fill = "city_complexity",
              lwd = 0.05,
              fill.scale = tm_scale_continuous(values = 'greek'),
              fill.legend = tm_legend(title = "City Complexity Index",
                                      orientation = "landscape"))+
  tm_facets("gcc_name_2021") +
  tm_place_legends_bottom()

Reading sa22021 file found in data

References

Adam, Antonis, Antonios Garas, Marina-Selini Katsaiti, and Athanasios Lapatinas. 2023. “Economic Complexity and Jobs: An Empirical Analysis.” Economics of Innovation and New Technology 32 (1): 25–52. https://doi.org/10.1080/10438599.2020.1859751.

Balland, Pierre-Alexandre, and Ron Boschma. 2021. “Mapping the Potentials of Regions in Europe to Contribute to New Knowledge Production in Industry 4.0 Technologies.” Regional Studies 55 (10-11): 1652–66. https://doi.org/10.1080/00343404.2021.1900557.

Chávez, Juan Carlos, Marco T. Mosqueda, and Manuel Gómez-Zaldívar. 2017. “Economic Complexity and Regional Growth Performance: Evidence from the Mexican Economy.” Review of Regional Studies 47 (2): 201–19. https://doi.org/10.52324/001c.8023.

Davies, Benjamin, and David C. Maré. 2021. “Relatedness, Complexity and Local Growth.” Regional Studies 55 (3): 479–94. https://doi.org/10.1080/00343404.2020.1802418.

Fritz, Benedikt S. L., and Robert A. Manduca. 2021. “The Economic Complexity of US Metropolitan Areas.” Regional Studies 55 (7): 1299–1310. https://doi.org/10.1080/00343404.2021.1884215.

Gao, Jian, and Tao Zhou. 2018. “Quantifying China’s Regional Economic Complexity.” Physica A: Statistical Mechanics and Its Applications 492: 1591–1603. https://doi.org/https://doi.org/10.1016/j.physa.2017.11.084.

Guevara, Miguel R., Dominik Hartmann, Manuel Aristarán, Marcelo Mendoza, and César A. Hidalgo. 2016. “The Research Space: Using Career Paths to Predict the Evolution of the Research Output of Individuals, Institutions, and Nations.” Scientometrics 109 (3): 1695–1709. https://doi.org/10.1007/s11192-016-2125-9.

Hartmann, Dominik, Miguel R. Guevara, Cristian Jara-Figueroa, Manuel Aristarán, and César A. Hidalgo. 2017. “Linking Economic Complexity, Institutions, and Income Inequality.” World Development 93 (May): 75–93. https://doi.org/10.1016/j.worlddev.2016.12.020.

Hidalgo, C. A., B. Klinger, A.-L. Barabaśi, and R. Hausmann. 2007. “The Product Space Conditions the Development of Nations.” Science 317 (5837): 482–87. https://doi.org/10.1126/science.1144581.

Hidalgo, César A., and Ricardo Hausmann. 2009. “The Building Blocks of Economic Complexity.” Proceedings of the National Academy of Sciences 106 (26): 10570–75. https://doi.org/10.1073/pnas.0900943106.

Kogler, Dieter F., David L. Rigby, and Isaac Tucker. 2013. “Mapping Knowledge Space and Technological Relatedness in US Cities.” European Planning Studies 21 (9): 1374–91. https://doi.org/10.1080/09654313.2012.755832.

Mealy, Penny, and Alexander Teytelboym. 2022. “Economic Complexity and the Green Economy.” Research Policy 51 (8): 103948. https://doi.org/10.1016/j.respol.2020.103948.

Muneepeerakul, Rachata, José Lobo, Shade T. Shutters, Andrés Goméz-Liévano, and Murad R. Qubbaj. 2013. “Urban Economies and Occupation Space: Can They Get “There” from “Here”?” Edited by César A. Hidalgo. PLoS ONE 8 (9): e73676. https://doi.org/10.1371/journal.pone.0073676.

Neffke, Frank, and Martin Henning. 2012. “Skill Relatedness and Firm Diversification.” Strategic Management Journal 34 (3): 297–316. https://doi.org/10.1002/smj.2014.

Reynolds, Christian, Manju Agrawal, Ivan Lee, Chen Zhan, Jiuyong Li, Phillip Taylor, Tim Mares, Julian Morison, Nicholas Angelakis, and Göran Roos. 2018. “A Sub-National Economic Complexity Analysis of Australia’s States and Territories.” Regional Studies 52 (5): 715–26. https://doi.org/10.1080/00343404.2017.1283012.

Romero, João P., and Camila Gramkow. 2021. “Economic Complexity and Greenhouse Gas Emissions.” World Development 139 (March): 105317. https://doi.org/10.1016/j.worlddev.2020.105317.

Article Notebook