Skip to contents
library(conmat)
library(socialmixr)
#> 
#> Attaching package: 'socialmixr'
#> The following object is masked from 'package:utils':
#> 
#>     cite
library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
library(mgcv)
#> Loading required package: nlme
#> 
#> Attaching package: 'nlme'
#> The following object is masked from 'package:dplyr':
#> 
#>     collapse
#> This is mgcv 1.9-0. For overview type 'help("mgcv-package")'.
library(patchwork)

The goal of conmat is to simplify the process of generating synthetic contact matrices for a given age population.

What is a contact matrix?

Contact matrices describe the degree of contact between individuals of given age groups. For example, this matrix describes the number of contacts between individuals.

#>       0-4 5-9 10-14
#> 0-4    10   3     4
#> 5-9     3  11     5
#> 10-14   4   5    13

The rows and columns represent the age groups of the people. On the main diagonal we see that we have a higher number of contacts - showing that people of similar ages tend to interact more with one another. We can use the information in these matrices to model how diseases such as COVID-19 spread in a population through social contact.

Why do we need synthetic contact matrices?

Contact matrices are produced from empirical data resulting from a contact survey, which requires individuals to diary the amount and manner of contact a person has in a day. However, these surveys are highly time-consuming and expensive to run, meaning that only a handful of these empirical datasets exist globally.

We can use statistical methods to create synthetic contact matrices, which are new contact matrices that have been generalised to new countries based on existing surveys.

Why do we need conmat?

Existing methods only provide outputs of the contact matrices for each country, or at best, for urban and rural areas for a given country. We need methods that allow for flexibly creating synthetic contact matrices for a specified age population. This is because the age population distribution of many countries (e.g., Australia), are quite heterogeneous, and assuming it is homogeneous would result in inaccurate representation of community infection in many regions.

Quick example using Australian data

Suppose we want to get a contact matrix for a given region in Australia, let’s say the city of Perth. We can get that from a helper function, abs_age_lga.

perth <- abs_age_lga("Perth (C)")
perth
#> # A tibble: 18 × 4 (conmat_population)
#>  - age: lower.age.limit
#>  - population: population
#>    lga       lower.age.limit  year population
#>    <chr>               <dbl> <dbl>      <dbl>
#>  1 Perth (C)               0  2020       1331
#>  2 Perth (C)               5  2020        834
#>  3 Perth (C)              10  2020        529
#>  4 Perth (C)              15  2020        794
#>  5 Perth (C)              20  2020       3615
#>  6 Perth (C)              25  2020       5324
#>  7 Perth (C)              30  2020       4667
#>  8 Perth (C)              35  2020       3110
#>  9 Perth (C)              40  2020       1650
#> 10 Perth (C)              45  2020       1445
#> 11 Perth (C)              50  2020       1299
#> 12 Perth (C)              55  2020       1344
#> 13 Perth (C)              60  2020       1359
#> 14 Perth (C)              65  2020       1145
#> 15 Perth (C)              70  2020       1004
#> 16 Perth (C)              75  2020        673
#> 17 Perth (C)              80  2020        481
#> 18 Perth (C)              85  2020        367

(You can learn more about the data sources we provide in data-sources.Rmd)

We can get a contact matrix made for perth using the extrapolate_polymod function:

perth_contact <- extrapolate_polymod(
  population = perth
)

perth_contact
#> 
#> ── Setting Prediction Matrices ─────────────────────────────────────────────────
#> A list of matrices containing the model predicted contact rate between ages in
#> each setting.
#> There are 16 age breaks, ranging 0-75+ years, with a regular 5 year interval
#> • home: a 16x16 <matrix>
#> • work: a 16x16 <matrix>
#> • school: a 16x16 <matrix>
#> • other: a 16x16 <matrix>
#> • all: a 16x16 <matrix>
#>  Access each <matrix> with `x$name`
#>  e.g., `x$home`

We can plot this with autoplot

autoplot(perth_contact)

And you can see each contact matrix in a setting by referring to its name - for example, the “home” setting contact matrix:

perth_contact$home
#>               [0,5)     [5,10)    [10,15)    [15,20)    [20,25)    [25,30)
#> [0,5)    0.40076181 0.33477812 0.16555012 0.11608014 0.17617098 0.32079759
#> [5,10)   0.19885124 0.32216319 0.20765643 0.07226656 0.05193349 0.09275907
#> [10,15)  0.07470057 0.15774971 0.26654020 0.12883717 0.04419741 0.03535209
#> [15,20)  0.08170674 0.08563793 0.20097718 0.34425132 0.17025978 0.06763197
#> [20,25)  0.33292330 0.16522893 0.18510249 0.45711126 0.73008179 0.41491477
#> [25,30)  1.11525912 0.54291253 0.27237380 0.33403867 0.76329736 1.15720681
#> [30,35)  1.45860486 1.15187002 0.52666015 0.27078533 0.33556433 0.71610024
#> [35,40)  0.74668485 1.02027190 0.74962874 0.32720560 0.17284734 0.20851823
#> [40,45)  0.24541744 0.41797497 0.55192548 0.38836042 0.16874222 0.08987479
#> [45,50)  0.12179342 0.15376654 0.25912643 0.33988507 0.23163274 0.10557474
#> [50,55)  0.11563694 0.09409502 0.11753342 0.20020602 0.24542407 0.17519639
#> [55,60)  0.13503961 0.09876405 0.07843632 0.09845190 0.15621575 0.19574806
#> [60,65)  0.12361018 0.11014329 0.07653905 0.05845473 0.07100810 0.11555725
#> [65,70)  0.08050583 0.09083407 0.07498176 0.04709679 0.03585191 0.04657126
#> [70,75)  0.04571279 0.05636194 0.05763523 0.04176239 0.02672337 0.02309248
#> [75,Inf) 0.04433925 0.05437000 0.05768114 0.05328648 0.04628235 0.04056598
#>             [30,35)    [35,40)    [40,45)    [45,50)    [50,55)    [55,60)
#> [0,5)    0.41986733 0.32009689 0.17360957 0.11569004 0.12037866 0.13963653
#> [5,10)   0.19694689 0.25979529 0.17562638 0.08675713 0.05818228 0.06066078
#> [10,15)  0.06840680 0.14500559 0.17617452 0.11106526 0.05520885 0.03659734
#> [15,20)  0.05486552 0.09873357 0.19337627 0.22725016 0.14669997 0.07165755
#> [20,25)  0.18254084 0.14002857 0.22558064 0.41579752 0.48281403 0.30526183
#> [25,30)  0.71662662 0.31076553 0.22102981 0.34863971 0.63404890 0.70368779
#> [30,35)  1.09947651 0.66335662 0.25976689 0.18402632 0.30705923 0.55689293
#> [35,40)  0.44542786 0.69615807 0.38905818 0.14785069 0.11414398 0.19791347
#> [40,45)  0.10570365 0.23577102 0.38790919 0.21832411 0.08906823 0.07049240
#> [45,50)  0.05576763 0.06672604 0.16259149 0.29222377 0.17596322 0.07176800
#> [50,55)  0.08490703 0.04700502 0.06052547 0.16056139 0.29887960 0.17643556
#> [55,60)  0.15502733 0.08205057 0.04822505 0.06592728 0.17762380 0.32453421
#> [60,65)  0.16323144 0.14348475 0.07851056 0.04681736 0.06587880 0.18132674
#> [65,70)  0.08575999 0.13412044 0.11910768 0.06317103 0.03886021 0.05882613
#> [70,75)  0.03362083 0.06630028 0.10307451 0.08846877 0.04884175 0.03304499
#> [75,Inf) 0.03722015 0.04251769 0.06915906 0.11440610 0.14081425 0.12619944
#>             [60,65)    [65,70)    [70,75)   [75,Inf)
#> [0,5)    0.12872383 0.09279565 0.06409376 0.03956186
#> [5,10)   0.06812931 0.06218996 0.04693915 0.02881503
#> [10,15)  0.03596518 0.03899872 0.03646368 0.02322291
#> [15,20)  0.04284739 0.03821128 0.04121576 0.03346612
#> [20,25)  0.13974053 0.07809486 0.07080747 0.07803935
#> [25,30)  0.41835658 0.18662205 0.11256237 0.12583317
#> [30,35)  0.59051931 0.34340814 0.16376160 0.11536980
#> [35,40)  0.34855102 0.36062108 0.21684493 0.08849404
#> [40,45)  0.11557517 0.19407598 0.20429657 0.08723069
#> [45,50)  0.05132625 0.07665599 0.13058578 0.10746461
#> [50,55)  0.06590183 0.04302814 0.06578340 0.12069301
#> [55,60)  0.18261175 0.06557415 0.04480699 0.10889502
#> [60,65)  0.33689152 0.19066624 0.07151884 0.07706387
#> [65,70)  0.17225754 0.32955054 0.18796283 0.06150905
#> [70,75)  0.05311866 0.15452338 0.28435950 0.09695813
#> [75,Inf) 0.08994295 0.07946041 0.15236100 0.24375399

These contact matrices could then be used in subsequent modelling, such as the input in a SIR (Susceptible, Infected, Recovered) model.

Quick example using world data

Similarly to above, we can access some world data from another data source - we have some helpers to pull data from the world population:

world_data <- socialmixr::wpp_age()

head(world_data)
#>   country lower.age.limit year population
#> 1  AFRICA               0 1950   38705049
#> 2  AFRICA               0 1955   44304214
#> 3  AFRICA               0 1960   50491493
#> 4  AFRICA               0 1965   57690110
#> 5  AFRICA               0 1970   65452837
#> 6  AFRICA               0 1975   75017430

We can tidy the data up, filtering down to a specified location and year with the age_population function:

nz_2015 <- age_population(
  data = world_data,
  location_col = country,
  location = "New Zealand",
  age_col = lower.age.limit,
  year_col = year,
  year = 2015
)

nz_2015
#> # A tibble: 21 × 5 (conmat_population)
#>  - age: lower.age.limit
#>  - population: population
#>    country      year population lower.age.limit upper.age.limit
#>    <chr>       <int>      <dbl>           <dbl>           <dbl>
#>  1 New Zealand  2015     308681               0               4
#>  2 New Zealand  2015     315632               5               9
#>  3 New Zealand  2015     297247              10              14
#>  4 New Zealand  2015     318932              15              19
#>  5 New Zealand  2015     339362              20              24
#>  6 New Zealand  2015     315914              25              29
#>  7 New Zealand  2015     287537              30              34
#>  8 New Zealand  2015     273539              35              39
#>  9 New Zealand  2015     312245              40              44
#> 10 New Zealand  2015     313312              45              49
#> # ℹ 11 more rows

Then we could create a contact matrix for NZ population data for 2015 like so:

nz_contact <- extrapolate_polymod(
  population = nz_2015
)
autoplot(nz_contact)

nz_contact$home
#>               [0,5)     [5,10)    [10,15)    [15,20)    [20,25)    [25,30)
#> [0,5)    0.59904546 0.49663687 0.24783852 0.16685575 0.23779435 0.45167131
#> [5,10)   0.49590450 0.80198425 0.52807734 0.19165851 0.12507770 0.21662190
#> [10,15)  0.24801027 0.52922371 0.86792789 0.44837029 0.15277570 0.11386007
#> [15,20)  0.17170557 0.19752052 0.46108309 0.74210402 0.37437407 0.14189402
#> [20,25)  0.24970734 0.13153780 0.16031834 0.38202547 0.60099118 0.32615896
#> [25,30)  0.45823995 0.22009679 0.11543594 0.13989150 0.31511566 0.49499779
#> [30,35)  0.58789796 0.45791877 0.21460518 0.11138674 0.12795622 0.27752805
#> [35,40)  0.43715208 0.59558222 0.44877557 0.20875654 0.10483077 0.11678179
#> [40,45)  0.24637719 0.41784060 0.55173061 0.41392260 0.18686739 0.09189737
#> [45,50)  0.17121478 0.21826147 0.36146201 0.47771534 0.34585427 0.15423717
#> [50,55)  0.17561659 0.14494800 0.18110658 0.29956710 0.37925607 0.27470535
#> [55,60)  0.19091227 0.13941710 0.11237964 0.13809560 0.21781394 0.28043336
#> [60,65)  0.15865533 0.13968437 0.09795321 0.07534281 0.08879449 0.14665213
#> [65,70)  0.09795471 0.10943657 0.09028519 0.05832945 0.04342008 0.05545301
#> [70,75)  0.05312051 0.06525155 0.06632928 0.04936708 0.03166200 0.02651409
#> [75,Inf) 0.05332440 0.06568208 0.06959281 0.06485754 0.05705246 0.05063730
#>             [30,35)    [35,40)    [40,45)    [45,50)    [50,55)    [55,60)
#> [0,5)    0.61686214 0.46910228 0.25365446 0.16899792 0.17376930 0.20206763
#> [5,10)   0.47977067 0.63816915 0.42954802 0.21511775 0.14321182 0.14734590
#> [10,15)  0.22533425 0.48190902 0.56842076 0.35702910 0.17932573 0.11902862
#> [15,20)  0.12027154 0.23052514 0.43853510 0.48523543 0.30503162 0.15041321
#> [20,25)  0.14098644 0.11812819 0.20202509 0.35847842 0.39406680 0.24209084
#> [25,30)  0.29543605 0.12713952 0.09598769 0.15445415 0.27576878 0.30113624
#> [30,35)  0.44075564 0.26007699 0.10241976 0.07378624 0.12176811 0.22102166
#> [35,40)  0.25430395 0.40484109 0.22741314 0.08797632 0.06614672 0.11166469
#> [40,45)  0.10438256 0.23703285 0.38946352 0.22464248 0.09222914 0.07033241
#> [45,50)  0.07843709 0.09564466 0.23431160 0.41266991 0.25196526 0.10302438
#> [50,55)  0.12912555 0.07173578 0.09596274 0.25134673 0.46077162 0.27413423
#> [55,60)  0.21910793 0.11321083 0.06841237 0.09607651 0.25627598 0.46419737
#> [60,65)  0.20993522 0.17991857 0.09746956 0.05977965 0.08481979 0.23166041
#> [65,70)  0.10469289 0.16296452 0.14069818 0.07505802 0.04677449 0.07048263
#> [70,75)  0.03913179 0.07853000 0.11932926 0.10096504 0.05612741 0.03800797
#> [75,Inf) 0.04714828 0.05382800 0.08528198 0.13688872 0.16841350 0.15605344
#>             [60,65)    [65,70)    [70,75)   [75,Inf)
#> [0,5)    0.18890661 0.13733131 0.09484177 0.05470230
#> [5,10)   0.16607313 0.15320249 0.11632882 0.06727994
#> [10,15)  0.11671106 0.12666645 0.11850689 0.07144055
#> [15,20)  0.09231611 0.08415411 0.09070226 0.06846732
#> [20,25)  0.11102181 0.06392410 0.05936161 0.06145876
#> [25,30)  0.17715415 0.07887508 0.04802691 0.05270121
#> [30,35)  0.23822736 0.13988632 0.06658570 0.04609558
#> [35,40)  0.19963353 0.21291306 0.13065862 0.05145799
#> [40,45)  0.11272481 0.19159789 0.20693902 0.08497573
#> [45,50)  0.07211171 0.10661081 0.18262842 0.14226798
#> [50,55)  0.10206625 0.06627440 0.10127563 0.17460191
#> [55,60)  0.26060420 0.09336056 0.06411346 0.15124814
#> [60,65)  0.43126546 0.24334924 0.09119461 0.10262151
#> [65,70)  0.20667038 0.39677196 0.22579696 0.07733600
#> [70,75)  0.06081689 0.17730659 0.32699747 0.10599402
#> [75,Inf) 0.11911085 0.10569303 0.18447573 0.27108627

What next?

From here you might want to:

  • Create a next generation matrix (NGM)
  • Apply vaccination to an NGM

See the vignette, “example pipeline” for more details.

A More in depth example

The above example showed how we might extract a contact matrix based on the polymod data - this example now shows all the steps that can be taken for full flexibility, and provides more detail on the initial datasets that could be used.

First we want to fit the model to the POLYMOD data, which contains various survey and population data.

library(conmat)
polymod_contact_data_home <- get_polymod_contact_data(setting = "home")
polymod_survey_data <- get_polymod_population()

The contact data is a data frame containing the age from and to, and the number of contacts for each of the specified settings, “home”, “work”, “school”, “other”, or “all” as well as the number of participants. By default, polymod_contact_data contains data from “all”, but we’re going to use the “work” set of data, as it produces an interesting looking dataset. Each row contains survey information of the number of contacts. Specifically, the number of contacts from one age group to another age group, and then the number of participants in that age group.

The survey data, polymod_survey_data contains the lower age limit and the population in that age group.

polymod_survey_data
#> # A tibble: 21 × 2 (conmat_population)
#>  - age: lower.age.limit
#>  - population: population
#>    lower.age.limit population
#>              <int>      <dbl>
#>  1               0   1898966.
#>  2               5   2017632.
#>  3              10   2192410.
#>  4              15   2369985.
#>  5              20   2467873.
#>  6              25   2484327.
#>  7              30   2649826.
#>  8              35   3043704.
#>  9              40   3117812.
#> 10              45   2879510.
#> # ℹ 11 more rows

We also provide control over the POLYMOD data retrieved from get_polymod_contact_data() via the arguments, setting, country, and ages. These allow you to specify the data to be only from certain settings or countries or ages. See ?get_polymod_contact_data for more details. Below is a brief example of this:

polymod_contact_data_belgium_0_10 <- get_polymod_contact_data(
  setting = "work",
  countries = "Belgium",
  ages = c(0, 5, 10)
)

polymod_contact_data_belgium_0_10
#> # A tibble: 3,443 × 5
#>    setting age_from age_to contacts participants
#>    <chr>      <int>  <dbl>    <int>        <int>
#>  1 work           0      0        0            5
#>  2 work           0      1        0            3
#>  3 work           0      2        0            2
#>  4 work           0      3        0            2
#>  5 work           0      4        0            1
#>  6 work           0      5        0            5
#>  7 work           0      7        0            1
#>  8 work           0     10        0            5
#>  9 work           0     11        0            1
#> 10 work           0     12        0            1
#> # ℹ 3,433 more rows

Similarly, you can control the population data, retrieving it only for certain countries:

get_polymod_population(countries = "Belgium")
#> # A tibble: 21 × 2 (conmat_population)
#>  - age: lower.age.limit
#>  - population: population
#>    lower.age.limit population
#>              <int>      <dbl>
#>  1               0     583492
#>  2               5     593148
#>  3              10     632157
#>  4              15     626921
#>  5              20     649588
#>  6              25     663176
#>  7              30     705878
#>  8              35     773177
#>  9              40     823305
#> 10              45     779436
#> # ℹ 11 more rows
get_polymod_population(countries = "Finland")
#> # A tibble: 21 × 2 (conmat_population)
#>  - age: lower.age.limit
#>  - population: population
#>    lower.age.limit population
#>              <int>      <dbl>
#>  1               0     284683
#>  2               5     296753
#>  3              10     330425
#>  4              15     319947
#>  5              20     334180
#>  6              25     331510
#>  7              30     308108
#>  8              35     352962
#>  9              40     379090
#> 10              45     381713
#> # ℹ 11 more rows

You can see the available countries in the helpfile with ?get_polymod_population.

Predicting the contact rate

We can create a model of the contact rate with the function fit_single_contact_model. We’re first going to use some smaller sets of the data, to save on computation time.

set.seed(2022 - 10 - 04)
polymod_contact_data_home_small <- polymod_contact_data_home %>%
  filter(
    age_from <= 30,
    age_to <= 30
  )

polymod_survey_data_small <- polymod_survey_data %>%
  filter(lower.age.limit <= 30)

contact_model <- fit_single_contact_model(
  contact_data = polymod_contact_data_home_small,
  population = polymod_survey_data_small
)

This fits a generalised additive model (GAM), predicting the contact rate, based on a series of prediction terms that describe various features of the contact rates.

contact_model
#> 
#> Family: poisson 
#> Link function: log 
#> 
#> Formula:
#> contacts ~ s(gam_age_offdiag) + s(gam_age_offdiag_2) + s(gam_age_diag_prod) + 
#>     s(gam_age_diag_sum) + s(gam_age_pmax) + s(gam_age_pmin) + 
#>     school_probability + work_probability + offset(log_contactable_population)
#> 
#> Estimated degrees of freedom:
#> 7.13 1.00 4.60 1.00 5.70 3.95  total = 26.37 
#> 
#> fREML score: 1657.92     rank: 55/57

We can use this contact model to then predict the contact rate in a new population.

As a demonstration, let’s take an age population from a given LGA in Australia (this was the initial motivation for the package, so there are some helper functions for Australian specific data).

fairfield <- abs_age_lga("Fairfield (C)")
fairfield
#> # A tibble: 18 × 4 (conmat_population)
#>  - age: lower.age.limit
#>  - population: population
#>    lga           lower.age.limit  year population
#>    <chr>                   <dbl> <dbl>      <dbl>
#>  1 Fairfield (C)               0  2020      12261
#>  2 Fairfield (C)               5  2020      13093
#>  3 Fairfield (C)              10  2020      13602
#>  4 Fairfield (C)              15  2020      14323
#>  5 Fairfield (C)              20  2020      15932
#>  6 Fairfield (C)              25  2020      16190
#>  7 Fairfield (C)              30  2020      14134
#>  8 Fairfield (C)              35  2020      13034
#>  9 Fairfield (C)              40  2020      12217
#> 10 Fairfield (C)              45  2020      13449
#> 11 Fairfield (C)              50  2020      13419
#> 12 Fairfield (C)              55  2020      13652
#> 13 Fairfield (C)              60  2020      12907
#> 14 Fairfield (C)              65  2020      10541
#> 15 Fairfield (C)              70  2020       8227
#> 16 Fairfield (C)              75  2020       5598
#> 17 Fairfield (C)              80  2020       4006
#> 18 Fairfield (C)              85  2020       4240

We can then pass the contact model through to predict_contacts, along with the fairfield age population data, and some age breaks that we want to predict to. Note that these age breaks could be any size, we just have them set to 5 year age brackets in most of the examples, but these could be 1 year, 2 year, or even sub year.

set.seed(2022 - 10 - 04)
synthetic_contact_fairfield <- predict_contacts(
  model = contact_model,
  population = fairfield,
  age_breaks = c(seq(0, 30, by = 5), Inf)
)

synthetic_contact_fairfield
#> # A tibble: 49 × 3
#>    age_group_from age_group_to contacts
#>    <fct>          <fct>           <dbl>
#>  1 [0,5)          [0,5)        1.79e- 1
#>  2 [0,5)          [5,10)       1.61e- 1
#>  3 [0,5)          [10,15)      8.53e- 2
#>  4 [0,5)          [15,20)      5.66e- 2
#>  5 [0,5)          [20,25)      8.57e- 2
#>  6 [0,5)          [25,30)      1.80e- 1
#>  7 [0,5)          [30,Inf)     1.75e+19
#>  8 [5,10)         [0,5)        1.53e- 1
#>  9 [5,10)         [5,10)       2.52e- 1
#> 10 [5,10)         [10,15)      1.89e- 1
#> # ℹ 39 more rows

Plotting

Let’s visualise the matrix to get a sense of the predictions with autoplot. First we need to transform the predictions to a matrix:

synthetic_contact_fairfield %>%
  predictions_to_matrix() %>%
  autoplot()

Note

It is worth noting that the contact matrices created using this package are transposed when compared to the contact matrices discussed by Prem and Mossong. That is, the rows are “age group to”, and the columns are “age group from”.

Applying the model across all settings.

Our experience has been that we would be fitting the same models to each setting when doing data analysis when using conmat. Accordingly, you can also fit a model for all of the settings all at once with the functions, fit_setting_contacts(), and predict_setting_contacts(). This means we can do the above, but for each setting, “home”, “work”, “school”, “other”, and “all”, at once. If we want to use all of the POLYMOD data, we can also use the extrapolate_polymod() function.

Fit to all settings

We can create a model for each of the settings with fit_setting_contacts().

set.seed(2021 - 09 - 24)

polymod_setting_data <- get_polymod_setting_data()

polymod_setting_data_small <- polymod_setting_data %>%
  lapply(FUN = function(x) x %>% filter(age_from <= 20, age_to <= 20))

setting_models <- fit_setting_contacts(
  contact_data_list = polymod_setting_data_small,
  population = polymod_survey_data
)

This contains a list of models, one for each setting. We can look at one, and get summary information out:

names(setting_models)
#> [1] "home"   "work"   "school" "other"
setting_models$home
#> 
#> Family: poisson 
#> Link function: log 
#> 
#> Formula:
#> contacts ~ s(gam_age_offdiag) + s(gam_age_offdiag_2) + s(gam_age_diag_prod) + 
#>     s(gam_age_diag_sum) + s(gam_age_pmax) + s(gam_age_pmin) + 
#>     school_probability + work_probability + offset(log_contactable_population)
#> 
#> Estimated degrees of freedom:
#> 6.17 1.00 1.00 1.00 2.80 4.56  total = 19.53 
#> 
#> fREML score: 790.1204     rank: 55/57

So this gives us our baseline model of a contact model. We have fit this model to the existing contact survey data. We can now predict to another age population, to create our “synthetic” contact matrix.

Predict to all settings

Then we take the model we had earlier, and extrapolate to the fairfield data with predict_setting_contacts, also providing some age breaks we want to predict to

set.seed(2021 - 10 - 04)
synthetic_settings_5y_fairfield <- predict_setting_contacts(
  population = fairfield,
  contact_model = setting_models,
  age_breaks = c(seq(0, 20, by = 5), Inf)
)

This then returns a list of synthetic matrices, “home”, “work”, “school”, “other”, and “all”, which is the sum of all matrices.

str(synthetic_settings_5y_fairfield)
#> List of 5
#>  $ home  : 'conmat_age_matrix' num [1:5, 1:5] 1.15e-01 1.04e-01 5.77e-02 3.50e-02 8.76e+30 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#>  $ work  : 'conmat_age_matrix' num [1:5, 1:5] 0.000233 0.00014 0.000169 0.002111 0.000361 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#>  $ school: 'conmat_age_matrix' num [1:5, 1:5] 0.66797 0.13932 0.00889 0.00804 Inf ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#>  $ other : 'conmat_age_matrix' num [1:5, 1:5] 1.65e-01 9.82e-02 3.90e-02 2.62e-02 8.52e+37 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#>  $ all   : 'conmat_age_matrix' num [1:5, 1:5] 0.9473 0.3417 0.1057 0.0713 Inf ...
#>   ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>   .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#>  - attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#>  - attr(*, "class")= chr [1:2] "conmat_setting_prediction_matrix" "list"
synthetic_settings_5y_fairfield$home
#>                 [0,5)       [5,10)      [10,15)      [15,20)     [20,Inf)
#> [0,5)    1.145796e-01 9.841129e-02 5.180681e-02 2.942438e-02 6.813871e+29
#> [5,10)   1.040635e-01 1.593116e-01 1.124681e-01 4.160005e-02 7.071628e+23
#> [10,15)  5.767858e-02 1.184141e-01 1.884676e-01 9.906429e-02 2.318631e+18
#> [15,20)  3.495467e-02 4.673458e-02 1.057030e-01 1.652793e-01 1.481966e+13
#> [20,Inf) 8.760961e+30 8.598521e+24 2.677699e+19 1.603977e+14 1.033909e+09
synthetic_settings_5y_fairfield$all
#>               [0,5)    [5,10)    [10,15)    [15,20) [20,Inf)
#> [0,5)    0.94731881 0.3241321 0.09496626 0.06005841      Inf
#> [5,10)   0.34173831 4.6139213 0.51481268 0.09161578      Inf
#> [10,15)  0.10572973 0.5400433 6.64997415 0.58019783      Inf
#> [15,20)  0.07134634 0.1029235 0.61571645 5.29546771      Inf
#> [20,Inf)        Inf       Inf        Inf        Inf      Inf

We can use autoplot to plot all at once

# this code is erroring for the moment - something to do with rendering a large plot I think.
autoplot(
  synthetic_settings_5y_fairfield,
  title = "Setting-specific synthetic contact matrices (fairfield 2020 projected)"
)