Getting Started
getting-started.Rmd
library(conmat)
library(socialmixr)
#>
#> Attaching package: 'socialmixr'
#> The following object is masked from 'package:utils':
#>
#> cite
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(mgcv)
#> Loading required package: nlme
#>
#> Attaching package: 'nlme'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
#> This is mgcv 1.9-0. For overview type 'help("mgcv-package")'.
library(patchwork)
The goal of conmat is to simplify the process of generating synthetic contact matrices for a given age population.
What is a contact matrix?
Contact matrices describe the degree of contact between individuals of given age groups. For example, this matrix describes the number of contacts between individuals.
#> 0-4 5-9 10-14
#> 0-4 10 3 4
#> 5-9 3 11 5
#> 10-14 4 5 13
The rows and columns represent the age groups of the people. On the main diagonal we see that we have a higher number of contacts - showing that people of similar ages tend to interact more with one another. We can use the information in these matrices to model how diseases such as COVID-19 spread in a population through social contact.
Why do we need synthetic contact matrices?
Contact matrices are produced from empirical data resulting from a contact survey, which requires individuals to diary the amount and manner of contact a person has in a day. However, these surveys are highly time-consuming and expensive to run, meaning that only a handful of these empirical datasets exist globally.
We can use statistical methods to create synthetic contact matrices, which are new contact matrices that have been generalised to new countries based on existing surveys.
Why do we need conmat
?
Existing methods only provide outputs of the contact matrices for each country, or at best, for urban and rural areas for a given country. We need methods that allow for flexibly creating synthetic contact matrices for a specified age population. This is because the age population distribution of many countries (e.g., Australia), are quite heterogeneous, and assuming it is homogeneous would result in inaccurate representation of community infection in many regions.
Quick example using Australian data
Suppose we want to get a contact matrix for a given region in
Australia, let’s say the city of Perth. We can get that from a helper
function, abs_age_lga
.
perth <- abs_age_lga("Perth (C)")
perth
#> # A tibble: 18 × 4 (conmat_population)
#> - age: lower.age.limit
#> - population: population
#> lga lower.age.limit year population
#> <chr> <dbl> <dbl> <dbl>
#> 1 Perth (C) 0 2020 1331
#> 2 Perth (C) 5 2020 834
#> 3 Perth (C) 10 2020 529
#> 4 Perth (C) 15 2020 794
#> 5 Perth (C) 20 2020 3615
#> 6 Perth (C) 25 2020 5324
#> 7 Perth (C) 30 2020 4667
#> 8 Perth (C) 35 2020 3110
#> 9 Perth (C) 40 2020 1650
#> 10 Perth (C) 45 2020 1445
#> 11 Perth (C) 50 2020 1299
#> 12 Perth (C) 55 2020 1344
#> 13 Perth (C) 60 2020 1359
#> 14 Perth (C) 65 2020 1145
#> 15 Perth (C) 70 2020 1004
#> 16 Perth (C) 75 2020 673
#> 17 Perth (C) 80 2020 481
#> 18 Perth (C) 85 2020 367
(You can learn more about the data sources we provide in
data-sources.Rmd
)
We can get a contact matrix made for perth
using the
extrapolate_polymod
function:
perth_contact <- extrapolate_polymod(
population = perth
)
perth_contact
#>
#> ── Setting Prediction Matrices ─────────────────────────────────────────────────
#> A list of matrices containing the model predicted contact rate between ages in
#> each setting.
#> There are 16 age breaks, ranging 0-75+ years, with a regular 5 year interval
#> • home: a 16x16 <matrix>
#> • work: a 16x16 <matrix>
#> • school: a 16x16 <matrix>
#> • other: a 16x16 <matrix>
#> • all: a 16x16 <matrix>
#> ℹ Access each <matrix> with `x$name`
#> ℹ e.g., `x$home`
We can plot this with autoplot
autoplot(perth_contact)
And you can see each contact matrix in a setting by referring to its name - for example, the “home” setting contact matrix:
perth_contact$home
#> [0,5) [5,10) [10,15) [15,20) [20,25) [25,30)
#> [0,5) 0.40076181 0.33477812 0.16555012 0.11608014 0.17617098 0.32079759
#> [5,10) 0.19885124 0.32216319 0.20765643 0.07226656 0.05193349 0.09275907
#> [10,15) 0.07470057 0.15774971 0.26654020 0.12883717 0.04419741 0.03535209
#> [15,20) 0.08170674 0.08563793 0.20097718 0.34425132 0.17025978 0.06763197
#> [20,25) 0.33292330 0.16522893 0.18510249 0.45711126 0.73008179 0.41491477
#> [25,30) 1.11525912 0.54291253 0.27237380 0.33403867 0.76329736 1.15720681
#> [30,35) 1.45860486 1.15187002 0.52666015 0.27078533 0.33556433 0.71610024
#> [35,40) 0.74668485 1.02027190 0.74962874 0.32720560 0.17284734 0.20851823
#> [40,45) 0.24541744 0.41797497 0.55192548 0.38836042 0.16874222 0.08987479
#> [45,50) 0.12179342 0.15376654 0.25912643 0.33988507 0.23163274 0.10557474
#> [50,55) 0.11563694 0.09409502 0.11753342 0.20020602 0.24542407 0.17519639
#> [55,60) 0.13503961 0.09876405 0.07843632 0.09845190 0.15621575 0.19574806
#> [60,65) 0.12361018 0.11014329 0.07653905 0.05845473 0.07100810 0.11555725
#> [65,70) 0.08050583 0.09083407 0.07498176 0.04709679 0.03585191 0.04657126
#> [70,75) 0.04571279 0.05636194 0.05763523 0.04176239 0.02672337 0.02309248
#> [75,Inf) 0.04433925 0.05437000 0.05768114 0.05328648 0.04628235 0.04056598
#> [30,35) [35,40) [40,45) [45,50) [50,55) [55,60)
#> [0,5) 0.41986733 0.32009689 0.17360957 0.11569004 0.12037866 0.13963653
#> [5,10) 0.19694689 0.25979529 0.17562638 0.08675713 0.05818228 0.06066078
#> [10,15) 0.06840680 0.14500559 0.17617452 0.11106526 0.05520885 0.03659734
#> [15,20) 0.05486552 0.09873357 0.19337627 0.22725016 0.14669997 0.07165755
#> [20,25) 0.18254084 0.14002857 0.22558064 0.41579752 0.48281403 0.30526183
#> [25,30) 0.71662662 0.31076553 0.22102981 0.34863971 0.63404890 0.70368779
#> [30,35) 1.09947651 0.66335662 0.25976689 0.18402632 0.30705923 0.55689293
#> [35,40) 0.44542786 0.69615807 0.38905818 0.14785069 0.11414398 0.19791347
#> [40,45) 0.10570365 0.23577102 0.38790919 0.21832411 0.08906823 0.07049240
#> [45,50) 0.05576763 0.06672604 0.16259149 0.29222377 0.17596322 0.07176800
#> [50,55) 0.08490703 0.04700502 0.06052547 0.16056139 0.29887960 0.17643556
#> [55,60) 0.15502733 0.08205057 0.04822505 0.06592728 0.17762380 0.32453421
#> [60,65) 0.16323144 0.14348475 0.07851056 0.04681736 0.06587880 0.18132674
#> [65,70) 0.08575999 0.13412044 0.11910768 0.06317103 0.03886021 0.05882613
#> [70,75) 0.03362083 0.06630028 0.10307451 0.08846877 0.04884175 0.03304499
#> [75,Inf) 0.03722015 0.04251769 0.06915906 0.11440610 0.14081425 0.12619944
#> [60,65) [65,70) [70,75) [75,Inf)
#> [0,5) 0.12872383 0.09279565 0.06409376 0.03956186
#> [5,10) 0.06812931 0.06218996 0.04693915 0.02881503
#> [10,15) 0.03596518 0.03899872 0.03646368 0.02322291
#> [15,20) 0.04284739 0.03821128 0.04121576 0.03346612
#> [20,25) 0.13974053 0.07809486 0.07080747 0.07803935
#> [25,30) 0.41835658 0.18662205 0.11256237 0.12583317
#> [30,35) 0.59051931 0.34340814 0.16376160 0.11536980
#> [35,40) 0.34855102 0.36062108 0.21684493 0.08849404
#> [40,45) 0.11557517 0.19407598 0.20429657 0.08723069
#> [45,50) 0.05132625 0.07665599 0.13058578 0.10746461
#> [50,55) 0.06590183 0.04302814 0.06578340 0.12069301
#> [55,60) 0.18261175 0.06557415 0.04480699 0.10889502
#> [60,65) 0.33689152 0.19066624 0.07151884 0.07706387
#> [65,70) 0.17225754 0.32955054 0.18796283 0.06150905
#> [70,75) 0.05311866 0.15452338 0.28435950 0.09695813
#> [75,Inf) 0.08994295 0.07946041 0.15236100 0.24375399
These contact matrices could then be used in subsequent modelling, such as the input in a SIR (Susceptible, Infected, Recovered) model.
Quick example using world data
Similarly to above, we can access some world data from another data source - we have some helpers to pull data from the world population:
world_data <- socialmixr::wpp_age()
head(world_data)
#> country lower.age.limit year population
#> 1 AFRICA 0 1950 38705049
#> 2 AFRICA 0 1955 44304214
#> 3 AFRICA 0 1960 50491493
#> 4 AFRICA 0 1965 57690110
#> 5 AFRICA 0 1970 65452837
#> 6 AFRICA 0 1975 75017430
We can tidy the data up, filtering down to a specified location and
year with the age_population
function:
nz_2015 <- age_population(
data = world_data,
location_col = country,
location = "New Zealand",
age_col = lower.age.limit,
year_col = year,
year = 2015
)
nz_2015
#> # A tibble: 21 × 5 (conmat_population)
#> - age: lower.age.limit
#> - population: population
#> country year population lower.age.limit upper.age.limit
#> <chr> <int> <dbl> <dbl> <dbl>
#> 1 New Zealand 2015 308681 0 4
#> 2 New Zealand 2015 315632 5 9
#> 3 New Zealand 2015 297247 10 14
#> 4 New Zealand 2015 318932 15 19
#> 5 New Zealand 2015 339362 20 24
#> 6 New Zealand 2015 315914 25 29
#> 7 New Zealand 2015 287537 30 34
#> 8 New Zealand 2015 273539 35 39
#> 9 New Zealand 2015 312245 40 44
#> 10 New Zealand 2015 313312 45 49
#> # ℹ 11 more rows
Then we could create a contact matrix for NZ population data for 2015 like so:
nz_contact <- extrapolate_polymod(
population = nz_2015
)
autoplot(nz_contact)
nz_contact$home
#> [0,5) [5,10) [10,15) [15,20) [20,25) [25,30)
#> [0,5) 0.59904546 0.49663687 0.24783852 0.16685575 0.23779435 0.45167131
#> [5,10) 0.49590450 0.80198425 0.52807734 0.19165851 0.12507770 0.21662190
#> [10,15) 0.24801027 0.52922371 0.86792789 0.44837029 0.15277570 0.11386007
#> [15,20) 0.17170557 0.19752052 0.46108309 0.74210402 0.37437407 0.14189402
#> [20,25) 0.24970734 0.13153780 0.16031834 0.38202547 0.60099118 0.32615896
#> [25,30) 0.45823995 0.22009679 0.11543594 0.13989150 0.31511566 0.49499779
#> [30,35) 0.58789796 0.45791877 0.21460518 0.11138674 0.12795622 0.27752805
#> [35,40) 0.43715208 0.59558222 0.44877557 0.20875654 0.10483077 0.11678179
#> [40,45) 0.24637719 0.41784060 0.55173061 0.41392260 0.18686739 0.09189737
#> [45,50) 0.17121478 0.21826147 0.36146201 0.47771534 0.34585427 0.15423717
#> [50,55) 0.17561659 0.14494800 0.18110658 0.29956710 0.37925607 0.27470535
#> [55,60) 0.19091227 0.13941710 0.11237964 0.13809560 0.21781394 0.28043336
#> [60,65) 0.15865533 0.13968437 0.09795321 0.07534281 0.08879449 0.14665213
#> [65,70) 0.09795471 0.10943657 0.09028519 0.05832945 0.04342008 0.05545301
#> [70,75) 0.05312051 0.06525155 0.06632928 0.04936708 0.03166200 0.02651409
#> [75,Inf) 0.05332440 0.06568208 0.06959281 0.06485754 0.05705246 0.05063730
#> [30,35) [35,40) [40,45) [45,50) [50,55) [55,60)
#> [0,5) 0.61686214 0.46910228 0.25365446 0.16899792 0.17376930 0.20206763
#> [5,10) 0.47977067 0.63816915 0.42954802 0.21511775 0.14321182 0.14734590
#> [10,15) 0.22533425 0.48190902 0.56842076 0.35702910 0.17932573 0.11902862
#> [15,20) 0.12027154 0.23052514 0.43853510 0.48523543 0.30503162 0.15041321
#> [20,25) 0.14098644 0.11812819 0.20202509 0.35847842 0.39406680 0.24209084
#> [25,30) 0.29543605 0.12713952 0.09598769 0.15445415 0.27576878 0.30113624
#> [30,35) 0.44075564 0.26007699 0.10241976 0.07378624 0.12176811 0.22102166
#> [35,40) 0.25430395 0.40484109 0.22741314 0.08797632 0.06614672 0.11166469
#> [40,45) 0.10438256 0.23703285 0.38946352 0.22464248 0.09222914 0.07033241
#> [45,50) 0.07843709 0.09564466 0.23431160 0.41266991 0.25196526 0.10302438
#> [50,55) 0.12912555 0.07173578 0.09596274 0.25134673 0.46077162 0.27413423
#> [55,60) 0.21910793 0.11321083 0.06841237 0.09607651 0.25627598 0.46419737
#> [60,65) 0.20993522 0.17991857 0.09746956 0.05977965 0.08481979 0.23166041
#> [65,70) 0.10469289 0.16296452 0.14069818 0.07505802 0.04677449 0.07048263
#> [70,75) 0.03913179 0.07853000 0.11932926 0.10096504 0.05612741 0.03800797
#> [75,Inf) 0.04714828 0.05382800 0.08528198 0.13688872 0.16841350 0.15605344
#> [60,65) [65,70) [70,75) [75,Inf)
#> [0,5) 0.18890661 0.13733131 0.09484177 0.05470230
#> [5,10) 0.16607313 0.15320249 0.11632882 0.06727994
#> [10,15) 0.11671106 0.12666645 0.11850689 0.07144055
#> [15,20) 0.09231611 0.08415411 0.09070226 0.06846732
#> [20,25) 0.11102181 0.06392410 0.05936161 0.06145876
#> [25,30) 0.17715415 0.07887508 0.04802691 0.05270121
#> [30,35) 0.23822736 0.13988632 0.06658570 0.04609558
#> [35,40) 0.19963353 0.21291306 0.13065862 0.05145799
#> [40,45) 0.11272481 0.19159789 0.20693902 0.08497573
#> [45,50) 0.07211171 0.10661081 0.18262842 0.14226798
#> [50,55) 0.10206625 0.06627440 0.10127563 0.17460191
#> [55,60) 0.26060420 0.09336056 0.06411346 0.15124814
#> [60,65) 0.43126546 0.24334924 0.09119461 0.10262151
#> [65,70) 0.20667038 0.39677196 0.22579696 0.07733600
#> [70,75) 0.06081689 0.17730659 0.32699747 0.10599402
#> [75,Inf) 0.11911085 0.10569303 0.18447573 0.27108627
What next?
From here you might want to:
- Create a next generation matrix (NGM)
- Apply vaccination to an NGM
See the vignette, “example pipeline” for more details.
A More in depth example
The above example showed how we might extract a contact matrix based on the polymod data - this example now shows all the steps that can be taken for full flexibility, and provides more detail on the initial datasets that could be used.
First we want to fit the model to the POLYMOD data, which contains various survey and population data.
library(conmat)
polymod_contact_data_home <- get_polymod_contact_data(setting = "home")
polymod_survey_data <- get_polymod_population()
The contact data is a data frame containing the age from and to, and
the number of contacts for each of the specified settings, “home”,
“work”, “school”, “other”, or “all” as well as the number of
participants. By default, polymod_contact_data
contains
data from “all”, but we’re going to use the “work” set of data, as it
produces an interesting looking dataset. Each row contains survey
information of the number of contacts. Specifically, the number of
contacts from one age group to another age group, and then the number of
participants in that age group.
The survey data, polymod_survey_data
contains the lower
age limit and the population in that age group.
polymod_survey_data
#> # A tibble: 21 × 2 (conmat_population)
#> - age: lower.age.limit
#> - population: population
#> lower.age.limit population
#> <int> <dbl>
#> 1 0 1898966.
#> 2 5 2017632.
#> 3 10 2192410.
#> 4 15 2369985.
#> 5 20 2467873.
#> 6 25 2484327.
#> 7 30 2649826.
#> 8 35 3043704.
#> 9 40 3117812.
#> 10 45 2879510.
#> # ℹ 11 more rows
We also provide control over the POLYMOD data retrieved from
get_polymod_contact_data()
via the arguments,
setting
, country
, and ages
. These
allow you to specify the data to be only from certain settings or
countries or ages. See ?get_polymod_contact_data
for more
details. Below is a brief example of this:
polymod_contact_data_belgium_0_10 <- get_polymod_contact_data(
setting = "work",
countries = "Belgium",
ages = c(0, 5, 10)
)
polymod_contact_data_belgium_0_10
#> # A tibble: 3,443 × 5
#> setting age_from age_to contacts participants
#> <chr> <int> <dbl> <int> <int>
#> 1 work 0 0 0 5
#> 2 work 0 1 0 3
#> 3 work 0 2 0 2
#> 4 work 0 3 0 2
#> 5 work 0 4 0 1
#> 6 work 0 5 0 5
#> 7 work 0 7 0 1
#> 8 work 0 10 0 5
#> 9 work 0 11 0 1
#> 10 work 0 12 0 1
#> # ℹ 3,433 more rows
Similarly, you can control the population data, retrieving it only for certain countries:
get_polymod_population(countries = "Belgium")
#> # A tibble: 21 × 2 (conmat_population)
#> - age: lower.age.limit
#> - population: population
#> lower.age.limit population
#> <int> <dbl>
#> 1 0 583492
#> 2 5 593148
#> 3 10 632157
#> 4 15 626921
#> 5 20 649588
#> 6 25 663176
#> 7 30 705878
#> 8 35 773177
#> 9 40 823305
#> 10 45 779436
#> # ℹ 11 more rows
get_polymod_population(countries = "Finland")
#> # A tibble: 21 × 2 (conmat_population)
#> - age: lower.age.limit
#> - population: population
#> lower.age.limit population
#> <int> <dbl>
#> 1 0 284683
#> 2 5 296753
#> 3 10 330425
#> 4 15 319947
#> 5 20 334180
#> 6 25 331510
#> 7 30 308108
#> 8 35 352962
#> 9 40 379090
#> 10 45 381713
#> # ℹ 11 more rows
You can see the available countries in the helpfile with
?get_polymod_population
.
Predicting the contact rate
We can create a model of the contact rate with the function
fit_single_contact_model
. We’re first going to use some
smaller sets of the data, to save on computation time.
set.seed(2022 - 10 - 04)
polymod_contact_data_home_small <- polymod_contact_data_home %>%
filter(
age_from <= 30,
age_to <= 30
)
polymod_survey_data_small <- polymod_survey_data %>%
filter(lower.age.limit <= 30)
contact_model <- fit_single_contact_model(
contact_data = polymod_contact_data_home_small,
population = polymod_survey_data_small
)
This fits a generalised additive model (GAM), predicting the contact rate, based on a series of prediction terms that describe various features of the contact rates.
contact_model
#>
#> Family: poisson
#> Link function: log
#>
#> Formula:
#> contacts ~ s(gam_age_offdiag) + s(gam_age_offdiag_2) + s(gam_age_diag_prod) +
#> s(gam_age_diag_sum) + s(gam_age_pmax) + s(gam_age_pmin) +
#> school_probability + work_probability + offset(log_contactable_population)
#>
#> Estimated degrees of freedom:
#> 7.13 1.00 4.60 1.00 5.70 3.95 total = 26.37
#>
#> fREML score: 1657.92 rank: 55/57
We can use this contact model to then predict the contact rate in a new population.
As a demonstration, let’s take an age population from a given LGA in Australia (this was the initial motivation for the package, so there are some helper functions for Australian specific data).
fairfield <- abs_age_lga("Fairfield (C)")
fairfield
#> # A tibble: 18 × 4 (conmat_population)
#> - age: lower.age.limit
#> - population: population
#> lga lower.age.limit year population
#> <chr> <dbl> <dbl> <dbl>
#> 1 Fairfield (C) 0 2020 12261
#> 2 Fairfield (C) 5 2020 13093
#> 3 Fairfield (C) 10 2020 13602
#> 4 Fairfield (C) 15 2020 14323
#> 5 Fairfield (C) 20 2020 15932
#> 6 Fairfield (C) 25 2020 16190
#> 7 Fairfield (C) 30 2020 14134
#> 8 Fairfield (C) 35 2020 13034
#> 9 Fairfield (C) 40 2020 12217
#> 10 Fairfield (C) 45 2020 13449
#> 11 Fairfield (C) 50 2020 13419
#> 12 Fairfield (C) 55 2020 13652
#> 13 Fairfield (C) 60 2020 12907
#> 14 Fairfield (C) 65 2020 10541
#> 15 Fairfield (C) 70 2020 8227
#> 16 Fairfield (C) 75 2020 5598
#> 17 Fairfield (C) 80 2020 4006
#> 18 Fairfield (C) 85 2020 4240
We can then pass the contact model through to
predict_contacts
, along with the fairfield age population
data, and some age breaks that we want to predict to. Note that these
age breaks could be any size, we just have them set to 5 year age
brackets in most of the examples, but these could be 1 year, 2 year, or
even sub year.
set.seed(2022 - 10 - 04)
synthetic_contact_fairfield <- predict_contacts(
model = contact_model,
population = fairfield,
age_breaks = c(seq(0, 30, by = 5), Inf)
)
synthetic_contact_fairfield
#> # A tibble: 49 × 3
#> age_group_from age_group_to contacts
#> <fct> <fct> <dbl>
#> 1 [0,5) [0,5) 1.79e- 1
#> 2 [0,5) [5,10) 1.61e- 1
#> 3 [0,5) [10,15) 8.53e- 2
#> 4 [0,5) [15,20) 5.66e- 2
#> 5 [0,5) [20,25) 8.57e- 2
#> 6 [0,5) [25,30) 1.80e- 1
#> 7 [0,5) [30,Inf) 1.75e+19
#> 8 [5,10) [0,5) 1.53e- 1
#> 9 [5,10) [5,10) 2.52e- 1
#> 10 [5,10) [10,15) 1.89e- 1
#> # ℹ 39 more rows
Plotting
Let’s visualise the matrix to get a sense of the predictions with
autoplot
. First we need to transform the predictions to a
matrix:
synthetic_contact_fairfield %>%
predictions_to_matrix() %>%
autoplot()
Note
It is worth noting that the contact matrices created using this package are transposed when compared to the contact matrices discussed by Prem and Mossong. That is, the rows are “age group to”, and the columns are “age group from”.
Applying the model across all settings.
Our experience has been that we would be fitting the same models to
each setting when doing data analysis when using conmat. Accordingly,
you can also fit a model for all of the settings all at once with the
functions, fit_setting_contacts()
, and
predict_setting_contacts()
. This means we can do the above,
but for each setting, “home”, “work”, “school”, “other”, and “all”, at
once. If we want to use all of the POLYMOD data, we can also use the
extrapolate_polymod()
function.
Fit to all settings
We can create a model for each of the settings with
fit_setting_contacts()
.
set.seed(2021 - 09 - 24)
polymod_setting_data <- get_polymod_setting_data()
polymod_setting_data_small <- polymod_setting_data %>%
lapply(FUN = function(x) x %>% filter(age_from <= 20, age_to <= 20))
setting_models <- fit_setting_contacts(
contact_data_list = polymod_setting_data_small,
population = polymod_survey_data
)
This contains a list of models, one for each setting. We can look at one, and get summary information out:
names(setting_models)
#> [1] "home" "work" "school" "other"
setting_models$home
#>
#> Family: poisson
#> Link function: log
#>
#> Formula:
#> contacts ~ s(gam_age_offdiag) + s(gam_age_offdiag_2) + s(gam_age_diag_prod) +
#> s(gam_age_diag_sum) + s(gam_age_pmax) + s(gam_age_pmin) +
#> school_probability + work_probability + offset(log_contactable_population)
#>
#> Estimated degrees of freedom:
#> 6.17 1.00 1.00 1.00 2.80 4.56 total = 19.53
#>
#> fREML score: 790.1204 rank: 55/57
So this gives us our baseline model of a contact model. We have fit this model to the existing contact survey data. We can now predict to another age population, to create our “synthetic” contact matrix.
Predict to all settings
Then we take the model we had earlier, and extrapolate to the
fairfield data with predict_setting_contacts
, also
providing some age breaks we want to predict to
set.seed(2021 - 10 - 04)
synthetic_settings_5y_fairfield <- predict_setting_contacts(
population = fairfield,
contact_model = setting_models,
age_breaks = c(seq(0, 20, by = 5), Inf)
)
This then returns a list of synthetic matrices, “home”, “work”, “school”, “other”, and “all”, which is the sum of all matrices.
str(synthetic_settings_5y_fairfield)
#> List of 5
#> $ home : 'conmat_age_matrix' num [1:5, 1:5] 1.15e-01 1.04e-01 5.77e-02 3.50e-02 8.76e+30 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#> $ work : 'conmat_age_matrix' num [1:5, 1:5] 0.000233 0.00014 0.000169 0.002111 0.000361 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#> $ school: 'conmat_age_matrix' num [1:5, 1:5] 0.66797 0.13932 0.00889 0.00804 Inf ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#> $ other : 'conmat_age_matrix' num [1:5, 1:5] 1.65e-01 9.82e-02 3.90e-02 2.62e-02 8.52e+37 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#> $ all : 'conmat_age_matrix' num [1:5, 1:5] 0.9473 0.3417 0.1057 0.0713 Inf ...
#> ..- attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> .. ..$ : chr [1:5] "[0,5)" "[5,10)" "[10,15)" "[15,20)" ...
#> - attr(*, "age_breaks")= num [1:6] 0 5 10 15 20 ...
#> - attr(*, "class")= chr [1:2] "conmat_setting_prediction_matrix" "list"
synthetic_settings_5y_fairfield$home
#> [0,5) [5,10) [10,15) [15,20) [20,Inf)
#> [0,5) 1.145796e-01 9.841129e-02 5.180681e-02 2.942438e-02 6.813871e+29
#> [5,10) 1.040635e-01 1.593116e-01 1.124681e-01 4.160005e-02 7.071628e+23
#> [10,15) 5.767858e-02 1.184141e-01 1.884676e-01 9.906429e-02 2.318631e+18
#> [15,20) 3.495467e-02 4.673458e-02 1.057030e-01 1.652793e-01 1.481966e+13
#> [20,Inf) 8.760961e+30 8.598521e+24 2.677699e+19 1.603977e+14 1.033909e+09
synthetic_settings_5y_fairfield$all
#> [0,5) [5,10) [10,15) [15,20) [20,Inf)
#> [0,5) 0.94731881 0.3241321 0.09496626 0.06005841 Inf
#> [5,10) 0.34173831 4.6139213 0.51481268 0.09161578 Inf
#> [10,15) 0.10572973 0.5400433 6.64997415 0.58019783 Inf
#> [15,20) 0.07134634 0.1029235 0.61571645 5.29546771 Inf
#> [20,Inf) Inf Inf Inf Inf Inf
We can use autoplot
to plot all at once
# this code is erroring for the moment - something to do with rendering a large plot I think.
autoplot(
synthetic_settings_5y_fairfield,
title = "Setting-specific synthetic contact matrices (fairfield 2020 projected)"
)