Adds offset variables
add_offset.Rd
Mostly used internally in add_modelling_features()
. Adds two
offset variables to be used in fit_single_contact_model()
:
log_contactable_population_school
, andlog_contactable_population
. These two variables require variablesschool_weighted_pop_fraction
(fromadd_school_work_participation()
) andpop_age_to
(fromadd_school_work_participation()
). This provides separate offsets for school setting when compared to the other settings such as home, work and other. The offset for school captures cohorting of students for schools and takes the logarithm of the weighted combination of contact population age distribution & school year probability calculated inadd_school_work_participation()
. See "details" for more information.
Arguments
- contact_data
contact data - must contain columns
age_to
,age_from
,pop_age_to
(fromadd_population_age_to()
, andschool_weighted_pop_fraction
(fromadd_school_work_participation()
)).
Value
data.frame of contact_data
with two extra columns:
log_contactable_population_school
and log_contactable_population
Details
why double offsets? There are two offsets specified, once in the
model formula, and once in the "offset" argument of mgcv::bam
. The
offsets get added together when the model first fit. In addition, the
setting specific offset from offset_variable
, which is included in the
GAM model as ... + offset(log_contactable_population)
is used in
prediction, whereas the other offset, included as an argument in the GAM
as offset = log(participants)
is only included when the model is
initially created. See more detail in fit_single_contact_model()
.
Examples
age_min <- 10
age_max <- 15
all_ages <- age_min:age_max
library(tidyr)
example_df <- expand_grid(
age_from = all_ages,
age_to = all_ages,
)
example_df %>%
add_population_age_to() %>%
add_school_work_participation() %>%
add_offset()
#> # A tibble: 36 × 14
#> age_from age_to pop_age_to intergen…¹ schoo…² work_…³ schoo…⁴ work_…⁵ schoo…⁶
#> <int> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 10 10 0.161 0 1 0.05 1 0.05 1
#> 2 10 11 0.163 1 1 0.05 1 0.05 1
#> 3 10 12 0.165 2 1 0.05 1 0.2 1
#> 4 10 13 0.168 3 1 0.05 1 0.2 1
#> 5 10 14 0.170 4 1 0.05 1 0.2 1
#> 6 10 15 0.173 5 1 0.05 1 0.2 1
#> 7 11 10 0.161 1 1 0.05 1 0.05 1
#> 8 11 11 0.163 0 1 0.05 1 0.05 1
#> 9 11 12 0.165 1 1 0.05 1 0.2 1
#> 10 11 13 0.168 2 1 0.05 1 0.2 1
#> # … with 26 more rows, 5 more variables: work_probability <dbl>,
#> # school_year_probability <dbl>, school_weighted_pop_fraction <dbl>,
#> # log_contactable_population_school <dbl>, log_contactable_population <dbl>,
#> # and abbreviated variable names ¹intergenerational,
#> # ²school_fraction_age_from, ³work_fraction_age_from,
#> # ⁴school_fraction_age_to, ⁵work_fraction_age_to, ⁶school_probability