Add features required for modelling to the dataset
add_modelling_features.Rd
This function adds three main groups of features to the data. It is used
internally in fit_single_contact_model()
and predict_contacts_1y()
.
It requires columns named age_to
and age_from
. The three types of
features it adds are described below:
Population distribution of contact ages from the function
add_population_age_to()
, which requires a column called "age_to" representing the age of the person who had contact. It creates a column calledpop_age_to
.add_population_age_to()
takes an extra argument for population, which defaults toget_polymod_population()
, but needs to be aconmat_population
object, which specifies theage
andpopulation
characteristics, or a data frame with columns,lower.age.limit
, andpopulation
.School work participation, which is from the function
add_school_work_participation()
. This requires columnsage_to
andage_from
, but will operate on any column starting withage
and adds columns:school_probability
,work_probability
,school_year_probability
, andschool_weighted_pop_fraction
.Offset is added on to the data using
add_offset()
. This requires variablesschool_weighted_pop_fraction
(fromadd_school_work_participation()
) andpop_age_to
(fromadd_school_work_participation()
). It adds two columns,log_contactable_population_school
, andlog_contactable_population
.
Usage
add_modelling_features(
contact_data,
school_demographics = NULL,
work_demographics = NULL,
population = get_polymod_population()
)
Arguments
- contact_data
contact data with columns
age_to
andage_from
- school_demographics
(optional) defaults to census average proportion at school. You can provide a dataset with columns, "age" (numeric), and "school_fraction" (0-1), if you would like to specify these details. See
abs_avg_school
for the default values. If you would like to use the original school demographics used in conmat, these are provided in the dataset,conmat_original_school_demographics
.- work_demographics
(optional) defaults to census average proportion employed. You can provide a dataset with columns, "age" (numeric), and "work_fraction", if you would like to specify these details. See
abs_avg_work
for the default values. If you would like to use the original work demographics used in conmat, these are provided in the dataset,conmat_original_work_demographics
.- population
the
population
argument ofadd_population_age_to()
Value
data frame with 11 extra columns - the contents of contact_data
,
plus: pop_age_to, school_fraction_age_from, work_fraction_age_from,
school_fraction_age_to, work_fraction_age_to, school_probability,
work_probability, school_year_probability, school_weighted_pop_fraction,
log_contactable_population_school, and log_contactable_population.
Examples
age_min <- 10
age_max <- 15
all_ages <- age_min:age_max
library(tidyr)
example_df <- expand_grid(
age_from = all_ages,
age_to = all_ages,
)
add_modelling_features(example_df)
add_modelling_features(
example_df,
school_demographics = conmat_original_school_demographics,
work_demographics = conmat_original_work_demographics
)