Return an interpolating function for populations in 1y age increments
get_age_population_function.Rd
This function returns an interpolating function to get
populations in 1y age increments from chunkier distributions produced by
socialmixr::wpp_age()
.
Usage
get_age_population_function(data, ...)
# S3 method for conmat_population
get_age_population_function(data = population, ...)
# S3 method for data.frame
get_age_population_function(
data = population,
age_col = lower.age.limit,
pop_col = population,
...
)
Arguments
- data
dataset containing information on population of a given age/age group
- ...
extra arguments
- age_col
bare variable name for the column with age information
- pop_col
bare variable name for the column with population information
Details
The function first prepares the data to fit a smoothing spline to the data for ages below the maximum age. It arranges the data by the lower limit of the age group to obtain the bin width/differences of the lower age limits. The mid point of the bin width is later added to the ages and the population is scaled as per the bin widths. The maximum age is later obtained and the populations for different above and below are filtered out along with the sum of populations with and without maximum age. A cubic smoothing spline is then fitted to the data for ages below the maximum with predictor variable as the ages with the mid point of the bins added to it where as the response variable is the log-scaled population. Using the smoothing spline fit, the predicted population of ages 0 to 200 is obtained and the predicted population is adjusted further using a ratio of the sum of the population across all ages from the data and predicted population. The ratio is based on whether the ages are under the maximum age as the total population across all ages differs for ages above and below the maximum age. The maximum age population is adjusted further to drop off smoothly, based on the weights. The final population is then linearly extrapolated over years past the upper bound from the data. For ages above the maximum age from data, the population is calculated as a weighted population of the maximum age that depends on the years past the upper bound. Older ages would have lower weights, therefore lower population.
Examples
polymod_pop <- get_polymod_population()
polymod_pop
#> # A tibble: 21 × 2 (conmat_population)
#> - age: lower.age.limit
#> - population: population
#> lower.age.limit population
#> <int> <dbl>
#> 1 0 1852682.
#> 2 5 1968449.
#> 3 10 2138897.
#> 4 15 2312032.
#> 5 20 2407486.
#> 6 25 2423602.
#> 7 30 2585137.
#> 8 35 2969393.
#> 9 40 3041663.
#> 10 45 2809154.
#> # … with 11 more rows
# But these ages and populations are binned every 5 years. So we can now
# provide a specified age and get the estimated population for that 1 year
# age group. First we create the new function like so
age_pop_function <- get_age_population_function(
data = polymod_pop
)
# Then we pass it a year to get the estimated population for a particular age
age_pop_function(4)
#> [1] 375940
# Or a vector of years, to get the estimated population for a particular age
# range
age_pop_function(1:4)
#> [1] 360379.5 365489.5 370672.4 375940.0
# Notice that we get a _pretty similar_ number of 0-4 if we sum it up, as
# the first row of the table:
head(polymod_pop, 1)
#> # A tibble: 1 × 2 (conmat_population)
#> - age: lower.age.limit
#> - population: population
#> lower.age.limit population
#> <int> <dbl>
#> 1 0 1852682.
sum(age_pop_function(age = 0:4))
#> [1] 1827822
# Usage in dplyr
library(dplyr)
example_df <- slice_head(abs_education_state, n = 5)
example_df %>%
mutate(population_est = age_pop_function(age))
#> # A tibble: 5 × 6
#> year state aboriginal_and_torres_strait_islander_status age n_ful…¹ popul…²
#> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 2006 ACT Aboriginal and Torres Strait Islander 4 5 375940.
#> 2 2006 ACT Non-Indigenous 4 109 375940.
#> 3 2006 NSW Aboriginal and Torres Strait Islander 4 104 375940.
#> 4 2006 NSW Non-Indigenous 4 1870 375940.
#> 5 2006 NT Aboriginal and Torres Strait Islander 4 102 375940.
#> # … with abbreviated variable names ¹n_full_and_part_time, ²population_est