Most price indexes are made with a two-step procedure, where period-over-period elemental indexes are calculated for a collection of elemental aggregates at each point in time, and then aggregated according to a price index aggregation structure. These indexes can then be chained together to form a time series that gives the evolution of prices with respect to a fixed base period. The piar package contains a collection of functions that revolve around this work flow, making it easy to build standard price indexes in R.
The purpose of this vignette is to give several extended examples of how to use the functions in this package to make different types of price indexes. This should serve both as a introduction to the functionality in piar, and a reference for solving specific index-number problems.
The first example covers calculating a matched-sample index, where a fixed set of businesses each provide prices for a collection of products over time. The products reported by a businesses can change over time, but the set of businesses is fixed for the duration of the sample. Each businesses has a weight that is established when the sample is drawn, and represents a particular segment of the economy.
The usual approach for calculating a matched-sample index starts by computing the elemental index for each business as an equally-weighted geometric mean of price relatives (i.e., a Jevons index). From there, index values for different segments of the economy are calculated as an arithmetic mean of the elemental indexes, using the businesses-level weights (either a Young or Lowe index, depending how the weights are constructed).
The ms_prices
dataset has price data for five businesses
over four quarters, and the ms_weights
dataset has the
weight data. Note that these data have fairly realistic patterns of
missing data.
library(piar)
head(ms_prices)
#> period business product price
#> 1 202001 B1 1 1.14
#> 2 202001 B1 2 NA
#> 3 202001 B1 3 6.09
#> 4 202001 B2 4 6.23
#> 5 202001 B2 5 8.61
#> 6 202001 B2 6 6.40
ms_weights
#> business classification weight
#> 1 B1 11 553
#> 2 B2 11 646
#> 3 B3 11 312
#> 4 B4 12 622
#> 5 B5 12 330
The elemental_index()
function makes, well, elemental
indexes, using information on price relatives, elemental aggregates
(businesses), and time periods (quarters). By default it makes a Jevons
index, but any bilateral generalized-mean index is possible. The only
wrinkle is that price data here are in levels, and not relatives, but
the price_relative()
function can make the necessary
conversion.
relatives <- with(
ms_prices,
price_relative(price, period = period, product = product)
)
ms_elemental <- with(
ms_prices,
elemental_index(relatives, period = period, ea = business, na.rm = TRUE)
)
ms_elemental
#> Period-over-period price index for 4 levels over 4 time periods
#> 202001 202002 202003 202004
#> B1 1 0.8949097 0.3342939 NaN
#> B2 1 NaN NaN 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 NaN NaN NaN 4.576286
(Homogeneous elemental aggregates often leads to unit-value elemental
indexes that are not based on price relatives. These cases can be dealt
with by first aggregating prices for each elemental aggregate,
aggregate(price ~ period + product, ms_prices, mean)
, at
each point in time with an arithmetic mean, then forming price relatives
to feed into elemental_index()
.)
As with most functions in R, missing values are
contagious by default in piar. Setting
na.rm = TRUE
in elemental_index()
means that
missing price relatives are ignored, which is equivalent to imputing
these missing relatives with the value of the elemental index for the
respective businesses (i.e., parental or overall mean imputation). Other
types of imputation are possible, and are the topic of a subsequent
example.
The elemental_index()
function returns a special index
object, and there are a number of methods for working with these
objects. Probably the most useful of these methods allows the resulting
elemental indexes to be extracted like a matrix, even though it’s not a
matrix. (Note that there are only indexes for four businesses, not five,
because the fifth business never reports any prices; as will be seen in
another example, an elemental index can be made for this business with a
small change to the call to elemental_index()
.)
ms_elemental[, "202004"]
#> Period-over-period price index for 4 levels over 1 time periods
#> 202004
#> B1 NaN
#> B2 2.770456
#> B3 0.537996
#> B4 4.576286
ms_elemental["B1", ]
#> Period-over-period price index for 1 levels over 4 time periods
#> 202001 202002 202003 202004
#> B1 1 0.8949097 0.3342939 NaN
With the elemental indexes out of the way, it’s time to make a
price-index aggregation structure that maps each business to its
position in the aggregation hierarchy. The only hiccup is unpacking the
digit-wise classification for each businesses that defines the
hierarchy. That’s the job of the expand_classification()
function.
hierarchy <- with(
ms_weights,
c(expand_classification(classification), list(business))
)
pias <- aggregation_structure(hierarchy, weights = ms_weights$weight)
It is now simple to aggregate the elemental indexes according to this
aggregation structure with the aggregate()
function. As
with the elemental indexes, missing values are ignored by setting
na.rm = TRUE
, which is equivalent to parentally imputing
missing values. Note that, unlike the elemental indexes, missing values
are filled in to ensure the index can be chained over time.
ms_index <- aggregate(ms_elemental, pias, na.rm = TRUE)
ms_index
#> Period-over-period price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.3007239 1.0630743 2.734761
#> 11 1 1.3007239 1.0630743 1.574515
#> 12 1 1.3007239 1.0630743 4.576286
#> B1 1 0.8949097 0.3342939 1.574515
#> B2 1 1.3007239 1.0630743 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 1 1.3007239 1.0630743 4.576286
#> B5 1 1.3007239 1.0630743 4.576286
Although simple, this example covers the core functionality of piar. The remaining examples in the vignette build on this one by adding complexities that often arise in practice.
The elemental_index()
function makes period-over-period
elemental indexes by default, which can then be aggregated to make a
period-over-period index. Chaining an index is the process of taking the
cumulative product of each of these period-over-period indexes to make a
time series that compares prices to a fixed base period.
The chain()
function can be used to chain the values in
an index object.
ms_index_chained <- chain(ms_index)
ms_index_chained
#> Fixed-base price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.3007239 1.3827662 3.7815355
#> 11 1 1.3007239 1.3827662 2.1771866
#> 12 1 1.3007239 1.3827662 6.3279338
#> B1 1 0.8949097 0.2991629 0.4710366
#> B2 1 1.3007239 1.3827662 3.8308934
#> B3 1 2.0200036 3.3033836 1.7772072
#> B4 1 1.3007239 1.3827662 6.3279338
#> B5 1 1.3007239 1.3827662 6.3279338
This gives almost the same result as directly manipulating the index as a matrix, except that the former returns an index object (not a matrix).
t(apply(as.matrix(ms_index), 1, cumprod))
#> 202001 202002 202003 202004
#> 1 1 1.3007239 1.3827662 3.7815355
#> 11 1 1.3007239 1.3827662 2.1771866
#> 12 1 1.3007239 1.3827662 6.3279338
#> B1 1 0.8949097 0.2991629 0.4710366
#> B2 1 1.3007239 1.3827662 3.8308934
#> B3 1 2.0200036 3.3033836 1.7772072
#> B4 1 1.3007239 1.3827662 6.3279338
#> B5 1 1.3007239 1.3827662 6.3279338
Chained indexes often need be to rebased, and this can be done with
the rebase()
function. For example, rebasing the index so
that 202004 is the base period just requires dividing the chained index
by the slice for 202004.
rebase(ms_index_chained, ms_index_chained[, "202004"])
#> Fixed-base price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 0.2644428 0.3439671 0.3656626 1
#> 11 0.4593084 0.5974334 0.6351161 1
#> 12 0.1580295 0.2055527 0.2185178 1
#> B1 2.1229774 1.8998731 0.6351161 1
#> B2 0.2610357 0.3395354 0.3609514 1
#> B3 0.5626806 1.1366169 1.8587499 1
#> B4 0.1580295 0.2055527 0.2185178 1
#> B5 0.1580295 0.2055527 0.2185178 1
In some cases the base period is the average of several periods; setting the base period to the second half of 2020 just requires averaging the index over subperiods before rebasing.
rebase(
ms_index_chained,
mean(window(ms_index_chained, "202003"))
)
#> Fixed-base price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 0.3872740 0.5037366 0.5355095 1.4644905
#> 11 0.5618052 0.7307535 0.7768452 1.2231548
#> 12 0.2593798 0.3373815 0.3586616 1.6413384
#> B1 2.5967299 2.3238388 0.7768452 1.2231548
#> B2 0.3836077 0.4989677 0.5304398 1.4695602
#> B3 0.3936550 0.7951845 1.3003935 0.6996065
#> B4 0.2593798 0.3373815 0.3586616 1.6413384
#> B5 0.2593798 0.3373815 0.3586616 1.6413384
Price indexes are often aggregated over multiple dimensions. Matched sample indexes that use sequential Poisson sampling are a good example, as there are usually take-all and take-some strata in addition to, say, an industry classification.
ms_weights$stratum <- c("TS", "TA", "TS", "TS", "TS")
ms_weights
#> business classification weight stratum
#> 1 B1 11 553 TS
#> 2 B2 11 646 TA
#> 3 B3 11 312 TS
#> 4 B4 12 622 TS
#> 5 B5 12 330 TS
The easiest way to deal with multiple digit-wise classifications is to turn them into one classification. In this example the “stratum” dimension comes before the “classification” dimension for the purposes of parental imputation.
classification_sps <- with(ms_weights, paste0(classification, stratum))
classification_sps
#> [1] "11TS" "11TA" "11TS" "12TS" "12TS"
This classification can be expanded with the
expand_classification()
function as before, just with an
extra instruction to say that the last “digit” in the classification is
two characters wide, not one.
classification_sps <- expand_classification(
classification_sps,
width = c(1, 1, 2)
)
pias_sps <- with(
ms_weights,
aggregation_structure(c(classification_sps, list(business)), weight)
)
The elemental indexes can now be aggregated according to this new aggregation structure.
index_sps <- aggregate(ms_elemental, pias_sps, na.rm = TRUE)
index_sps
#> Period-over-period price index for 11 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.3007239 1.0630743 2.684412
#> 11 1 1.3007239 1.0630743 1.492443
#> 12 1 1.3007239 1.0630743 4.576286
#> 11TS 1 1.3007239 1.0630743 0.537996
#> 11TA 1 1.3007239 1.0630743 2.770456
#> 12TS 1 1.3007239 1.0630743 4.576286
#> B1 1 0.8949097 0.3342939 0.537996
#> B2 1 1.3007239 1.0630743 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 1 1.3007239 1.0630743 4.576286
#> B5 1 1.3007239 1.0630743 4.576286
When a price index has many dimensions (e.g., industry, sampling stratum, region), it can be useful to interact the classifications for these different dimensions to get all possible aggregation structures. The aggregated index can then be re-aggregated to get index values for all dimensions.
Continuing with the example, the industry and strata structures can
be interacted to get two aggregation structures that can be used to
re-aggregate index_sps
.
interacted_hierarchy <- with(
ms_weights,
interact_classifications(
expand_classification(classification),
expand_classification(stratum)
)
)
pias_sps2 <- lapply(
interacted_hierarchy,
\(x) aggregation_structure(c(x, ms_weights["business"]), ms_weights$weight)
)
index_sps2 <- lapply(pias_sps2, \(x) aggregate(index_sps, x, include_ea = FALSE))
The resulting indexes can be merged together to give an index that includes all combinations of industry and sampling stratum.
Reduce(merge, index_sps2)
#> Period-over-period price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1:T 1 1.300724 1.063074 2.684412
#> 1:TS 1 1.300724 1.063074 2.653820
#> 1:TA 1 1.300724 1.063074 2.770456
#> 11:T 1 1.300724 1.063074 1.492443
#> 12:T 1 1.300724 1.063074 4.576286
#> 11:TS 1 1.300724 1.063074 0.537996
#> 11:TA 1 1.300724 1.063074 2.770456
#> 12:TS 1 1.300724 1.063074 4.576286
Aggregating a price index can be done as a matrix operation. Although
this approach is less flexible than the aggregate()
method,
it can be considerably faster for larger indexes. The key is to turn the
aggregation structure into an aggregation matrix.
pias_matrix <- as.matrix(pias)
pias_matrix
#> B1 B2 B3 B4 B5
#> 1 0.2245229 0.2622818 0.1266748 0.2525376 0.1339829
#> 11 0.3659828 0.4275314 0.2064858 0.0000000 0.0000000
#> 12 0.0000000 0.0000000 0.0000000 0.6533613 0.3466387
Multiplying this matrix by a matrix of fixed-base elemental indexes now computes the aggregate index in each time period.
It’s often useful to determine which higher-level index values are missing, and subsequently get imputed during aggregation (i.e., compute the shadow of an index). This is simple to do if there’s an elemental index for each elemental aggregate in the aggregation structure.
ms_elemental2 <- elemental_index(
ms_prices,
relatives ~ period + factor(business, ms_weights$business),
na.rm = TRUE
)
The idea is to simply aggregate an indicator for missingness to get a matrix that gives the share of missing elemental indexes for each higher-level index.
pias_matrix <- as.matrix(pias) > 0
pias_matrix %*% is.na(ms_elemental2) / rowSums(pias_matrix)
#> 202001 202002 202003 202004
#> 1 0.4 0.6000000 0.6000000 0.4000000
#> 11 0.0 0.3333333 0.3333333 0.3333333
#> 12 1.0 1.0000000 1.0000000 0.5000000
A value of 1 means that there are no non-missing elemental indexes, and that the value for this level of the index is imputed from its parent in the aggregation structure. A value below 1 but above zero means that some but not all elemental indexes are missing, and the index value for this level is based on the non-missing elemental indexes. A value of zero means there’s no imputation for this level of the index.
Parental imputation is the usual way to impute missing index values
during aggregation, and it is simple to do with
aggregate()
. In some cases, however, a business-level index
may get imputed with the value for, say, another business, rather than
for an entire group of businesses. The simplest way to do this sort of
imputation is to alter the elemental indexes prior to aggregation. It is
also possible to augment the aggregation structure with an imputation
layer, but this is more complex.
Suppose that missing index values for business B2 should be imputed as 1, rather than the value for group 11. This replacement can be done as if the index was a matrix.
ms_elemental2 <- ms_elemental
ms_elemental2["B2", 2:3] <- 1
ms_elemental2
#> Period-over-period price index for 4 levels over 4 time periods
#> 202001 202002 202003 202004
#> B1 1 0.8949097 0.3342939 NaN
#> B2 1 1.0000000 1.0000000 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 NaN NaN NaN 4.576286
The index can now be aggregated as usual.
aggregate(ms_elemental2, pias, na.rm = TRUE)
#> Period-over-period price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.1721550 1.0400686 2.626560
#> 11 1 1.1721550 1.0400686 1.398142
#> 12 1 1.1721550 1.0400686 4.576286
#> B1 1 0.8949097 0.3342939 1.398142
#> B2 1 1.0000000 1.0000000 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 1 1.1721550 1.0400686 4.576286
#> B5 1 1.1721550 1.0400686 4.576286
By default, the elemental_index()
function calculates a
Jevons index. Although this is the standard index-number formula for
making elemental indexes, many other types of index-numbers are
possible. The Carli index (equally-weighted arithmetic mean of price
relatives) is the main competitor to the Jevons, and requires specifying
the order of the index r
when calling
elemental_index()
. An order of 1 corresponds to an
arithmetic mean.
elemental_index(ms_prices, relatives ~ period + business, na.rm = TRUE, r = 1)
#> Period-over-period price index for 4 levels over 4 time periods
#> 202001 202002 202003 202004
#> B1 1 0.8949097 0.3342939 NaN
#> B2 1 NaN NaN 5.155942
#> B3 1 23.7480455 2.4900997 0.607197
#> B4 NaN NaN NaN 9.368610
The Coggeshall index (equally-weighted harmonic mean of price
relatives) is another competitor to the Jevons, but is seldom used in
practice. Despite it being more exotic, it is just as easy to make by
specifying an order r
of -1.
elemental_index(ms_prices, relatives ~ period + business, na.rm = TRUE, r = -1)
#> Period-over-period price index for 4 levels over 4 time periods
#> 202001 202002 202003 202004
#> B1 1 0.8949097 0.3342939 NaN
#> B2 1 NaN NaN 1.7205750
#> B3 1 0.6591433 0.8185743 0.4746769
#> B4 NaN NaN NaN 2.2353790
The type of mean used to aggregate elemental indexes can be
controlled in the same way in the call to aggregate()
. The
default makes an arithmetic index, but any type of generalized-mean
index is possible.
Many superlative indexes can be made by supplying unequal and (usually) time-varying weights when making the elemental indexes. These weights often come from information on quantities.
The Tornqvist index is a popular superlative index-number formula, using average period-over-period value shares as the weights in a geometric mean. The only tricky part is making the weights from data on prices and quantities.
library(gpindex)
tw <- grouped(index_weights("Tornqvist"))
ms_prices2[c("back_price", "back_quantity")] <-
ms_prices2[back_period(ms_prices2$period, ms_prices2$product),
c("price", "quantity")]
ms_prices2 <- na.omit(ms_prices2) # can't have NAs for Tornqvist weights
ms_prices2$weight <- with(
ms_prices2,
tw(
price, back_price, quantity, back_quantity,
group = interaction(period, business)
)
)
As elemental_index()
makes a geometric index by default,
all that is needed to make a Tornqvist index is to provide the
weights.
elemental_index(
ms_prices2,
price / back_price ~ period + business,
weights = weight
)
#> Period-over-period price index for 4 levels over 4 time periods
#> 202001 202002 202003 202004
#> B1 1 0.8949097 0.3342939 NaN
#> B2 1 NaN NaN 2.165152
#> B3 1 0.9520982 1.5913929 0.542372
#> B4 NaN NaN NaN 5.904237
It’s often convenient to decompose an index into the (additive)
contribution of each price relative, also known as the percent-change
contributions. This can be done with the same work flow as in the
previous examples, specifying contrib = TRUE
when calling
elemental_index()
.
ms_elemental <- elemental_index(
ms_prices,
relatives ~ period + business,
contrib = TRUE, na.rm = TRUE
)
As with index values, percent-change contributions for a given level of the index can be extracted as a matrix.
contrib(ms_elemental)
#> 202001 202002 202003 202004
#> 1 0 0.0000000 0.0000000 0
#> 2 NA NA -0.6657061 0
#> 3 0 -0.1050903 NA NA
Aggregating the elemental indexes automatically aggregates percent-change contributions, so no extra steps are needed after the elemental indexes are made.
contrib(aggregate(ms_elemental, pias, na.rm = TRUE))
#> 202001 202002 202003 202004
#> 1 0 0.00000000 0.0000000 0.000000000
#> 10 0 -0.08782076 0.2731949 -0.078173579
#> 11 0 0.00000000 NA 0.059392635
#> 12 0 0.00000000 NA 1.322915301
#> 2 NA NA -0.2928098 0.000000000
#> 3 0 -0.06718490 NA NA
#> 4 0 NA NA -0.018209690
#> 5 0 NA NA 0.094562963
#> 6 0 NA NA 0.427935081
#> 7 0 0.51646606 -0.2054665 -0.011177530
#> 8 0 0.01906845 0.1755868 -0.003784845
#> 9 0 -0.07980493 0.1125689 -0.058699008
The functions in piar are all designed to work within a “basket”, which is a fancy way of saying within a given aggregation structure. Over time, however, aggregation structures change as the weights used to aggregate an index get updated, and new samples of businesses are drawn. The general approach to keep the time series going is to “chain” the index across baskets.
It is easier to see how to chain an index over time with a simple
example that just splits the ms_prices
data in two.
ms_prices1 <- subset(ms_prices, period <= "202003")
ms_prices2 <- subset(ms_prices, period >= "202003")
The index for the first basket can be calculate as usual.
ms_elemental1 <- elemental_index(
ms_prices1,
price_relative(price, period = period, product = product) ~ period + business,
na.rm = TRUE
)
ms_index1 <- aggregate(ms_elemental1, pias, na.rm = TRUE)
ms_index1
#> Period-over-period price index for 8 levels over 3 time periods
#> 202001 202002 202003
#> 1 1 1.3007239 1.0630743
#> 11 1 1.3007239 1.0630743
#> 12 1 1.3007239 1.0630743
#> B1 1 0.8949097 0.3342939
#> B2 1 1.3007239 1.0630743
#> B3 1 2.0200036 1.6353355
#> B4 1 1.3007239 1.0630743
#> B5 1 1.3007239 1.0630743
Nothing special needs to be done to make the elemental indexes for the new basket, but it’s easier to remove the index values of 1 for quarter 3 2020.
ms_elemental2 <- ms_prices2 |>
transform(rel = price_relative(price, period = period, product = product)) |>
subset(period > "202003") |>
elemental_index(rel ~ period + business, na.rm = TRUE)
Aggregating these elemental indexes, however, requires an aggregation structure. The results of the first example can be reproduced by simply “price updating” the original weights, then building the aggregation structure as usual.
ms_index2 <- aggregate(ms_elemental2, update(pias, ms_index1), na.rm = TRUE)
ms_index2
#> Period-over-period price index for 8 levels over 1 time periods
#> 202004
#> 1 2.734761
#> 11 1.574515
#> 12 4.576286
#> B1 1.574515
#> B2 2.770456
#> B3 0.537996
#> B4 4.576286
#> B5 4.576286
This produces two sets of period-over-period indexes that can be stacked together and then chained.
chain(stack(ms_index1, ms_index2))
#> Fixed-base price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.3007239 1.3827662 3.7815355
#> 11 1 1.3007239 1.3827662 2.1771866
#> 12 1 1.3007239 1.3827662 6.3279338
#> B1 1 0.8949097 0.2991629 0.4710366
#> B2 1 1.3007239 1.3827662 3.8308934
#> B3 1 2.0200036 3.3033836 1.7772072
#> B4 1 1.3007239 1.3827662 6.3279338
#> B5 1 1.3007239 1.3827662 6.3279338
The previous examples used parental imputation to both impute missing
price relatives when calculating the elemental indexes, and to impute
missing elemental indexes during aggregation. Another common imputation
strategy when making elemental indexes is to carry forward the previous
price to impute for missing prices, and parentally impute missing
elemental indexes during aggregation. As the
elemental_index()
function accepts price relatives as its
input, other types of imputations can be done prior to passing price
relatives to this function.
ms_elemental2 <- ms_prices |>
transform(imputed_price = carry_forward(price, period = period, product = product)) |>
elemental_index(
price_relative(imputed_price, period = period, product = product) ~
period + business,
na.rm = TRUE
)
ms_elemental2
#> Period-over-period price index for 4 levels over 4 time periods
#> 202001 202002 202003 202004
#> B1 1 0.8949097 0.5781816 1.000000
#> B2 1 1.0000000 0.1777227 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 NaN NaN NaN 4.576286
Aggregation follows the same steps as in the previous examples, with missing values set to be ignored in order to parentally impute missing elemental indexes.
ms_index <- aggregate(ms_elemental2, pias, na.rm = TRUE)
ms_index
#> Period-over-period price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.1721550 0.8082981 2.2653614
#> 11 1 1.1721550 0.8082981 0.8093718
#> 12 1 1.1721550 0.8082981 4.5762862
#> B1 1 0.8949097 0.5781816 1.0000000
#> B2 1 1.0000000 0.1777227 2.7704563
#> B3 1 2.0200036 1.6353355 0.5379960
#> B4 1 1.1721550 0.8082981 4.5762862
#> B5 1 1.1721550 0.8082981 4.5762862
All of the examples so far have built an index from a single source of price data. In many cases the elemental indexes are built from multiple sources of data, either because no single source of data has the necessary coverage, or different index-number formulas are employed for different elemental aggregates.
It is straightforward to merge index objects together, provided
they’re for the same time periods. To keep the example simple, suppose
that ms_prices
is split in two.
ms_prices1 <- subset(ms_prices, business %in% c("B1", "B2", "B3"))
ms_prices2 <- subset(ms_prices, business == "B4")
Elemental indexes can be made for both groups separately with the usual recipe. Note that there is no data for business B4 in the first two periods, so the time periods need to be made explicit.
ms_elemental1 <- elemental_index(
ms_prices1,
price_relative(price, period = period, product = product) ~ period + business,
na.rm = TRUE
)
ms_elemental1
#> Period-over-period price index for 3 levels over 4 time periods
#> 202001 202002 202003 202004
#> B1 1 0.8949097 0.3342939 NaN
#> B2 1 NaN NaN 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
ms_elemental2 <- ms_prices2 |>
transform(period = factor(period, levels = time(ms_elemental1))) |>
elemental_index(
price_relative(price, period = period, product = product) ~ period + business,
na.rm = TRUE
)
ms_elemental2
#> Period-over-period price index for 1 levels over 4 time periods
#> 202001 202002 202003 202004
#> B4 NaN NaN NaN 4.576286
Once the elemental indexes are made, they can be merged together and then aggregated.
aggregate(merge(ms_elemental1, ms_elemental2), pias, na.rm = TRUE)
#> Period-over-period price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.3007239 1.0630743 2.734761
#> 11 1 1.3007239 1.0630743 1.574515
#> 12 1 1.3007239 1.0630743 4.576286
#> B1 1 0.8949097 0.3342939 1.574515
#> B2 1 1.3007239 1.0630743 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 1 1.3007239 1.0630743 4.576286
#> B5 1 1.3007239 1.0630743 4.576286
A slightly more complex case is when some of the input data are already a price index. For example, suppose the index values for businesses B4 and B5 come from some outside process, and are taken as inputs.
ms_prices2 <- subset(
as.data.frame(aggregate(ms_elemental, pias, na.rm = TRUE)),
level %in% c("B4", "B5")
)
ms_prices2
#> period level value
#> 7 202001 B4 1.000000
#> 8 202001 B5 1.000000
#> 15 202002 B4 1.300724
#> 16 202002 B5 1.300724
#> 23 202003 B4 1.063074
#> 24 202003 B5 1.063074
#> 31 202004 B4 4.576286
#> 32 202004 B5 4.576286
All that is required is to pass the pre-existing indexes to
as_index()
to cast them into the correct form. This won’t
affect their values, but will allow them to be merged with the other
elemental indexes, and aggregated.
ms_elemental2 <- as_index(ms_prices2)
aggregate(merge(ms_elemental1, ms_elemental2), pias, na.rm = TRUE)
#> Period-over-period price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.3007239 1.0630743 2.734761
#> 11 1 1.3007239 1.0630743 1.574515
#> 12 1 1.3007239 1.0630743 4.576286
#> B1 1 0.8949097 0.3342939 1.574515
#> B2 1 1.3007239 1.0630743 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 1 1.3007239 1.0630743 4.576286
#> B5 1 1.3007239 1.0630743 4.576286
All of the examples so far have used a single set of weights to aggregate an index. Although this is by far the most common case, there are situations where the aggregation weights change every period. The Paasche index is the notable example, as the weights for aggregation are the current-period revenue shares in each period.
weights <- data.frame(
period = rep(c("202001", "202002", "202003", "202004"), each = 5),
classification = ms_weights$classification,
weight = 1:20
)
head(weights)
#> period classification weight
#> 1 202001 11 1
#> 2 202001 11 2
#> 3 202001 11 3
#> 4 202001 12 4
#> 5 202001 12 5
#> 6 202002 11 6
The only new tools needed to deal with time-varying weights are the
stack()
and unstack()
functions.
stack()
appends a later index series onto an earlier one
for the same levels, whereas unstack()
pulls apart an index
series for many periods into a collection of one-period indexes.
The first step to making a Paasche index is to unstack the elemental indexes into a list of elemental indexes for each period. (Trying to make the elemental indexes period-by-period can be dangerous when there are missing values.)
ms_elemental <- unstack(ms_elemental)
ms_elemental
#> $`202001`
#> Period-over-period price index for 4 levels over 1 time periods
#> 202001
#> B1 1
#> B2 1
#> B3 1
#> B4 NaN
#>
#> $`202002`
#> Period-over-period price index for 4 levels over 1 time periods
#> 202002
#> B1 0.8949097
#> B2 NaN
#> B3 2.0200036
#> B4 NaN
#>
#> $`202003`
#> Period-over-period price index for 4 levels over 1 time periods
#> 202003
#> B1 0.3342939
#> B2 NaN
#> B3 1.6353355
#> B4 NaN
#>
#> $`202004`
#> Period-over-period price index for 4 levels over 1 time periods
#> 202004
#> B1 NaN
#> B2 2.770456
#> B3 0.537996
#> B4 4.576286
The second step is to make a sequence of aggregation structures for each set of weights.
Making the Paasche index for each period is now just a case of
mapping the aggregate()
function to each elemental index
and aggregation structure, and then reducing the result with the
stack()
function.
paasche <- Reduce(
stack,
Map(aggregate, ms_elemental, pias, na.rm = TRUE, r = -1)
)
paasche
#> Period-over-period price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.3127080 0.5874490 1.3591916
#> 11 1 1.3127080 0.5874490 0.8839797
#> 12 1 1.3127080 0.5874490 4.5762862
#> B1 1 0.8949097 0.3342939 0.8839797
#> B2 1 1.3127080 0.5874490 2.7704563
#> B3 1 2.0200036 1.6353355 0.5379960
#> B4 1 1.3127080 0.5874490 4.5762862
#> B5 1 1.3127080 0.5874490 4.5762862
With a Paasche index in hand, it is now trivial to make a Fisher index by first making the period-over-period Laspeyres index, and then doing a simple matrix operation.
laspeyres <- Reduce(
stack,
Map(aggregate, ms_elemental, pias[c(1, 1, 2, 3)], na.rm = TRUE)
)
fisher <- sqrt(as.matrix(laspeyres) * as.matrix(paasche))
fisher
#> 202001 202002 202003 202004
#> 1 1 1.5107763 0.7956890 1.996688
#> 11 1 1.5107763 0.7956890 1.192826
#> 12 1 1.5107763 0.7956890 4.576286
#> B1 1 0.8949097 0.3342939 1.192826
#> B2 1 1.5107763 0.7956890 2.770456
#> B3 1 2.0200036 1.6353355 0.537996
#> B4 1 1.5107763 0.7956890 4.576286
#> B5 1 1.5107763 0.7956890 4.576286
Percent-change contributions can similarly be computed with a matrix operation.
geometric_weights <- transmute_weights(0, 1)
w <- mapply(
\(x, y) scale_weights(geometric_weights(c(x, y))),
as.numeric(laspeyres[1]),
as.numeric(paasche[1])
)
laspeyres_contrib <- contrib(laspeyres)
paasche_contrib <- contrib(paasche)
fisher_contrib <- w[1, col(laspeyres_contrib)] * laspeyres_contrib +
w[2, col(paasche_contrib)] * paasche_contrib
fisher_contrib
#> 202001 202002 202003 202004
#> 1 0 0.00000000 0.00000000 0.000000000
#> 10 0 -0.13327742 0.17296129 -0.131946938
#> 11 0 0.00000000 NA 0.039533034
#> 12 0 0.00000000 NA 0.880561300
#> 2 NA NA -0.42962331 0.000000000
#> 3 0 -0.04756479 NA NA
#> 4 0 NA NA -0.012019870
#> 5 0 NA NA 0.062419211
#> 6 0 NA NA 0.282471798
#> 7 0 0.78379264 -0.13008203 -0.018866231
#> 8 0 0.02893842 0.11116498 -0.006388331
#> 9 0 -0.12111255 0.07126804 -0.099076370
Despite being a matrix, the resulting Fisher index can be chained just like any other index.
chain(fisher)
#> Fixed-base price index for 8 levels over 4 time periods
#> 202001 202002 202003 202004
#> 1 1 1.5107763 1.2021080 2.4002342
#> 11 1 1.5107763 1.2021080 1.4339054
#> 12 1 1.5107763 1.2021080 5.5011904
#> B1 1 0.8949097 0.2991629 0.3568492
#> B2 1 1.5107763 1.2021080 3.3303878
#> B3 1 2.0200036 3.3033836 1.7772072
#> B4 1 1.5107763 1.2021080 5.5011904
#> B5 1 1.5107763 1.2021080 5.5011904
A chained Fisher index can also be made by first chaining the Laspeyres and Paasche indexes, then taking the geometric mean.
sqrt(as.matrix(chain(laspeyres)) * as.matrix(chain(paasche)))
#> 202001 202002 202003 202004
#> 1 1 1.5107763 1.2021080 2.4002342
#> 11 1 1.5107763 1.2021080 1.4339054
#> 12 1 1.5107763 1.2021080 5.5011904
#> B1 1 0.8949097 0.2991629 0.3568492
#> B2 1 1.5107763 1.2021080 3.3303878
#> B3 1 2.0200036 3.3033836 1.7772072
#> B4 1 1.5107763 1.2021080 5.5011904
#> B5 1 1.5107763 1.2021080 5.5011904