The linear phenotypic selection index (LPSI), the null restricted LPSI (RLPSI), and the predetermined proportional gains LPSI (PPG-LPSI) are the main phenotypic selection indices used to predict the net genetic merit and select parents for the next selection cycle. The LPSI is an unrestricted index, whereas the RLPSI and the PPG-LPSI allow restrictions equal to zero and predetermined proportional gain restrictions respectively to be imposed on the expected genetic gain values of the trait.
One additional restricted index is the desired gains LPSI (DG-LPSI), which does not require economic weights and, in a similar manner to the PPG-LPSI, allows restrictions to be imposed on the expected genetic gain values of the trait based on a predetermined level.
In this vignette, we demonstrate the theory and practical application
of RLPSI, PPG-LPSI, and DG-LPSI using the selection.index
package.
We use the synthetic maize_pheno dataset built into the
package to calculate the phenotypic and genotypic variance-covariance
matrices.
library(selection.index)
# Load the built-in maize phenotype dataset
data("maize_pheno")
# Extract the traits of interest
traits <- c("Yield", "PlantHeight", "DaysToMaturity")
# Calculate Genotypic (gmat) and Phenotypic (pmat) covariance matrices
gmat <- gen_varcov(maize_pheno[, traits], maize_pheno$Genotype, maize_pheno$Block)
pmat <- phen_varcov(maize_pheno[, traits], maize_pheno$Genotype, maize_pheno$Block)
# Define economic weights for the three traits
wmat <- weight_mat(data.frame(Trait = traits, Weight = c(1, -1, 1)))The main objective of the RLPSI is to optimize, under some null restrictions, the selection response, to predict the net genetic merit \(H = \mathbf{w}'\mathbf{g}\), and to select the individuals with the highest net genetic merit values as parents of the next generation. The RLPSI allows restrictions equal to zero to be imposed on the expected genetic gains of some traits.
Assuming we want to minimize the mean squared difference between the index \(I\) and merit \(H\) (\(E[(H - I)^2]\)) under the restriction \(\mathbf{C}'\mathbf{b} = \mathbf{0}\), we minimize the following function:
\[ \Psi \left(\mathbf{b},\mathbf{v}\right)={\mathbf{b}}^{\prime}\mathbf{Pb}+{\mathbf{w}}^{\prime}\mathbf{Gw}-2{\mathbf{w}}^{\prime}\mathbf{Gb}+2{\mathbf{v}}^{\prime }{\mathbf{C}}^{\prime}\mathbf{b} \]
where \(\mathbf{C}' = \mathbf{U}'\mathbf{G}\), \(\mathbf{U}'\) is the matrix of null restrictions, and \(\mathbf{v}\) is a vector of Lagrange multipliers. The solution yields the RLPSI vector of coefficients:
\[ {\mathbf{b}}_R=\mathbf{Kb} \]
where \(\mathbf{K} = [\mathbf{I} - \mathbf{Q}]\), \(\mathbf{Q} = \mathbf{P}^{-1}\mathbf{C}(\mathbf{C}'\mathbf{P}^{-1}\mathbf{C})^{-1}\mathbf{C}'\), and \(\mathbf{b} = \mathbf{P}^{-1}\mathbf{Gw}\) is the LPSI vector of coefficients.
Using the rlpsi function, we can constrain the genetic
gain of certain traits to be strictly zero. For instance, we may want to
maximize overall gain while ensuring that Plant Height (trait #2) does
not change.
# Restrict trait 2 (PHT) to have ZERO expected genetic gain
rlpsi_res <- rlpsi(
pmat = pmat,
gmat = gmat,
wmat = wmat,
restricted_traits = c(2)
)
# View the summary and coefficients
print(rlpsi_res$summary)
#> b.1 b.2 b.3 GA PRE Delta_G rHI hI2
#> 1 0.0551 -3.3075 2.1743 108.2417 10824.17 108.2417 0.3153 0.0994
# View the expected genetic gains (Delta_G)
print(rlpsi_res$Delta_G)
#> Yield PlantHeight DaysToMaturity
#> 1.059783e+02 -1.709497e-08 2.263426e+00Notice that the expected genetic gain (Delta_G) for the
2nd trait is effectively zero.
Unlike the RLPSI, the predetermined proportional gains phenotypic selection index (PPG-LPSI) allows restrictions different from zero to be imposed, ensuring that traits gain in strictly predefined proportions to each other.
Minimizing \(E[(H - I)^2]\) under the proportional restriction \(\mathbf{M}'\mathbf{b} = 0\), where \(\mathbf{M}' = \mathbf{D}'\mathbf{C}'\) and \(\mathbf{D}'\) is a Mallard (1972) matrix, leads to:
\[ \Phi \left(\mathbf{b},\mathbf{v}\right)={\mathbf{b}}^{\prime}\mathbf{Pb}+{\mathbf{w}}^{\prime}\mathbf{Gw}-2{\mathbf{w}}^{\prime}\mathbf{Gb}+2{\mathbf{v}}^{\prime }{\mathbf{M}}^{\prime}\mathbf{b} \]
The vector that minimizes the mean squared error under this restriction is:
\[ {\mathbf{b}}_M={\mathbf{K}}_M\mathbf{b} \]
where \(\mathbf{K}_M = [\mathbf{I} - \mathbf{Q}_M]\) and \(\mathbf{Q}_M = \mathbf{P}^{-1}\mathbf{M}(\mathbf{M}'\mathbf{P}^{-1}\mathbf{M})^{-1}\mathbf{M}'\). Alternatively, Tallis (1985) formulated it using a proportionality constant \(\theta\):
\[ {\mathbf{b}}_{\mathrm{T}}={\mathbf{b}}_R+\uptheta \boldsymbol{\updelta} \]
The ppg_lpsi function enforces these desired
proportional gains. Suppose we want the gains across the three traits to
follow the proportion 2 : 1 : 1.
# Specify the desired proportions
k_proportions <- c(2, 1, 1)
# Calculate the PPG-LPSI
ppg_res <- ppg_lpsi(pmat = pmat, gmat = gmat, k = k_proportions, wmat = wmat)
# View the expected genetic gains
print(ppg_res$Delta_G)
#> Yield PlantHeight DaysToMaturity
#> 700636.74 18363.40 10206.29If we observe the resulting Delta_G values, their
relative proportions should approximate the 2 : 1 : 1 ratio
dictated by k.
The desired gains linear phenotypic selection index (DG-LPSI) is unique in that it does not require economic weights \(\mathbf{w}\). Instead, the breeder specifies the exact desired target genetic gains \(\mathbf{d}\) directly.
Because the expected genetic gain is \(\mathbf{E} = k_I \frac{\mathbf{Gb}}{\sigma_I}\), if we set \(\mathbf{Gb} = \mathbf{d}\), then we want to minimize the variance of the index \(\sigma_I\) subject to \(\mathbf{Gb} = \mathbf{d}\) to maximize \(\mathbf{E}\):
\[ {\Phi}_{DG}\left(\mathbf{b},\mathbf{v}\right)=\mathbf{0.5}\left({\mathbf{b}}^{\prime}\mathbf{Pb}\right)+{\mathbf{v}}^{\prime}\left(\mathbf{Gb}-\mathbf{d}\right) \]
Solving this yields the DG-LPSI vector of coefficients:
\[ {\mathbf{b}}_{DG}={\mathbf{P}}^{-1}\mathbf{G}{\left({\mathbf{GP}}^{-1}\mathbf{G}\right)}^{-1}\mathbf{d} \]
We can run the dg_lpsi function solely based on desired
absolute gains. Let’s aim to increase Yield by 5 units, decrease Plant
Height by 2 units, and increase Days to Maturity by 1 unit.
# Explicit vector of desired absolute genetic gains
desired_gains <- c(5, -2, 1)
# Calculate DG-LPSI
dg_res <- dg_lpsi(pmat = pmat, gmat = gmat, d = desired_gains)
# Check the achieved proportional genetic gains
print(dg_res$Delta_G)
#> Yield PlantHeight DaysToMaturity
#> 3.3431312 -1.3372525 0.6686262
# The DG-LPSI also calculates implied Smith-Hazel economic weights
print(dg_res$implied_weights_normalized)
#> Yield PlantHeight DaysToMaturity
#> -0.01969999 0.19583750 1.00000000The output gives us the selection index coefficients \(\mathbf{b}\) that would achieve the target gains, as well as the implied economic weights showing the “absolute cost” or relative importance required to achieve those gains.
Imposing restrictions on an index inherently limits its potential to
maximize the overall selection response \(R_R\). The selection.index
package allows you to evaluate this “cost of restriction” by calculating
both the relative selection efficiency (PRE - Percentage of
Response Efficiency) and the heritability (hI2) of the
constrained index versus the unconstrained equivalent.
When reviewing the summary data frame output of our
indices above: - rHI: The correlation
between the constrained index and the true net genetic merit (\(H\)). As more restrictions are added, this
correlation decreases. - PRE: Shows the
expected genetic advance relative to a base index. Constrained indices
will universally show lower PRE than their unconstrained
LPSI counterpart.
By default, rlpsi() and ppg_lpsi() accept
user-friendly arguments like restricted_traits = c(2) or
k = c(2, 1, 1) to automatically construct the mathematical
constraint matrices behind the scenes.
However, advanced users may wish to manually define these matrices, particularly for complex scenarios.
For the RLPSI, the constraint matrix \(\mathbf{C}\) is a \(t \times r\) matrix (where \(t\) is the number of traits and \(r\) is the number of restrictions). Each
column represents a restricted trait with a 1 at the
restricted index and 0s elsewhere:
# Manually restrict traits #1 and #3:
# Create a 3x2 constraint matrix
C_matrix <- matrix(
c(
1, 0, 0, # Restrict trait 1
0, 0, 1
), # Restrict trait 3
nrow = 3, ncol = 2
)
# Pass directly to RLPSI
rlpsi_manual <- rlpsi(pmat = pmat, gmat = gmat, wmat = wmat, C = C_matrix)
print(rlpsi_manual$Delta_G)
#> Yield PlantHeight DaysToMaturity
#> -0.01072495 -2.76848253 0.74035593The ppg_lpsi() function automatically manages the
complex proportional constraint matrix based on your inputs \(\mathbf{k}\), guaranteeing the proportional
relationship \(\Delta \mathbf{G} =
\phi\mathbf{k}\).
While these indices are powerful predictive models, their real-world applicability operates under the limitations established by classical breeding theory (Hazel, 1943):
The selection.index package provides highly flexible
options beyond the simple unrestricted index. The RLPSI
is useful when strictly zero-change bounds are needed on specific
traits. The PPG-LPSI caters to breeders focusing on a
strict multi-trait improvement ratio. Finally, the
DG-LPSI provides a powerful alternative for scenarios
where economic weights are notoriously difficult to estimate, but ideal
target gains are known. When using these constrained models, breeders
must continuously balance their specific phenotypic goals against the
intrinsic statistical costs to overall genetic efficiency.