How to use statisR

statisR Package version 1.0

Oldemar Rodríguez R.

Installing the package

CRAN

install.packages("statisR", dependencies=TRUE)

STATIS Method

How to read a Table from a CSV file?

DT1 <- read.table("STATIS_TABLE1.csv", header = TRUE, sep = ",", dec = ".", row.names = 1)
DT1
#>           NIT  FOS   CAL    STO   PH   MN   ZN   SS    ALC    CL   CAU   DBO
#> ALAJUELA 2.36 0.43 69.00 156.50 8.41 0.06 0.03 0.20 150.75  7.85  0.45 16.70
#> CIRUELAS 3.60 0.53 72.25 184.75 8.44 0.04 0.01 0.42 149.50  8.12  1.37  4.78
#> DESFO    0.89 0.70 61.00 172.25 7.31 0.19 0.02 0.55 129.00  3.75 19.90 11.07
#> EMBC     1.06 0.69 55.50 177.25 7.21 0.18 0.02 0.72 127.75  3.70 12.00 13.55
#> EMBO     0.83 0.63 53.25 169.00 7.70 0.18 0.26 0.70 136.75  3.60 12.00 22.18
#> EMBS     0.92 0.66 59.00 160.25 7.25 0.18 0.03 0.49 125.75  3.37 12.00 14.02
#> PRES     1.44 0.78 58.25 167.50 7.29 0.21 0.03 0.66 120.50  3.92 17.85 25.00
#> QSOT     2.14 0.55 71.25 188.25 8.36 0.03 0.02 0.20 165.00  3.60  0.14  3.27
#> VIRI     7.65 1.19 58.50 335.50 8.05 0.28 0.06 1.85 182.25 12.90 16.68 21.50
#>            POR
#> ALAJUELA 87.25
#> CIRUELAS 88.25
#> DESFO    48.50
#> EMBC     58.25
#> EMBO     76.50
#> EMBS     43.25
#> PRES     53.25
#> QSOT     66.50
#> VIRI     83.50

Principal Functions

statis

Applies the STATIS method to a set of matrices (data tables) with the same rows. STATIS is a multivariate analysis technique that allows studying the common structure and the evolution of individuals and variables across multiple tables.

plot.statis.circle

This function generates a correlation circle plot from two-dimensional coordinates, commonly used in principal component analysis (PCA) or other multivariate methods.

plot.statis.plane

This function generates a two-dimensional scatter plot with centered axes, useful for representing the results of multivariate analyses.

Example 1: Article on Wine Evaluation by Experts

rows <- paste0("Wine", 1:6)

expert1 <- data.frame(
  fruity = c(1, 5, 6, 7, 2, 3),
  woody  = c(6, 3, 1, 1, 5, 4),
  coffee = c(7, 2, 1, 2, 4, 4),
  row.names = rows
)

expert2 <- data.frame(
  red_fruit = c(2, 4, 5, 7, 3, 3),
  roasted   = c(5, 4, 2, 2, 5, 5),
  vanillin  = c(7, 4, 1, 1, 6, 4),
  woody     = c(6, 2, 1, 2, 5, 5),
  row.names = rows
)

expert3 <- data.frame(
  fruity = c(3, 4, 7, 2, 2, 1),
  butter = c(6, 4, 1, 2, 6, 7),
  woody  = c(7, 3, 1, 2, 6, 5),
  row.names = rows
)

labels <- c("Expert 1", "Expert 2", "Expert 3")

Apply statis without any selection and save the result

res <- statis(list(expert1, expert2, expert3), table.labels = labels)
Plot Correlation Circle of all the tables
inter <- res$circle.inter
inter.graph <- plot.statis.circle(inter$points, inertia = inter$inertia, labels = inter$labels, title = inter$title)
inter.graph

Graphic Interpretation

  • Each expert (1, 2, and 3) appears as a point.
  • All three are located relatively close to each other, with an explained inertia of 99%, which indicates that their evaluations are highly consistent with one another, although Expert 3 is slightly different in their assessments.
  • This means that, despite having different sensory variables (fruity, woody, coffee, etc.), the experts agree on the general structure of their evaluations, with Expert 3 being somewhat different in their assessments.
Plot Correlation Circle of all variables evolution
intra <- res$circle.intra
intra.graph <- plot.statis.circle(intra$points, inertia = intra$inertia, labels = intra$labels, title = intra$title)
intra.graph

Graphic Interpretation

All the variables from the three experts are projected here: for example, Expert 1: fruity, Expert 2: red_fruit, Expert 3: butter, etc.

  • The inertia (95.02%) shows that the first two dimensions summarize almost all of the variability.
  • Some variables form small angles, which implies a strong and positive correlation (e.g., fruity from Expert 1 and red_fruit from Expert 2), reflecting that they capture similar sensations.
  • Others appear opposite, forming large angles, which implies a strong and negative correlation (e.g., butter from Expert 3). If the angle is close to 90°, this suggests that certain attributes are more specific to one evaluator and are not correlated with those of the others.
Plot Principal Plane of Average Individuals
individuals <- res$plane.individuals
ind.graph <- plot.statis.plane(individuals$points, inertia = individuals$inertia, labels = individuals$labels, title = individuals$title)
ind.graph

Graphic Interpretation

  • The wines (Wine1…Wine6) are represented in the average space of the experts.
  • Groupings can be observed: for example, Wine1 and Wine5 are located close to each other and in the same quadrant, which implies similar sensory profiles.
  • In contrast, Wine3 and Wine4 are projected in the opposite area, indicating that they have distinct sensory characteristics.
  • The inertia (95%) ensures that the representation is accurate.
Plot Principal Plane of the Evolution of Individuals
evolution <- res$plane.evolution
evol.graph <- plot.statis.plane(evolution$points, inertia = evolution$inertia, labels = evolution$labels, title = evolution$title)
evol.graph

Graphic Interpretation

Here we can see how each wine is evaluated separately by each expert (e.g., “Wine1·Expert1,” “Wine1·Expert2,” etc.).

  • The trajectories of each wine (three connected points) allow us to analyze the consistency among experts.
  • If the three points of a wine are very close to each other, it means that the experts perceive it similarly.
  • If they are dispersed, it shows disagreement among them. For example, Wine1 and Wine5 show greater coherence in evaluations among the experts. Wine3 and Wine4 show more dispersion, suggesting that they generate divergent perceptions depending on the evaluator.

Apply statis with specific selections and save the result

Selecting tables 1 and 3
res <- statis(list(expert1, expert2, expert3), selected.tables = c(1, 3), table.labels = labels)
Plot Correlation Circle of all the tables
inter <- res$circle.inter
inter.graph <- plot.statis.circle(inter$points, inertia = inter$inertia, labels = inter$labels, title = inter$title)
inter.graph

Plot Correlation Circle of all variables evolution
intra <- res$circle.intra
intra.graph <- plot.statis.circle(intra$points, inertia = intra$inertia, labels = intra$labels, title = intra$title)
intra.graph

Selecting rows 1 and 5
res <- statis(list(expert1, expert2, expert3), selected.rows = c(1, 5), table.labels = labels)
Plot Principal Plane of Average Individuals
individuals <- res$plane.individuals
ind.graph <- plot.statis.plane(individuals$points, inertia = individuals$inertia, labels = individuals$labels, title = individuals$title)
ind.graph

Plot Principal Plane of the Evolution of Individuals
evolution <- res$plane.evolution
evol.graph <- plot.statis.plane(evolution$points, inertia = evolution$inertia, labels = evolution$labels, title = evolution$title)
evol.graph

Selecting rows 3 and 4
res <- statis(list(expert1, expert2, expert3), selected.rows = c(3, 4), table.labels = labels)
Plot Principal Plane of Average Individuals
individuals <- res$plane.individuals
ind.graph <- plot.statis.plane(individuals$points, inertia = individuals$inertia, labels = individuals$labels, title = individuals$title)
ind.graph

Plot Principal Plane of the Evolution of Individuals
evolution <- res$plane.evolution
evol.graph <- plot.statis.plane(evolution$points, inertia = evolution$inertia, labels = evolution$labels, title = evolution$title)
evol.graph

Selecting tables (1,3) and rows (1,4)
res <- statis(list(expert1, expert2, expert3), selected.tables = c(1, 3), selected.rows = c(1, 4), labels)
Plot Correlation Circle of all the tables
inter <- res$circle.inter
inter.graph <- plot.statis.circle(inter$points, inertia = inter$inertia, labels = inter$labels, title = inter$title)
inter.graph

Plot Correlation Circle of all variables evolution
intra <- res$circle.intra
intra.graph <- plot.statis.circle(intra$points, inertia = intra$inertia, labels = intra$labels, title = intra$title)
intra.graph

Plot Principal Plane of Average Individuals
individuals <- res$plane.individuals
ind.graph <- plot.statis.plane(individuals$points, inertia = individuals$inertia, labels = individuals$labels, title = individuals$title)
ind.graph

Plot Principal Plane of the Evolution of Individuals
evolution <- res$plane.evolution
evol.graph <- plot.statis.plane(evolution$points, inertia = evolution$inertia, labels = evolution$labels, title = evolution$title)
evol.graph

Example 2: Tarcoles River Basin of Costa Rica

Project “Development and Application of Low-Cost Effective Methods for Biological Monitoring of Costa Rican Rivers” from the National University. With 4 measurements taken over time.

Read csv files and load data

M1 <- read.table("STATIS_TABLE1.csv", header = TRUE, sep = ",", dec = ".", row.names = 1)
M2 <- read.table("STATIS_TABLE2.csv", header = TRUE, sep = ",", dec = ".", row.names = 1)
M3 <- read.table("STATIS_TABLE3.csv", header = TRUE, sep = ",", dec = ".", row.names = 1)
M4 <- read.table("STATIS_TABLE4.csv", header = TRUE, sep = ",", dec = ".", row.names = 1)

M1
#>           NIT  FOS   CAL    STO   PH   MN   ZN   SS    ALC    CL   CAU   DBO
#> ALAJUELA 2.36 0.43 69.00 156.50 8.41 0.06 0.03 0.20 150.75  7.85  0.45 16.70
#> CIRUELAS 3.60 0.53 72.25 184.75 8.44 0.04 0.01 0.42 149.50  8.12  1.37  4.78
#> DESFO    0.89 0.70 61.00 172.25 7.31 0.19 0.02 0.55 129.00  3.75 19.90 11.07
#> EMBC     1.06 0.69 55.50 177.25 7.21 0.18 0.02 0.72 127.75  3.70 12.00 13.55
#> EMBO     0.83 0.63 53.25 169.00 7.70 0.18 0.26 0.70 136.75  3.60 12.00 22.18
#> EMBS     0.92 0.66 59.00 160.25 7.25 0.18 0.03 0.49 125.75  3.37 12.00 14.02
#> PRES     1.44 0.78 58.25 167.50 7.29 0.21 0.03 0.66 120.50  3.92 17.85 25.00
#> QSOT     2.14 0.55 71.25 188.25 8.36 0.03 0.02 0.20 165.00  3.60  0.14  3.27
#> VIRI     7.65 1.19 58.50 335.50 8.05 0.28 0.06 1.85 182.25 12.90 16.68 21.50
#>            POR
#> ALAJUELA 87.25
#> CIRUELAS 88.25
#> DESFO    48.50
#> EMBC     58.25
#> EMBO     76.50
#> EMBS     43.25
#> PRES     53.25
#> QSOT     66.50
#> VIRI     83.50
M2
#>           NIT  FOS  CAL   STO   PH   MN   ZN   SS   ALC    CL   CAU    DBO  POR
#> ALAJUELA 1.89 0.96 68.5 230.0 8.10 0.16 0.16 0.95 116.5 17.70  0.90  46.35 84.0
#> CIRUELAS 3.74 1.39 63.5 232.5 8.05 0.12 0.04 0.75 128.5  9.30  3.05  46.00 70.0
#> DESFO    2.32 0.90 64.0 227.5 7.40 0.26 0.16 0.95 101.0  4.10 37.10  54.15 68.5
#> EMBC     2.58 1.21 58.5 288.0 7.25 0.42 0.05 0.65  95.0  4.25 12.00  22.45 57.0
#> EMBO     2.31 1.21 60.5 224.5 7.28 0.16 0.06 0.95  97.0  4.10 12.00  24.25 61.0
#> EMBS     2.52 1.10 61.5 217.0 7.35 0.21 0.10 0.85  98.0  4.25 12.00  21.65 64.0
#> PRES     2.71 1.13 62.5 195.0 7.50 0.20 0.02 1.15 104.5  4.10 28.20  90.70 77.0
#> QSOT     2.42 1.58 68.5 209.0 8.20 0.04 0.03 0.35 153.0  4.55  0.35  20.10 85.5
#> VIRI     5.73 1.91 57.0 404.0 8.05 0.31 0.07 1.75 153.5 12.85 17.25 116.95 86.5
M3
#>           NIT  FOS   CAL    STO   PH   MN   ZN   SS    ALC   CL   CAU  DBO
#> ALAJUELA 2.13 0.24 80.25 176.25 7.74 0.25 0.07 0.62 104.75 5.90  1.64 4.47
#> CIRUELAS 3.71 0.37 80.00 190.25 7.75 0.22 0.05 0.46 113.25 7.08  4.68 1.85
#> DESFO    4.08 0.27 81.25 169.00 7.33 0.17 0.04 0.70  85.75 3.27 59.75 1.93
#> EMBC     3.86 0.27 81.50 168.00 7.16 0.17 0.48 0.46  78.75 3.35 20.00 1.35
#> EMBO     3.96 0.27 81.25 160.50 7.17 0.36 0.05 0.33 158.00 3.25 20.00 1.33
#> EMBS     4.14 0.25 82.75 152.25 7.26 0.20 0.09 0.37  79.50 3.35 20.00 1.45
#> PRES     3.90 0.35 79.25 176.50 7.41 0.37 0.28 0.62  80.25 3.35 53.27 3.77
#> QSOT     2.66 0.40 82.25 186.50 8.00 0.03 0.06 0.16 162.50 3.42  0.61 1.75
#> VIRI     6.09 0.87 67.00 321.25 7.80 0.28 0.11 1.27 136.00 9.05 50.05 9.35
#>            POR
#> ALAJUELA 86.00
#> CIRUELAS 83.00
#> DESFO    84.50
#> EMBC     81.75
#> EMBO     82.00
#> EMBS     84.50
#> PRES     89.75
#> QSOT     85.25
#> VIRI     90.50
M4
#>           NIT  FOS  CAL   STO   PH   MN   ZN   SS   ALC    CL   CAU   DBO  POR
#> ALAJUELA 1.06 0.46 73.0 180.5 8.05 0.12 0.25 0.45 135.0 10.55  1.21  8.90 79.0
#> CIRUELAS 4.77 0.84 72.5 159.0 8.05 0.08 0.15 0.40 105.0  6.65  3.58  3.50 84.5
#> DESFO    1.06 0.33 67.0 134.0 7.30 0.17 0.03 0.70 103.5  2.90 62.25  8.10 51.0
#> EMBC     1.11 0.46 60.5 157.5 7.25 0.16 0.07 1.00  99.5  2.75 20.00  9.65 41.5
#> EMBO     1.37 0.37 66.5 158.0 7.30 0.15 0.21 0.85  99.5  2.60 20.00  6.60 49.5
#> EMBS     1.06 0.37 64.5 156.5 7.30 0.15 0.16 0.45 101.5  2.75 20.00  8.10 47.0
#> PRES     1.87 0.48 68.0 171.0 7.50 0.18 0.66 0.75  97.5  3.05 46.75 26.60 72.5
#> QSOT     2.23 0.76 78.0 185.5 8.30 0.00 0.15 0.20 178.5  2.90  0.36  1.50 85.5
#> VIRI     4.65 0.76 73.0 254.0 8.05 0.21 0.32 0.95 144.5  7.80 42.20  6.50 88.0

Statis without selections

labels <- c("Measurement 1", "Measurement 2", "Measurement 3", "Measurement 4")

res <- statis(list(M1, M2, M3, M4), table.labels = labels)
Plot Correlation Circle of all the tables
inter <- res$circle.inter
inter.graph <- plot.statis.circle(inter$points, inertia = inter$inertia, labels = inter$labels, title = inter$title)
inter.graph

Plot Correlation Circle of all variables evolution
intra <- res$circle.intra
intra.graph <- plot.statis.circle(intra$points, inertia = intra$inertia, labels = intra$labels, title = intra$title)
intra.graph

Plot Principal Plane of Average Individuals
individuals <- res$plane.individuals
ind.graph <- plot.statis.plane(individuals$points, inertia = individuals$inertia, labels = individuals$labels, title = individuals$title)
ind.graph

Plot Principal Plane of the Evolution of Individuals
evolution <- res$plane.evolution
evol.graph <- plot.statis.plane(evolution$points, inertia = evolution$inertia, labels = evolution$labels, title = evolution$title)
evol.graph

Selecting only table 2

res <- statis(list(M1, M2, M3, M4), selected.tables = c(2), table.labels = labels)
Plot Correlation Circle of all the tables
inter <- res$circle.inter
inter.graph <- plot.statis.circle(inter$points, inertia = inter$inertia, labels = inter$labels, title = inter$title)
inter.graph

Plot Correlation Circle of all variables evolution
intra <- res$circle.intra
intra.graph <- plot.statis.circle(intra$points, inertia = intra$inertia, labels = intra$labels, title = intra$title)
intra.graph

Selecting only row 3

res <- statis(list(M1, M2, M3, M4), selected.rows = c(3), table.labels = labels)
Plot Principal Plane of Average Individuals
individuals <- res$plane.individuals
ind.graph <- plot.statis.plane(individuals$points, inertia = individuals$inertia, labels = individuals$labels, title = individuals$title)
ind.graph

Plot Principal Plane of the Evolution of Individuals
evolution <- res$plane.evolution
evol.graph <- plot.statis.plane(evolution$points, inertia = evolution$inertia, labels = evolution$labels, title = evolution$title)
evol.graph

Selecting table 2 and row 3

res <- statis(list(M1, M2, M3, M4), selected.tables = c(2), selected.rows = c(3), table.labels = labels)
Plot Correlation Circle of all the tables
inter <- res$circle.inter
inter.graph <- plot.statis.circle(inter$points, inertia = inter$inertia, labels = inter$labels, title = inter$title)
inter.graph

Plot Correlation Circle of all variables evolution
intra <- res$circle.intra
intra.graph <- plot.statis.circle(intra$points, inertia = intra$inertia, labels = intra$labels, title = intra$title)
intra.graph

Plot Principal Plane of Average Individuals
individuals <- res$plane.individuals
ind.graph <- plot.statis.plane(individuals$points, inertia = individuals$inertia, labels = individuals$labels, title = individuals$title)
ind.graph

Plot Principal Plane of the Evolution of Individuals
evolution <- res$plane.evolution
evol.graph <- plot.statis.plane(evolution$points, inertia = evolution$inertia, labels = evolution$labels, title = evolution$title)
evol.graph

STATIS-DUAL Method

How to read a Table from a CSV file?

Tuis5_95 <- read.table("Tuis5_95.csv", header=TRUE, sep=';', dec=',')
Tuis5_95
#>      Ph Temp   Na   Ka    Ca   Mg Si02   OD  DBO  SD  ST  PO4   Cl  NO3 SO45
#> 1  8.10 23.8 4.34 1.68 11.60 3.67 32.3 8.18 2.00 120 152 0.57 1.00 0.82 1.46
#> 2  8.81 25.8 4.72 1.99 11.50 3.92 32.4 8.41 2.00  67  80 0.36 2.10 0.15 1.31
#> 3  7.00 23.5 3.51 1.44  9.08 2.95 28.5 7.79 2.00  85 108 0.26 1.31 0.63 0.83
#> 4  7.64 26.3 3.46 1.35  9.23 2.83 28.2 7.66 2.00  87  94 0.36 0.78 0.28 0.83
#> 5  7.03 22.3 3.91 1.46  9.48 3.34 25.6 7.63 2.00 138 160 0.63 1.09 1.01 1.15
#> 6  7.37 24.2 3.82 1.60  9.98 3.15 29.2 7.95 2.16  67  82 0.12 0.77 0.46 0.80
#> 7  7.00 23.7 3.34 2.29  9.20 2.91 28.4 7.09 4.12  86 122 0.33 1.36 1.43 0.90
#> 8  6.60 21.8 3.34 1.56  9.11 2.85 34.1 6.34 4.42  65  74 0.22 0.81 0.61 0.90
#> 9  6.79 22.3 3.64 2.05 10.80 3.32 30.6 6.35 7.02 119 124 0.36 1.00 0.00 0.82
#> 10 7.01 22.0 3.67 1.56 10.10 3.36 35.6 8.06 2.00  98 114 0.36 0.79 0.67 0.92
#>    HC03   DT   POD  Cal
#> 1  61.6 44.1 105.0 82.6
#> 2  69.3 44.9 112.0 81.1
#> 3  52.7 34.8  99.0 88.3
#> 4  58.5 34.7 103.0 87.4
#> 5  60.7 37.5  95.1 83.1
#> 6  63.4 37.9 102.0 90.2
#> 7  49.8 35.0  90.1 83.1
#> 8  57.7 34.5  78.3 74.4
#> 9  63.8 40.7  79.2 77.3
#> 10 60.2 39.1  99.6 87.5

Principal Functions

statis.dual

Implementation of the STATIS DUAL method for the joint analysis of multiple tables that share the same variables (same columns). This approach allows evaluating the common structure between tables (interstructure), building a compromise (weighted average of structures), and analyzing the trajectories of variables across the tables.

plot.statis.dual.circle

This function generates a 2D scatter plot with support for multiple groups, labels, arrows from the origin, reference circles, cross axes, and full style customization using ggplot2.

plot.statis.dual.trajectories

Visualizes the evolution of one or more variables across the different tables in a STATIS DUAL analysis. Each trajectory represents the sequence of positions of a variable in the compromise space.

select.super.variables

This function selects a predefined subset of variables from a supervision matrix, checks dimension consistency, verifies missing variables, and constructs a clean data frame containing the first two coordinates typically used for PCA or STATIS DUAL correlation plots.

Example 1: Sugarcane in Costa Rica

Read csv files and apply STATIS DUAL

Tuis5_95 <- read.table("Tuis5_95.csv", header=TRUE, sep=';', dec=',')
Tuis5_96 <- read.table("Tuis5_96.csv", header=TRUE, sep=';', dec=',')
Tuis5_97 <- read.table("Tuis5_97.csv", header=TRUE, sep=';', dec=',')
Tuis5_98 <- read.table("Tuis5_98.csv", header=TRUE, sep=';', dec=',')

labels = c("95","96","97","98")

res <- statis.dual(list(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98), labels.tables = labels)

Use plot.statis.dual.circle to get the Interstructure graph

plot.statis.dual.circle(points.list = list(res$interstructure), labels = res$labels.tables) + ggplot2::ggtitle("Interstructure")

Graphic Interpretation

Each point represents a data table (that is, a group of sugarcane plants with their chemical measurements).

  • The proximity between points indicates that those tables share a similar pattern of correlations among the chemical variables.
  • If a table appears farther away, it means its correlation profile is different, possibly because those plants were under different soil, water, or management conditions.

This allows us to compare which sets of plants are more “similar” in their chemistry and which ones differ.

Use plot.statis.dual.circle to get the Correlation Circle for all variables

plot.statis.dual.circle(list(res$supervariables), labels = row.names(res$supervariables)) + ggplot2::ggtitle("Correlation (all variables)")

Graphic Interpretation

Here we can see how the 19 chemical variables are related.

  • Variables with arrows pointing in the same direction are positively correlated: they tend to increase together in the plants.
  • Variables with arrows pointing in opposite directions are negatively correlated: when one increases, the other decreases.
  • Those forming an angle close to 90° are almost independent.

Typical example in sugarcane:

  • Ca, Mg, and HCO₃ often align (positive correlation) because they originate from calcareous soil conditions.
  • PO₄ and NO₃ (fertilization nutrients) may appear correlated.
  • DO and BOD reflect water quality, with specific relationships that can oppose (negative correlation) salts such as Cl or SO₄.

Use plot.statis.dual.circle to get the Correlation Circle for the selected variables

It´s important to mention here that you have to use select.super.variables function to save the selected variables in a data frame, and use this data frame as a parameter in the plot.statis.dual.circle function.

selected.variables <- c("Ph", "Temp", "DBO", "ST", "PO4", "NO3", "POD", "Cal")

superv.sel.df <- select.super.variables(res$supervariables, res$vars.names, selected.variables)

plot.statis.dual.circle(list(superv.sel.df), labels = row.names(superv.sel.df)) + ggplot2::ggtitle("Correlation (selected variables)")

Graphic Interpretation

This subplot highlights only the key variables for interpreting sugarcane physiology.

  • pH and Temp tend to show a positive correlation: this suggests that water/soil temperature and pH are related under the cultivation conditions.
  • BOD and DO are associated with organic matter and oxygen consumption: they are positively correlated, indicating a strong link between microbial metabolism and nutrient dynamics.
  • TDS, PO₄, and NO₃ reflect the influence of fertilization and mineral load: if they cluster together (positive correlation), it suggests that the plants absorb these nutrients jointly.
  • Calcium (Ca) may oppose (negative correlation) mobile nutrients like NO₃, reflecting differences between more calcareous soils and more leached soils.

Use plot.statis.dual.trajectories to get the trajectories graph for the selected variables

vars.A <- c("Ph","ST","NO3")
plot.statis.dual.trajectories(vars = vars.A, trajectories = res$trajectories, labels.tables = res$labels.tables) + ggplot2::ggtitle(sprintf("Trajectories (%s)", paste(vars.A, collapse = ", ")))


vars.B <- c("OD","DBO","PO4")
plot.statis.dual.trajectories(vars = vars.B, trajectories = res$trajectories, labels.tables = res$labels.tables) + ggplot2::ggtitle(sprintf("Trajectories (%s)", paste(vars.B, collapse = ", ")))


# If you want to select an specific variable
vars.1 <- "Temp"
plot.statis.dual.trajectories(vars = vars.1, trajectories = res$trajectories, labels.tables = res$labels.tables) + ggplot2::ggtitle(sprintf("Trajectory (%s)", vars.1))

Use plot.statis.dual.trajectories to get the trajectories graph for all variables

plot.statis.dual.trajectories(vars = res$vars.names, trajectories = res$trajectories, labels.tables = res$labels.tables)

Graphic Interpretation

Each trajectory shows how the position of a variable changes in the correlation space across the different plant tables.

  • A short and compact trajectory (e.g., Temp) indicates that this variable maintains stable relationships with the others across all plant groups.
  • A long trajectory or one with changes in direction (e.g., Ozone in the other example; here it could be NO₃ or PO₄) indicates that the role of that variable changes between tables: under some conditions it is strongly associated with productivity, while in others it is more related to water or soil quality.

In sugarcane, this can reflect differences in fertilization, soil type, or irrigation water quality.

Example 2: airquality (base R) ⇒ K = 5 = months

New York air quality (1973) by day, with the variables Ozone, Solar.R, Wind, Temp (Temperature).

Z: is separating by month (May–September ⇒ K = 5). The group sizes may differ from one month to another.

Data

vars <- c("Ozone","Solar.R","Wind","Temp")
AQ <- na.omit(airquality[, c(vars, "Month")])
Z  <- split(AQ[ , vars], AQ$Month)        # list(Z5, Z6, Z7, Z8, Z9)
names(Z) <- paste0("M", names(Z))         # "M5","M6","M7","M8","M9"
Z <- lapply(Z, as.matrix)

Apply STATIS DUAL

labels <- c("May","June","July","August","September")
res  <- statis.dual(Z, labels.tables = labels)

Interstructure graph by Month

interstructure <- list(res$interstructure)
labels <- res$labels.tables

plot.statis.dual.circle(points.list = interstructure, labels = labels) + ggplot2::ggtitle("Airquality (NY): Interstructure by Month")

Graphic Interpretation

This plot shows the relationship between the tables (in this case, each month with its observations of Ozone, Solar.R, Wind, and Temp).

  • Months that appear close together in the circle share a similar correlation structure among the variables.
  • Months that are far apart or opposite reflect different patterns in the relationships between variables.

For example, since July and August are close, it indicates that these months have similar correlations among Ozone, Solar Radiation, Wind, and Temperature. May appears isolated, meaning that the correlation pattern in May is different (possibly due to being a cooler month with lower solar radiation).

Correlation Circle for all variables

plot.statis.dual.circle(list(res$supervariables), labels = row.names(res$supervariables)) + ggplot2::ggtitle("Airquality (NY): Correlation (all variables)")

Graphic Interpretation

This plot allows us to see how the original variables (Ozone, Solar.R, Wind, Temp) correlate with the main axes of the compromise.

  • Arrows pointing in the same direction indicate positively correlated variables.
  • Opposite arrows (angle close to 180°) indicate negative correlation.
  • Arrows forming an angle of ~90° indicate that the variables are nearly independent.

Typically in this dataset:

  • Temp and Solar.R tend to align, meaning a positive correlation (more radiation on warmer days).
  • Wind appears opposite (negative correlation) to Temp and Ozone: when wind is stronger, ozone levels and temperature tend to be lower.
  • Ozone usually correlates with solar radiation and temperature.

Correlation Circle with Selected Variables

selected.variables <- c("Ozone","Wind","Temp")

superv.sel.df <- select.super.variables(res$supervariables, res$vars.names, selected.variables)

plot.statis.dual.circle(list(superv.sel.df), labels = row.names(superv.sel.df)) + ggplot2::ggtitle("Airquality (NY): Correlation (selected variables)")

Graphic Interpretation

Here we focus only on these three variables to highlight the contrast:

  • Ozone and Temp are usually strongly positively correlated → ozone tends to increase on warmer days.
  • Wind points in the opposite direction (negative correlation) → confirming that wind reduces ozone accumulation in the air.

This provides a clearer way to visualize the key relationship: heat and sun increase ozone, while wind disperses it.

Variable Trajectories by Month

vars.A <- c("Ozone","Temp")
p.tray.A <- plot.statis.dual.trajectories(vars = vars.A, trajectories = res$trajectories, labels.tables = labels) + ggplot2::ggtitle(sprintf("Airquality (NY): Trajectories (%s)", paste(vars.A, collapse = ", ")))
p.tray.A


vars.B <- c("Solar.R","Wind")
p.tray.B <- plot.statis.dual.trajectories(vars = vars.B, trajectories = res$trajectories, labels.tables = labels) + ggplot2::ggtitle(sprintf("Airquality (NY): Trajectories (%s)", paste(vars.B, collapse = ", ")))
p.tray.B


vars.1 <- "Ozone"
p.tray.1 <- plot.statis.dual.trajectories(vars = vars.1, trajectories = res$trajectories, labels.tables = labels) + ggplot2::ggtitle(sprintf("Airquality (NY): Trajectory (%s)", vars.1))
p.tray.1

Trajectory of All Variables

p.tray.all <- plot.statis.dual.trajectories(vars = res$vars.names, trajectories = res$trajectories, labels.tables = res$labels.tables)
p.tray.all

Graphic Interpretation

Each trajectory plot shows how each variable moves (its correlations with the main axes) across the months from May to September.

  • A stable trajectory (points close together) indicates that the variable behaves consistently from month to month.
  • A long trajectory or one that changes direction shows that the variable’s relationship with the others changes from one month to another.

Example:

  • Ozone: usually shows strong variation (more ozone in summer, less in May/September). The trajectory can elongate, indicating that ozone’s correlations with Temp and Solar.R change throughout the summer.
  • Temp: may have a more stable trajectory, because temperature follows a gradual seasonal pattern.
  • Wind: may show changing trajectories, reflecting that in some months wind plays a larger role in dispersing ozone.