bigPLSR: Partial Least Squares Regression Models with Big Matrices

Fast partial least squares (PLS) for dense and out-of-core data. Provides SIMPLS (straightforward implementation of a statistically inspired modification of the PLS method) and NIPALS (non-linear iterative partial least-squares) solvers, plus kernel-style PLS variants ('kernelpls' and 'widekernelpls') with parity to 'pls'. Optimized for 'bigmemory'-backed matrices with streamed cross-products and chunked BLAS (Basic Linear Algebra Subprograms) (XtX/XtY and XXt/YX), optional file-backed score sinks, and deterministic testing helpers. Includes an auto-selection strategy that chooses between XtX SIMPLS, XXt (wide) SIMPLS, and NIPALS based on (n, p) and a configurable memory budget. About the package, Bertrand and Maumy (2023) <https://hal.science/hal-05352069>, and <https://hal.science/hal-05352061> highlighted fitting and cross-validating PLS regression models to big data. For more details about some of the techniques featured in the package, Dayal and MacGregor (1997) <doi:10.1002/(SICI)1099-128X(199701)11:1%3C73::AID-CEM435%3E3.0.CO;2-%23>, Rosipal & Trejo (2001) <https://www.jmlr.org/papers/v2/rosipal01a.html>, Tenenhaus, Viennet, and Saporta (2007) <doi:10.1016/j.csda.2007.01.004>, Rosipal (2004) <doi:10.1007/978-3-540-45167-9_17>, Rosipal (2019) <https://ieeexplore.ieee.org/document/8616346>, Song, Wang, and Bai (2024) <doi:10.1016/j.chemolab.2024.105238>. Includes kernel logistic PLS with 'C++'-accelerated alternating iteratively reweighted least squares (IRLS) updates, streamed reproducing kernel Hilbert space (RKHS) solvers with reusable centering statistics, and bootstrap diagnostics with graphical summaries for coefficients, scores, and cross-validation workflows, alongside dedicated plotting utilities for individuals, variables, ellipses, and biplots. The streaming backend uses far less memory and keeps memory bounded across data sizes. For PLS1, streaming is often fast enough while preserving a small memory footprint; for PLS2 it remains competitive with a bounded footprint. On small problems that fit comfortably in RAM (random-access memory), dense in-memory solvers are slightly faster; the crossover occurs as n or p grow and the Gram/cross-product cost dominates.

Version: 0.7.2
Depends: R (≥ 4.0.0)
Imports: Rcpp, bigmemory
LinkingTo: Rcpp, RcppArmadillo, BH, bigmemory
Suggests: bench, dplyr, forcats, future, future.apply, ggplot2, knitr, pls, plsRglm, rmarkdown, RhpcBLASctl, svglite, testthat (≥ 3.0.0), tidyr, withr
Published: 2025-12-01
DOI: 10.32614/CRAN.package.bigPLSR (may not be active yet)
Author: Frederic Bertrand ORCID iD [cre, aut], Myriam Maumy ORCID iD [aut]
Maintainer: Frederic Bertrand <frederic.bertrand at lecnam.net>
BugReports: https://github.com/fbertran/bigPLSR/issues
License: GPL-3
URL: https://fbertran.github.io/bigPLSR/, https://github.com/fbertran/bigPLSR
NeedsCompilation: yes
SystemRequirements: C++17, Optional CBLAS (detected at compile time)
Classification/MSC: 62N01, 62N02, 62N03, 62N99
Citation: bigPLSR citation info
Materials: README, NEWS
CRAN checks: bigPLSR results

Documentation:

Reference manual: bigPLSR.html , bigPLSR.pdf
Vignettes: Automatic Algorithm Selection in bigPLSR (source, R code)
Streaming Kernel PLS in bigPLSR: XX^T and Column-Chunked Variants (source, R code)
Bootstrap strategies for bigPLSR (source, R code)
Cross-validation and Information Criteria in bigPLSR (source, R code)
Double RKHS PLS (rkhs_xy): Theory and Usage (source, R code)
External PLS benchmarks for bigPLSR: detailed analysis (source, R code)
Benchmarking bigPLSR against external PLS implementations (source, R code)
KF-PLS: Streaming PLS with Kalman-style updates (source, R code)
Kernel Logistic PLS (source, R code)
Kernel and Streaming PLS Methods in bigPLSR (source, R code)
Visualising PLS Fits with bigPLSR (source, R code)
Benchmarking PLS1 Implementations (source, R code)
Benchmarking PLS2 Implementations (source, R code)
RKHS-based Algorithms in bigPLSR (source, R code)

Downloads:

Package source: bigPLSR_0.7.2.tar.gz
Windows binaries: r-devel: not available, r-release: not available, r-oldrel: not available
macOS binaries: r-release (arm64): not available, r-oldrel (arm64): not available, r-release (x86_64): not available, r-oldrel (x86_64): not available

Linking:

Please use the canonical form https://CRAN.R-project.org/package=bigPLSR to link to this page.