Fast partial least squares (PLS) for dense and out-of-core data. Provides SIMPLS (straightforward implementation of a statistically inspired modification of the PLS method) and NIPALS (non-linear iterative partial least-squares) solvers, plus kernel-style PLS variants ('kernelpls' and 'widekernelpls') with parity to 'pls'. Optimized for 'bigmemory'-backed matrices with streamed cross-products and chunked BLAS (Basic Linear Algebra Subprograms) (XtX/XtY and XXt/YX), optional file-backed score sinks, and deterministic testing helpers. Includes an auto-selection strategy that chooses between XtX SIMPLS, XXt (wide) SIMPLS, and NIPALS based on (n, p) and a configurable memory budget. About the package, Bertrand and Maumy (2023) <https://hal.science/hal-05352069>, and <https://hal.science/hal-05352061> highlighted fitting and cross-validating PLS regression models to big data. For more details about some of the techniques featured in the package, Dayal and MacGregor (1997) <doi:10.1002/(SICI)1099-128X(199701)11:1%3C73::AID-CEM435%3E3.0.CO;2-%23>, Rosipal & Trejo (2001) <https://www.jmlr.org/papers/v2/rosipal01a.html>, Tenenhaus, Viennet, and Saporta (2007) <doi:10.1016/j.csda.2007.01.004>, Rosipal (2004) <doi:10.1007/978-3-540-45167-9_17>, Rosipal (2019) <https://ieeexplore.ieee.org/document/8616346>, Song, Wang, and Bai (2024) <doi:10.1016/j.chemolab.2024.105238>. Includes kernel logistic PLS with 'C++'-accelerated alternating iteratively reweighted least squares (IRLS) updates, streamed reproducing kernel Hilbert space (RKHS) solvers with reusable centering statistics, and bootstrap diagnostics with graphical summaries for coefficients, scores, and cross-validation workflows, alongside dedicated plotting utilities for individuals, variables, ellipses, and biplots. The streaming backend uses far less memory and keeps memory bounded across data sizes. For PLS1, streaming is often fast enough while preserving a small memory footprint; for PLS2 it remains competitive with a bounded footprint. On small problems that fit comfortably in RAM (random-access memory), dense in-memory solvers are slightly faster; the crossover occurs as n or p grow and the Gram/cross-product cost dominates.
| Version: | 0.7.2 |
| Depends: | R (≥ 4.0.0) |
| Imports: | Rcpp, bigmemory |
| LinkingTo: | Rcpp, RcppArmadillo, BH, bigmemory |
| Suggests: | bench, dplyr, forcats, future, future.apply, ggplot2, knitr, pls, plsRglm, rmarkdown, RhpcBLASctl, svglite, testthat (≥ 3.0.0), tidyr, withr |
| Published: | 2025-12-01 |
| DOI: | 10.32614/CRAN.package.bigPLSR (may not be active yet) |
| Author: | Frederic Bertrand |
| Maintainer: | Frederic Bertrand <frederic.bertrand at lecnam.net> |
| BugReports: | https://github.com/fbertran/bigPLSR/issues |
| License: | GPL-3 |
| URL: | https://fbertran.github.io/bigPLSR/, https://github.com/fbertran/bigPLSR |
| NeedsCompilation: | yes |
| SystemRequirements: | C++17, Optional CBLAS (detected at compile time) |
| Classification/MSC: | 62N01, 62N02, 62N03, 62N99 |
| Citation: | bigPLSR citation info |
| Materials: | README, NEWS |
| CRAN checks: | bigPLSR results |
| Package source: | bigPLSR_0.7.2.tar.gz |
| Windows binaries: | r-devel: not available, r-release: not available, r-oldrel: not available |
| macOS binaries: | r-release (arm64): not available, r-oldrel (arm64): not available, r-release (x86_64): not available, r-oldrel (x86_64): not available |
Please use the canonical form https://CRAN.R-project.org/package=bigPLSR to link to this page.