% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/plausible_check.R
\name{plausible_check}
\alias{plausible_check}
\title{Perform plausibility Check for Data Frame Columns}
\usage{
plausible_check(
  S_data,
  M_data,
  Result = FALSE,
  show_column = NULL,
  date_parser_fun = smart_to_gregorian_vec,
  var_select = "all",
  verbose = FALSE
)
}
\arguments{
\item{S_data}{data.frame. The source data in which rules will be evaluated. Each column may be referenced by the rules.}

\item{M_data}{data.frame. Metadata describing variables and their plausibility rules. Must include at least columns \code{VARIABLE} , \code{Plausible_Rule}, \code{TYPE} and \code{Plausible_Error_Type}.}

\item{Result}{logical (default: \code{FALSE}). If \code{TRUE}, returns row-by-row evaluation results for each rule. If \code{FALSE}, returns a summary table for each rule.}

\item{show_column}{character vector (default: \code{NULL}). Names of columns from \code{S_data} to include in the result when \code{Result = TRUE}. Ignored otherwise.}

\item{date_parser_fun}{function (default: \code{smart_to_gregorian_vec}). Converting Persian dates to English,Function to convert date values or date literals to \code{Date} class. Must accept character vectors and return \code{Date} objects.}

\item{var_select}{character, numeric, or \code{"all"} (default: \code{"all"}). Subset of variables (rules) to check. Can be a character vector of variable names, numeric vector of row indices in \code{M_data}, or \code{"all"} to run all rules.}

\item{verbose}{logical (default: \code{FALSE}). If \code{TRUE}, prints diagnostic messages during rule processing and evaluation.}
}
\value{
If \code{Result = FALSE}: a data.frame summary with columns:
\itemize{
  \item VARIABLE: Name of the variable/rule.
  \item Condition_Met: Number of rows where the rule is TRUE.
  \item Condition_Not_Met: Number of rows where the rule is FALSE.
  \item NA_Count: Number of rows with missing/indeterminate result.
  \item Total_Applicable: Number of non-NA rows.
  \item Total_Rows: Number of total rows.
  \item Percent_Met: Percentage of applicable rows meeting the condition.
  \item Percent_Not_Met: Percentage of applicable rows not meeting the condition.
  \item Plausible_Error_Type: Error type from metadata (if available).
}
}
\description{
This function evaluates Plausibility Rule Checking for Data Frame Columns
It Checks logical and clinical rules on columns of a data frame, based on metadata specifications.
Supports flexible rule definition, check logical range of variables, and customizable output.
}
\details{
The metadata data.frame (\code{M_data}) must contain at least the following columns:
\itemize{
  \item \strong{VARIABLE}: The name of the variable in \code{S_data} to which the rule applies.
  \item \strong{Plausible_Rule}: The logical rule (as a string) to be evaluated for each row.
  \item \strong{TYPE}: The expected type of the variable (e.g., "numeric", "date", "character").
  \item \strong{Plausible_Error_Type}: The error type for each rule will be reported in the summary output.Based on the importance and severity of the rule, it can include two options: "Warning" or "Error".
}

For each variable described in \code{M_data}, the function:
\itemize{
  \item Replaces any instance of the string "val" in the rule with the actual column name of the variable.
  \item Parses and detects any date literals in the rule and substitutes them with placeholders; these placeholders are converted to Date class using the provided \code{date_parser_fun}.
  \item Automatically converts any referenced data columns to the appropriate type (numeric, date, or character) based on the \code{TYPE} column in the metadata.
  \item Detects which columns from \code{S_data} are referenced in each rule and ensures they are available and correctly typed before evaluation.
  \item Evaluates the rule for each row of \code{S_data}, using vectorized evaluation for performance where possible, and falling back to row-wise evaluation if necessary (e.g., for rules that are not vectorizable, such as those using \code{ifelse} with NA logic).
}

The function supports flexible rule definitions, including conditions involving multiple columns, and custom logic using R expressions.

If \code{Result = FALSE}, the function returns a summary table for each rule, including counts and percentages of rows that meet or do not meet the condition, as well as the error type from the metadata if present.

If \code{Result = TRUE}, the function returns a data.frame with one column per rule/variable, each containing logical values (\code{TRUE}, \code{FALSE}, or \code{NA}) for every row, plus any extra columns from \code{S_data} listed in \code{show_column}.
}
\examples{
# Source data
S_data <- data.frame(
  National_code = c("123", "1456", "789","545","4454","554"),
  LastName = c("Aliyar","Johnson","Williams","Brown","Jones","Garcia"),
  VisitDate = c("2025-09-23", "2021-01-10", "2021-01-03","1404-06-28","1404-07-28",NA),
  Test_date = c("1404-07-01", "2021-01-09", "2021-01-14","1404-06-29","2025-09-19",NA),
  Certificate_validity = c("2025-09-21", "2025-01-12", "2025-02-11","1403-06-28","2025-09-19",NA),
  DiastolicBP = c(110, NA, 145, 125,114,NA),
  SystolicBP = c(125, 150, NA, 110,100,NA),
  Prescription_drug= c("Atorvastatin", "Metformin", "Amlodipine",
  "Omeprazole", "Aspirin","Metoprolol"),
  Blood_type = c("A-","B+","AB","A+","O-","O+"),
  stringsAsFactors = FALSE
)

# META DATA
M_data <- data.frame(
  VARIABLE = c("National_code", "Certificate_validity", "VisitDate",
               "Test_date","LastName","DiastolicBP","SystolicBP",
               "Prescription_drug","Blood_type"),
  Plausible_Rule = c(
    "val<=123",
    "",
    "",
    "",
    "",
    "val < 40 | val > 145",
    "val < 50 | val > 230",
    "",
    ""),
  TYPE=c("numeric","date","date","date","character",
         "numeric","numeric","character","character"),
  Plausible_Error_Type = c("warning",NA,"Error","warning",NA,"warning","warning",NA,"Error"),
  stringsAsFactors = FALSE
)

result <- plausible_check(
  S_data = S_data,
  M_data = M_data,
  Result = TRUE,
  show_column = c("National_code")
)

print(result)

result <- plausible_check(
  S_data = S_data,
  M_data = M_data,
  Result = FALSE,
  var_select = c("DiastolicBP","DiastolicBP")
)

print(result)

}
