| Title: | Enhanced Teaching Assistant (AI) for Statistical Analysis |
|---|---|
| Description: | An assistant built on large language models that helps interpret statistical model outputs in R by generating concise, audience-specific explanations. |
| Authors: | Sébastien Lê [aut, cre] (ORCID: <https://orcid.org/0000-0001-8814-6714>, Code and documentation assisted by ChatGPT.) |
| Maintainer: | Sébastien Lê <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-06-09 14:41:11 UTC |
| Source: | https://github.com/sebastien-le/entrainer |
Convert an EntraineR response to character
## S3 method for class 'entrainer_response' as.character(x, ...)## S3 method for class 'entrainer_response' as.character(x, ...)
x |
Object returned by gemini_generate(). |
... |
Unused. |
Generated text.
Monitoring data of water and air temperatures before and after riparian deforestation. Useful to illustrate linear regression with an interaction (Temp_air * Deforestation).
data(deforestation)data(deforestation)
A data frame with 56 rows and 3 variables:
numeric; water temperature (deg C).
numeric; air temperature (deg C).
factor with 2 levels: "BEFORE", "AFTER". 28 periods each.
Brief summary (indicative): Temp_water min ~ 0.55, median ~ 9.28, max ~ 18.89; Temp_air min ~ -3.04, median ~ 6.53, max ~ 15.75.
data(deforestation) str(deforestation) table(deforestation$Deforestation) # Linear model with interaction (FactoMineR): fit <- FactoMineR::LinearModel( Temp_water ~ Temp_air * Deforestation, data = deforestation, selection = "none" ) print(fit)data(deforestation) str(deforestation) table(deforestation$Deforestation) # Linear model with interaction (FactoMineR): fit <- FactoMineR::LinearModel( Temp_water ~ Temp_air * Deforestation, data = deforestation, selection = "none" ) print(fit)
Extract the response from an EntraineR prompt/result object
entrainer_response(x)entrainer_response(x)
x |
Object returned by an EntraineR trainer. |
The stored LLM response, or NULL.
Minimal wrapper around the Generative Language API ':generateContent' endpoint for text prompts, with retries, exponential backoff, clearer errors, and optional output compilation (HTML/DOCX). Files are opened only when 'open = TRUE'.
gemini_generate( prompt, model = "gemini-2.5-flash", api_key = Sys.getenv("GEMINI_API_KEY"), user_agent = NULL, base_url = "https://generativelanguage.googleapis.com/v1beta", temperature = NULL, top_p = NULL, top_k = NULL, max_output_tokens = NULL, stop_sequences = NULL, system_instruction = NULL, safety_settings = NULL, seed = NULL, timeout = 120, verbose = FALSE, max_tries = 5, backoff_base = 0.8, backoff_cap = 8, force_markdown = TRUE, compile_to = c("none", "html", "docx"), output_path = NULL, open = interactive() )gemini_generate( prompt, model = "gemini-2.5-flash", api_key = Sys.getenv("GEMINI_API_KEY"), user_agent = NULL, base_url = "https://generativelanguage.googleapis.com/v1beta", temperature = NULL, top_p = NULL, top_k = NULL, max_output_tokens = NULL, stop_sequences = NULL, system_instruction = NULL, safety_settings = NULL, seed = NULL, timeout = 120, verbose = FALSE, max_tries = 5, backoff_base = 0.8, backoff_cap = 8, force_markdown = TRUE, compile_to = c("none", "html", "docx"), output_path = NULL, open = interactive() )
prompt |
Character scalar. The user prompt (plain text). |
model |
Character scalar. Gemini model id (e.g., "gemini-2.5-flash", "gemini-2.5-pro"). You may also pass "models/..." and it will be normalized. |
api_key |
Character scalar. API key. Defaults to env var 'GEMINI_API_KEY'. |
user_agent |
Character scalar. If NULL, a dynamic value is used. |
base_url |
Character scalar. API base URL. |
temperature |
Optional numeric in [0, 2]. |
top_p |
Optional numeric in (0, 1]. |
top_k |
Optional integer >= 1. |
max_output_tokens |
Optional integer > 0. |
stop_sequences |
Optional character vector. |
system_instruction |
Optional character scalar. |
safety_settings |
Optional list passed as-is to the API. |
seed |
Optional integer seed. |
timeout |
Numeric seconds for request timeout (default 120). |
verbose |
Logical; if TRUE, prints URL/retries. |
max_tries |
Integer. Max attempts (default 5). |
backoff_base |
Numeric. Initial backoff seconds (default 0.8). |
backoff_cap |
Numeric. Max backoff seconds (default 8). |
force_markdown |
Logical. If TRUE, instructs the model to answer in Markdown. |
compile_to |
Character scalar. One of c("none","html","docx"). |
output_path |
Optional character scalar. Destination file for HTML/DOCX output. If NULL, a temporary file is created. |
open |
Logical; if TRUE, open the generated HTML/DOCX file. Defaults to 'interactive()'. |
An object of class 'entrainer_response' with a stable structure. The generated text is available in '$text'/'$markdown'; 'html_path' or 'docx_path' are populated when 'compile_to' is '"html"' or '"docx"'.
This function sends 'prompt' to the Google Generative Language API. Do not include confidential data unless this is intended and allowed in your context.
Sensory profile of hams (quantitative attributes) and an overall liking score. Useful to illustrate multiple regression and the joint reading of per-term F tests and coefficient T tests.
data(ham)data(ham)
A data frame with 21 rows (hams) and 15 variables:
numeric
numeric
numeric
numeric
numeric
numeric
numeric
numeric
numeric
numeric
numeric
numeric
numeric
numeric
numeric; overall acceptability score
Brief summary (indicative): median Juiciness ~ 3.0; median Tenderness ~ 6.0; mean Salty ~ 5.52; median Overall liking ~ 6.5.
data(ham) summary(ham) # Multiple regression without selection (FactoMineR): fit <- FactoMineR::LinearModel( `Overall liking` ~ ., data = ham, selection = "none" ) print(fit)data(ham) summary(ham) # Multiple regression without selection (FactoMineR): fit <- FactoMineR::LinearModel( `Overall liking` ~ ., data = ham, selection = "none" ) print(fit)
Chick weights measured under three brooding temperatures, with sex recorded. Useful for ANOVA and linear models with categorical factors.
data(poussin)data(poussin)
A data frame with 45 rows and 3 variables:
factor with 3 levels: "T1", "T2", "T3" (15 each).
factor with 2 levels: "Female", "Male" (about 20 and 25).
numeric; weight (units as provided).
Brief summary (indicative): Weight min ~ 15, median ~ 23, max ~ 33.
data(poussin) with(poussin, table(Temperature, Gender)) boxplot(Weight ~ Temperature, data = poussin, main = "Poussin weight by temperature") # Two-factor ANOVA (base stats): fit <- stats::aov(Weight ~ Temperature * Gender, data = poussin) summary(fit)data(poussin) with(poussin, table(Temperature, Gender)) boxplot(Weight ~ Temperature, data = poussin, main = "Poussin weight by temperature") # Two-factor ANOVA (base stats): fit <- stats::aov(Weight ~ Temperature * Gender, data = poussin) summary(fit)
Backward-compatible print method name
## S3 method for class 'entrainer_llm_result' print(x, ...)## S3 method for class 'entrainer_llm_result' print(x, ...)
x |
Object returned by older EntraineR generators. |
... |
Unused. |
Invisibly returns x.
Print an EntraineR prompt/result compactly
## S3 method for class 'entrainer_prompt' print(x, ...)## S3 method for class 'entrainer_prompt' print(x, ...)
x |
Object returned by an EntraineR trainer. |
... |
Unused. |
Invisibly returns x.
Print an EntraineR response compactly
## S3 method for class 'entrainer_response' print(x, ...)## S3 method for class 'entrainer_response' print(x, ...)
x |
Object returned by gemini_generate(). |
... |
Unused. |
Invisibly returns x.
Builds an English-only, audience-tailored prompt to interpret an ANOVA produced by FactoMineR::AovSum. The function never invents numbers: it only passes verbatim excerpts to the LLM and instructs how to interpret deviations (sum-to-zero coding) as performance drivers.
trainer_aovsum( x, introduction = NULL, alpha = 0.05, t_test = NULL, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... ) trainer_AovSum(aovsum_obj, ...)trainer_aovsum( x, introduction = NULL, alpha = 0.05, t_test = NULL, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... ) trainer_AovSum(aovsum_obj, ...)
x |
An object whose printed output contains sections named
"Ftest" and "Ttest" (e.g., |
introduction |
Optional character context paragraph for the analysis. Defaults to a generic description. |
alpha |
Numeric significance level used as an instruction for the LLM. Default 0.05. |
t_test |
Optional character vector to filter the T-test section by factor names and/or interactions (e.g. "Factor A" or "Factor A:Factor B"). |
audience |
Target audience, one of c("beginner","applied","advanced"). |
summary_only |
Logical; if TRUE, return a compact 3-bullet executive summary. |
llm_model |
Character model name for the generator (e.g., "llama3"). |
generate |
Logical; if TRUE, calls trainer_core_generate_or_return(). |
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
aovsum_obj |
Deprecated name for 'x'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
## Not run: # Example 1: SensoMineR chocolates (requires SensoMineR) if (requireNamespace("SensoMineR", quietly = TRUE)) { # Load data from SensoMineR data("chocolates", package = "SensoMineR") # ANOVA summary with Product and Panelist res <- FactoMineR::AovSum(Granular ~ Product * Panelist, data = sensochoc) intro <- "Six chocolates have been evaluated by a sensory panel, during two days, according to a sensory attribute: granular. The panel has been trained according to this attribute and panellists should be reproducible when rating this attribute." intro <- gsub("\n", " ", intro) intro <- gsub("\\s+", " ", intro) cat(intro) prompt <- trainer_aovsum(res, audience = "beginner", t_test = c("Product", "Panelist"), introduction = intro) cat(prompt) res <- gemini_generate(prompt, compile_to = "html") } # Example 2: Poussin dataset (shipped with this package) data(poussin) intro <- "For incubation, 45 chicken eggs were randomly assigned to three batches of 15. Three treatments (different incubation temperatures) were then applied to the batches. We assume that after hatching, all chicks were raised under identical conditions and then weighed at a standard reference age. At that time, the sex of the chicks - a factor known beforehand to cause significant weight differences - could also be observed. The objective is to choose the treatment that maximizes chick weight." intro <- gsub("\n", " ", intro) intro <- gsub("\\s+", " ", intro) cat(intro) res <- FactoMineR::AovSum(Weight ~ Gender * Temperature, data = poussin) prompt <- trainer_aovsum(res, audience = "beginner", t_test = c("Gender", "Temperature"), introduction = intro) cat(prompt) res <- gemini_generate(prompt, compile_to = "html") ## End(Not run)## Not run: # Example 1: SensoMineR chocolates (requires SensoMineR) if (requireNamespace("SensoMineR", quietly = TRUE)) { # Load data from SensoMineR data("chocolates", package = "SensoMineR") # ANOVA summary with Product and Panelist res <- FactoMineR::AovSum(Granular ~ Product * Panelist, data = sensochoc) intro <- "Six chocolates have been evaluated by a sensory panel, during two days, according to a sensory attribute: granular. The panel has been trained according to this attribute and panellists should be reproducible when rating this attribute." intro <- gsub("\n", " ", intro) intro <- gsub("\\s+", " ", intro) cat(intro) prompt <- trainer_aovsum(res, audience = "beginner", t_test = c("Product", "Panelist"), introduction = intro) cat(prompt) res <- gemini_generate(prompt, compile_to = "html") } # Example 2: Poussin dataset (shipped with this package) data(poussin) intro <- "For incubation, 45 chicken eggs were randomly assigned to three batches of 15. Three treatments (different incubation temperatures) were then applied to the batches. We assume that after hatching, all chicks were raised under identical conditions and then weighed at a standard reference age. At that time, the sex of the chicks - a factor known beforehand to cause significant weight differences - could also be observed. The objective is to choose the treatment that maximizes chick weight." intro <- gsub("\n", " ", intro) intro <- gsub("\\s+", " ", intro) cat(intro) res <- FactoMineR::AovSum(Weight ~ Gender * Temperature, data = poussin) prompt <- trainer_aovsum(res, audience = "beginner", t_test = c("Gender", "Temperature"), introduction = intro) cat(prompt) res <- gemini_generate(prompt, compile_to = "html") ## End(Not run)
Builds a clear, audience-tailored prompt to interpret base R stats::chisq.test() results, handling both goodness-of-fit and contingency-table tests. Aligned with other EntraineR trainers: no invented numbers; audience-specific guidance.
trainer_chisq_test( csq_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )trainer_chisq_test( csq_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )
csq_obj |
An htest object returned by stats::chisq.test(). |
introduction |
Optional character string giving the study context. |
alpha |
Numeric significance level (default 0.05). |
audience |
One of c("beginner","applied","advanced"). |
summary_only |
Logical; if TRUE, return a 3-bullet executive summary regardless of audience depth (uses trainer_core_summary_only_block()). |
llm_model |
Character; model name for the generator (default "llama3"). |
generate |
Logical; if TRUE, call the generator and return prompt + response. |
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
# GOF set.seed(1); x <- c(18, 22, 20, 25, 15) csq1 <- chisq.test(x, p = rep(1/5, 5)) cat(trainer_chisq_test(csq1, audience = "beginner")) # Contingency (independence) tbl <- matrix(c(12,5,7,9), nrow=2) csq2 <- chisq.test(tbl) # Yates for 2x2 by default cat(trainer_chisq_test(csq2, audience = "applied"))# GOF set.seed(1); x <- c(18, 22, 20, 25, 15) csq1 <- chisq.test(x, p = rep(1/5, 5)) cat(trainer_chisq_test(csq1, audience = "beginner")) # Contingency (independence) tbl <- matrix(c(12,5,7,9), nrow=2) csq2 <- chisq.test(tbl) # Yates for 2x2 by default cat(trainer_chisq_test(csq2, audience = "applied"))
Builds a clear, audience-tailored prompt to interpret stats::cor.test() results for Pearson, Spearman, or Kendall correlation. Supports three audiences ("beginner", "applied", "advanced") and an optional summary_only mode.
trainer_cor_test( ct_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )trainer_cor_test( ct_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )
ct_obj |
An htest object returned by stats::cor.test(). |
introduction |
Optional character string giving the study context. |
alpha |
Numeric significance level (default 0.05). |
audience |
One of c("beginner","applied","advanced"). |
summary_only |
Logical; if TRUE, return a 3-bullet executive summary regardless of audience depth (uses trainer_core_summary_only_block()). |
llm_model |
Character; model name for the generator (default "llama3"). |
generate |
Logical; if TRUE, call the generator and return prompt + response. |
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
set.seed(1) x <- rnorm(30); y <- 0.5*x + rnorm(30, sd = 0.8) ct <- cor.test(x, y, method = "pearson") cat(trainer_cor_test(ct, audience = "applied", summary_only = FALSE))set.seed(1) x <- rnorm(30); y <- 0.5*x + rnorm(30, sd = 0.8) ct <- cor.test(x, y, method = "pearson") cat(trainer_cor_test(ct, audience = "applied", summary_only = FALSE))
These aliases are kept for backward compatibility only. Prefer the explicit snake_case API: 'trainer_chisq_test()', 'trainer_cor_test()', 'trainer_prop_test()', and 'trainer_var_test()'.
trainer_chisq(...) trainer_cor(...) trainer_prop(...) trainer_var(...)trainer_chisq(...) trainer_cor(...) trainer_prop(...) trainer_var(...)
... |
Passed to the corresponding trainer function. |
Same return value as the corresponding trainer.
Builds an English-only, audience-tailored prompt to interpret a FactoMineR::LinearModel result. Handles model selection (AIC/BIC) and instructs how to interpret deviation contrasts (sum-to-zero) for factors. Works for ANOVA, ANCOVA, and multiple regression.
trainer_linear_model( x, introduction = NULL, alpha = 0.05, t_test = NULL, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... ) trainer_LinearModel(lm_obj, ...)trainer_linear_model( x, introduction = NULL, alpha = 0.05, t_test = NULL, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... ) trainer_LinearModel(lm_obj, ...)
x |
An object returned by FactoMineR::LinearModel(...). |
introduction |
Optional character string giving the study context. |
alpha |
Numeric significance level (default 0.05). |
t_test |
Optional character vector to filter the T-test section by factor names and/or interactions (e.g. "FactorA" or "FactorA:FactorB"). |
audience |
One of c("beginner","applied","advanced"). |
summary_only |
Logical; if TRUE, return a 3-bullet executive summary. |
llm_model |
Character model name for the generator (e.g., "llama3"). |
generate |
Logical; if TRUE, call the generator. |
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
lm_obj |
Deprecated name for 'x'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
# --- Example 1: multiple regression with selection (ham) ------------------- data(ham) if (requireNamespace("FactoMineR", quietly = TRUE)) { intro_ham <- "A sensory analysis institute wants to know if it's possible to predict the overall liking of a ham from its sensory description. A trained panel used the following attributes to describe 21 hams: Juiciness, Crispy, Tenderness, Pasty, Fibrous, Salty, Sweet, Meaty, Seasoned, Metallic, Ammoniated, Fatty, Braised, Lactic. Afterward, an Overall Liking score was assigned to each of the hams." # collapse whitespace safely without extra packages intro_ham <- gsub("\n", " ", intro_ham) intro_ham <- gsub("\\s+", " ", intro_ham) res <- FactoMineR::LinearModel(`Overall liking` ~ ., data = ham, selection = "bic") pr <- trainer_linear_model(res, introduction = intro_ham, audience = "advanced", generate = FALSE) cat(pr) } # --- Example 2: interaction with a categorical factor (deforestation) ------ data(deforestation) if (requireNamespace("FactoMineR", quietly = TRUE)) { intro_flume <- "The study's goal is to determine how river deforestation affects the relationship between water and air temperature. The dataset contains maximum air and water temperatures measured over 28 ten-day periods before deforestation and 28 periods after deforestation. The main objective is to understand if and how the link between air and water temperature changes after deforestation." intro_flume <- gsub("\n", " ", intro_flume) intro_flume <- gsub("\\s+", " ", intro_flume) res <- FactoMineR::LinearModel(Temp_water ~ Temp_air * Deforestation, data = deforestation, selection = "none") pr <- trainer_linear_model(res, introduction = intro_flume, audience = "advanced", generate = FALSE) cat(pr) }# --- Example 1: multiple regression with selection (ham) ------------------- data(ham) if (requireNamespace("FactoMineR", quietly = TRUE)) { intro_ham <- "A sensory analysis institute wants to know if it's possible to predict the overall liking of a ham from its sensory description. A trained panel used the following attributes to describe 21 hams: Juiciness, Crispy, Tenderness, Pasty, Fibrous, Salty, Sweet, Meaty, Seasoned, Metallic, Ammoniated, Fatty, Braised, Lactic. Afterward, an Overall Liking score was assigned to each of the hams." # collapse whitespace safely without extra packages intro_ham <- gsub("\n", " ", intro_ham) intro_ham <- gsub("\\s+", " ", intro_ham) res <- FactoMineR::LinearModel(`Overall liking` ~ ., data = ham, selection = "bic") pr <- trainer_linear_model(res, introduction = intro_ham, audience = "advanced", generate = FALSE) cat(pr) } # --- Example 2: interaction with a categorical factor (deforestation) ------ data(deforestation) if (requireNamespace("FactoMineR", quietly = TRUE)) { intro_flume <- "The study's goal is to determine how river deforestation affects the relationship between water and air temperature. The dataset contains maximum air and water temperatures measured over 28 ten-day periods before deforestation and 28 periods after deforestation. The main objective is to understand if and how the link between air and water temperature changes after deforestation." intro_flume <- gsub("\n", " ", intro_flume) intro_flume <- gsub("\\s+", " ", intro_flume) res <- FactoMineR::LinearModel(Temp_water ~ Temp_air * Deforestation, data = deforestation, selection = "none") pr <- trainer_linear_model(res, introduction = intro_flume, audience = "advanced", generate = FALSE) cat(pr) }
Builds an English-only, audience-tailored prompt to name and justify a
Multiple Correspondence Analysis (MCA) dimension from a FactoMineR::MCA
object. The function never invents numbers: it passes verbatim excerpts from
summary(mca_obj) and FactoMineR::dimdesc() filtered at a given
significance threshold proba, and instructs how to read and name the axis.
trainer_mca( x, dimension = 1L, proba = 0.05, introduction = NULL, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... ) trainer_MCA(mca_obj, ...)trainer_mca( x, dimension = 1L, proba = 0.05, introduction = NULL, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... ) trainer_MCA(mca_obj, ...)
x |
A MCA object returned by |
dimension |
Integer scalar; the dimension (component) to name (default 1). |
proba |
Numeric in (0,1]; significance threshold used by
|
introduction |
Optional character string giving the study context. Defaults to a generic description. |
audience |
One of |
summary_only |
Logical; if TRUE, return a compact 3-bullet executive
summary (uses |
llm_model |
Character; model name for your generator backend
(default |
generate |
Logical; if TRUE, calls
|
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
mca_obj |
Deprecated name for 'x'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
## Not run: # Example: tea (FactoMineR) if (requireNamespace("FactoMineR", quietly = TRUE)) { data(tea, package = "FactoMineR") res_mca <- FactoMineR::MCA(tea, quanti.sup = 19, quali.sup = 20:36, graph = FALSE) intro <- "A survey on tea consumption practices and contexts was summarized by MCA." intro <- gsub("\n", " ", intro); intro <- gsub("\\s+", " ", intro) # Applied audience prompt <- trainer_mca(res_mca, dimension = 1, proba = 0.01, introduction = intro, audience = "applied", generate = FALSE) cat(prompt) res <- gemini_generate(prompt, compile_to = "html") } ## End(Not run)## Not run: # Example: tea (FactoMineR) if (requireNamespace("FactoMineR", quietly = TRUE)) { data(tea, package = "FactoMineR") res_mca <- FactoMineR::MCA(tea, quanti.sup = 19, quali.sup = 20:36, graph = FALSE) intro <- "A survey on tea consumption practices and contexts was summarized by MCA." intro <- gsub("\n", " ", intro); intro <- gsub("\\s+", " ", intro) # Applied audience prompt <- trainer_mca(res_mca, dimension = 1, proba = 0.01, introduction = intro, audience = "applied", generate = FALSE) cat(prompt) res <- gemini_generate(prompt, compile_to = "html") } ## End(Not run)
Builds an English-only, audience-tailored prompt to name and justify a principal component (dimension) from a FactoMineR::PCA object. The function never invents numbers: it passes verbatim excerpts from 'summary(pca_obj)' (Individuals/Variables) and 'FactoMineR::dimdesc()' filtered at a given significance threshold 'proba', and instructs how to read and name the axis.
trainer_pca( x, dimension = 1L, proba = 0.05, introduction = NULL, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... ) trainer_PCA(pca_obj, ...)trainer_pca( x, dimension = 1L, proba = 0.05, introduction = NULL, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... ) trainer_PCA(pca_obj, ...)
x |
A PCA object returned by |
dimension |
Integer scalar; the dimension (component) to name (default 1). |
proba |
Numeric in (0,1]; significance threshold used by
|
introduction |
Optional character string giving the study context. Defaults to a generic description. |
audience |
One of |
summary_only |
Logical; if TRUE, return a compact 3-bullet executive
summary (uses |
llm_model |
Character; model name for your generator backend
(default |
generate |
Logical; if TRUE, calls
|
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
pca_obj |
Deprecated name for 'x'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
## Not run: # Example: decathlon (FactoMineR) if (requireNamespace("FactoMineR", quietly = TRUE)) { data(decathlon, package = "FactoMineR") res_pca <- FactoMineR::PCA(decathlon, quanti.sup = 11:12, quali.sup = 13, graph = FALSE) intro <- "A study was conducted on decathlon athletes. Performances on each event were measured and summarized by PCA." intro <- gsub("\n", " ", intro); intro <- gsub("\\s+", " ", intro) prompt <- trainer_pca(res_pca, dimension = 1, proba = 0.05, introduction = intro, audience = "applied", generate = FALSE) cat(prompt) res <- gemini_generate(prompt, compile_to = "html") } ## End(Not run)## Not run: # Example: decathlon (FactoMineR) if (requireNamespace("FactoMineR", quietly = TRUE)) { data(decathlon, package = "FactoMineR") res_pca <- FactoMineR::PCA(decathlon, quanti.sup = 11:12, quali.sup = 13, graph = FALSE) intro <- "A study was conducted on decathlon athletes. Performances on each event were measured and summarized by PCA." intro <- gsub("\n", " ", intro); intro <- gsub("\\s+", " ", intro) prompt <- trainer_pca(res_pca, dimension = 1, proba = 0.05, introduction = intro, audience = "applied", generate = FALSE) cat(prompt) res <- gemini_generate(prompt, compile_to = "html") } ## End(Not run)
Builds a clear, audience-tailored prompt to interpret stats::prop.test() results (one-sample vs target p, two-sample equality, k-group equality, or k-group vs given p). Aligned with other EntraineR trainers: no invented numbers; audience-specific guidance.
trainer_prop_test( pt_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )trainer_prop_test( pt_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )
pt_obj |
An htest object returned by stats::prop.test(). |
introduction |
Optional character string giving the study context. |
alpha |
Numeric significance level (default 0.05). |
audience |
One of c("beginner","applied","advanced"). |
summary_only |
Logical; if TRUE, return a 3-bullet executive summary regardless of audience depth (uses trainer_core_summary_only_block()). |
llm_model |
Character; model name for the generator (default "llama3"). |
generate |
Logical; if TRUE, call the generator and return prompt + response. |
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
# One-sample pt1 <- prop.test(x = 56, n = 100, p = 0.5) cat(trainer_prop_test(pt1, audience = "beginner")) # Two-sample pt2 <- prop.test(x = c(42, 35), n = c(100, 90)) cat(trainer_prop_test(pt2, audience = "applied", summary_only = TRUE))# One-sample pt1 <- prop.test(x = 56, n = 100, p = 0.5) cat(trainer_prop_test(pt1, audience = "beginner")) # Two-sample pt2 <- prop.test(x = c(42, 35), n = c(100, 90)) cat(trainer_prop_test(pt2, audience = "applied", summary_only = TRUE))
Builds a clear, audience-tailored prompt to interpret a base R stats::t.test() result.
Identifies the test flavor (One-sample, Two-sample, Paired, Welch) and instructs the LLM
to use ONLY printed values (p, t, df, CI, estimates) and avoid any new calculations.
trainer_t_test( tt_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )trainer_t_test( tt_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )
tt_obj |
An htest object returned by |
introduction |
Optional character string giving the study context in plain English. |
alpha |
Numeric significance level used for interpretation (default 0.05). |
audience |
One of |
summary_only |
Logical; if TRUE, return a 3-bullet executive summary. |
llm_model |
Character; model name passed to your generator (default "llama3"). |
generate |
Logical; if TRUE, call |
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
set.seed(1) tt1 <- t.test(rnorm(20, 0.1), mu = 0) # one-sample cat(trainer_t_test(tt1, audience = "beginner")) x <- rnorm(18, 0); y <- rnorm(20, 0.3) tt2 <- t.test(x, y, var.equal = FALSE) # two-sample Welch cat(trainer_t_test(tt2, audience = "applied", summary_only = TRUE))set.seed(1) tt1 <- t.test(rnorm(20, 0.1), mu = 0) # one-sample cat(trainer_t_test(tt1, audience = "beginner")) x <- rnorm(18, 0); y <- rnorm(20, 0.3) tt2 <- t.test(x, y, var.equal = FALSE) # two-sample Welch cat(trainer_t_test(tt2, audience = "applied", summary_only = TRUE))
Builds a clear, audience-tailored prompt to interpret a base R stats::var.test() result.
trainer_var_test( vt_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )trainer_var_test( vt_obj, introduction = NULL, alpha = 0.05, audience = c("beginner", "applied", "advanced"), summary_only = FALSE, llm_model = "llama3", generate = FALSE, llm_engine = c("ollama", "gemini", "none"), ... )
vt_obj |
An htest object returned by stats::var.test(). |
introduction |
Optional character string giving the study context. |
alpha |
Numeric significance level (default 0.05). |
audience |
One of c("beginner","applied","advanced"). |
summary_only |
Logical; if TRUE, return a 3-bullet executive summary regardless of audience depth (uses trainer_core_summary_only_block()). |
llm_model |
Character; model name for the generator (default "llama3"). |
generate |
Logical; if TRUE, call the generator and return prompt + response. |
llm_engine |
Character; backend engine: "ollama", "gemini", or "none". |
... |
Passed to the selected LLM backend when 'generate = TRUE'. |
An 'entrainer_prompt' object. It behaves like a character string for 'cat()'/printing and stores LLM metadata and response as attributes when 'generate = TRUE'.
If 'generate = TRUE' and 'llm_engine' is not '"none"', the prompt is sent to the selected LLM backend. With external providers such as Gemini, this may include excerpts of statistical outputs and user-provided context.
set.seed(1) x <- rnorm(25, sd = 1.0); y <- rnorm(30, sd = 1.3) vt <- var.test(x, y) cat(trainer_var_test(vt, audience = "applied")) cat(trainer_var_test(vt, audience = "advanced", summary_only = TRUE))set.seed(1) x <- rnorm(25, sd = 1.0); y <- rnorm(30, sd = 1.3) vt <- var.test(x, y) cat(trainer_var_test(vt, audience = "applied")) cat(trainer_var_test(vt, audience = "advanced", summary_only = TRUE))