Unified normality testing across all branches.
The two-group case (t-test vs Wilcoxon) and the multi-group case (ANOVA
vs Kruskal-Wallis) now share the same normality test:
shapiro.test() applied to the internally studentized
residuals from lm(y ~ x) via
rstandard().
Variance homogeneity test changed from
bartlett.test() to levene.test(). The
Levene-Brown-Forsythe test (using the median) is more robust to
non-normality than Bartlett’s test. The test now determines whether
Student’s t-test (t.test(var.equal = TRUE)) or Welch’s
t-test is used in the two-group case, and whether aov() or
oneway.test() is used in the multi-group case.
Large-sample threshold changed from 30 to 50 per group. When every group exceeds 50 observations, normality testing is skipped and parametric tests are applied directly (justified by the central limit theorem).
Post-hoc test selection. When
oneway.test() (Welch’s ANOVA) is used for unequal
variances, games.howell() is the appropriate post-hoc test,
as it does not assume equal variances. TukeyHSD() remains
for aov() (Student’s ANOVA).
Rank correlation
(correlation = TRUE). Selects the most appropriate
rank correlation for the data type: Spearman’s \(\rho\) for two numeric variables, Kendall’s
\(\tau_b\) when both variables are
ordered factors. This is the only test decision not made automatically;
it requires an explicit user choice.
Formula interface:
visstat(y ~ x, data = df) is now supported alongside the
existing visstat(x, y) and
visstat(dataframe, "namey", "namex") forms.
Ordered factor responses: When the response is
of class ordered, visstat() converts it to
numeric ranks and applies Wilcoxon or Kruskal-Wallis. When both
variables are ordered and correlation = TRUE, Kendall’s
\(\tau_b\) is used instead.
Correlation analysis: New parameter
correlation in visstat(). When set to
TRUE, selects Spearman rank correlation for numeric
variables or Kendall’s \(\tau_b\) for
two ordered factors, instead of fitting a regression model.
New exported functions:
levene.test(): Levene-Brown-Forsythe test for
homogeneity of variance (center = median), mimicking the default
behaviour of leveneTest() in the car
package.bp.test(): Breusch-Pagan test for heteroscedasticity in
linear regression models.games.howell(): Games-Howell post-hoc test for pairwise
comparisons following Welch’s ANOVA.vis_numeric(): Visualisation of numeric-numeric
relationships (regression or correlation).vis_group_normality(): Diagnostic plots for the Welch
t-test / Welch ANOVA branch.vis_lm_assumptions(): Renamed from
vis_anova_assumptions(), now provides unified assumption
diagnostics for the general linear model (t-test, ANOVA,
regression).vis_anova_assumptions() has been removed and replaced
by vis_lm_assumptions(). The new function handles both
ANOVA and regression diagnostics (controlled by the
regression parameter). For regression, it shows a Residuals
vs Leverage plot (with Cook’s distance contours) and the Breusch-Pagan
test instead of the Bartlett test.vis_anova_assumptions() is provided as a deprecated
wrapper for vis_lm_assumptions(). It will be removed in a
future version.plot.visstat() method added to the visstat
class.Vignette substantially revised:
rstandard() computes internally
studentized residuals, with reference to Cook and Weisberg (1982).DESCRIPTION rewritten to reflect the updated test selection algorithm.
The visstat() function interface has been updated to
accept two vectors directly, enabling a more concise and idiomatic
usage. For example:
visstat(trees\(Girth, trees\)Height)
yields the same result as the original form:
visstat(trees, “Height”, “Girth”)
This change aligns with standard R conventions. Both calling styles remain supported for backwards compatibility.
visstat() function now returns an object of class
"visstat", enabling consistent method dispatch.print.visstat() – shows a concise summary,summary.visstat() – prints the full test and post hoc
summaries.get_samples_fact_inputfile()
no longer exported to NAMESPACE.visStatistics.Rmd documenting the
statistical decision logic, with reproducible examples illustrating each
test case.README.html and the @details
section of the main function visstat():
t.test()) is now applied when both
groups have more than 30 observations (previous threshold was 100).conf.level rather than defaulting to
0.95.pairwise.wilcox.test()) now uses the specified
conf.level.fisher.test() now correctly follows the
expected cell count thresholds.