| Type: | Package |
| Version: | 0.1.4 |
| Title: | Distributed Laplace Factor Model |
| Description: | Distributed estimation method is based on a Laplace factor model to solve the estimates of load and specific variance. The philosophy of the package is described in Guangbao Guo. (2022). <doi:10.1007/s00180-022-01270-z>. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Imports: | stats, MASS, LaplacesDemon, matrixcalc, relliptical, LFM |
| NeedsCompilation: | no |
| Language: | en-US |
| Author: | Guangbao Guo [aut, cre], Siqi Liu [aut] |
| Depends: | R (≥ 3.5.0) |
| BuildManual: | yes |
| Suggests: | testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| Maintainer: | Guangbao Guo <ggb11111111@163.com> |
| Repository: | CRAN |
| Date/Publication: | 2025-12-06 12:20:06 UTC |
| Packaged: | 2025-12-05 11:47:49 UTC; R7000 |
Distributed general unilateral loading principal component
Description
Distributed general unilateral loading principal component
Usage
DGulPC(data, m, n1, K)
Arguments
data |
is a total data set |
m |
is the number of principal component |
n1 |
is the length of each data subset |
K |
is the number of nodes |
Value
AU1,AU2,DU3,Shat
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
DGulPC(data_a,m=3,n1=128,K=2)
Distributed Incremental Principal Component Analysis (DIPC)
Description
Apply IPC in a distributed manner across K nodes.
Usage
DIPC(data, m, eta, K)
Arguments
data |
Matrix of input data (n × p). |
m |
Number of principal components. |
eta |
Proportion of initial batch to total data within each node. |
K |
Number of nodes (distributed splits). |
Value
List with per-node results and aggregated averages.
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
m=5
eta=0.8
K=2
results <- DIPC(data_a, m, eta, K)
Distributed principal component
Description
Distributed principal component
Usage
DPC(data, m, n1, K)
Arguments
data |
is a total data set |
m |
is the number of principal component |
n1 |
is the length of each data subset |
K |
is the number of nodes |
Value
Ahat,Dhat,Sigmahathat
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
DPC(data_a,m=3,n1=128,K=2)
Distributed projection principal component
Description
Distributed projection principal component
Usage
DPPC(data, m, n1, K)
Arguments
data |
is a total data set |
m |
is the number of principal component |
n1 |
is the length of each data subset |
K |
is the number of nodes |
Value
Apro,pro,Sigmahathatpro
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
DPPC(data_a,m=3,n1=128,K=2)
The distributed stochastic approximation principal component for handling online data sets with highly correlated data across multiple nodes.
Description
The distributed stochastic approximation principal component for handling online data sets with highly correlated data across multiple nodes.
Usage
DSAPC(data, m, eta, n1, K)
Arguments
data |
is a highly correlated online data set |
m |
is the number of principal component |
eta |
is the proportion of online data to total data |
n1 |
is the length of each data subset |
K |
is the number of nodes |
Value
Asa, Dsa (lists containing results from each node)
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
DSAPC(data=data_a, m=3, eta=0.8, n1=128, K=2)
Distributed Factor Model Testing with Wald, GRS, PY tests and FDR control
Description
Performs comprehensive factor model testing in distributed environment across multiple nodes, including joint tests (Wald, GRS, PY), individual asset t-tests, and False Discovery Rate control.
Usage
Dfactor.tests(ret, fac, n1, K, q.fdr = 0.05)
Arguments
ret |
A T × N matrix representing the excess returns of N assets at T time points. |
fac |
A T × K matrix representing the returns of K factors at T time points. |
n1 |
The number of assets allocated to each node |
K |
The number of nodes |
q.fdr |
The significance level for FDR (False Discovery Rate) testing, defaulting to 5%. |
Value
A list containing the following components:
alpha_list |
List of alpha vectors from each node |
tstat_list |
List of t-statistics from each node |
pval_list |
List of p-values from each node |
Wald_list |
List of Wald test statistics from each node |
p_Wald_list |
List of p-values for Wald tests from each node |
GRS_list |
List of GRS test statistics from each node |
p_GRS_list |
List of p-values for GRS tests from each node |
PY_list |
List of Pesaran and Yamagata test statistics from each node |
p_PY_list |
List of p-values for PY tests from each node |
reject_fdr_list |
List of logical vectors indicating significant assets after FDR correction from each node |
power_proxy_list |
List of number of significant assets after FDR correction from each node |
combined_alpha |
Combined alpha vector from all nodes |
combined_pval |
Combined p-value vector from all nodes |
combined_reject_fdr |
Combined FDR rejection vector from all nodes |
total_power_proxy |
Total number of significant assets across all nodes after FDR correction |
Examples
set.seed(42)
T <- 120
N <- 100 # Larger dataset for distributed testing
K_factors <- 3
fac <- matrix(rnorm(T * K_factors), T, K_factors)
beta <- matrix(rnorm(N * K_factors), N, K_factors)
alpha <- rep(0, N)
alpha[1:10] <- 0.4 / 100 # 10 non-zero alphas
eps <- matrix(rnorm(T * N, sd = 0.02), T, N)
ret <- alpha + fac %*% t(beta) + eps
# Distributed testing with 4 nodes, each handling 25 assets
results <- Dfactor.tests(ret, fac, n1 = 25, K = 4, q.fdr = 0.05)
# View combined results
cat("Total significant assets after FDR:", results$total_power_proxy, "\n")
cat("Combined results across all nodes:\n")
print(summary(results$combined_alpha))
Apply the FanPC method to the Laplace factor model
Description
This function performs Factor Analysis via Principal Component (FanPC) on a given data set. It calculates the estimated factor loading matrix (AF), specific variance matrix (DF), and the mean squared errors.
Usage
FanPC(data, m)
Arguments
data |
A matrix of input data. |
m |
is the number of principal component |
Value
AF,DF,SigmahatF
Examples
library(LaplacesDemon)
library(MASS)
n=1000
p=10
m=5
mu=t(matrix(rep(runif(p,0,1000),n),p,n))
mu0=as.matrix(runif(m,0))
sigma0=diag(runif(m,1))
F=matrix(mvrnorm(n,mu0,sigma0),nrow=n)
A=matrix(runif(p*m,-1,1),nrow=p)
lanor <- rlaplace(n*p,0,1)
epsilon=matrix(lanor,nrow=n)
D=diag(t(epsilon)%*%epsilon)
data=mu+F%*%t(A)+epsilon
results <- FanPC(data, m)
print(results)
Apply the Farmtest method to the Laplace factor model
Description
This function simulates data from a Lapalce factor model and applies the FarmTest for multiple hypothesis testing. It calculates the false discovery rate (FDR) and power of the test.
Usage
Ftest(
data,
p1,
alpha = 0.05,
K = -1,
alternative = c("two.sided", "less", "greater")
)
Arguments
data |
A matrix or data frame of simulated or observed data from a Laplace factor model. |
p1 |
The number or proportion of non-zero hypotheses. |
alpha |
The significance level for controlling the false discovery rate (default: 0.05). |
K |
The number of factors to estimate (default: -1, meaning auto-detect). |
alternative |
The alternative hypothesis: "two.sided", "less", or "greater" (default: "two.sided"). |
Value
A list containing the following elements:
FDR |
The false discovery rate, which is the proportion of false positives among all discoveries (rejected hypotheses). |
Power |
The statistical power of the test, which is the probability of correctly rejecting a false null hypothesis. |
PValues |
A vector of p-values associated with each hypothesis test. |
RejectedHypotheses |
The total number of hypotheses that were rejected by the FarmTest. |
reject |
Indices of rejected hypotheses. |
means |
Estimated means. |
Examples
library(LaplacesDemon)
library(MASS)
n=1000
p=10
m=5
mu=t(matrix(rep(runif(p,0,1000),n),p,n))
mu0=as.matrix(runif(m,0))
sigma0=diag(runif(m,1))
F=matrix(mvrnorm(n,mu0,sigma0),nrow=n)
A=matrix(runif(p*m,-1,1),nrow=p)
lanor <- rlaplace(n*p,0,1)
epsilon=matrix(lanor,nrow=n)
D=diag(t(epsilon)%*%epsilon)
data=mu+F%*%t(A)+epsilon
p1=40
results <- Ftest(data, p1)
print(results$FDR)
print(results$Power)
General unilateral loading principal component
Description
General unilateral loading principal component
Usage
GulPC(data, m)
Arguments
data |
is a total data set |
m |
is the number of first layer principal component |
Value
AU1,AU2,DU3,SigmaUhat
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
GulPC(data=data_a,m=5)
Generate Laplace factor models
Description
The function is to generate Laplace factor model data. The function supports various distribution types for generating the data, including: - 'truncated_laplace': Truncated Laplace distribution - 'log_laplace': Univariate Symmetric Log-Laplace distribution - 'Asymmetric Log_Laplace': Log-Laplace distribution - 'Skew-Laplace': Skew-Laplace distribution
Usage
LFM(n, p, m, distribution_type)
Arguments
n |
An integer specifying the sample size. |
p |
An integer specifying the sample dimensionality or the number of variables. |
m |
An integer specifying the number of factors in the model. |
distribution_type |
A character string indicating the type of distribution to use for generating the data. |
Value
A list containing the following elements:
data |
A numeric matrix of the generated data. |
A |
A numeric matrix representing the factor loadings. |
D |
A numeric matrix representing the uniquenesses, which is a diagonal matrix. |
Examples
library(MASS)
library(matrixcalc)
library(relliptical)
n <- 1000
p <- 10
m <- 5
sigma1 <- 1
sigma2 <- matrix(c(1,0.7,0.7,1), 2, 2)
distribution_type <- "truncated_laplace"
results <- LFM(n, p, m, distribution_type)
print(results)
Principal component
Description
Principal component
Usage
PC(data, m)
Arguments
data |
is a total data set |
m |
is the number of principal component |
Value
Ahat, Dhat, Sigmahat
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
PC(data_a,m=5)
Projection principal component
Description
Projection principal component
Usage
PPC(data, m)
Arguments
data |
is a total data set |
m |
is the number of principal component |
Value
Apro, Dpro, Sigmahatpro
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
PPC(data=data_a,m=5)
The stochastic approximation principal component can handle online data sets with highly correlated.
Description
The stochastic approximation principal component can handle online data sets with highly correlated.
Usage
SAPC(data, m, eta)
Arguments
data |
is a highly correlated online data set |
m |
is the number of principal component |
eta |
is the proportion of online data to total data |
Value
Asa,Dsa
Examples
library(LFM)
data_from_package <- Wine
data_a <- Wine
SAPC(data=data_a,m=3,eta=0.8)
Factor Model Testing with Wald, GRS, PY tests and FDR control
Description
Performs comprehensive factor model testing including joint tests (Wald, GRS, PY), individual asset t-tests, and False Discovery Rate control.
Usage
factor.tests(ret, fac, q.fdr = 0.05)
Arguments
ret |
A T × N matrix representing the excess returns of N assets at T time points. |
fac |
A T × K matrix representing the returns of K factors at T time points. |
q.fdr |
The significance level for FDR (False Discovery Rate) testing, defaulting to 5%. |
Value
A list containing the following components:
alpha |
N-vector of estimated alphas for each asset |
tstat |
N-vector of t-statistics for testing individual alphas |
pval |
N-vector of p-values for individual alpha tests |
Wald |
Wald test statistic for joint alpha significance |
p_Wald |
p-value for Wald test |
GRS |
GRS test statistic (finite-sample F-test) |
p_GRS |
p-value for GRS test |
PY |
Pesaran and Yamagata test statistic |
p_PY |
p-value for PY test |
reject_fdr |
Logical vector indicating which assets have significant alphas after FDR correction |
fdr_p |
Adjusted p-values using Benjamini-Hochberg procedure |
power_proxy |
Number of significant assets after FDR correction |
Examples
set.seed(42)
T <- 120
N <- 25
K <- 3
fac <- matrix(rnorm(T * K), T, K)
beta <- matrix(rnorm(N * K), N, K)
alpha <- rep(0, N)
alpha[1:3] <- 0.4 / 100 # 3 non-zero alphas
eps <- matrix(rnorm(T * N, sd = 0.02), T, N)
ret <- alpha + fac %*% t(beta) + eps
results <- factor.tests(ret, fac, q.fdr = 0.05)
# View results
cat("Wald test p-value:", results$p_Wald, "\n")
cat("GRS test p-value:", results$p_GRS, "\n")
cat("PY test p-value:", results$p_PY, "\n")
cat("Significant assets after FDR:", results$power_proxy, "\n")