Skip to contents

HFR is a regularized regression estimator that decomposes a least squares regression along a supervised hierarchical graph, and shrinks the edges of the estimated graph to regularize parameters. The algorithm leads to group shrinkage in the regression parameters and a reduction in the effective model degrees of freedom.

Usage

cv.hfr(
  x,
  y,
  weights = NULL,
  kappa = seq(0, 1, by = 0.1),
  q = NULL,
  intercept = TRUE,
  standardize = TRUE,
  nfolds = 10,
  foldid = NULL,
  partial_method = c("pairwise", "shrinkage"),
  l2_penalty = 0,
  ...
)

Arguments

x

Input matrix or data.frame, of dimension \((N\times p)\); each row is an observation vector.

y

Response variable.

weights

an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used for the level-specific regressions.

kappa

A vector of target effective degrees of freedom of the regression.

q

Thinning parameter representing the quantile cut-off (in terms of contributed variance) above which to consider levels in the hierarchy. This can used to reduce the number of levels in high-dimensional problems. Default is no thinning.

intercept

Should intercept be fitted. Default is intercept=TRUE.

standardize

Logical flag for x variable standardization prior to fitting the model. The coefficients are always returned on the original scale. Default is standardize=TRUE.

nfolds

The number of folds for k-fold cross validation. Default is nfolds=10.

foldid

An optional vector of values between 1 and nfolds identifying what fold each observation is in. If supplied, nfolds can be missing.

partial_method

Indicate whether to use pairwise partial correlations, or shrinkage partial correlations.

l2_penalty

Optional penalty for level-specific regressions (useful in high-dimensional case)

...

Additional arguments passed to hclust.

Value

A 'cv.hfr' regression object.

Details

This function fits an HFR to a grid of kappa hyperparameter values. The result is a matrix of coefficients with one column for each hyperparameter. By evaluating all hyperparameters in a single function, the speed of the cross-validation procedure is improved substantially (since level-specific regressions are estimated only once).

When nfolds > 1, a cross validation is performed with shuffled data. Alternatively, test slices can be passed to the function using the foldid argument. The result of the cross validation is given by best_kappa in the output object.

References

Pfitzinger, Johann (2024). Cluster Regularization via a Hierarchical Feature Regression. _Econometrics and Statistics_ (in press). URL https://doi.org/10.1016/j.ecosta.2024.01.003.

See also

hfr, coef, plot and predict methods

Author

Johann Pfitzinger

Examples

x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit = cv.hfr(x, y, kappa = seq(0, 1, by = 0.1))
coef(fit)
#>                        s1           s2           s3            s4           s5
#> (Intercept) -6.513300e-03  0.004424144  0.001414333 -0.0009662082 -0.001769557
#> V1           4.105750e-09  0.025055671 -0.002506424 -0.0155201046 -0.018773187
#> V2          -3.599427e-09 -0.035760516 -0.044661249 -0.0517428983 -0.057890643
#> V3          -4.194673e-09 -0.042116781 -0.052599577 -0.0523164708 -0.046914950
#> V4          -4.290485e-09 -0.057353963 -0.091465608 -0.0978371857 -0.097473308
#> V5           4.290625e-09  0.025636044  0.024558659  0.0327080488  0.033026981
#> V6          -3.776804e-09 -0.037621347 -0.046985238 -0.0510931197 -0.054571958
#> V7          -4.072737e-09 -0.040609966 -0.050717719 -0.0545243406 -0.058876777
#> V8          -4.390128e-09 -0.026613412  0.002662252  0.0100788419  0.006776848
#> V9           4.024101e-09  0.071225939  0.103287508  0.1109869630  0.111927547
#> V10         -3.850645e-09 -0.038356886 -0.047903851 -0.0520920460 -0.053405652
#> V11         -3.708088e-09 -0.037231204 -0.046497988 -0.0462477223 -0.044455759
#> V12         -3.651565e-09 -0.036410390 -0.045472875 -0.0488858445 -0.049654564
#> V13         -4.462772e-09 -0.044499075 -0.055574821 -0.0597459923 -0.058590136
#> V14          4.175371e-09  0.073376452  0.106406051  0.1169081467  0.121215023
#> V15         -3.970051e-09 -0.052800086 -0.084203281 -0.0933701270 -0.097025477
#> V16         -4.239701e-09 -0.042120703 -0.052604475 -0.0530436555 -0.050943424
#> V17         -4.055030e-09 -0.024581236  0.002458965  0.0192377218  0.027660931
#> V18         -4.265462e-09 -0.025857673  0.002586652  0.0097926336  0.006584406
#> V19          4.212937e-09  0.025171725  0.024113853  0.0321156411  0.031820120
#> V20          4.254742e-09  0.075308029  0.109207106  0.1165667870  0.115978572
#>                       s6           s7           s8           s9          s10
#> (Intercept) -0.002149329 -0.002658990 -0.003168651 -0.003678312 -0.004187974
#> V1          -0.020054996 -0.020951491 -0.021847986 -0.022744480 -0.023640975
#> V2          -0.063050170 -0.067759024 -0.072467878 -0.077176732 -0.081885586
#> V3          -0.040303293 -0.034079987 -0.027856680 -0.021633373 -0.015410067
#> V4          -0.097094363 -0.096859832 -0.096625301 -0.096390770 -0.096156239
#> V5           0.034348144  0.036090127  0.037832109  0.039574092  0.041316074
#> V6          -0.058970601 -0.062894621 -0.066818641 -0.070742662 -0.074666682
#> V7          -0.065831174 -0.072374832 -0.078918489 -0.085462147 -0.092005804
#> V8           0.004506273  0.005269777  0.006033281  0.006796785  0.007560289
#> V9           0.112929444  0.114084110  0.115238776  0.116393442  0.117548108
#> V10         -0.053327597 -0.053043440 -0.052759282 -0.052475124 -0.052190966
#> V11         -0.044654621 -0.044444357 -0.044234094 -0.044023830 -0.043813566
#> V12         -0.049517851 -0.049629665 -0.049741479 -0.049853293 -0.049965107
#> V13         -0.054036262 -0.049253604 -0.044470947 -0.039688289 -0.034905631
#> V14          0.125150603  0.129024915  0.132899227  0.136773540  0.140647852
#> V15         -0.099760197 -0.102091384 -0.104422571 -0.106753759 -0.109084946
#> V16         -0.050006469 -0.049327889 -0.048649310 -0.047970731 -0.047292151
#> V17          0.033505917  0.038710534  0.043915152  0.049119770  0.054324387
#> V18          0.003607763 -0.002195856 -0.007999474 -0.013803093 -0.019606712
#> V19          0.031679941  0.030575928  0.029471915  0.028367902  0.027263889
#> V20          0.115008480  0.114442837  0.113877195  0.113311552  0.112745909
#>                      s11
#> (Intercept) -0.004697653
#> V1          -0.024537419
#> V2          -0.086594381
#> V3          -0.009186821
#> V4          -0.095921738
#> V5           0.043058117
#> V6          -0.078590641
#> V7          -0.098549408
#> V8           0.008324208
#> V9           0.118702805
#> V10         -0.051906786
#> V11         -0.043603251
#> V12         -0.050076960
#> V13         -0.030122951
#> V14          0.144522166
#> V15         -0.111416087
#> V16         -0.046613613
#> V17          0.059528917
#> V18         -0.025410714
#> V19          0.026159747
#> V20          0.112180333