Skip to contents

HFR is a regularized regression estimator that decomposes a least squares regression along a supervised hierarchical graph, and shrinks the edges of the estimated graph to regularize parameters. The algorithm leads to group shrinkage in the regression parameters and a reduction in the effective model degrees of freedom.

Usage

hfr(
  x,
  y,
  weights = NULL,
  kappa = 1,
  q = NULL,
  intercept = TRUE,
  standardize = TRUE,
  partial_method = c("pairwise", "shrinkage"),
  l2_penalty = 0,
  ...
)

Arguments

x

Input matrix or data.frame, of dimension \((N\times p)\); each row is an observation vector.

y

Response variable.

weights

an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used for the level-specific regressions.

kappa

The target effective degrees of freedom of the regression as a percentage of \(p\).

q

Thinning parameter representing the quantile cut-off (in terms of contributed variance) above which to consider levels in the hierarchy. This can used to reduce the number of levels in high-dimensional problems. Default is no thinning.

intercept

Should intercept be fitted. Default is intercept=TRUE.

standardize

Logical flag for x variable standardization prior to fitting the model. The coefficients are always returned on the original scale. Default is standardize=TRUE.

partial_method

Indicate whether to use pairwise partial correlations, or shrinkage partial correlations.

l2_penalty

Optional penalty for level-specific regressions (useful in high-dimensional case)

...

Additional arguments passed to hclust.

Value

An 'hfr' regression object.

Details

Shrinkage can be imposed by targeting an explicit effective degrees of freedom. Setting the argument kappa to a value between 0 and 1 controls the effective degrees of freedom of the fitted object as a percentage of \(p\). When kappa is 1 the result is equivalent to the result from an ordinary least squares regression (no shrinkage). Conversely, kappa set to 0 represents maximum shrinkage.

When \(p > N\) kappa is a percentage of \((N - 2)\).

If no kappa is set, a linear regression with kappa = 1 is estimated.

Hierarchical clustering is performed using hclust. The default is set to ward.D2 clustering but can be overridden by passing a method argument to ....

For high-dimensional problems, the hierarchy becomes very large. Setting q to a value below 1 reduces the number of levels used in the hierarchy. q represents a quantile-cutoff of the amount of variation contributed by the levels. The default (q = NULL) considers all levels.

When data exhibits multicollinearity it can be useful to include a penalty on the l2 norm in the level-specific regressions. This can be achieved by setting the l2_penalty parameter.

References

Pfitzinger, Johann (2024). Cluster Regularization via a Hierarchical Feature Regression. _Econometrics and Statistics_ (in press). URL https://doi.org/10.1016/j.ecosta.2024.01.003.

See also

cv.hfr, se.avg, coef, plot and predict methods

Author

Johann Pfitzinger

Examples

x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit = hfr(x, y, kappa = 0.5)
coef(fit)
#>  (Intercept)           V1           V2           V3           V4           V5 
#> -0.045528849 -0.044402118 -0.047856870 -0.092421394 -0.126072592 -0.009288087 
#>           V6           V7           V8           V9          V10          V11 
#>  0.011728797 -0.144757423  0.167845451  0.055608271 -0.102802516 -0.124190165 
#>          V12          V13          V14          V15          V16          V17 
#>  0.012207753  0.009962975 -0.124733367  0.055228331  0.144381079 -0.135998132 
#>          V18          V19          V20 
#>  0.039950932  0.063773122  0.009338699