Fit a hierarchical feature regression

HFR is a regularized regression estimator that decomposes a least squares regression along a supervised hierarchical graph, and shrinks the edges of the estimated graph to regularize parameters. The algorithm leads to group shrinkage in the regression parameters and a reduction in the effective model degrees of freedom.

Usage

hfr(
  x,
  y,
  weights = NULL,
  kappa = 1,
  q = NULL,
  intercept = TRUE,
  standardize = TRUE,
  partial_method = c("pairwise", "shrinkage"),
  l2_penalty = 0,
  ...
)

Arguments

x: Input matrix or data.frame, of dimension \((N\times p)\); each row is an observation vector.
y: Response variable.
weights: an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used for the level-specific regressions.
kappa: The target effective degrees of freedom of the regression as a percentage of \(p\).
q: Thinning parameter representing the quantile cut-off (in terms of contributed variance) above which to consider levels in the hierarchy. This can used to reduce the number of levels in high-dimensional problems. Default is no thinning.
intercept: Should intercept be fitted. Default is intercept=TRUE.
standardize: Logical flag for x variable standardization prior to fitting the model. The coefficients are always returned on the original scale. Default is standardize=TRUE.
partial_method: Indicate whether to use pairwise partial correlations, or shrinkage partial correlations.
l2_penalty: Optional penalty for level-specific regressions (useful in high-dimensional case)
...: Additional arguments passed to hclust.

Value

An 'hfr' regression object.

Details

Shrinkage can be imposed by targeting an explicit effective degrees of freedom. Setting the argument kappa to a value between 0 and 1 controls the effective degrees of freedom of the fitted object as a percentage of \(p\). When kappa is 1 the result is equivalent to the result from an ordinary least squares regression (no shrinkage). Conversely, kappa set to 0 represents maximum shrinkage.

When \(p > N\) kappa is a percentage of \((N - 2)\).

If no kappa is set, a linear regression with kappa = 1 is estimated.

Hierarchical clustering is performed using hclust. The default is set to ward.D2 clustering but can be overridden by passing a method argument to ....

For high-dimensional problems, the hierarchy becomes very large. Setting q to a value below 1 reduces the number of levels used in the hierarchy. q represents a quantile-cutoff of the amount of variation contributed by the levels. The default (q = NULL) considers all levels.

When data exhibits multicollinearity it can be useful to include a penalty on the l2 norm in the level-specific regressions. This can be achieved by setting the l2_penalty parameter.

References

Pfitzinger, Johann (2024). Cluster Regularization via a Hierarchical Feature Regression. _Econometrics and Statistics_ (in press). URL https://doi.org/10.1016/j.ecosta.2024.01.003.

Author

Johann Pfitzinger

Examples

x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit = hfr(x, y, kappa = 0.5)
coef(fit)
#>  (Intercept)           V1           V2           V3           V4           V5 
#> -0.045528849 -0.044402118 -0.047856870 -0.092421394 -0.126072592 -0.009288087 
#>           V6           V7           V8           V9          V10          V11 
#>  0.011728797 -0.144757423  0.167845451  0.055608271 -0.102802516 -0.124190165 
#>          V12          V13          V14          V15          V16          V17 
#>  0.012207753  0.009962975 -0.124733367  0.055228331  0.144381079 -0.135998132 
#>          V18          V19          V20 
#>  0.039950932  0.063773122  0.009338699