Bandwidth estimation for Geographically Weighted Lasso
Source:R/gwl_bw_estimation.R
gwl_bw_estimation.Rd
This function performs a bruteforce selection of the optimal bandwidth for the selected kernel to perform a geographically weighted lasso.
The user should be aware that this function could be really long to run depending of the settings.
We recommend starting with nbw = 5
and nfolds = 5
at first to ensure that the function is running properly and producing the desired output.
Usage
gwl_bw_estimation(
x.var,
y.var,
dist.mat,
adaptive = TRUE,
adptbwd.thresh = 0.1,
kernel = "bisquare",
alpha = 1,
progress = TRUE,
nbw = 100,
nfolds = 5
)
Arguments
- x.var
input matrix, of dimension nobs x nvars; each row is an observation vector.
x
should have 2 or more columns.- y.var
response variable for the lasso
- dist.mat
a distance matrix. can be generated by
compute_distance_matrix()
- adaptive
TRUE or FALSE Whether to perform an adaptive bandwidth search or not. A fixed bandwidth means that than samples are selected if they fit a determined fixed radius around a location. in a aptative bandwidth , the radius around a location varies to gather a fixed number of samples around the investigated location
- adptbwd.thresh
the lowest fraction of samples to take into account for local regression. Must be 0 <
adptbwd.thresh
< 1- kernel
the geographical kernel shape to compute the weight. passed to
GWmodel::gw.weight()
Can begaussian
,exponential
,bisquare
,tricube
,boxcar
- alpha
the elasticnet mixing parameter. set 1 for lasso, 0 for ridge. see
glmnet::glmnet()
- progress
if TRUE, print a progress bar
- nbw
the number of bandwidth to test
- nfolds
the number f folds for the glmnet cross validation
Value
a gwlest
object. It is a list with rmspe
(the RMSPE of the model with the associated badwidth), NA
(the number of NA in the dataset), bw
(the optimal bandwidth), bwd.vec
(the vector of tested bandwidth)
References
A. Comber and P. Harris. Geographically weighted elastic net logistic regression (2018).
Journal of Geographical Systems, vol. 20, no. 4, pages 317–341.
doi:10.1007/s10109-018-0280-7
.
Examples
predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)
# \donttest{
myst.est <- gwl_bw_estimation(x.var = predictors,
y.var = y_value,
dist.mat = distance_matrix,
adaptive = TRUE,
adptbwd.thresh = 0.5,
kernel = "bisquare",
alpha = 1,
progress = TRUE,
n=10,
nfolds = 5)
myst.est
#> Optimalbw : 50
#> kernel : bisquare
#> adaptive : TRUE
# }