GWR and MGWR with Space-Time Kernels

Space-time kernel (Experimental)

One of the earliest extensions to incorporate the temporal dimension into Geographically Weighted Regression (GWR) is the Geographically and Temporally Weighted Regression (GTWR). GTWR enables regression coefficients to vary both spatially and temporally by employing spatiotemporal weights that account for spatial and temporal distances. This approach more effectively captures evolving local dynamics compared to traditional GWR (Huang, B., Wu, B., & Barry, M., 2010). The inclusion of temporal elements enhances the understanding of how relationships change over time, in addition to their spatial variations.

In this package, we have extended the functionality to incorporate spatiotemporal (separable) kernels for all available models in the mgwrsar package, significantly expanding the range of temporal kernel options. Existing GTWR packages typically offer only symmetric temporal kernels, which do not account for seasonality. However, when working with spatiotemporal data, it is often more appropriate to use asymmetric spatiotemporal kernels for the temporal dimension. The asymmetric aspect is particularly crucial in predictive frameworks, as it ensures that only past observations are considered in local samples. Additionally, unlike purely spatial models, certain spatiotemporal processes may involve cyclic temporal proximities (Du et al., 2018; Senanayake, O’Callaghan, & Ramos, 2016).

Multiplicative Separable Kernel

Here, we consider a multiplicative separable kernel composed of two components:

$K_s(d_s, h_s)$ for space, where $d_s$ represents the spatial distance and $h_s$ the spatial bandwidth.
$K_t(d_t, h_t, cycling, \theta)$ for time, where $d_t$ represents the temporal distance, $h_t$ the temporal bandwidth, $cycling$ defines the seasonal cycle, and $theta$ modulates the weights between seasons.

The space-time kernel is then computed as:
$K(h_s, h_t, \alpha, cycling, \theta) = \alpha * (K_s(d_s, h_s) * K_t(d_t, h_t, cycling, \theta)) + (1 - \alpha) * I_{t_i < t_j}$

Parameters:

$\alpha \in [0, 1]$ : This parameter distributes weights between observations that are close in both time and space, and all past observations.
- If $\alpha = 1$ , the regression coefficients $\beta$ at time $t$ are calculated using only past observations, resembling OLS-type regressions.
- If $\alpha = 0$ , the classic GTWR coefficients are obtained.
$cycling$ : If not null, the temporal distance is calculated modulo a cycle, making the weights periodic over time.
$\theta \in ]0, 1]$ : This parameter controls the rate of weight reduction between seasons.
- If $\theta = 1$ , there is no reduction in weights between seasons, and only the time difference within the season matters.
Temporal Kernel: Can be either asymmetric (considering only past observations for non-zero distances) or symmetric (including all observations).

Kernel Naming Convention

The name of the temporal kernel consists of two to three terms separated by underscores:

The first term specifies the kernel type (e.g., gauss, bisq, rectangle, triangle, epane).
The second term can be past for an asymmetric kernel or symmetric for a kernel that includes all observations.
The optional third term defines the cycling/seasonal weight in temporal units (e.g., gauss_past_365).

Implementation in mgwrsar

In the mgwrsar package, to use a spatiotemporal kernel, you must provide a complete parameterization for both the spatial and temporal kernels, including the kernel names, whether they are adaptive, and their corresponding bandwidth values.

For non-adaptive kernels, bandwidths are expressed in units of distance and time, respectively.
For adaptive kernels, bandwidths correspond to the number of neighbors.

A spatiotemporal kernel can be specified using the control parameter type=GDT. For example:

kernels = c('gauss', 'gauss_past_365'),
H = c(20000, 30),
control = list(type = 'GDT', adaptive = c(FALSE, FALSE), alpha = 0.85)

The following graph illustrates this kernel with two different parameterizations. The spatial dimension is constrained to a single axis (longitude) to facilitate a 2D representation. The grey areas represent null weights. GDT kernel 1

GDT kernel 2

Simulated GWR with spatiotemporal patterns (Beta1 and Beta2 depend on time)

library(mgwrsar)
#> Loading required package: Rcpp
#> Loading required package: sp
#> Loading required package: leaflet
#> Loading required package: Matrix
### Simulation of a spatiotemporal DGP
simu2=simu_multiscale(n=1000,myseed=1,type='GG2024',constant=NULL,nuls=NULL,config_beta='spatiotemp',config_snr=0.9,config_eps='normal',ratiotime=1.5)
mydata=simu2$mydata
head(mydata)
#>           Y         X1         X2         X3       Beta1     Beta2
#> 1 14.485825  1.2879704  0.8500435 -1.8054836  1.38895237 4.0797414
#> 2  7.275414  1.6184181 -0.9253130 -0.6780407  2.17421410 4.2422579
#> 3  8.723902  1.2710352  0.8935812 -0.4733581  1.87402111 4.8731258
#> 4  7.624326 -0.2157605 -0.9410097  1.0274171  4.61026118 0.3023768
#> 5  2.973350 -1.0671588  0.5389521 -0.5973876 -0.04012401 1.4197379
#> 6  5.773858 -1.4469746 -0.1819744  1.1598494  1.82607635 0.2902590
#>           Beta3      Beta4        eps         x          y Intercept
#> 1  1.091891e-01 -3.5080176  1.4158034 0.2655087 0.53080879         1
#> 2  9.714243e-01  0.4348605 -0.5708219 0.3721239 0.68486090         1
#> 3 -4.437261e-01  4.1313048  3.0080579 0.5728534 0.38328339         1
#> 4 -1.833294e+00  3.9273697 -2.6808888 0.9082078 0.95498800         1
#> 5  5.477149e-05 -4.5964812  1.7826499 0.2016819 0.11835658         1
#> 6 -2.724324e+00  4.5418611 -1.3958531 0.8983897 0.03910006         1
#>          time
#> 1 0.000000000
#> 2 0.001001001
#> 3 0.002002002
#> 4 0.003003003
#> 5 0.004004004
#> 6 0.005005005
coords=simu2$coords
head(coords)
#>              x          y
#> [1,] 0.2655087 0.53080879
#> [2,] 0.3721239 0.68486090
#> [3,] 0.5728534 0.38328339
#> [4,] 0.9082078 0.95498800
#> [5,] 0.2016819 0.11835658
#> [6,] 0.8983897 0.03910006
myformula=as.formula('Y~X1+X2+X3')
# storage of coefficients from the data generating process
TRUEBETA<-as.matrix(mydata[,c('Beta1','Beta2','Beta3','Beta4')],ncol=4)

ggplot(mydata[mydata$time<0.3,],aes(x,y))+geom_point(aes(color=Beta1))+ ggtitle('Space time Pattern of coefficient Beta1 for time<0.3')+scale_colour_gradientn(colours = terrain.colors(10),limits=c(0,13))


ggplot(mydata[mydata$time>0.6,],aes(x,y))+geom_point(aes(color=Beta1))+ ggtitle('Space time Pattern of coefficient Beta1 for time>0.6')+scale_colour_gradientn(colours = terrain.colors(10),limits=c(0,13))

Spatial and temporal bandwidths optimization

The golden_search_bandwidth function is employed to determine the optimal spatial bandwidth $h_s$ for a given temporal bandwidth $h_t$ . This process should be iterated across various values of $h_t$ to identify the best pair of bandwidths.

Note: In the function parameterizations, $h_t$ corresponds to $H_2$ , while $h_s$ corresponds to $H$ .

As anticipated, when evaluating the Root Mean Square Error (RMSE) of the coefficient estimates against the true values from the Data Generating Process (DGP), the use of a spatiotemporal kernel yields a significantly better fit compared to a Geographically Weighted Regression (GWR) that solely considers spatial weights.

For estimation and bandwidth search, it is essential to provide the in-sample time vector through the control argument. Here is an example:

    control = list(..., Z = mydata$time[index_in], ...)

##### Comparing GWR estimation using spatial kernel (type='GD') and  spatiotemproal kernel Type='GDT'

verbose=F
# function to compute rmse from a vector of residuals 
rmse<-function(x) sqrt(mean(x^2))

### GWR with spatial non-adaptive Gaussian kernel
res_gwr_GD=golden_search_bandwidth(formula=myformula,H2=NULL,data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'), Model = 'GWR',control=list(adaptive=F,verbose=verbose,criterion='AICc',doMC=F,ncore=1),lower.bound=0, upper.bound=1.4,tolerance=0.001)
model_GWR_GD<-res_gwr_GD$model
model_GWR_GD@H # 0.05698921
#> [1] 0.05698921
model_GWR_GD@AICc # 5645.197
#> [1] 5643.34
Beta_RMSE=apply((model_GWR_GD@Betav-TRUEBETA) %>% as.data.frame(),2,rmse) 
Beta_RMSE #  1.166121  1.403245  1.352598  1.756254 
#> Intercept        X1        X2        X3 
#>  1.166121  1.403245  1.352598  1.756254
Beta_RMSE%>% mean() # 1.419555
#> [1] 1.419555

### GWR model with a spatio-temporal non-adaptive kernel

# Repeating the following estimation (optimization of the spatial bandwidth) 
# with various temporal bandwidth (H2) 
# lead to an optimal value for H2 equal 0.55 regarding AICc criteria 

# here is the procedure for H2 = 0.55
# you need to provide the insample time vectorthrough the control argument} : control=list(...,Z=mydata$time)
res_gwr_GDT=golden_search_bandwidth(myformula,H2=0.55,data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss','gauss'),Model = 'GWR',control=list(doMC=F,ncore=1,Z=mydata$time,SE=F,Type='GDT',adaptive=c(F,F),criterion='AICc',get_ts=TRUE,verbose=verbose,NN=nrow(mydata)),lower.bound=0, upper.bound=1.4)

# results
model_GWR_GDT<-res_gwr_GDT$model
model_GWR_GDT@H  # 0.05892351
#> [1] 0.05885385
model_GWR_GDT@H2 # 0.55
#> [1] 0.55
model_GWR_GDT@AICc # 5632.56
#> [1] 5630.537
Beta_RMSE=apply((model_GWR_GDT@Betav-TRUEBETA) %>% as.data.frame(),2,rmse) 
Beta_RMSE # 1.027104  1.253701  1.368510  1.786674 
#> Intercept        X1        X2        X3 
#>  1.027352  1.253882  1.368100  1.785387
Beta_RMSE%>% mean()    # 1.35899
#> [1] 1.35868

Comparison of GWR Out-of-Sample Predictions for Spatiotemporal DGP

In this analysis, we compare Geographically Weighted Regression (GWR) out-of-sample predictions for a spatiotemporal Data Generating Process (DGP) with a larger sample size using two approaches:

A spatial kernel
A spatiotemporal kernel

The reduction in out-of-sample Root Mean Square Error (RMSE) for the dependent variable is of the same order of magnitude as the decrease in in-sample beta RMSE (mean), approximately 3%.

Several methods are available for out-of-sample predictions (see the Prediction section in the vignette: GWR-and-mixed-GWR-with-spatial-autocorrelation). For this comparison, we use the method method_pred='tWtp_model'. This method extrapolates the Beta coefficients using a weighting matrix derived from the expected weights of out-of-sample data, as if they were added to in-sample data to estimate the corresponding model.

For out-of-sample prediction, ensure the following:

Provide the in-sample time vector through the control argument:

control = list(…, Z = mydata$time[index_in], …)
Provide a matrix with three columns for the new data (longitude, latitude, time):

new_data=matrix(cbind(coords[index_out,], mydata$time[index_out]), ncol = 3)

## simulations
simu3=simu_multiscale(n=2000,myseed=1,type='GG2024',constant=NULL,
                      nuls=NULL,config_beta='spatiotemp',config_snr=0.9,
                      config_eps='normal',ratiotime=1.5)

mydata=simu3$mydata
head("mydata")
#> [1] "mydata"
coords=simu3$coords
head("coords")
#> [1] "coords"
myformula=as.formula('Y~X1+X2+X3')

# betas from the DGP
TRUEBETA<-as.matrix(mydata[,c('Beta1','Beta2','Beta3','Beta4')],ncol=4)

# build in sample and ousample folds
index_in=which(mydata$time<0.7)
index_out=(1:2000)[-index_in]

### model estimation, insample estimates comparison
## Comparing GWR estimation using spatial kernel (type='GD') and  spatiotemproal kernel Type='GDT' 
# (same as in the previous part, comparison of insample beta estimates)

verbose=F

## GWR model with a Gaussian non-adaptive kernel
# model estimation
res_gwr_GD2=golden_search_bandwidth(formula=myformula,H2=NULL,data = mydata[index_in,],coords=coords[index_in,], fixed_vars=NULL,kernels=c('gauss'), Model = 'GWR',control=list(adaptive=F,verbose=verbose,criterion='AICc',doMC=F,ncore=1),lower.bound=0, upper.bound=1.4,tolerance=0.001)
#results
model_GWR_GD<-res_gwr_GD2$model
model_GWR_GD@H # 0.0441879
#> [1] 0.0441879
model_GWR_GD@AICc # 7205.019
#> [1] 7205.019
Beta_RMSE=apply((model_GWR_GD@Betav-TRUEBETA[index_in,]) %>% as.data.frame(),2,rmse) 
# rmse for insample beta estimates with GWR-GD
Beta_RMSE # 00.8841014 1.1196300 1.0467956 1.2906708
#> Intercept        X1        X2        X3 
#> 0.8841014 1.1196300 1.0467956 1.2906708
Beta_RMSE%>% mean() # 1.085299
#> [1] 1.085299

## Repeating the following estimation with various H2 lead to an optimal value for H2 equal 0.45 regarding AICc criteria 

res_gwr_GDT2=golden_search_bandwidth(myformula,H2=0.45,data = mydata[index_in,],coords=coords[index_in,], fixed_vars=NULL,kernels=c('gauss','gauss'),Model = 'GWR',control=list(doMC=F,ncore=1,Z=mydata$time[index_in],SE=F,Type='GDT',adaptive=c(F,F),criterion='AICc',get_ts=TRUE,verbose=verbose),lower.bound=0, upper.bound=1.4)
model_GWR_GDT<-res_gwr_GDT2$model
model_GWR_GDT@H  #  0.04572969
#> [1] 0.04572969
model_GWR_GDT@H2 #  0.45
#> [1] 0.45
model_GWR_GDT@AICc # 7190.152
#> [1] 7190.152
Beta_RMSE=apply((model_GWR_GDT@Betav-TRUEBETA[index_in,]) %>% as.data.frame(),2,rmse) 
# rmse for insample beta estimates with GWR-GDT
Beta_RMSE # 0.8042042 1.0365272 1.0625390 1.3247835 
#> Intercept        X1        X2        X3 
#> 0.8042042 1.0365272 1.0625390 1.3247835
Beta_RMSE%>% mean()    #   1.057013
#> [1] 1.057013


#### outsample PREDICTIONS comparison
# outsample data
newdata<-mydata[index_out,]
newdata_coords=coords[index_out,]
# you need to make a matrix with space and time coordinates
newdata_coords_time=matrix(cbind(coords[index_out,],mydata$time[index_out]),ncol=3)

### GWR
Y_pred17=predict(model_GWR_GD, newdata=newdata, newdata_coords=newdata_coords,method_pred='tWtp_model')
head(Y_pred17)
#> [1]  2.019934  2.953993  7.316086  6.511827 -2.397841  9.925443
head(mydata$Y[index_out])
#> [1]  5.309515  3.799516 12.770301  9.447239 -5.857138  9.814067
# RMSE for outsample predictions (GWR)
sqrt(mean((mydata$Y[index_out]-Y_pred17)^2)) # RMSE 4.297125
#> [1] 4.297125

### GWR GDT
Y_pred18=predict(model_GWR_GDT, newdata=newdata, newdata_coords=newdata_coords_time,method_pred='tWtp_model')
head(Y_pred18)
#> [1]  2.218985  2.717337  7.233807  6.770594 -2.972868 10.288854
head(mydata$Y[index_out])
#> [1]  5.309515  3.799516 12.770301  9.447239 -5.857138  9.814067
# RMSE for outsample predictions (GWR GDT), method_pred='tWtp_model'
sqrt(mean((mydata$Y[index_out]-Y_pred18)^2)) # RMSE 4.122512
#> [1] 4.122512

Y_pred19=predict(model_GWR_GDT, newdata=newdata, newdata_coords=newdata_coords_time,method_pred='shepard')
head(Y_pred19)
#> [1]  2.221355  2.713575  7.245124  6.772069 -2.976258 10.290390
head(mydata$Y[index_out])
#> [1]  5.309515  3.799516 12.770301  9.447239 -5.857138  9.814067
# RMSE for outsample predictions (GWR GDT), method_pred='tWtp_model'
sqrt(mean((mydata$Y[index_out]-Y_pred19)^2)) # RMSE 4.121472
#> [1] 4.121472