Application of geographically weighted regression model in the estimation of surface air temperature lapse rate

  • QIN Yun , 1 ,
  • REN Guoyu , 1, 2, * ,
  • HUANG Yunxin 3 ,
  • ZHANG Panfeng 1 ,
  • WEN Kangmin 1
Expand
  • 1. Department of Atmospheric Science, School of Environmental Studies, China University of Geosciences, Wuhan 430074, China
  • 2. Laboratory for Climate Studies, National Climate Center, China Meteorological Administration, Beijing 100081, China
  • 3. School of Resource and Environmental Science, Hubei University, Wuhan 430062, China
*Ren Guoyu, Professor, E-mail:

Qin Yun (1990‒), PhD Candidate, specialized in regional climatology and climate change. E-mail:

Received date: 2020-05-16

  Accepted date: 2020-09-11

  Online published: 2021-05-25

Supported by

The National Key R&D Program(2018YFA0605603)

Natural Science Foundation of China(41575003)

Copyright

Copyright reserved © 2021. Office of Journal of Geographical Sciences All articles published represent the opinions of the authors, and do not reflect the official policy of the Chinese Medical Association or the Editorial Board, unless this is clearly specified.

Abstract

The surface air temperature lapse rate (SATLR) plays a key role in the hydrological, glacial and ecological modeling, the regional downscaling, and the reconstruction of high-resolution surface air temperature. However, how to accurately estimate the SATLR in the regions with complex terrain and climatic condition has been a great challenge for researchers. The geographically weighted regression (GWR) model was applied in this paper to estimate the SATLR in China’s mainland, and then the assessment and validation for the GWR model were made. The spatial pattern of regression residuals which was identified by Moran’s Index indicated that the GWR model was broadly reasonable for the estimation of SATLR. The small mean absolute error (MAE) in all months indicated that the GWR model had a strong predictive ability for the surface air temperature. The comparison with previous studies for the seasonal mean SATLR further evidenced the accuracy of the estimation. Therefore, the GWR method has potential application for estimating the SATLR in a large region with complex terrain and climatic condition.

Cite this article

QIN Yun , REN Guoyu , HUANG Yunxin , ZHANG Panfeng , WEN Kangmin . Application of geographically weighted regression model in the estimation of surface air temperature lapse rate[J]. Journal of Geographical Sciences, 2021 , 31(3) : 389 -402 . DOI: 10.1007/s11442-021-1849-5

1 Introduction

Generally, air temperature decreases with the increase of observation altitude, the change rate of which is called air temperature lapse rate (Barry and Chorley, 2003). The lapse rate along the terrestrial surface, called near-surface or surface air temperature lapse rate (SATLR), is a result of surface energy balance and determines atmospheric stability above the surface (Minder et al., 2010; Holden and Rose, 2011; He and Wang, 2020). It plays an important role in the hydrological, glacial and ecological modeling (Jóhannesson et al., 1995; Petersen et al., 2013; Wang et al., 2016a), because the gridding surface air temperatures which are interpolated based on the SATLR can significantly affect the melting surface (Gardner and Sharp, 2009; Sun et al., 2015). Moreover, it is an important parameter in the fields of regional downscaling and reconstruction of high-resolution surface air temperature due to the strong relationship between air temperature and altitude (Dodson and Marks, 1997; Marshall et al., 2007; Yang et al., 2007; Gardner et al., 2009), which has an impact on the detection and attribution of regional climate change (Wilby et al., 2002; Xie et al., 2018).
The SATLR was usually estimated based on the linear relationship between the surface air temperature and the altitude of surface climate observations (Harlow et al., 2004; Mokhov and Akperov, 2006; Gardner et al., 2009), and the surface air temperature records from meteorological observational networks were widely used due to their nature of normality and regularity (Li et al., 2013; Li et al., 2015; Guo et al., 2016; Ojha, 2017). In the aspect of estimation methods, the simple linear regression model solved with ordinary least squares (OLS) method was frequently used to calculate the SATLR (Rolland, 2003; Blandford et al., 2008; Kattel et al., 2018). As the simple linear regression model, a kind of global model, could lead to many problems, including the spatial non-stationarity, when there was complex terrain and climatic condition in a large region, observational stations were usually divided into several groups and the SATLRs of each group were estimated separately (Fotheringham et al., 2002; Zhai et al., 2016; Qin et al., 2018). The division criterion was not always the same between different literatures. Some were based on the significant gradients of climatic elements and the spatial continuous distribution of stations (Guo et al., 2016), some were to divide the mountain areas into several parts according to the topography and climatic conditions (Kattel et al., 2013; Shen et al., 2016), and some others took many factors into consideration including the similar regional climate settings, the reasonable number of stations, and the grid division of longitude and latitude (Li et al., 2013; Du et al., 2018). Hence, the SATLR estimated in each group reflected the average state of SATLR in each sub-region, but it could not reflect the spatial continuous variation of the SATLR in the study area. Moreover, simple linear regression model may not reasonable in statistics, as it relies on the normality assumption and homogeneity of variance (Rencher and Schaalje, 2008).
Compared with global model, the local model using subsets of observations centered on a focal calibration station could solve the above-mentioned problem (Lloyd, 2006). Moving window regression (MWR) model, a kind of local model, was increasingly used for the estimation of the SATLR in recent literatures. For instance, to estimate the SATLR at the station, Li et al. (2015) used the observation records from the calibration station and its 20 nearest neighboring stations, and He and Wang (2020) used 15 to 50 nearest neighboring stations inside a five-degree circle. Notwithstanding, the technique of MWR suffered from edge effects and the results were dependent on the size of the window (Fotheringham et al., 2002; Zhang et al., 2018). On the basis of MWR model, the geographically weighted regression (GWR) model had taken the kernel function into consideration which assigned more weight on the stations which were geographically closer to the calibration station than those which were more distant (Brunsdon et al., 1996). It thus reduced the influence of the stations located in the boundary of statistical area in the parameter estimation and solved the problem of spatial non-stationarity (Brunsdon et al., 1996; Fischer and Getis, 2010).
The objective of this study is to estimate the SATLR in a large region with complex terrain and climatic condition. Due to the advantages of GWR model, it was applied to estimate the SATLR in this research. The monthly mean surface air temperatures and the altitudes of the meteorological stations in China's mainland were used to calculate the SATLR, which were described in Section 2. To avoid confusion, as most previous studies, the negative SATLR was regarded as the temperature decreasing with the increase of altitude, and the positive SATLR as the temperature increasing with the increase of altitude (Pepin et al., 1999; Blandford et al., 2008; Kattel et al., 2015; Guo et al., 2016; Wang et al., 2018). The assessment for the model was made in Section 3.1, including the regression residuals and the coefficient standard errors of altitudes. The validation for the model and the comparison of the seasonal mean SATLR with previous studies were made in Sections 3.2 and 3.3, respectively. The discussion and conclusions were presented in Sections 4 and 5, respectively.

2 Materials and methods

2.1 Data and preprocessing

Two datasets were used in this research, which were provided by National Meteorological Information Center, China Meteorological Administration (CMA). One was the monthly mean surface air temperature and the recorded altitude of national meteorological stations (dataset-1), which was used to estimate SATLR with GWR model; the other was the hourly mean surface air temperature and the recorded altitude of automatic meteorological stations (dataset-2), which was used to validate the results of the model.
Quality control and homogeneity adjustment for the surface air temperature of dateset-1 had been made by Cao et al. (2016). The number of stations in the dataset was 2419, and they were featured as dense in the eastern part of China and sparse in the western part (Figure 1). The time period in this research was from January 1961 to December 2015. The missing records (accounted for about 4%) of the surface air temperature were replaced by the values which were predicted with recurrent neural network (RNN) algorithm (Bengio and Gingras, 1996; Kim, 2017), the Matlab codes of which were obtained from GitHub website (Atabay, 2016). The monthly mean, seasonal mean (spring: March, April, May; summer: June, July, August; autumn: September, October, November; winter: December, January, February), and annual mean temperatures were then calculated. The altitude records of stations were complete, ranging from -47.4 to 4801.2 m.
Figure 1 Topography of China and spatial distribution of meteorological stations in the dateset-1
The sub-dataset in the year 2013 of dataset-2 was used for the validation. Quality control of the sub-dataset was made as follows: (1) according to the boundary of China's mainland, removed the stations with wrong latitude or longitude; (2) removed the stations the altitude records of which were missing or unreasonable; (3) checked the integrality of time records, and set the temperatures at the missing time records as missing values; (4) considering the limit of measurement, took the hourly temperature records falling outside of the bounds of -60 to 60 °C as missing values; (5) considering the case of instrument failure which could freeze the temperature records (Zhang et al., 2013), if no less than three consecutive hourly temperature records shared the same value, took the latter two or more as missing values; (6) removed the stations with the rate of missing data > 10% in any one month and the repeated stations with dataset-1. Finally, 10,924 stations were remained for use. Due to the relatively little difference of temperature in the month and the small missing rate, the monthly temperature was just the mean of the recorded hourly temperature for each station.

2.2 Estimation of SATLR

As the SATLR was a parameter to measure the relationship between temperature and altitude, the altitude was considered as the only explanatory variable. For convenience, the temperature and altitude were denoted by yi and xi, respectively, at station i. The linear relationship between them could be expressed as
${{y}_{i}}={{\beta }_{i0}}+{{\beta }_{i1}}{{x}_{i}}+{{\varepsilon }_{i}}$
where βi0 was the intercept, βi1 was the regression coefficient for the xi, and εi was the random error at station i. Because the goodness of fit (R2) measured how well the regression model fitted, if the R2 was high, say ${{R}^{2}}\ge 0.7$ (Dodson and Marks, 1997), the relationship was considered to be meaningful. In that case, βi1 was considered as a valid SATLR; otherwise, βi1 was not a valid SATLR.
In GWR model, a local regression equation was built for each station, and expression (1) was written as a vector expression with estimated regression coefficients at station i
$Y=X{{\hat{\beta }}_{i}}$
where ${{\hat{\beta }}_{i}}={{\left( {{\beta }_{i0}},{{\beta }_{i1}} \right)}^{\text{T}}}$, x$X=\left( \begin{matrix} 1 & {{x}_{1}} \\ 1 & {{x}_{2}} \\ \cdots & \cdots \\ 1 & {{x}_{N}} \\ \end{matrix} \right)$., $Y={{\left( {{y}_{1}},{{y}_{2}},\cdots ,{{y}_{N}} \right)}^{\text{T}}}$. N was the number of stations, i.e. 2419 in this research. Taken the geographical weights into account, expression (2) was transformed into
${{\hat{\beta }}_{i}}={{({{X}^{\text{T}}}{{W}_{i}}X)}^{-1}}{{X}^{\text{T}}}{{W}_{i}}Y$
where ${{W}_{i}}=diag({{W}_{i1}},{{W}_{i2}},...,{{W}_{iN}})$. Because of the non-uniform spatial distribution of stations, adaptive Gaussian kernel was utilized for each fit station. The kernels had larger bandwidths where the stations were sparse and had smaller bandwidths where the stations were dense (Fotheringham et al., 2002).
${{W}_{ij}}=\left\{ \begin{matrix} {{e}^{-{{\left( \frac{{{d}_{ij}}}{{{d}_{i\left( n \right)}}} \right)}^{2}}}}, & {{d}_{ij}}\le {{d}_{i\left( n \right)}} \\ 0, & {{d}_{ij}}>{{d}_{i\left( n \right)}} \\ \end{matrix} \right.$
where Wij and dij were the weight value of observation at station j for estimating the coefficient at station i and the distance on the earth between them, respectively, and di(n) was the adaptive bandwidth size defined as the nth nearest neighboring distance. The corrected Akaike Information Criterion (AICc) was utilized to search the optimal di(n) for the fit station, which was expressed as (Akaike, 1974; Hurvich et al., 1998; Fotheringham et al., 2002)
$\text{AI}{{\text{C}}_{c}}=2n\text{ln}\left( {\hat{\sigma }} \right)+n\text{ln}\left( 2\text{ }\!\!\pi\!\!\text{ } \right)+n\left( \frac{n+\text{tr}\left( S \right)}{n-2-\text{tr}\left( S \right)} \right)$
where $\hat{\sigma }$ was the estimated standard deviation of the error εi, and tr(S) was the trace of the hat matrix S each row of which was defined by ${{S}_{i}}={{X}_{i}}{{({{X}^{\text{T}}}{{W}_{i}}X)}^{-1}}{{X}^{\text{T}}}{{W}_{i}}$. The optimal n and di(n) were determined when AICc came to the minimum.
The AICc with the number of nearest neighboring stations ranging from 4 to 100 were calculated with ArcGIS 10.5 software in this article. Though the results showed that the optimal n was different for the monthly mean temperature (Appendix Figure 1), there was little difference for βi1 when local ${{R}^{2}}\ge 0.7$ (the standard deviations of βi1 were almost less than 1°C/km) (Appendix Figure 2). In addition, due to local multicollinearity, the condition number larger than 30 at the fit station was thought as unreliable in the GWR model (Mitchell, 2005; ESRI, 2018). Therefore, we took n=17 to carry out the calculation of GWR, which was the optimal n for the annual mean temperature, and took βi1 with the restriction of both local ${{R}^{2}}\ge 0.7$ and the condition number $\le 30$ as the SATLR at the fit station i.

2.3 Assessment and validation

In order to check whether the GWR model was reasonable for the estimation of the SATLR, the spatial relationship of regression residuals was detected (Leung et al., 2000). If there was statistically significant spatial clustering, the model was considered to be improperly specified, which indicated that a key explanatory variable had not been included in the model; otherwise, the model was properly specified (Mitchell, 2005). Spatial autocorrelation statistic has the ability to detect the spatial relationship (Cliff and Ord, 1969, 1972; Páez et al., 2002), which was usually measured by Moran's Index (Moran, 1948). If the Moran's Index was significantly positive in statistics, the spatial distribution of residuals was considered as clustered. As a comparison, the MWR model with the same nearest neighboring stations as GWR model was carried out, and the Moran's Index of the residuals was also calculated. Note that the residual in this article refers to the fitted temperature of the regression model subtracted by the observed temperature.
The residual sum of squares (RSS) measures the discrepancy between the observed data and the data fitted by the GWR model, which was used to make the comparison between months. The smaller it was, the better the GWR model fitted to the observed temperature, in that month. Moreover, as the standard error of βi1 measures the reliability of βi1, the difference of the standard error and its percentage at each level in 12 months were compared (Wheeler and Tiefelsdorf, 2005). The smaller the standard error, the higher the confidence of estimated βi1 was (ESRI, 2018).
The monthly temperatures at the 10,924 automatic stations in the year of 2013 that were predicted with the GWR model and calibrated by the corresponding monthly series of dataset-1 were used for the validation. The differences between the predicted and the observed monthly temperatures and the mean absolute error (MAE) were used to measure the results of validation.

2.4 Spatial distribution of SATLR

The raster surfaces of βi1 were created by the GWR model. The spatial interpolation for both the local R2 and the condition number were based on the inverse distance weighting (IDW) method (Philip and Watson, 1982; Watson and Philip, 1985), with 17 nearest neighboring stations and 1 power of weight function. If the local R2<0.7 or the condition number >30, the corresponding βi1 would be taken as an invalid SATLR, so the SATLR in that area was shown as blank.

3 Results

3.1 Assessment for the GWR model

As can be seen in Table 1, the residuals returned by the MWR model showed statistically significant spatial clustering for all months, which indicated that it was unreasonable that the distant stations were given the same weight with the close stations in the regression window when only one explanatory variable was included in the model. Compared with the MWR, the GWR performed much better, with the residuals non-clustered except for in February and March. Nevertheless, all of the residuals returned by the GWR model for seasonal data did not show statistically significant spatial clustering (not shown). Therefore, the GWR model was considered as broadly reasonable for the estimation of SATLR.
Table 1 Moran's Index of residuals returned by the MWR and the GWR model, respectively
Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.
MWR 0.0354** 0.0518** 0.0661** 0.0497** 0.0449** 0.0415** 0.0419** 0.0486** 0.0345** 0.0318** 0.0471** 0.0377**
GWR 0.0038 0.0102** 0.0082* ‒0.0055 ‒0.0119** ‒0.0175** ‒0.0194** ‒0.0186** ‒0.0224** ‒0.0224** ‒0.0072 ‒0.0001

Note: * and ** indicated the value was significant at the 0.05 and 0.01 level, respectively. The significant positive and the significant negative values represent the clustered and the dispersed spatial relationship, respectively; otherwise, the spatial relationship was random.

For the annual mean series, 85.6% residuals ranged from -0.5 to 0.5°C, and most of the stations with relatively large residuals were located in the western and the northern parts of China, where the distribution of stations was sparse (Figure 2a). The statistically significant (at the 0.05 level) negative linear relationship between the residual and the station density also indicated that high station density (i.e., short distance between stations) made the GWR model more accurate (not shown). The large standard errors of βi1 were mostly found in the areas with flat terrain, especially in the North China Plain and the Middle-Lower Yangtze Plain with extremely large standard errors (>7°C/km) (Figure 2b). It indicated that there was strong local multicollinearity in those areas, and the confidence of estimated βi1 was low. The large standard errors of βi1 may be related to the spatial stratified heterogeneity (Wang et al., 2016b). As there was little variation of altitude in these areas, the spatial heterogeneity of surface air temperature was mainly caused by other factors, e.g. land cover/use, rather than altitude (Yang et al., 2009; Qu et al., 2013).
Figure 2 Residual (a) and standard error of β1 (b) at each fit station for the anmual mean series
For the monthly mean series, the spatial distribution of residuals was similar with the annual mean series, with larger percentage of small residuals in summer than in winter (not shown). The smaller RSS also showed that the temperatures fitted by the GWR model were closer to the observed temperatures in summer (Figure 3). The results showed that the linear relationship between temperature and altitude was generally stronger in summer than in winter, which may be related to the frequent temperature inversion in winter, especially in the northern part of China (Li et al., 2013; Du et al., 2018). Moreover, the percentage of stations with large standard errors of βi1 was smaller in summer than in winter (Figure 3), which indicated that in some areas the surface air temperature was strongly affected by the altitude in summer, but the influence degree became weaker in winter.
Figure 3 Cumulative percentage of stations with different levels of standard error of βi1 (left axis) and RSS (right axis) for the monthly mean series (The black line showed the monthly variation of RSS. The color bar for the standard error of βi1 was the same as that in Figure 2b.)

3.2 Validation for the GWR model

As is shown in Figure 4, the stations with differences between the predicted and the observed monthly temperatures ranging within about 1.0°C and 2.5°C accounted for 50% and 95%, respectively. The small MAE for all months (varied from 0.52 to 0.91°C) indicated that the GWR model had a strong predictive ability for the surface air temperature (Table 2).
Figure 4 Boxplot of the differences between the predicted and the observed monthly temperatures in the 10,924 automatic stations (The differences above-mentioned were the values subtracted the observed values from the predicted values. The low and the high edge of the boxes represented the position of the lower quartile of the 25th percentile and the upper quartile of the 75th percentile, respectively. The white lines across the boxes represented the medians. The whiskers extending from the boxes represented the 2.5th percentile and the 97.5th percentile, and the red “+” represented the outliers.)
Table 2 MAE in the assessment stage and the validation stage for each month
Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.
Assessment 0.42 0.37 0.32 0.28 0.27 0.26 0.25 0.25 0.26 0.29 0.33 0.40
Validation 0.84 0.68 0.64 0.56 0.57 0.52 0.58 0.63 0.61 0.75 0.72 0.91

Note: The unit of MAE was °C.

Moreover, the MAE in warm months (summer, later spring, and early autumn) were smaller than that in cold months (winter, later autumn, and early spring), which indicated that the GWR model fitted better in warm months. As to the outliers, 34% stations (average percentage for all months) were located in the western part (<100°E) of China, and the statistically significant (at the 0.05 level) negative linear relationship in all months between the station density (interpolated by the station density of dataset-1 based on a kernel function) and the absolute differences suggested that the denser the stations were, the more reliable the GWR model was.

3.3 Comparison of estimated SATLR with previous studies

The seasonal mean SATLR estimated with GWR model in several main mountainous areas of China's mainland with different characteristics of SATLR were compared with the results of previous studies (Figure 5). It was noteworthy that the spatial scale of SATLR distribution was wider in summer when compared with that in winter, due to the stronger linear relationship in summer between temperature and altitude. In the eastern part of China, the large-scale blank areas were related to the extensive flat terrain, including the North China Plain, the Middle-Lower Yangtze Plain, and the Northeast China Plain, with low confidence of estimated values.
Figure 5 Spatial distribution of seasonal mean SATLR estimated with GWR model (The blank areas in the study area representing the SATLRs were invalid as described in Sections 2.2 and 2.4.)
(1) Tibetan Plateau. The seasonal variation pattern in the eastern part of the Tibetan Plateau was steepest-in-winter and shallowest-in-summer, which was in accordance with previous studies (Li et al., 2013; Kattel et al., 2015; Li et al., 2015). The winter mean SATLR with steeper than -7°C/km was found on a large scale, and it even steeper than -10°C/km in some areas. The steep seasonal mean SATLR in the Tibetan Plateau was also reported by Li et al. (2013), Li et al. (2015), and Zhang et al. (2018).
(2) Tianshan Mountains. The summer mean SATLR steeper than -8°C/km was found in the Tianshan Mountains, which was in accordance with previous studies (Li et al., 2015; Shen et al., 2016; Du et al., 2018). In winter, the mean SATLR was not stable, due to the bad linear relationship between the altitude and the surface air temperature, which was evidenced by Shen et al. (2016).
(3) Qilian Mountains. The steepest seasonal mean SATLR occurred in summer (-7 to -6°C/km), followed by that in spring (-6 to -5°C/km) and autumn (-6 to -4°C/km), and the shallowest one did in winter (-4 to -3°C/km), the seasonal variation of which was in accordance with previous studies (Li et al., 2015; Lin and Chang, 2018).
(4) Qinling Mountains. The distinctive characteristic of seasonal mean SATLR in the Qinling Mountains was that the annual range was relatively small, and the mean SATLR almost kept about -7 to -5°C/km for all seasons. This phenomenon has been found in the authors' previous studies related to land surface temperature lapse rate (Qin et al., 2018). Furthermore, the results showed by Li et al. (2015) and Wang (2015)indicated that the annual SATLR range was no more than about 2°C/km in the mountains.

4 Discussion

The outliers with large differences between the predicted and the observed monthly temperatures may be related to both the data quality of dataset-2 and the confidence of GWR model especially in the areas with sparse stations. On one hand, as the quality control and the calculation method for the monthly temperature of the dataset-2 used for validation were somewhat rough, the observed values may have large differences with the real values in some stations. Moreover, the wrong-recorded altitudes in the dataset-2 at a few stations may also result in the incorrect predicted values. On the other hand, as the weight function was the kernel with the radius of bandwidth, the weights were the same at any directions for the two stations sharing the same spatial distances to the fitted station, which may not be reasonable for the areas with sparse stations due to the different climatic characteristics between two distant stations. Furthermore, due to the complexity of GWR model, overfitting may occur, which could decrease the generalization ability when a new dataset was used to make validation (Harrell, 2015; Du et al., 2020), and could lead to higher MAE and larger percentage of outliers when compared with those in the assessment stage (Table 2).
The linear relationship between the altitude and the surface air temperature in summer was generally stronger than in winter, which was also reported by Rolland (2003) and Gardner et al. (2009). This may be related to the well-mixed air induced by stronger turbulence and convection during summer, and the frequent occurrence of temperature inversion in plains and basins during winter. The well-mixed air will benefit the formation of normal air stratification under the influence of net surface radiation, but the temperature inversion in winter would disturb the normal relationship between the surface air temperature and altitude.
In this article, the distance of the 17th nearest neighboring stations was chosen as the bandwidth to estimate the SATLR for all stations, i.e. global optimal bandwidth. As the results of GWR were sensitive to the bandwidth of the weight function, the bandwidth might not be always the optimal for any one station (Fotheringham et al., 2002; Lu et al., 2017). For example, the bandwidth with the 17th nearest neighboring stations was optimal for a station, but that with the 18th nearest neighboring stations for another station. Therefore, further studies were required to make it clear that how much the differences of the SATLR estimated with the global optimal bandwidth and the local optimal bandwidth. In addition, due to the similar climatic characteristics on the same aspect of a specific mountain in the same climate zones, the weight function could be adjusted in future according to the digital elevation model (DEM) and the climate zones.

5 Conclusions

This research made first attempt to use GWR model to estimate the SATLR. The assessment and validation for the model and the comparison of seasonal mean SATLR with previous studies were also made. Conclusions can be drawn as follows:
(1) The spatial relationship of regression residuals was broadly non-clustered, which indicated that it was reasonable for the estimation of SATLR using GWR model.
(2) The small MAE in all months indicated that the GWR model had a strong predictive ability for the surface air temperature.
(3) The similar conclusions with previous studies for the seasonal mean SATLR further evidenced the accuracy of the estimation by applying the new method.
Compared with the traditional linear regression methods, the GWR method has potential application for estimating the SATLR in a large region with complex terrain and climatic condition.

Acknowledgments

The authors thank Tianlin Zhai at the School of Resource and Environmental Sciences, Wuhan University, for the help on the use of ArcGIS software, and Huqiang Qin at the School of Electronics and Information Engineering, Hunan University of Science and Engineering, for the help on batch processing.

Appendix

Appendix Figure 1 AICc varied with the increase of nearest neighbor stations

Notes: The number of nearest neighbor stations was 17 for the year, and 17, 14, 13, 17, 17, 17, 19, 19, 19, 19, 17 and 17 for the 12 months from January to December, respectively. The x-value of the vertical dashed line was 17. The x-value of the gray band off the vertical dashed line ranged from 13 to 19.

Appendix Figure 2 Standard deviations of βi1 when n varied from 13 to 19

Notes: The x-value of a black dot represented the average local R2 when n varied from 13 to 19 at the fit station. The blue vertical line represented the critical value of local R2=0.7, and the red horizontal lines represented the standard deviation of βi1 equal to 1°C/km. The meanings of the parameters above-mentioned were described in Section 2.

[1]
Akaike H, 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716-723.

[2]
Atabay D, 2016. Pyrenn: First Release (Version v0.1) [CP]. Zenodo. doi: 10.5281/zenodo.45022.

[3]
Barry R G, Chorley R J, 2003. Atmosphere, Weather, and Climate. London and New York: Routledge.

[4]
Bengio Y, Gingras F, 1996. Recurrent neural networks for missing or asynchronous data. In: Advances in Neural Information Processing Systems. Cambridge, Massachusetts: MIT Press, 395-401.

[5]
Blandford T R, Humes K S, Harshburger B J et al., 2008. Seasonal and synoptic variations in near-surface air temperature lapse rates in a mountainous basin. Journal of Applied Meteorology and Climatology, 47(1):249-261.

[6]
Brunsdon C, Fotheringham A S, Charlton M E, 1996. Geographically weighted regression: A method for exploring spatial nonstationarity. Geographical Analysis, 28(4):281-298.

[7]
Cao L, Zhu Y, Tang G et al., 2016. Climatic warming in China according to a homogenized data set from 2419 stations. International Journal of Climatology, 36(13):4384-4392.

[8]
Cliff A D, Ord J K, 1969. The problem of spatial autocorrelation. In: Studies in Regional Science. London: Pion, 25-55.

[9]
Cliff A D, Ord J K, 1972. Testing for spatial autocorrelation among regression residuals. Geographical Analysis, 4(3):267-284.

[10]
Dodson R, Marks D, 1997. Daily air temperature interpolated at high spatial resolution over a large mountainous region. Climate Research, 8(1):1-20.

[11]
Du M, Zhang M, Wang S et al., 2018. Near-surface air temperature lapse rates in Xinjiang, northwestern China. Theoretical and Applied Climatology, 131(3/4):1221-1234.

[12]
Du Z, Wang Z, Wu S et al., 2020. Geographically neural network weighted regression for the accurate estimation of spatial non-stationarity. International Journal of Geographical Information Science, 34(7):1353-1377.

[13]
ESRI, 2018. How GWR works [OL]. https://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-statistics-toolbox/how-gwr-regression-works.htm

[14]
Fischer M M, Getis A, 2010. Handbook of Applied Spatial Analysis. Heidelberg: Springer.

[15]
Fotheringham A S, Brunsdon C, Charlton M, 2002. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. Chichester: John Wiley & Sons.

[16]
Gardner A S, Sharp M, 2009. Sensitivity of net mass-balance estimates to near-surface temperature lapse rates when employing the degree-day method to estimate glacier melt. Annals of Glaciology, 50(50):80-86.

[17]
Gardner A S, Sharp M J, Koerner R M et al., 2009. Near-surface temperature lapse rates over Arctic glaciers and their implications for temperature downscaling. Journal of Climate, 22(16):4281-4298.

[18]
Guo X, Wang L, Tian L, 2016. Spatio-temporal variability of vertical gradients of major meteorological observations around the Tibetan Plateau. International Journal of Climatology, 36(4):1901-1916.

[19]
Harlow R C, Burke E J, Scott R L et al., 2004. Research note: Derivation of temperature lapse rates in semi-arid south-eastern Arizona. Hydrology and Earth System Sciences, 8(6):1179-1185.

[20]
Harrell Jr F E, 2015. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. New York: Springer.

[21]
He Y, Wang K, 2020. Contrast patterns and trends of lapse rates calculated from near-surface air and land surface temperatures in China from 1961 to 2014. Science Bulletin, 65(14):1217-1224.

DOI

[22]
Holden J, Rose R, 2011. Temperature and surface lapse rate change: A study of the UK’s longest upland instrumental record. International Journal of Climatology, 31(6):907-919.

[23]
Hurvich C M, Simonoff J S, Tsai C-L, 1998. Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(2):271-293.

[24]
Jóhannesson T, Sigurdsson O, Laumann T et al., 1995. Degree-day glacier mass-balance modelling with applications to glaciers in Iceland, Norway and Greenland. Journal of Glaciology, 41(138):345-358.

[25]
Kattel D B, Yao T, Panday P K, 2018. Near-surface air temperature lapse rate in a humid mountainous terrain on the southern slopes of the eastern Himalayas. Theoretical and Applied Climatology, 132(3/4):1129-1141.

[26]
Kattel D B, Yao T, Yang K et al., 2013. Temperature lapse rate in complex mountain terrain on the southern slope of the central Himalayas. Theoretical and Applied Climatology, 113(3/4):671-682.

[27]
Kattel D B, Yao T, Yang W et al., 2015. Comparison of temperature lapse rates from the northern to the southern slopes of the Himalayas. International Journal of Climatology, 35(15):4431-4443.

[28]
Kim P, 2017. MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence. New York: Apress.

[29]
Leung Y, Mei C, Zhang W, 2000. Testing for spatial autocorrelation among the residuals of the geographically weighted regression. Environment and Planning A: Economy and Space, 32(5):871-890.

[30]
Li X, Wang L, Chen D et al., 2013. Near-surface air temperature lapse rates in the mainland of China during 1962-2011. Journal of Geophysical Research: Atmospheres, 118(14):7505-7515.

[31]
Li Y, Zeng Z, Zhao L et al., 2015. Spatial patterns of climatological temperature lapse rate in mainland of China: A multi-time scale investigation. Journal of Geophysical Research: Atmospheres, 120(7):2661-2675.

[32]
Lin C, Chang X, 2018. Spatio-temporal variations of surface temperature lapse rate on Qilian Mountains. Advances in Geosciences, 8(3):691-698. (in Chinese)

[33]
Lloyd C D, 2006. Local Models for Spatial Analysis. Boca Raton: CRC Press.

[34]
Lu B, Brunsdon C, Charlton M et al., 2017. Geographically weighted regression with parameter-specific distance metrics. International Journal of Geographical Information Science, 31(5):982-998.

[35]
Marshall S J, Sharp M J, Burgess D O et al., 2007. Near-surface-temperature lapse rates on the Prince of Wales Icefield, Ellesmere Island, Canada: Implications for regional downscaling of temperature. International Journal of Climatology, 27(3):385-398.

[36]
Minder J R, Mote P W, Lundquist J D, 2010. Surface temperature lapse rates over complex terrain: Lessons from the Cascade Mountains. Journal of Geophysical Research: Atmospheres, 115(D14):D14122.

[37]
Mitchell A, 2005. The ESRI Guide to GIS Analysis. New York: ESRI Press.

[38]
Mokhov I I, Akperov M G, 2006. Tropospheric lapse rate and its relation to surface temperature from reanalysis data. Izvestiya, Atmospheric and Oceanic Physics, 42(4):430-438.

[39]
Moran P A P, 1948. The interpretation of statistical maps. Journal of the Royal Statistical Society: Series B (Methodological), 10(2):243-251.

[40]
Ojha R, 2017. Assessing seasonal variation of near surface air temperature lapse rate across India. International Journal of Climatology, 37(8):3413-3426.

[41]
Páez A, Uchida T, Miyamoto K, 2002. A general framework for estimation and inference of geographically weighted regression models: 2. Spatial association and model specification tests. Environment and Planning A: Economy and Space, 34(5):883-904.

[42]
Pepin N, Benham D, Taylor K, 1999. Modeling lapse rates in the maritime uplands of Northern England: Implications for climate change. Arctic, Antarctic, and Alpine Research, 31(2):151-164.

[43]
Petersen L, Pellicciotti F, Juszak I et al., 2013. Suitability of a constant air temperature lapse rate over an Alpine glacier: Testing the Greuell and Böhm model as an alternative. Annals of Glaciology, 54(63):120-130.

DOI

[44]
Philip G M, Watson D F, 1982. A precise method for determining contoured surfaces. The APPEA Journal, 22(1):205-212.

[45]
Qin Y, Ren G, Zhai T et al., 2018. A new methodology for estimating the surface temperature lapse rate based on grid data and its application in China. Remote Sensing, 10(10):1617.

[46]
Qu R, Cui X, Yan H et al., 2013. Impacts of land cover change on the near-surface temperature in the North China Plain. Advances in Meteorology, 2013: 1-12.

[47]
Rencher A C, Schaalje G B, 2008. Linear Models in Statistics. Hoboken, New Jersey: John Wiley & Sons.

[48]
Rolland C, 2003. Spatial and seasonal variations of air temperature lapse rates in Alpine regions. Journal of Climate, 16(7):1032-1046.

DOI

[49]
Shen Y J, Shen Y, Goetz J et al., 2016. Spatial-temporal variation of near-surface temperature lapse rates over the Tianshan Mountains, Central Asia. Journal of Geophysical Research: Atmospheres, 121(23):14006-14017.

[50]
Sun M, Yao X, Li Z et al., 2015. Hydrological processes of glacier and snow melting and runoff in the Urumqi River source region, eastern Tianshan Mountains, China. Journal of Geographical Sciences, 25(2):149-164.

[51]
Wang L, Sun L, Shrestha M et al., 2016a. Improving snow process modeling with satellite-based estimation of near-surface-air-temperature lapse rate. Journal of Geophysical Research: Atmospheres, 121(20):12005-12030.

[52]
Wang J, Zhang T, Fu B, 2016b. A measure of spatial stratified heterogeneity. Ecological Indicators, 67:250-256.

DOI

[53]
Wang X, 2015. GIS-based study on temperature lapse rate in mountain areas in China [D]. Nanjing: Nanjing University of Information Science and Technology, (in Chinese)

[54]
Wang Y, Wang L, Li X et al., 2018. Temporal and spatial changes in estimated near-surface air temperature lapse rates on Tibetan Plateau. International Journal of Climatology, 38(7):2907-2921.

[55]
Watson D F, Philip G M, 1985. A refinement of inverse distance weighted interpolation. Geoprocessing, 2(4):315-327.

[56]
Wheeler D, Tiefelsdorf M, 2005. Multicollinearity and correlation among local regression coefficients in geographically weighted regression. Journal of Geographical Systems, 7(2):161-187.

[57]
Wilby R L, Dawson C W, Barrow E M, 2002. SDSM: A decision support tool for the assessment of regional climate change impacts. Environmental Modelling & Software, 17(2):147-159.

[58]
Xie Y, Zhang Y, Lan H et al., 2018. Investigating long-term trends of climate change and their spatial variations caused by regional and local environments through data mining. Journal of Geographical Sciences, 28(6):802-818.

DOI

[59]
Yang X, Tang G, Xiao C et al., 2007. Terrain revised model for air temperature in mountainous area based on DEMs: A case study in Yaoxian county. Journal of Geographical Sciences, 17(4):399-408.

[60]
Yang X, Zhang Y, Liu L et al., 2009. Sensitivity of surface air temperature change to land use/cover types in China. Science in China Series D: Earth Sciences, 52(8):1207-1215.

DOI

[61]
Zhai D, Bai H, Qin J et al., 2016. Temporal and spatial variability of air temperature lapse rates in Mt. Taibai, Central Qinling Mountains. Acta Geographica Sinica, 71(9):1587-1595. (in Chinese)

[62]
Zhang H, Zhang F, Zhang G et al., 2018. How accurately can the air temperature lapse rate over the Tibetan Plateau be estimated from MODIS LSTs? Journal of Geophysical Research: Atmospheres, 123(8):3943-3960.

[63]
Zhang Z, Ren Z, Zhang Q et al., 2013. Analysis of quality control procedures for hourly air temperature data from automatic weather stations in China. Journal of Meteorology and Environment, 29(4):64-70. (in Chinese)

Outlines

/