Research Articles

Geographical big data and data mining: A new opportunity for “water-energy-food” nexus analysis

  • YANG Jie , 1 ,
  • CAO Xiaoshu , 1, * ,
  • YAO Jun 2 ,
  • KANG Zhewen 1 ,
  • CHANG Jianxia 3 ,
  • WANG Yimin 3
Expand
  • 1. Northwest Land and Resources Research Center, Global Regional and Urban Research Institute, Institute of Transport Geography and Spatial Planning, Shaanxi Normal University, Xi’an 710119, China
  • 2. Hanjiang-to-Weihe River Valley Water Diversion Project Construction Co. Ltd., Shaanxi Province, Xi’an 710010, China
  • 3. State Key Laboratory of Eco-hydraulics in Northwest Arid Region, Xi’an University of Technology, Xi’an 710048, China
*Cao Xiaoshu (1969-), Professor, specialized in human-earth system analysis and territorial space planning. E-mail:

Yang Jie (1991-), Assistant Researcher, specialized in water-energy-food nexus and water resources management. E-mail:

Received date: 2023-05-05

  Accepted date: 2023-11-03

  Online published: 2024-02-06

Supported by

National Natural Science Foundation of China(52209030)

Fundamental Research Funds for the Central Universities(GK202207005)

China Postdoctoral Science Foundation(2023M732163)

Shaanxi Province Postdoctoral Science Foundation(2023BSHYDZZ110)

Abstract

Since the Bonn 2011 conference, the “water-energy-food” (WEF) nexus has aroused global concern to promote sustainable development. The WEF nexus is a complex, dynamic, and open system containing interrelated and interdependent elements. However, the nexus studies have mainly focused on natural elements based on massive earth observation data. Human elements (e.g., society, economy, politics, culture) are described insufficiently, because traditional earth observation technologies cannot effectively perceive socioeconomic characteristics, especially human feelings, emotions, and experiences. Thus, it is difficult to simulate the complex WEF nexus. With the development of earth observation sensor technologies and human activity perception methods, geographical big data covering both human activities and natural elements offers a new opportunity for in-depth WEF nexus analysis. This study proposes a five-step framework by leveraging geographical big data mining to dig for the hidden value in the data of various natural and human elements. This framework can enable a thorough and comprehensive analysis of the WEF nexus. Some application examples of the framework, major challenges, and possible solutions are discussed. Geographical big data mining is a promising approach to enhance the analysis of the WEF nexus, strengthen the coordinated management of resources and sectors, and facilitate the progress toward sustainable development.

Cite this article

YANG Jie , CAO Xiaoshu , YAO Jun , KANG Zhewen , CHANG Jianxia , WANG Yimin . Geographical big data and data mining: A new opportunity for “water-energy-food” nexus analysis[J]. Journal of Geographical Sciences, 2024 , 34(2) : 203 -228 . DOI: 10.1007/s11442-024-2202-6

1 Introduction

Water, energy, and food are essential and fundamental resources that relate to human survival, national security, and social stability (Bazilian et al., 2011). In the wake of rapid social and economic development, the global population is growing with varied dietary patterns and climate change. By the year of 2030, the world’s demand for water, energy, and food will increase by 40%, 50%, and 35%, respectively (United States National Intelligence Council, 2012). As water, energy, and food are facing mounting challenges, these three fundamental resources have been listed as global risks in the Global Risks Report issued in January 2011 (World Economic Forum, 2011).
During the Bonn conference in November 2011, Hoff (2011) proposed that water, energy, and food security are critical not just for a single system. These three resources are interlinked with each other in highly complicated ways (Figure 1) (Bazilian et al., 2011). For example, water is a major input for food production, as well as for energy development and utilization (Carvalho et al., 2022). The production, transportation, treatment, and distribution of water and the production, harvest, processing, and transportation of food cannot be divorced from energy (Bazilian et al., 2011). The interlinkages consisting of interrelated resources and sectors are referred to as the “water-energy-food” (WEF) nexus (Hoff, 2011). The proposal of the WEF nexus emphasizes the complex association and tradeoffs among different resources (Bazilian et al., 2011; Kurian, 2017). Policy-makings based on a single-minded approach may fail to achieve sustainable development and even have a negative impact on other systems (Cai et al., 2018). Cross-sectoral approaches can better spur the harmonious development of resources (Hoff, 2011). The emergence of the WEF nexus has marked the paradigm shift of resources governance from a traditional single-sector strategy to a comprehensive cross-sector strategy (Giupponi and Gain, 2016). Research should break down the barriers and address the global water, energy, and food challenges from the perspective of interlinkages, rather than isolation.
Figure 1 The interlinkages among water, energy, and food
With the popularization of the nexus concept, the extensibility of the nexus has been increasingly explored by in-depth analysis across various fields, perspectives, and even scales. In 2011, Stockholm Environment Institute stated that the nexus is a potentially effective vehicle to solve resources scarcity (Hoff, 2011). The WEF nexus should further consider urbanization, population growth, and climate change. WEF nexus analysis should take some incentive measures and cross-sectoral management in the social, economic, and environmental fields to improve the efficiency of resources use (Hoff, 2011). In 2013, the United Nations Economic and Social Commission for Asia and the Pacific (UNESCAP, 2013) stated that the nexus is closely linked in time and space to external factors. For example, climate change, rapid urbanization, financial crisis, and excessive consumption would all affect the nexus. On June 20, 2023, a dialogue was organized by the UNFCCC Technology Executive Committee (TEC) in collaboration with the Food and Agriculture Organization of the United Nations (FAO), with contributions from the United Nations Industrial Development Organization (UNIDO). This dialogue focused on the urgent need for a systemic solution that harnesses innovation and technology for WEF nexus analysis. As research continues, globalization, ecological environment, land use, water market, public health, political factors, and technology are constantly included in the WEF nexus framework (Moghadam et al., 2023). The WEF nexus not only underscores the feedback among water, energy, and food, but also takes into account their interaction mechanisms with external factors (Figure 1) (Conway et al., 2015).
The WEF nexus features multi-system (e.g., agriculture, water), multi-scale (e.g., annual, monthly, daily), multi-level (e.g., local, regional, global), multi-factor (e.g., food quality, price, production), holistic, systematic, uncertain, and highly complex characteristics. As it involves many aspects of natural and human sciences, the WEF nexus is within the scope of human and earth systems (i.e., the interaction or feedback between human and earth), and has become a key approach to solve interdisciplinary and complex system problems (Scanlon et al., 2017). In 2015, the 2030 Agenda for Sustainable Development was proposed to manage the earth’s environmental and natural resources through cooperation and further form a world where people and nature are in harmony with sustainable development (UN, 2015). The agenda consists of 17 Sustainable Development Goals (SDGs). Among them, WEF nexus analysis can directly help achieve SDG6 (Clean Water and Sanitation), SDG7 (Affordable and Clean Energy), and SDG2 (Zero Hunger). The WEF nexus also interacts with other SDGs to varying degrees. This is exemplified by future WEF optimization studies, which are inevitable to indirectly involve the climate change, economy, and environment related to SDG1 (No Poverty), SDG8 (Decent Work and Economic Growth), SDG13 (Climate Action), SDG14 (Life Below Water), and SDG15 (Life on Land). WEF nexus analysis adopts a holistic, integrated, and systemic perspective, allowing it to support the realization of the SDGs which are interrelated and interdependent across different dimensions of development.
For the WEF nexus, the interlinkages of its internal fundamental resources (i.e., water, energy, food) and external factors (e.g., land use, environment, population growth, human behavior) are expected to be more complicated under climate change and intense human activities (Ajjur and Al-Ghamdi, 2022). Elements associated with the WEF nexus can be divided into two categories: natural elements such as water, soil, atmosphere, and biology (Wang et al., 2018), and human elements such as society, economy, politics, and culture (Wang et al., 2018). Deciphering the interaction mechanisms among the three resources and those related to other factors is a key step towards realizing sustainable development (Giupponi and Gain, 2016; Kurian, 2017). Therefore, numerous studies have explored sustainable adaptation measures with different entry points, such as water resources allocation, sewage discharge, energy structure, carbon tax, human lifestyle, ecosystem service, crop planting structure, and fertilizer application. Quantitative models widely used in WEF nexus analysis include the climate, land-use, energy, and water strategies (CLEWs) (Howells et al., 2013), the multi-scale integrated analysis of societal and ecosystem metabolism (MuSIASEM) (Giampietro et al., 2014), the WEF Nexus Tool 2.0 (Daher and Mohtar, 2015), the water evaluation and planning - long-range energy alternatives planning system (WEAP-LEAP) (Karlberg et al., 2015), the water, energy, and food security nexus optimization model (WEFO) (Zhang and Vesselinov, 2016), the nexus simulation system (NexSym) (Martinez-Hernandez et al., 2017), and the Water-Energy-Food nexus simulation model (WEFSiM) (Wicaksono and Kang, 2018).
Despite outstanding research advances, further analysis of the WEF nexus still faces severe challenges (Figure 2). The first challenge is posed by the complex interaction mechanisms of nexus-related elements. While multi-source data indispensable for quantitative nexus research are hard to be fully obtained, and the spatiotemporal scales of available data are often not unified and vary in quality. Thus, most nexus analyses and models are basically designed for a particular region and are not universal (Ernst and Preston, 2017). Additionally, previous studies have mainly looked at the interaction mechanisms between two elements (Larkin et al., 2020). The second challenge arises from the presence of multiple natural and human elements in the nexus. Elements considered in existing research are still not comprehensive (Carvalho et al., 2022). The third challenge is that different resources are mostly governed by various departments, and this condition is difficult to change in a short time (Ernst and Preston, 2017). Department equity has not been effectively considered and coordinated planning has not been developed due to the restrictive relationship among resources. This leads to the single and fragmented management mode and hinders unified management of resources (Conway et al., 2015). Lastly but most significantly, many nexus studies pay great attention to natural elements as supported by massive earth observation data. Human elements are less described because traditional earth observation technologies cannot effectively perceive socioeconomic characteristics, especially human feelings, emotions, and experiences (Larkin et al., 2020). Human elements, which affect the evolution and development of the earth system, have a non-negligible impact on the WEF nexus. For example, Momblanch et al. (2019) indicated that the socioeconomic impact on the WEF nexus is greater than that of climate change. Without considering human elements, it is difficult to realize the transformation from theory to reality, and to provide effective adaptation measures for government-led resources management (Biggs et al., 2015).
Figure 2 Challenges of the “water-energy-food” (WEF) nexus
To break down the barriers and coordinate the development among various departments, there is a need to strengthen data monitoring, collection, processing, and sharing. Data from different sources, regions, and scales should be integrated into a unified framework for better model construction. It is also imperative to integrate more human elements into the WEF nexus and build a bridge between natural and human elements. The next is to establish government examination, dialogue, negotiation, and cooperation mechanisms among various departments and regions through legal or institutional constraints (Kurian, 2017). On this basis, great effort is needed to superimpose the impacts of human processes and climate change, identify the tradeoffs between natural and human elements, and unravel the complex interaction mechanisms among nexus-related elements. The outcomes would form the foundation for formulating collaborative management policies from an overall system integration perspective and promoting the efficient utilization and sustainable development of resources.
This study proposes a framework that leverages geographical big data mining to enhance WEF nexus analysis covering both natural and human elements. Examples are provided to demonstrate the applicability of the framework in addressing some key issues in WEF nexus analysis. The major challenges and possible solutions for WEF nexus analysis based on geographical big data mining are identified. The paper is organized as follows. Section 2 introduces the geographical big data mining technology. Sections 3 and 4 present the framework of geographical big data mining and its application examples in WEF nexus analysis. Section 5 discusses the challenges and possible solutions for applying the proposed framework. Section 6 concludes with some remarks.

2 Geographical big data mining

WEF nexus analysis is challenged by the fact that most studies have focused on natural elements, with less consideration of human elements. On the one hand, massive earth observation data allow analyzing and modeling the changes in geographical elements and assessing the impacts of sudden natural disaster events. On the other hand, traditional earth observation sensor technologies, such as remote sensing, use spectral signatures to acquire the information of ground objectives. These technologies cannot effectively perceive socioeconomic and environmental characteristics, especially human feelings, emotions, experiences, beliefs, and thoughts (Larkin et al., 2020).
In this study, geographical big data mining is adopted as a potential approach to better analyze the WEF nexus and reach the achievement of the SDGs. It involves the application of data mining techniques in geographical big data for addressing various problems or questions. This approach could provide large-scale, high-resolution, and multi-dimensional data sources and analytical methods. It is not a new technique, but is a combination of geographical big data and data mining techniques that have wide application in various fields. Examples include the analyses of human behavior predictability (Song et al., 2010), population displacement affected by earthquake (Lu et al., 2012), global surface water change (Pekel et al., 2016), commuting time (Huang et al., 2018), and flood exposure risk (Tellman et al., 2021). However, there is still a scarcity of studies assessing the potential of geographical big data mining for application in WEF nexus analysis.

2.1 Geographical big data

The advent of big data allows people to automatically receive real-time, continuously updated, and long time series data through the Internet. While data collection has transferred from the professional to the popular, people are not only data receivers but also data producers (Wang et al., 2022). The term geographical big data refers to extraordinarily large and complex datasets that contain geographical spatiotemporal information on location, time, space, and attribute (Guo et al., 2022). Geographical big data can be divided into earth observation big data and human behavior big data according to different sensors and objects (Figure 3) (Pei et al., 2020).
Figure 3 Classification of geographical big data
Earth observation big data describes natural elements and is mainly recorded in an active manner by earth observation sensor technologies, such as satellite remote sensing, surface monitoring sensor, aerial survey, and aerial photography (Huang et al., 2021). Human behavior big data is the information of human behaviors, including movement, consumption, and social interaction, which is recorded passively mainly through human activity perception methods. These methods are represented by social check-in, navigation travel, logistics and delivery, mobile positioning, mobile terminal, expenditure, chip card, and communication log (Huang et al., 2021). Geographical big data enables us to depict geographical features (e.g., atmosphere, land, ocean), record socioeconomic activities (e.g., human behaviors, emotions, public opinions) in a holistic, detailed, diversified, and real-time manner (He et al., 2018; Guo et al., 2022). Importantly, it has created possibilities for further analysis of the WEF nexus state, evolution, and cascade effect, as well as the harmony between human-nature and sustainable development.
In contrast with traditional purposeful sampling data (or small data), geographical big data is characterized by the typical 5Vs (volume, velocity, variety, value, veracity) of big data (Marr, 2015; Guo et al., 2022). Volume indicates the large data size. Velocity refers to the benefits of fast data generation and spread from many intelligent terminal devices and the Internet. Variety represents the diversity of data source types, including structured, semi-structured, and unstructured data forms (e.g., network logs, videos, pictures, location information). Value indicates the huge value of big data that allows people to measure and understand the world in multiple dimensions. Veracity means that compared to traditional sampling survey, big data analysis can better reveal the real world when the data quality is well controlled.
In addition to the 5V characteristics, geographical big data has the following advantages over small data. (1) Geographical big data is characterized by finer granularity, i.e., smaller size of geographical information unit (Li et al., 2016), which can enable us to observe geographical phenomena from a microscopic perspective. For example, the statistical data of previous population surveys in China are primarily recorded at the street or township level. Based on the Tencent location big data, Chen et al. (2022) analyzed urban vibrancy to reveal the intensity of human activities. (2) In the era of big data, data in a larger area (nationwide or even global) can be derived while maintaining finer granularity compared to small data mainly collected in a local area. For example, Pekel et al. (2016) quantified long-term global surface water change at a 30-m solution using three million Landsat satellite images. (3) Geographical big data features higher density with finer granularity, which renders the observation of geographical phenomena in a more detailed and realistic manner. For example, He et al. (2018) indicated that the shortcomings of traditional questionnaire data include sparse density and narrow scopes compared to big data.

2.2 Data mining

The ultimate purpose of big data, including geographical big data, is to dig the hidden, unknown, and potentially useful knowledge from data. It is essential to identify the rules within big data. Data mining is the process or technique of extracting useful information from large and complex datasets and is currently in strong association with computer science. Hidden valuable information and knowledge that are not evident or easily accessible can be mined by using statistics, online analytical processing, intelligence retrieval, machine learning, expert system, visualization techniques, and pattern recognition. Thus, data mining is popular in the field of artificial intelligence (Gladju et al., 2022). Machine learning algorithms are the main techniques to analyze massive data and achieve the data mining purpose (Gladju et al., 2022). Commonly used machine learning algorithms include k-nearest neighbor, artificial neural network, support vector machine, Bayesian network, and random forest (Jin et al., 2020).
Data mining is an appropriate method to enhance WEF nexus analysis. It can pinpoint the anomalies deviating from normal patterns in a single sector, such as water quality or energy demand. Data mining also facilitates the clustering of WEF nexus data based on similarity, and identifies the association rules between two or among more independent variables, such as those between irrigation water and food production. Moreover, this method is able to demonstrate the impact of climate change on the WEF nexus (Yang et al., 2023).

3 Framework of geographical big data mining for WEF nexus analysis

Considering the great value and service capabilities of geographical big data mining, this study proposes a framework that leverages geographical big data mining technique to enhance WEF nexus analysis (Figure 4). The aim of the framework is to comprehensively reveal the interaction mechanisms and dynamic evolution trends of various natural and human elements in the WEF nexus.
Figure 4 Framework of geographical big data mining in WEF nexus analysis

3.1 Data collection, storage, and processing

The first step of the framework is data collection, storage, and processing, whereby diverse and heterogeneous data are collected from different sectors (including constraints and objectives) and integrated into the WEF nexus system. The WEF nexus is a highly complicated system involving various natural and human elements (Biggs et al., 2015). As the nexus is strongly affected by both climate change and human activities (Momblanch et al., 2019), WEF nexus analysis can be supported by large amounts of data. Therefore, the first and foremost is integrated use of techniques such as surveying, mapping, remote sensing, spatial statistics, communication, and artificial intelligence to monitor natural and human elements dynamically and quantitatively. The data should be collected over different scales (global, regional, or local) due to complex application scenarios.
Integration of massive multi-source data is the key for subsequent discovery of potential knowledge and decision-making. To enhance WEF nexus analysis from both natural and human perspectives, the following data should be collected. (i) Remote sensing platforms and sensors provide spatially explicit and temporally consistent natural data, including land use, vegetation index, soil moisture, crop production, irrigation efficiency, precipitation, and water quality (Wang et al., 2023). These data can enable the assessment of the impact on the WEF nexus posed by climate change or human activities, such as drought, deforestation, or urbanization. (ii) Social media offer rich and timely human data, including human behaviors, emotions, or preferences (Campana and Delmastro, 2022). These data support the monitoring of the impact of water scarcity, energy crisis, and food insecurity on the WEF nexus. (iii) Mobile devices supply fine-grained and real-time data, including mobile user location, payment, and movement (Campana and Delmastro, 2022). These data facilitate the analysis of the impact of traffic jam, air pollution, and noise pollution on the WEF nexus.
The collected data need to be stored in a timely manner. Database selection should be based on the data characteristics. For example, a non-relational database is more suitable for storing web-based data than a relationship database (Chandra, 2015). Furthermore, it is necessary to consider the interaction techniques and visualization methods of physical storage devices and unified storage platforms for integrated data storage. Notably, there are certain problems in the collected and stored big data, such as noises, outliers, errors, biases, missing values, duplicates, or inconsistencies due to errors in measurement, transmission, and processing (Abdul-Rahman et al., 2021). In this case, data processing, including cleaning, debiasing, validation, integration, reduction, conversion, rejection, and filter, is required (Upadhyay, 2022). Detailed methods are exemplified by noise filtering, outlier detection, error correction, data resampling, data normalization, bias mitigation, range check, consistency check, and redundancy check (Tax et al., 2022). Data processing helps improve the data quality and the credibility of future decisions.

3.2 Spatiotemporal distribution and outlier analysis

The second step of the framework is spatiotemporal distribution and outlier analysis. To understand the complex rules of natural factors and human activities, spatiotemporal distribution analysis identifies the potential and valuable spatiotemporal patterns and unravels the evolutionary trends over time, such as location, distance, direction, duration, frequency, or sequence from the monitored and collected multi-source data (Wang et al., 2022). It also involves finding the aggregation and distribution characteristics of geographical entities with similar thematic attributes (e.g., precipitation, water resources per capita, temperature) in time and space (Xiong et al., 2020). Commonly used data-driven mining methods for the distribution analysis include regression model, auto-covariance, goodness of fit, and variogram for temporal analysis, as well as K-mean, K-medoids, and R-tree clustering or classification for spatial analysis.
Spatiotemporal outlier analysis identifies a small number of entities deviating from the overall or local distribution patterns due to non-observational errors from the massive spatiotemporal database (Hawkins, 1980). While outlier detection is traditionally conducted from a single perspective, such as time, space, or attribute (Hodge and Austin, 2004), this concept is extended by the spatiotemporal outlier analysis of geographical big data. The deviation can be a “dynamic flow space” or “multi-dimensional scene space” outlier from the combination of time, space, and attribute to describe the variation in a geographical entity during evolution (Telang et al., 2014). Examples include outlier objects and abnormal movement behavior in trajectory big data, in addition to abnormal load of spatial interactive travel flow. The results of spatiotemporal outlier analysis could provide theoretical basis and practical guidance for thoroughly characterizing the unique distribution, variation, or potential development patterns of geographical phenomena or processes. Available data mining methods for the outlier analysis include long-range correlation, Lyapunov exponent, spatiotemporal scan, isolation forest, kernel density estimation, likelihood-ratio test, and consistency test, which detect the anomalies in terms of frequency and proximity.
The methods and purposes of data mining vary with data sources. To extract valuable data for WEF nexus analysis, classification, clustering, and anomaly detection are mainly applied to remote sensing data and sensor data (Nwaila et al., 2022). Social media data are processed by text mining, sentiment analysis, and topic modeling, whereas mobile device data are analyzed through trajectory mining, mobility pattern mining, activity recognition, and anomaly detection (Dong and Wu, 2015).

3.3 Spatiotemporal interlinkage analysis

The third step of the framework is spatiotemporal interlinkage analysis to reveal and quantify the interactions, feedbacks, drivers, and impacts around the WEF nexus. As an open and complex system, the WEF nexus contains human and natural elements that are closely associated with each other (Momblanch et al., 2019). The interlinkages among various elements are dynamic, complex, nonlinear, and unpredictable (Hoff, 2011). Spatiotemporal interlinkage analysis identifies specific frequent spatiotemporal correlations (e.g., relevance, dependence, heterogeneity, nonlinearity) from the dataset (Naidoo et al., 2021; Bruns et al., 2022).
In addition to the interlinkages among natural elements, the analysis should especially consider the natural responses to human activities across time and regions in the WEF nexus system. This means that the interlinkages should proceed from the combination of time, space, and attribute in a dynamic manner. An example is provided by Pei et al. (2014) who analyzed the interlinkages between mobile data and land use data in a certain area and period. Overall, the interlinkage analysis is of importance for revealing the complex essential characteristics, interdependence, interactions, and evolutionary trends of natural and human elements. The outcomes could contribute to a better understanding of the complicated human and earth systems.
Spatiotemporal statistics is a traditional method primarily used to describe the relationship between variables through statistical inference based on historical data (Hofman et al., 2022). Commonly used methods for dealing with the spatiotemporal dependence include geostatistics, regression, and principal component analysis. Due to the complex and nonlinear interlinkages within the WEF nexus, the traditional spatiotemporal statistical models seem not sufficient (Premsagar et al., 2022). Spatiotemporal statistical models coupled with artificial intelligence, such as machine learning algorithms, may be feasible to assess the qualitative and quantitative cascade effect of events and achieve internal multi-dimensional relationship mining.
Machine learning is a discipline that specializes in studying how computers simulate or realize human learning behaviors to acquire new knowledge or skills, and reorganize existing knowledge structures to continuously improve their own performance (Azzam et al., 2022). In short, machine learning refers to smart programs that learn from samples. Trained machine learning models can often achieve faster simulation of association rules between variables without sacrificing too much accuracy (Arcomano et al., 2020). This method could improve the efficiency of sensitivity analysis and model parameter calibration. The combination of artificial intelligence and spatiotemporal statistics in data mining has provided a new powerful tool for nexus interlinkage analysis. For example, Li et al. (2021a) developed an integrated technology-environment-economics model based on an artificial neural network and statistical method to simulate the WEF nexus in central Illinois, USA.

3.4 Spatiotemporal collaborative optimization and prediction analysis

The fourth step of the framework is spatiotemporal collaborative optimization and prediction analysis. The purpose is to optimize and predict the optimal allocation and utilization of resources in the WEF nexus under different scenarios, constraints, and objectives based on the co-evolution laws of multi-elements (Saray et al., 2022). The key to this analysis is implementing the complex and dynamic WEF nexus coupled model (Peña-Torres et al., 2022). The coupled model is composed of multiple internal modules, such as water, economy, environment, food, energy, and human behavior modules, which allows for better adaptive policy-makings under the uncertainty conditions of climate change and human activities (Karlberg et al., 2015).
The optimization analysis is used to optimize the tradeoffs and propose the overall optimal or near-optimal control scheme among different resources and sectors in the WEF nexus system under various constraints and objectives. For example, the objectives could be minimizing water consumption, maximizing energy production, ensuring food security, and mitigating greenhouse gas emissions. To address these objectives, various artificial intelligence mining methods, such as linear programming, genetic algorithm, and multi-objective optimization are used. This analysis may provide feasible solutions for the management and allocation of the WEF nexus system considering tradeoffs, synergies, and co-benefits.
The prediction analysis is performed to predict the future trends and changes in different resources and sectors in the WEF nexus system under various scenarios and assumptions, such as climate change scenarios, socioeconomic scenarios, and policy scenarios. Various artificial intelligence mining and spatiotemporal statistical methods are used in this analysis. For example, Campana et al. (2018) adopted a crop yield regression model coupled with the genetic algorithm to manage agricultural drought from a WEF nexus perspective. This analysis aims to strengthen future adaptation capabilities and underpin the nexus security under future uncertainties through multivariate data fusion (Ren et al., 2022). It may provide effective and reliable forecasts for the planning and policy of the WEF nexus system.
There is a progressive relationship among spatiotemporal distribution and outlier analysis, interlinkage analysis, and collaborative optimization and prediction analysis. Specifically, the spatiotemporal partitioning and variation structure of the WEF nexus is extracted by spatiotemporal distribution and outlier analysis, which could lead to the construction of a local prediction model. Then, the mechanisms of action of the factors influencing the WEF nexus are thoroughly explored using spatiotemporal interlinkage analysis to assist in the selection of co-variables of the prediction model. Further, based on the knowledge of spatiotemporal distributions and interlinkages, the local spatiotemporal prediction model with multi-variable adaptive coordination is established for future policy-makings.

3.5 Visual decision support system construction

The fifth step of the framework is constructing a visual decision support system to visualize and communicate the WEF nexus results for stakeholders or policy-makers in an intuitive and interactive way. Various visual elements and components, such as maps, charts, dashboards, and indicators are used. The WEF nexus is of great uncertainties with regard to climate change, human behavior, economy, and trade dispute. To better respond to emergencies and make reasonable decisions, it is essential to establish a computer-based visual decision support system, which refers to intelligent, visual, interactive, dynamically evolving, near-real-time, and big data-driven simulation on computers (Talari et al., 2022). This system combines the advantages of human brain perception, personalized experience, hypothesis, reasoning, high speed, and accurate ability of computers to calculate massive data (McMeekin et al., 2006).
Through comprehensive panoramic display of big data on the visual interactive interface, decision-makers can accurately know the spatial change, temporal change, and dynamic evolution trend of multi-source elements, as well as their complex dynamic interlinkages (Brazález et al., 2022). More importantly, uncertainties should be taken into account for the visual decision support system. Decision-makers could intuitively judge the consequences, advantages, and disadvantages of different decisions under various uncertainties (Cremen et al., 2022). Thus, the visual decision support system serves adaptive policy-making corresponding to natural and human elements, such as environmental and socioeconomic changes.
For WEF nexus analysis, geographic information system combined with various data mining techniques, such as geocoding, georeferencing, geoprocessing, geovisualization, rule-based expert systems, or artificial neural networks, helps realize the nexus visualization (Palchevsky et al., 2023). This decision support system could capture, store, manipulate, analyze, manage, and present geographical big data around the WEF nexus as supported by geographic information system (Palchevsky et al., 2023). Additionally, it helps visualize and communicate the WEF nexus by providing interactive and intuitive maps or charts that show the spatial distribution or variation of WEF resources or their interlinkages. As such, actionable and adaptive recommendations or solutions for the allocation and management of WEF resources under different scenarios and objectives can be created.

4 Examples of geographical big data mining for WEF nexus analysis

A few examples are provided to demonstrate the applicability and usefulness of the framework of geographical big data mining in addressing some key issues for WEF nexus analysis. The WEF nexus is influenced by multiple external factors that can be classified into two categories: climate change and human activities (Ajjur and Al-Ghamdi, 2022). Climate change primarily affects the supply side of water, energy, and food, mainly through negative impacts. On the contrary, human activities mainly affect the demand side of the WEF nexus, including demand level and demand structure. For example, population growth and economic development lead to increased demand of resources, whereas improvement of living standards changes the demand structure of food. The WEF nexus also has an impact on the environment (Biggs et al., 2015). Excessive utilization of water resources would lead to water shortage, water pollution, and ecosystem degradation. Harmful gases emitted from energy development negatively affect the atmosphere and human health. Food production activities are likely to cause the degradation of land resources, further resulting in grave consequences, such as desertification and sandstorms.
Urban population, industry, housing, supporting service facilities, and transportation infrastructure are highly dense. Cities have become the center of global resources convergence and consumption (Chen and Chen, 2016). However, the internal resources within a city cannot meet the needs of urban development. While urban water resources mainly rely on inter-regional allocation, energy and food are mainly produced outside the city. Thus, cities are highly vulnerable to the adverse effects of climate change (Bieber et al., 2018). For example, cities are prone to flood disasters and heat waves. This further has a negative impact on the environment and restricts the socioeconomic development of cities. In the process of urbanization, climate change and environment protection should be considered for the WEF nexus. How to adjust the urbanization patterns for adaptation to climate change while ensuring the security of the WEF nexus and environment will be the key to sustainable development in the future. Thus, the impact of external factors on the urban WEF nexus and the impact of the urban WEF nexus on the environment are introduced in the following sections.

4.1 Impact of external factors on the urban WEF nexus

(1) Data collection and preparation
The starting point of adjusting the urbanization patterns for adaptation to climate change is data collection and preparation. An integrated database covering basic resources such as water, energy, food, and land, as well as external factors such as population, industrial structure, and climate information should be established. These data may include:
Climate data: climate scenarios (e.g., RCP2.6, RCP4.5, RCP8.5), precipitation, temperature, and humidity;
Water data: runoff, groundwater, and water quality;
Energy and food data: energy or food production, consumption, and price;
Geographic information data: digital elevation model, digital raster graphic, land use, satellite image, and aerial photograph;
Mobile data: mobile signaling, bicycle sharing, and credit-card consumption;
Social media data: social network site, WeChat, review site, and forum;
Socioeconomic data: population, economy, society, and city construction;
Internet big data: web browsing information and search keyword;
Ground observation data: forest, geology, and environment;
Internet of things and real-time monitoring data: traffic, logistics, power system, and water network.
These data can be obtained from various sources, such as official statistics, surveys, remote sensing, internet platforms, and sensors. Data should be processed by cleaning, debiasing, and validation as described in Section 3.1.
(2) Single-factor characteristic analysis and multi-factor interlinkage analysis
The urban WEF nexus is a complex system characterized by multiple participants and influencing factors. WEF security risks arise not only from internal linkages but also from external factors, such as climate change. Owing to the comprehensive characteristics of urbanization, its multi-dimensional elements, including water, energy, food, population, land, economy, and society should maintain a synergistic relationship. This calls for a quantitative assessment of the evolution and synergies of these elements in the urbanization process, which may involve time-space lag relationship, causal relationship, spillover effect, and driving mechanisms.
From the perspective of time, it is necessary to first characterize single elements, such as clarifying water quality status and change trend, tracking water quality changes in a timely manner, and realizing a holistic judgment of the water environment. Then, the interlinkages among various elements should be analyzed by integrating traditional data with geographical big data mining techniques. For example, one can associate urban water consumption with water users (e.g., population growth, living conditions, water usage habits, water price, value concepts) for urban water supply system planning, or identify the characteristics and changing rules of land use and human activities for urban fine management. Other examples include assessing the influence of energy exploitation on water consumption and the effects of precipitation, temperature, evapotranspiration, and technology on crop water requirements.
Spatial characteristics should also be taken into account for synergy, that is, to gradually realize spatial equality and justice. In the urbanization process, it is crucial to consider the scale of actual acceptance and realization of population citizenship across regions, and promote the orderly flow and coordinated development of various elements in space. For example, with the support of methods such as clustering analysis, association rule mining, and gravity modeling, geographical big data collected in step (1) can be used to excavate the characteristics of urban natural resources distribution and socioeconomic activities, and realize the fine mapping of urban population distribution, function centers, and vitality. Further, this allows us to analyze the relationship between natural resources and population, economy, transportation; evaluate the quality of urban growth; coordinate the allocation relationship between population, industry, efficiency, and natural resources; reduce the environmental cost of socioeconomic operation; and improve the efficiency of government decision-making.
(3) Optimization and prediction analysis to enhance urban resilience and adaptation
The dual impacts of climate change and human activities on the WEF nexus are increasing over time. Therefore, this necessitates an optimization analysis of the WEF nexus under the dual impacts to improve the resilience and adaptation of urban systems across spatiotemporal scales. With regard to climate change, climate models should be first introduced to analyze the changes in natural resources, industrial and agricultural production and consumption, social economy, and natural environment under various climate change scenarios, and assess the impact of corresponding decisions on the WEF nexus based on the climate change impact assessment. Then, optimization analysis and modeling methods should be used to optimize tradeoffs and synergies among various resources and sectors under different climate scenarios and objectives, such as minimizing water stress, maximizing water use efficiency, and ensuring water quality. This also provides decision support for future WEF analysis and city planning and management adaptation decision-making.
When considering human activities, there are constantly strengthened urban traffic flow, crowd, logistics, information flow, technology flow, and capital flow. These “flows” are major driving forces that influence the WEF nexus. Such “flows” reflect various human activities and should be included in the optimization and prediction analysis coupled with the climate change impact assessment. Thus, more scenarios of human activities need to be considered in the optimization models under different climate scenarios. For example, information technologies, including mobile Internet, Internet of things, artificial intelligence, spatial perception, and cloud computing can be fully applied to analyze the data received and develop corresponding models that can predict future population migration. Then, through the interlinkage analysis in Step (2), population migration scenarios can be integrated with climate change scenarios to assess the changing trends in other factors influenced by these dual impacts. Integrated analysis of climate change and human activities is conductive to provide a more comprehensive picture of future dynamic changes in the WEF nexus.
(4) Emergency decision support system for disaster response
This step aims to provide support for emergency response to sudden natural disasters caused by climate change and short-duration high-intensity human activities. Environmental risk early-warning is mainly discussed in Section 4.2 and is not emphasized here. The emergency decision support system is constructed primarily based on scientific monitoring, dynamic assessment, and timely early-warning. First, the long-term monitoring data of climate change and human activities are combined with the abnormal values or fluctuations of multiple factors in various industries (especially abnormal “flow” data). As such, the sudden climate events and human disturbance events can be dynamically evaluated and timely warned based on the interlinkage analysis. Then, the results can be visualized to stakeholders and decision-makers using maps, charts, and dashboards. Finally, through infrastructure construction, disasters can be properly handled, and the adaptation and resilience of the WEF nexus to disaster disturbances can be improved (Zhang et al., 2019).

4.2 Impact of the urban WEF nexus on the environment

The complex relationship within the WEF nexus is often closely related to ecosystems. A range of factors such as population growth, climate change, urbanization, industrialization, and living standard improvement are responsible for increasingly scarce urban water resources, growing energy demand, fluctuating food supply, and seriously impaired ecological functions. Urban air pollution, water pollution, ecological degradation, and compound environmental pollution have become major factors restricting economic development and endangering human health and social stability. An eco-conservation perspective is the basis of realizing the goal of harmonious co-existence between human and nature. Therefore, a pressing issue for regional sustainable development is to form a sustainable “water-energy-food-environment” coupled with optimization management scheme. The main research contents are as follows:
(1) Ecological environment evolution law and driving mechanism mining
With the big data collected in Section 4.1, multi-source data based on the mechanism model can be integrated with that based on the big data-driven model or using hybrid models. This makes it possible to identify the sources of environmental pollution and the evolution mechanisms of ecological environment. Then, the relationship between ecological environment and socioeconomic factors, as well as between environmental pollution and pollution sources, should be explored. This helps clarify the causes and contributions, thereby realizing the traceability analysis of environmental pollution and ecosystem damage. For example, in addition to exploring the spatiotemporal patterns of air pollution, geographical big data mining is also useful to identify the relationship between urbanization and air pollution, which contributes to the “Blue Sky Protection Campaign”. In terms of industrial pollution control, geographical big data mining can be used to assess the correlation between pollution load, pollutant concentration and emission intensity, density of polluting enterprises, and regional ecological environment quality. This could provide guidance for promoting effective control of industrial pollution and improving the precision of law enforcement.
(2) Ecological environment collaborative optimization and prediction analysis
Based on the idea of WEF nexus and regional environmental carrying capacity, it is helpful to determine the scale and spatial agglomeration pattern for rational planning and optimization of urban territorial space. For example, on account of scientific evaluation of urbanization and environmental carrying capacity, machine learning and geographical big data mining algorithms should be integrated first to analyze the matching degree between urbanization level and environmental carrying capacity. The reasonable range and speed of urbanization are determined by considering the constraints of water, energy, and soil resources, as well as environmental carrying capacity. Then, the optimal solution is explored to maximize the beneficial impacts (e.g., ensuring water, energy, and food security, supporting economic and social development, increasing environmental carrying capacity) and minimize the harmful impacts (e.g., reducing carbon emissions, destroying the natural environment, triggering regional conflicts).
Following the optimization analysis, it is vital to predict regional ecological environment quality across spatiotemporal scales for early pollution warning through a combination of geographical big data mining techniques, WEF nexus system, and professional model simulation of ecological environment (e.g., water quality, hydrological, air pollution model). For example, when using the data of historical PM2.5 concentration, urban points of interest, traffic, population, economy, energy, and meteorology, the prediction model constructed based on geographical big data mining can realize continuous prediction of air quality with high precision and high spatiotemporal resolution (km, hourly). This breaks through the spatial limitations of air pollution supervision, which is of value for pollution control and public health protection.
(3) Ecological environment risk early-warning and emergency decision support system
To provide early-warning and emergency response for environmental risks caused by climate change and human activities, the first step is to identify the sources, paths, and receptors of potential environmental risks based on geographical big data mining. Then, through rapid identification of the observed outliers, the risk warning of sudden environmental accidents is carried out, and the impact scope and severity of the accidents are judged in advance. Further, based on the spatial analysis of geographical big data, materials and human resources can be quickly and reasonably deployed to provide emergency decision support information after the occurrence of sudden pollution accidents. For example, in the prevention and control of water pollution, real-time monitoring and analysis of water intake, use, and drainage can be carried out by relying on the big data platform to find water pollution problems in time. Dynamic data tracking of river monitoring section can be coupled with other methods such as linear trend, cumulative anomaly, migration, and diffusion simulation to find out the spatiotemporal changes of river pollutants and trace back the pollution sources, which could promote pollution source control and water resources supervision and management.

5 Challenges and possible solutions

Despite having many advantages for WEF nexus analysis, geographical big data mining techniques still face challenges that need to be addressed. This section identifies two major challenges of applying geographical big data mining in WEF nexus analysis (Figure 5): (i) reliability of geographical big data and data mining results and (ii) multivariate model fusion considering multiple stakeholders of the WEF nexus. Further, this section discusses how these challenges can be solved.
Figure 5 Challenges and possible solutions of geographical big data mining in WEF nexus

5.1 Data reliability and spurious results

(1) Data reliability and representativeness challenge
Geographical big data has remarkable advantages over traditional small data in terms of granularity, scope, and density, but it also has universal limitations. In addition to professionals, a large number of non-professional institutions and individuals are also providers of geographical big data (Liu et al., 2015). Data provided by non-professional organizations is often a by-product of their business. As a result, the accuracy, representativeness, and reliability of the collected data are not guaranteed (Liu et al., 2015). These non-purposeful observations are accompanied by various types of noises and errors which exist in time, space, and attributes (Zhao et al., 2022). For example, big data is often difficult to represent the whole sample, especially in the application of network data, and it is more inclined to young and higher-educated groups with higher uncertainties (Önder, 2017). Thus, the data analysis cannot accurately reflect the real phenomena in the city. Different from small data whose accuracy, representativeness, and reliability are strictly controlled by professionals, geographical big data usually contains lots of noises, ultimately resulting in heavier skewness and worse precision (Zandbergen, 2008). The existence of errors may lead to cognitive biases and even fallacies, as demonstrated by Google’s success and failure in flu prediction (Lazer et al., 2014).
(2) Spurious results challenge
Due to the finer granularity and higher density of geographical big data, the huge search space of candidate patterns is larger than that of small data (Marr, 2015). In contrast to strong spatiotemporal correlations detected with smaller-sized data, it is prone to discovering a large number of relationships by machine learning with an emphasis on correlation and big data (Fernandez-Basso et al., 2021). Those relationships need to be carefully screened for causality to avoid finding logically incorrect, misleading, or practically irrelevant relationships (Karpatne et al., 2017). Fan et al. (2014) and Lyu et al. (2022) have discussed the spurious correlation problems in big data. The correlation increases with sample size. Spurious results may lead to wrong causal relationships or causal inversion conclusions. These results should be verified by observation, experimentation, and simulation, so they can be better explained and achieve higher reliability.
To overcome this challenge, a few possible solutions are given below:
(1) Integration of big and small data to improve data reliability
Geographical big data is of relatively heavy skewness and poor precision from a weakness perspective that cannot be ignored (Zandbergen, 2008). Therefore, geographical big data needs to be deeply aggregated and integrated with small data (e.g., sample surveys, in-depth interviews). Currently, big and small data cannot completely replace each other, as each has its own advantages and disadvantages. A combination of big and small data can give full play to their strengths while avoiding their weaknesses (Rengarajan et al., 2022). Small data can “correct” geographical big data to a certain extent, providing more representative and reliable results of big data mining (Faraway and Augustin, 2018).
(2) Human domain knowledge and physical model for valuable knowledge discovery
The first point is the verification of knowledge discovery by human domain knowledge. With thousands of years of civilization, human beings have accumulated a large amount of important cognitive knowledge (Wu et al., 2022). Such knowledge is helpful to understand the laws contained in geographical big data, reduce the partial and incomplete cognition of big data, alleviate the influence of data errors, and improve the reliability and accuracy of big data mining models (Sarailidis et al., 2023).
The second potential way is to integrate machine learning with physical models. Physical models, also called “white box” models, are theory-driven models. These models are directly interpretable based on factual principles and can offer the extrapolation beyond observed conditions (Adilkhanova et al., 2022). Machine learning is more flexible and can find patterns beyond known knowledge. However, machine learning models, often referred to as “black boxes”, are data-driven models, whose main problem is related to the difficulty in interpreting the internal mechanisms (Fung et al., 2021). Machine learning and physical model integration could verify and minimize wrong findings. For example, Reichstein et al. (2019) proposed five coupling modes to link physical models and machine learning to better understand the nature’s laws. Zhao et al. (2019) indicated that a physics-constrained machine learning model can perform well in estimating ecosystem evapotranspiration.

5.2 Multiple stakeholder engagement and model integration

(1) Multiple stakeholder engagement challenge
With research advances on the WEF nexus, a growing number of elements are considered in the nexus, and the analysis scale becomes increasingly larger (Carvalho et al., 2022). There is an increasing demand for entire space-time earth and human data, such as water, soil, atmosphere, biology, society, economy, politics, and culture (Momblanch et al., 2019). The WEF nexus involves multiple stakeholders who have an interest or stake. A unified department to coordinate the whole nexus system has not been available due to the unwillingness to cooperate (Ernst and Preston, 2017). Available data are basically controlled by single sectors, which lack effective information communication, sharing, and integration (Conway et al., 2015). The exchange and sharing mechanism of multi-source data is still immature, leading to the “fragmentation” of inter-departmental information transmission in the governance of the WEF nexus.
(2) Model integration challenge
Geographical big data contains rich “human” and “nature” information. How to integrate multi-source geographical big data and integrate various models to decipher the complex, diverse, nonlinear, uncertain, and multivariate relationships of the WEF nexus is still a major sticking point for policy-making. It is especially true under great uncertainties. Reasons for the dilemma are as follows.
Different types of geographical big data come from multiple sources with varied formats, structures, contents, spatiotemporal scales, resolutions, units, or semantics. Thus, multi-source data should be aligned, matched, transformed, displayed, conveyed, and harmonized by using appropriate methods, models, or tools (Balti et al., 2020). For example, data coordinate systems may differ; the spatiotemporal scale of site observation data, such as precipitation and temperature is quite different from that of remote sensing data, such as global population and gross domestic product. Data processing also varies; text data can only be calculated after structured processing, and the structure of text data needs to rely on the theory of knowledge graph. Further, these further result in the challenge of big data from different fields, and organization being truly connected and coordinated. Inconsistent standards, diverse formats, spatiotemporal scales, data consistency, and data control department of different types of data lead to difficulties in data integration and quantitative studies of the WEF nexus (Ernst and Preston, 2017).
More importantly, the WEF nexus consists of multiple internal modules (Li et al., 2021b; Shi et al., 2020). Different modules, such as crop, water quality, energy, society, climate, environment, and economy modules, are strongly interlinked with each other (Shi et al., 2020). The output of one module is usually used as the input of the next module in the process of integrated model construction (Li et al., 2021a), which requires the consistency of the model. However, the accuracy requirements of various modules are not completely uniform, and the data statistical caliber and spatiotemporal scale are not consistent (Balti et al., 2020). Due to the inconsistency, there is still a lack of integrated models and methods to fully support the collaborative mining of multi-source geographical big data (Fernandes Torres et al., 2019). This hinders the transition from qualitative analysis to dynamically quantitative simulation and optimization of the WEF nexus.
To overcome this challenge, several possible solutions are proposed:
(1) Department collaboration mechanism and regional connection enhancement
From a regional perspective, a consultation mechanism between relevant departments should be established and improved (Basheer et al., 2018). It is necessary to strengthen the collaboration between agricultural, water conservancy, energy, and other departments, thereby ensuring the coordinated development of water, land, energy, and food resources.
Additionally, the coupling synergies between various resources and economic and environmental systems should be coordinated for regions, especially those in the upstream and downstream (Basheer et al., 2018). The establishment of cross-regional and cross-departmental mechanisms for coordinated development of the upstream and downstream, main stream and tributaries, as well as left and right banks should be speeded up. Effective implementation of transboundary agreements requires a combination of political will and technical cooperation involving all stakeholders (Bazilian et al., 2011). For example, in the Yellow River Basin, regional cooperation (e.g., joint pollution prevention and control of ecological environment) and upstream and downstream cooperation (e.g., major trans-provincial infrastructure construction) should be strengthened to enhance WEF nexus analysis and further promote ecological conservation and high-quality development.
The collaboration enhancement also includes data sharing (Basheer et al., 2018). Effort should be made to promote diverse and heterogeneous data sharing among stakeholders through various platforms or mechanisms, such as open data, cloud computing, and blockchain. This could increase the accessibility, availability, and quality of geographical big data for WEF nexus analysis.
(2) Big data platform construction considering data integration and fusion
While strengthening data monitoring, collection, and sharing, it is indispensable to comprehensively integrate all elements and building a big data platform of the earth-human system (Radini et al., 2021). A unified big data platform can integrate and harmonize data from different sources and scales under a unified data framework. Additionally, the geographical big data platform should adopt a unified geodetic datum, a unified coordinate system, and a unified projection method. The unified data framework facilitates discovering the consistency and complementarity of geographical big data from different sources. It could also reduce the limitations of model construction for in-depth quantitative WEF nexus analysis (Mannschatz and Hülsmann, 2016).
(3) Department collaboration and cloud computing infrastructure construction for quantitative simulation and optimization
Due to data inconsistency and fragmented department-specific data management, it is difficult to construct integrated quantitative models. Based on department collaboration and big data platform construction, the first is to explore the standardized model structure and establish the relationship between various modules of the WEF nexus. In this way, all departments can make long-term plans for inter-departmental collaboration under the premise of a unified scientific framework. This may enhance the feasibility of simulating or modeling the WEF nexus system.
Then, an agent-based integrated model should be built. Agent-based models can simulate the stochastic and heterogenous behaviors of individual stakeholders (e.g., households, firms, governments) and their interactions within a complex system under various scenarios and objectives (Bieber et al., 2018). Furthermore, new universal models, algorithms, and cloud computing infrastructures should be built to deal with challenging and time-consuming big data problems, thereby improving the complex system simulation and performance (David et al., 2022). This may increase the flexibility and robustness of optimizing or balancing the WEF nexus system (Guo et al., 2022).

6 Conclusions

Water, energy, and food, which are indispensable basic resources for human survival and development, interlink with each other in complex ways. These resources are also influenced by multiple external factors, such as population, economy, and climate change. The concept of the “water-energy-food” (WEF) nexus was proposed at the Bonn 2011 conference to reasonably optimize multiple resources, reduce their tradeoffs, improve resource use efficiency, and promote sustainable development. Since then, the WEF nexus study has become a popular global issue.
The WEF nexus involves many dynamic elements around the earth and human, such as water, land, atmosphere, society, and economy. Previous WEF nexus studies have looked at natural elements with less consideration of human elements, which hinders the simulation of human activities and the deduction of the complex real world. Compared with traditional small data, geographical big data covering natural and human elements can better support WEF nexus analysis, yet its potential for application has rarely been investigated. Therefore, this study proposes a five-step framework of WEF nexus analysis supported by geographical big data mining. Application examples of the framework are presented, including the impact of climate change and human activities on the WEF nexus and the nexus impact on the environment. Furthermore, the potential challenges of geographical big data mining in WEF nexus analysis are discussed and their possible solutions are provided.
This paper argues that geographical big data mining is a promising approach to enhance WEF nexus analysis, disentangle the relationship between human and nature, and promote sustainable development. This study is expected to encourage the application of geographical big data mining in WEF nexus analysis, foster collaboration among sectors or resources, and contribute to the advancement of this interdisciplinary field.
[1]
Abdul-Rahman M, Chan E H W, Wong M S et al., 2021. A framework to simplify pre-processing location-based social media big data for sustainable urban planning and management. Cities, 109: 102986.

DOI

[2]
Adilkhanova I, Ngarambe J, Yun G Y, 2022. Recent advances in black box and white-box models for urban heat island prediction: Implications of fusing the two methods. Renewable and Sustainable Energy Reviews, 165: 112520.

DOI

[3]
Ajjur S B, Al-Ghamdi S G, 2022. Towards sustainable energy, water and food security in Qatar under climate change and anthropogenic stresses. Energy Reports, 8: 514-518.

[4]
Arcomano T, Szunyogh I, Pathak J et al., 2020. A machine learning-based global atmospheric forecast model. Geophysical Research Letters, 47(9): e2020GL087776.

DOI

[5]
Azzam A, Zhang W, Akhtar F et al., 2022. Estimation of green and blue water evapotranspiration using machine learning algorithms with limited meteorological data: A case study in Amu Darya River Basin, Central Asia. Computers and Electronics in Agriculture, 202: 107403.

DOI

[6]
Balti H, Abbes A B, Mellouli N et al., 2020. A review of drought monitoring with big data: Issues, methods, challenges and research directions. Ecological Informatics, 60: 101136.

DOI

[7]
Basheer M, Wheeler K G, Ribbe L et al., 2018. Quantifying and evaluating the impacts of cooperation in transboundary river basins on the water-energy-food nexus: The Blue Nile Basin. Science of The Total Environment, 630(15): 1309-1323.

DOI

[8]
Bazilian M, Rogner H, Howells M et al., 2011. Considering the energy, water and food nexus: Towards an integrated modelling approach. Energy Policy, 39(12): 7896-7906.

DOI

[9]
Bieber N, Ker J H, Wang X et al., 2018. Sustainable planning of the energy-water-food nexus using decision making tools. Energy Policy, 113: 584-607.

DOI

[10]
Biggs E M, Bruce E, Boruff B et al., 2015. Sustainable development and the water-energy-food nexus: A perspective on livelihoods. Environmental Science & Policy, 54: 389-397.

[11]
Brazález E, Macià H, Díaz G et al., 2022. FUME: An air quality decision support system for cities based on CEP technology and fuzzy logic. Applied Soft Computing, 129: 109536.

DOI

[12]
Bruns A, Meisch S, Ahmed A et al., 2022. Nexus disrupted: Lived realities and the water-energy-food nexus from an infrastructure perspective. Geoforum, 133: 79-88.

DOI

[13]
Cai X, Wallington K, Shafiee-Jood M et al., 2018. Understanding and managing the food-energy-water nexus: Opportunities for water resources research. Advances in Water Resources, 111: 259-273.

DOI

[14]
Campana M G, Delmastro F, 2022. On-device modeling of user’s social context and familiar places from smartphone-embedded sensor data. Journal of Network and Computer Applications, 205: 103438.

DOI

[15]
Campana P E, Zhang J, Yao T et al., 2018. Managing agricultural drought in Sweden using a novel spatially-explicit model from the perspective of water-food-energy nexus. Journal of Cleaner Production, 197: 1382-1393.

DOI

[16]
Carvalho P N, Finger D C, Masi F et al., 2022. Nature-based solutions addressing the water-energy-food nexus: Review of theoretical concepts and urban case studies. Journal of Cleaner Production, 338(1): 130652.

DOI

[17]
Chandra D G, 2015. BASE analysis of NoSQL database. Future Generation Computer Systems, 52: 13-21.

[18]
Chen L, Zhao L, Xiao Y et al., 2022. Investigating the spatiotemporal pattern between the built environment and urban vibrancy using big data in Shenzhen, China. Computers, Environment and Urban Systems, 95: 101827.

[19]
Chen S, Chen B, 2016. Urban energy-water nexus: A network perspectives. Applied Energy, 142: 215-224.

[20]
Conway D, van Garderen E A, Deryng D et al., 2015. Climate and southern Africa’s water-energy-food nexus. Nature Climate Change, 5(9): 837-846.

DOI

[21]
Cremen G, Bozzoni F, Pistorio S et al., 2022. Developing a risk-informed decision-support system for earthquake early warning at a critical seaport. Reliability Engineering & System Safety, 218: 108035.

DOI

[22]
Daher B T, Mohtar R H, 2015. Water-energy-food (WEF) Nexus Tool 2.0: Guiding integrative resource planning and decision-making. Water International, 40(5/6): 748-771.

DOI

[23]
David L O, Nwulu N I, Aigbavboa C O et al., 2022. Integrating fourth industrial revolution (4IR) technologies into the water, energy & food nexus for sustainable security: A bibliometric analysis. Journal of Cleaner Production, 363(20): 132522.

DOI

[24]
Dong J D, Wu W, 2015. Business value of social media technologies: Evidence from online user innovation communities. The Journal of Strategic Information Systems, 24(2): 113-127.

DOI

[25]
Ernst K M, Preston B L, 2017. Adaptation opportunities and constraints in coupled systems: Evidence from the U.S. energy-water nexus. Environmental Science & Policy, 70: 38-45.

[26]
Fan J, Han F, Liu H, 2014. Challenges of big data analysis. National Science Review, 1(2): 293-314.

PMID

[27]
Faraway J J, Augusin N H, 2018. When small data beats big data. Statistics & Probability Letters, 136: 142-145.

DOI

[28]
Fernandes Torres C J, Peixoto de Lima C H, Suzart de Almeida Goodwin B et al., 2019. A literature review to propose a systematic procedure to develop “nexus thinking” considering the water-energy-food nexus. Sustainability, 11: 7205.

DOI

[29]
Fernandez-Basso C, Ruiz M D, Martin-Bautista, M J, 2021. Spark solutions for discovering fuzzy association rules in big data. International Journal of Approximate Reasoning, 137: 94-112.

DOI

[30]
Fung P L, Zaidan M A, Timonen H et al., 2021. Evaluation of white-box versus black-box machine learning models in estimating ambient black carbon concentration. Journal of Aerosol Science, 152: 105694.

DOI

[31]
Giampietro M, Aspinall R J, Ramos-Martin J et al., 2014. Resource accounting for sustainability assessment. In: The Nexus Between Energy, Food, Water and Land Use. London: Routledge.

[32]
Giupponi C, Gain A K, 2016. Integrated spatial assessment of the water, energy and food dimensions of the Sustainable Development Goals. Regional Environmental Change, 17(7): 1881-1893.

DOI

[33]
Gladju J, Kamalam B S, Kanagaraj A, 2022. Applications of data mining and machine learning framework in aquaculture and fisheries: A review. Smart Agricultural Technology, 2: 100061.

DOI

[34]
Guo H, Liang D, Sun Z et al., 2022. Measuring and evaluating SDG indicators with Big Earth Data. Science Bulletin, 67(17): 1792-1801.

DOI PMID

[35]
Hawkins D, 1980. Identification of Outliers. London: Chapman and Hall.

[36]
He Q, He W, Song Y et al., 2018. The impact of urban growth patterns on urban vitality in newly built-up areas based on an association rules analysis using geographical ‘big data’. Land Use Policy, 78: 726-738.

DOI

[37]
Hodge V J, Austin J, 2004. A survey of outlier detection methodologies. Artificial Intelligence Review, 22: 85-126.

DOI

[38]
Hoff H, 2011. Understanding the Nexus:Background Paper for the Bonn 2011 Conference:The Water Energy and Food Security Nexus. Stockholm: Stockholm Environment Institute.

[39]
Hofman J, Do T H, Qin X et al., 2022. Spatiotemporal air quality inference of low-cost sensor data: Evidence from multiple sensor testbeds. Environmental Modelling & Software, 149: 105306.

[40]
Howells M, Hermann S, Welsch M et al., 2013. Integrated analysis of climate change, land-use, energy and water strategies. Nature Climate Change, 7(7): 621-626.

DOI

[41]
Huang H, Yao X A, Krisp J M et al., 2021. Analytics of location-based big data for smart cities: Opportunities, challenges, and future directions. Computers, Environment and Urban Systems, 90: 101712.

[42]
Huang J, Levinson D, Wang J et al., 2018. Tracking job and housing dynamics with smartcard data. Proceedings of the National Academy of Sciences of the United States of America, 115(50): 12710-12715.

DOI PMID

[43]
Jin C, Bouzembrak Y, Zhou J et al., 2020. Big data in food safety: A review. Current Opinion in Food Science, 36: 24-32.

DOI

[44]
Karlberg L, Hoff H, Amsalu T et al., 2015. Tackling complexity: Understanding the food-energy-environment nexus in Ethiopia’s Lake Tana Sub-basin. Water Alternatives: An Interdisciplinary Journal on Water Politics and Development, 8(1): 710-734.

[45]
Karpatne A, Atluri G, Faghmous J H et al., 2017. Theory-guided data science: A new paradigm for scientific discovery from data. IEEE Transactions on Knowledge and Data Engineering, 29: 2318-2331.

DOI

[46]
Kurian M, 2017. The water-energy-food nexus: Trade-offs, thresholds and transdisciplinary approaches to sustainable development. Environmental Science & Policy, 68: 97-106.

[47]
Larkin A, Hoolohan C, McLachlan C, 2020. Embracing context and complexity to address environmental challenges in the water-energy-food nexus. Futures, 123: 102612.

DOI

[48]
Lazer D, Kennedy R, King G et al., 2014. The parable of Google Flu: Traps in big data analysis. Science, 343(6176): 1203-1205.

DOI

[49]
Li S, Cai X, Emaminejad S A et al., 2021a. Developing an integrated technology-environment-economics model to simulate food-energy-water systems in Corn Belt watersheds. Environmental Modelling & Software, 143: 105083.

[50]
Li S, Dragicevic S, Castro F A et al., 2016. Geospatial big data handling theory and methods: A review and research challenges. ISPRS Journal of Photogrammetry and Remote Sensing, 115: 119-133.

DOI

[51]
Li X, Zhang L, Zhang P et al., 2021b. Urban food-energy-water nexus: A case study in Beijing. Chinese Journal of Population, Resources and Environment, 31(5): 174-184. (in Chinese)

[52]
Liu J, Li J, Li W et al., 2015. Rethinking big data: A review on the data quality and usage issues. ISPRS Journal of Photogrammetry and Remote Sensing, 115: 134-142.

DOI

[53]
Lu X, Bengtsson L, Holme P, 2012. Predictability of population displacement after the 2010 Haiti earthquake. Proceedings of the National Academy of Sciences of the United States of America, 109(29): 11576-11581.

DOI PMID

[54]
Lyu J, Khan A, Bibi S et al., 2022. Big data in action: An overview of big data studies in tourism and hospitality literature. Journal of Hospitality and Tourism Management, 51: 346-360.

DOI

[55]
Mannschatz T, Hülsmann T W S, 2016. Nexus Tools Platform: Web-based comparison of modelling tools for analysis of water-soil-waste nexus. Environmental Modelling & Software, 76: 137-153.

[56]
Marr B, 2015. Big Data: Using SMART Big Data, Analytics and Metrics to Make Better Decision and Improve Performance. Chichester, UK: John Wiley & Sons.

[57]
Martinez-Hernandez E, Leach M, Yang A, 2017. Understanding water-energy-food and ecosystem interactions using the nexus simulation tool NexSym. Applied Energy, 106(15): 1009-1021.

[58]
McMeekin T A, Baranyi J, Bowman J et al., 2006. Information systems in food safety management. International Journal of Food Microbiology, 112(3): 181-194.

PMID

[59]
Moghadam E S, Sadeghi S H, Zarghami M et al., 2023. Developing sustainable land-use patterns at watershed scale using nexus of soil, water, energy, and food. Science of The Total Environment, 856: 158935.

DOI

[60]
Momblanch A, Papadimitriou L, Jain S K et al., 2019. Untangling the water-food-energy-environment nexus for global change adaptation in a complex Himalayan water resource system. Science of The Total Environment, 655: 35-47.

DOI

[61]
Naidoo D, Nhamo L, Mpandeli S et al., 2021. Operationalising the water-energy-food nexus through the theory of change. Renewable and Sustainable Energy Reviews, 149: 111416.

DOI

[62]
Nwaila G T, Zhang S E, Bourdeau J E et al., 2022. Artificial intelligence-based anomaly detection of the Assen iron deposit in South Africa using remote sensing data from the Landsat-8 Operational Land Imager. Artificial Intelligence in Goesciences, 3: 71-85.

[63]
Önder I, 2017. Classifying multi-destination trips in Austria with big data. Tourism Management Perspectives, 21: 54-58.

DOI

[64]
Palchevsky E, Antonov V, Enikeev R R et al., 2023. A system based on an artificial neural network of the second generation for decision support in especially significant situations. Journal of Hydrology, 616: 128844.

DOI

[65]
Pei T, Sobolevsky S, Ratti C et al., 2014. A new insight into land use classification based on aggregated mobile phone data. International Journal of Geographical Information Science, 28(9): 1988-2007.

DOI

[66]
Pei T, Song C, Guo S et al., 2020. Big geodata mining: Objective, connotations and research issues. Journal of Geographical Sciences, 30: 251-266.

DOI

[67]
Pekel J F, Cottam A, Gorelick N et al., 2016. High-resolution mapping of global surface water and its long-term changes. Nature, 540: 418-422.

DOI

[68]
Peña-Torres D, Boix M, Montastruc L, 2022. Optimization approaches to design water-energy-food nexus: A literature review. Computers & Chemical Engineering, 167: 108025.

DOI

[69]
Premsagar P, Aldous C, Esterhuizen T M et al., 2022. Comparing conventional statistical models and machine learning in a small cohort of South African cardiac patients. Informatics in Medicine Unlocked, 34: 101103.

DOI

[70]
Radini S, Marinelli E, Akyol C et al., 2021. Urban water-energy-food-climate nexus in integrated wastewater and reuse systems: Cyber-physical framework and innovations. Applied Energy, 298(15): 117268.

DOI

[71]
Reichstein M, Camps-Valls G, Stevens B et al., 2019. Deep learning and process understanding for data-driven Earth system science. Nature, 566: 195-204.

DOI

[72]
Ren H, Liu B, Zhang Z et al., 2022. A water-energy-food-carbon nexus optimization model for sustainable agricultural development in the Yellow River Basin under uncertainty. Applied Energy, 326(15): 120008.

DOI

[73]
Rengarajan S, Narayanamurthy G, Moser R et al., 2022. Data strategies for global value chains: Hybridization of small and big data in the aftermath of COVID-19. Journal of Business Research, 144: 776-787.

DOI

[74]
Sarailidis G, Wagener T, Pianosi F, 2023. Integrating scientific knowledge into machine learning using interactive decision trees. Computers & Geosciences, 170: 105248.

DOI

[75]
Saray M H, Baubekova A, Gohari A et al., 2022. Optimization of water-energy-food nexus considering CO2 emissions from cropland: A case study in northwest Iran. Applied Energy, 307(1): 118236.

DOI

[76]
Scanlon B R, Ruddell B L, Reed P M et al., 2017. The food-energy-water nexus: Transforming science for society. Water Resources Research, 53(5): 3550-3556.

DOI

[77]
Shi H, Luo G, Zheng H et al., 2020. Coupling the water-energy-food-ecology nexus into a Bayesian network for water resources analysis and management in the Syr Darya River basin. Journal of Hydrology, 581: 124387.

DOI

[78]
Song C, Qu Z, Blumm N et al., 2010. Limits of predictability in human mobility. Science, 327(5968): 1018-1021.

DOI PMID

[79]
Talari G, Gummins E, McNamara C et al., 2022. State of the art review of big data and web-based Decision Support Systems (DSS) for food safety risk assessment with respect to climate change. Trends in Food Science & Technology, 126: 192-204.

[80]
Tax C M W, Bastiani M, Veraart J et al., 2022. What’s new and what’s next in diffusion MRI preprocessing. Neurolmage, 249(1): 118830.

DOI

[81]
Telang A, Deepak P, Joshi S et al., 2014. Detecting localized homogeneous anomalies over spatio-temporal data. Data Mining and Knowledge Discovery, 28: 1480-1502.

DOI

[82]
Tellman B, Sullivan J A, Kuhn C et al., 2021. Satellite imaging reveals increased proportion of population exposed to floods. Nature, 596: 80-86.

DOI

[83]
UNESCAP, 2013. ESCAP Status Report on the Water-Energy-Food Security Nexus in the Asia Pacific Region. https://www.unescap.org/sites/default/files/UNESCAP-WEF-Nexus-AP-Bangkok-Hezri.pdf.

[84]
United Nations (UN),2015. Transforming Our World:The 2030 Agenda for Sustainable Development. Outcome Document for the UN Summit to Adopt the Post- 2015 Development Agenda: Draft for Adoption. New York.

[85]
United States National Intelligence Council, 2012. Global Trends 2030:Alternative Worlds. US NIC, Washington DC, USA, pp. 137.

[86]
Upadhyay E, 2022. A critical evaluation of handling uncertainty in big data processing. Advances in Engineering Software, 173: 103246.

DOI

[87]
Wang J, Zhang F, Tan M L et al., 2023. Remote sensing evaluation of Chinese mainland’s comprehensive natural resources carrying capacity and its spatial-temporal variation characteristics. Environmental Impact Assessment Review, 101: 107104.

DOI

[88]
Wang S, Fu B, Zhao W et al., 2018. Structure, function, and dynamic mechanisms of coupled human-natural systems. Current Opinion in Environmental Sustainability, 33: 87-91.

DOI

[89]
Wang X, Zhang Y, Yu D et al., 2022. Investigating the spatiotemporal pattern of urban vibrancy and its determinants: Spatial big data analyses in Beijing, China. Land Use Policy, 119: 106162.

DOI

[90]
Wicaksono A, Kang D, 2018. Nationwide simulation of water, energy, and food nexus: Case study in South Korea and Indonesia. Journal of Hydro-environment Research, 22: 70-87.

DOI

[91]
World Economic Forum, 2011. Global Risks Report 2011. 6th ed. Cologne: World Economic Forum.

[92]
Wu X, Xiao L, Sun Y et al., 2022. A survey of human-in-the-loop for machine learning. Future Generation Computer Systems, 135: 364-381.

DOI

[93]
Xiong X, Liu S, Li D et al., 2020. Real-time and private spatio-temporal data aggregation with local differential privacy. Journal of Information Security and Applications, 55: 102633.

DOI

[94]
Yang J, Chang J, Konar M et al., 2023. The grain food-energy-water nexus in China: Benchmarking sustainability with generalized data envelopment analysis. Science of The Total Environment, 887(20): 164128.

DOI

[95]
Zandbergen P A, 2008. Positional accuracy of spatial data: Non-normal distributions and a critique of the National Standard for Spatial Data Accuracy. Transactions in GIS, 12(1): 103-130.

DOI

[96]
Zhang P, Zhang L, Chang Y et al., 2019. Food-energy-water (FEW) nexus for urban sustainability: A comprehensive review. Resources, Conservation & Recycling, 142: 215-224.

[97]
Zhang X, Vesselinov V V, 2016. Integrated modeling approach for optimal management of water, energy and food security nexus. Advances in Water Resources, 101: 1-10.

DOI

[98]
Zhao E, Sun S, Wang S, 2022. New developments in wind energy forecasting with artificial intelligence and big data: A scientometric insight. Data Science and Management, 5(2): 84-95.

DOI

[99]
Zhao W, Gentine P, Reichstein M et al., 2019. Physics-constrained machine learning of evapotranspiration. Geophysical Research Letters, 46(4): 14496-14507.

DOI

Outlines

/