Review Article

Methodology, progress and challenges of geoscience knowledge graph in International Big Science Program of Deep-Time Digital Earth

  • ZHU Yunqiang , 1, 2 ,
  • WANG Qiang 1, 3 ,
  • WANG Shu 1 ,
  • SUN Kai 1 ,
  • WANG Xinbing 4 ,
  • LV Hairong 5 ,
  • HU Xiumian 6 ,
  • ZHANG Jie 7 ,
  • WANG Bin , 8, * ,
  • QIU Qinjun 9 ,
  • YANG Jie 1 ,
  • ZHOU Chenghu 1
Expand
  • 1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
  • 2. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
  • 3. University of Chinese Academy of Sciences, Beijing 100049, China
  • 4. School of Electronic, Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • 5. Department of Automation, Tsinghua University, Beijing 100084, China
  • 6. State Key Laboratory for Mineral Deposits Research, School of Earth Sciences and Engineering, Nanjing University, Nanjing 210023, China
  • 7. State Key Laboratory of Information Photonics and Optical Communications, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • 8. Command Center of Natural Resources Comprehensive Survey, China Geological Survey, Beijing 100055, China
  • 9. School of Computer Science, China University of Geosciences, Wuhan 430074, China
*Wang Bin (1984-), PhD and Senior Engineer, specialized in mineral exploration and geological informatization. E-mail:

Zhu Yunqiang (1977-), PhD and Professor, specialized in geographic knowledge graphs and geospatial data sharing. E-mail:

Received date: 2024-08-04

  Accepted date: 2025-02-20

  Online published: 2025-09-05

Supported by

Strategic Priority Research Program of the Chinese Academy of Sciences(XDB0740000)

National Key Research and Development Program of China(2022YFB3904200)

National Key Research and Development Program of China(2022YFF0711601)

Key Project of Innovation LREIS(PI009)

National Natural Science Foundation of China(42471503)

Abstract

Deep-time Earth research plays a pivotal role in deciphering the rates, patterns, and mechanisms of Earth’s evolutionary processes throughout geological history, providing essential scientific foundations for climate prediction, natural resource exploration, and sustainable planetary stewardship. To advance Deep-time Earth research in the era of big data and artificial intelligence, the International Union of Geological Sciences initiated the “Deep- time Digital Earth International Big Science Program” (DDE) in 2019. At the core of this ambitious program lies the development of geoscience knowledge graphs, serving as a transformative knowledge infrastructure that enables the integration, sharing, mining, and analysis of heterogeneous geoscience big data. The DDE knowledge graph initiative has made significant strides in three critical dimensions: (1) establishing a unified knowledge structure across geoscience disciplines that ensures consistent representation of geological entities and their interrelationships through standardized ontologies and semantic frameworks; (2) developing a robust and scalable software infrastructure capable of supporting both expert-driven and machine-assisted knowledge engineering for large-scale graph construction and management; (3) implementing a comprehensive three-tiered architecture encompassing basic, discipline-specific, and application-oriented knowledge graphs, spanning approximately 20 geoscience disciplines. Through its open knowledge framework and international collaborative network, this initiative has fostered multinational research collaborations, establishing a robust foundation for next-generation geoscience research while propelling the discipline toward FAIR (Findable, Accessible, Interoperable, Reusable) data practices in deep-time Earth systems research.

Cite this article

ZHU Yunqiang , WANG Qiang , WANG Shu , SUN Kai , WANG Xinbing , LV Hairong , HU Xiumian , ZHANG Jie , WANG Bin , QIU Qinjun , YANG Jie , ZHOU Chenghu . Methodology, progress and challenges of geoscience knowledge graph in International Big Science Program of Deep-Time Digital Earth[J]. Journal of Geographical Sciences, 2025 , 35(5) : 1132 -1156 . DOI: 10.1007/s11442-025-2361-0

1 Introduction

The growing volume and complexity of geological data present significant challenges for Earth science, especially in studying deep-time processes spanning millions to billions of years. As the generation of new data accelerates, there is an urgent need for more robust strategies to manage, analyze, and interpret vast datasets. In response to these challenges, the International Union of Geological Sciences (IUGS) launched the Deep-time Digital Earth International Big Science Program (DDE) in 2019 (Normile, 2019). Unlike the U.S. government’s Digital Earth initiative, which primarily focuses on contemporary data, DDE is dedicated to deep-time Earth, encompassing data on the evolution of life and climate, tectonic plate movements, and the transformation of Earth’s geography (IUGS, 2019; Wang et al., 2021). The DDE initiative seeks to harmonize global geoscience data, facilitate knowledge sharing, and develop advanced analytical and visualization methods to advance research in Earth science.
With the accumulation of enormous volumes of deep-time Earth data, coupled with rapid advancements in space-based and ground-based integrated Earth observation systems, the explosion of Earth science data in 4D (horizontal space, vertical space, modern time, and deep time dimensions) has propelled Earth science research into a new data-driven stage (Lin et al., 2018; Wang et al., 2021; Yang et al., 2024). This data explosion has significantly enhanced our understanding of Earth’s past, present, and future. However, the lack of a unified knowledge framework has left geoscience data fragmented and heterogeneous. Currently, geoscience data exist in various formats (texts, tables, maps), and are often represented using diverse terminologies for the same concepts. Furthermore, scientific data management often falls short of the FAIR principles (findable, accessible, interoperable, and reusable), limiting data integration, analysis, and discovery (Stall et al., 2019). These challenges impede the effective use of Earth data, particularly in discovering and understanding complex Earth science phenomena (Wang et al., 2019, 2021).
Addressing the above-mentioned issues is essential to advancing deep-time Earth research. One promising solution lies in the application of knowledge graph (KG) technology. Knowledge graphs provide an efficient way to organize, manage, and link complex data in a semantically rich framework. Since Google first introduced the concept of knowledge graphs in 2012, they have become pivotal in diverse domains, such as semantic search, machine translation, and recommendation systems. Notable examples include DBpedia (Mendes et al., 2012), YAGO (Tanon et al., 2020), and Wikidata (Vrandečić and Krötzsch, 2014), which have demonstrated the immense potential of KGs for representing and managing knowledge at scale. As such, the DDE program leverages knowledge graph technology to support the organization, representation, and management of geoscientific knowledge.
Within geoscience, the application of knowledge graphs is rapidly evolving. Early research has focused on key areas, such as entity and relationship extraction (Lozano et al., 2017; Wang et al., 2020), knowledge fusion (Yu et al., 2018; Trisedya et al., 2019), and knowledge representation (Qiu et al., 2019; Mai et al., 2020). In addition, scholars have explored various applications, including literature mining (Peters et al., 2017), disaster and resource analysis (Wang and Stewart, 2015; Hazen et al., 2019), intelligent geological mapping (Laxton, 2017; Mantovani et al., 2020), and object-based image analysis (Belgiu et al., 2014; Arvor et al., 2019). These advancements have laid a solid foundation for the DDE Geoscience Knowledge Graph (GKG) initiative, which draws from successful geoscience projects like GeoSciML, GeoNames Ontology, LinkedEarth Ontology, and GeoDeepDive (Simons et al., 2006; Zhang et al., 2013; Khider et al., 2019). However, the DDE GKG faces unique challenges related to its multidisciplinary scope, the vast volume of data, and the integration of diverse research fields.
The DDE GKG program aims to address three critical challenges in deep-time Earth research: (1) the lack of a unified knowledge structure across Earth science disciplines, (2) the absence of a robust, scalable software infrastructure for building and managing large-scale knowledge graphs, and (3) the need for a comprehensive, discipline-specific knowledge graph that integrates both ontological structures and real-world instances. Over the past four years, the DDE GKG has made significant progress in overcoming these challenges. Notable achievements include the development of innovative knowledge graph construction methods (Deng et al., 2021; Zhu et al., 2022), the creation of specialized software tools (DDE, 2021a, 2021b, 2021c, 2021d, 2022, 2023a, 2023b), and the establishment of basic, discipline-specific, and application-oriented geoscience knowledge graphs covering approximately 20 geoscience disciplines.
This paper reviews the advancements in the development of the DDE GKG, focusing on three key areas: (1) unifying the geoscience knowledge framework across the project, (2) enhancing the DDE knowledge graph construction and management platform, and (3) building comprehensive GKG content. Additionally, we address remaining challenges and outline future directions for the project, emphasizing its potential to revolutionize Earth science research and foster new discoveries in deep-time Earth studies.

2 Geoscience knowledge representation

2.1 Cognition of geoscience knowledge

A comprehensive and rational understanding of geoscience knowledge is paramount to the construction and organization of the DDE GKG. To effectively capture the complexities of geoscience, a multidimensional cognitive approach is essential, encompassing the scope, content taxonomy, and hierarchical structure of geoscience knowledge. Geoscience knowledge spans a multidisciplinary expanse, covering the entire spectrum of Earth science domains. This repository of knowledge consists of fundamental conceptual constructs that encapsulate core entities within geoscientific discourse, as well as real-world data derived from objective entities, phenomena, processes, and scenarios from the past and present. This culminates in a dual-layered representation of geoscience knowledge: the ontology layer, which defines the structural framework of the knowledge domain, and the instance layer, which houses real data entities. A comprehensive exploration of these aspects empowers us to effectively represent and apply geoscience knowledge, thus driving continuous growth and innovation within the field of Earth science.

2.1.1 Scope of geoscience knowledge

The geoscience knowledge contained within a GKG extends across a broad scope, delineating three fundamental dimensions: basic or common knowledge, discipline knowledge, and application knowledge, as illustrated in Figure 1. At its core, cross-disciplinary basic knowledge forms the foundational bedrock of geoscientific understanding, encompassing essential concepts such as the Earth’s structure, elemental composition, rotational dynamics, atmospheric processes, and climate systems. This foundational layer not only facilitates the integration and interoperability of geoscientific information across domains but also serves as a critical basis for cultivating a holistic understanding of Earth’s complex systems. Building upon this foundation, discipline-specific knowledge delves into specialized areas within geoscience, such as geological stratigraphy, rock classification, climate modeling, and tectonic processes. This deeper layer of knowledge enhances the accuracy and granularity of geoscientific information within the GKG, enabling advanced research and exploration in specific geoscientific fields. Finally, application knowledge bridges theory and practice by leveraging geoscientific principles to address real-world challenges, including geological resource exploration, climate prediction, and marine resource conservation. Through the integration of applied knowledge, the GKG demonstrates its practical relevance and utility in tackling pressing environmental and societal issues, thereby connecting scientific understanding to actionable solutions.
Figure 1 Cognition of geoscience knowledge and its representation in graph structures. The nodes labeled as “Concept Node” and “Attribute Node” in the Ontology Level refer to conceptual geoscience knowledge and the respective attribute information associated with them. The nodes labeled as “Instance Node” and “Instance Property” in the Instance Level represent factual geoscience knowledge and their corresponding specific attribute values.

2.1.2 Types and content of geoscience knowledge

From the perspective of knowledge functionality, geoscience knowledge can be categorized into four main types: conceptual knowledge, factual knowledge, rule-based knowledge, and metaknowledge, as depicted in Figure 1. Conceptual knowledge forms the backbone of geoscientific understanding, encompassing fundamental concepts used to define and describe entities, phenomena, processes, and events. It provides the foundational semantics for the knowledge graph, ensuring consistency and enabling seamless interlinkage across various geoscientific domains. Building on this foundation, factual knowledge comprises specific information about real-world entities, phenomena, and events within the geoscience domain, including empirical data, measurements, and observations collected from diverse sources. This type of knowledge serves as critical evidence to support research, analysis, and decision-making processes. Further advancing the structure, rule-based knowledge integrates rules, theorems, axioms, and computational models derived from both conceptual and factual knowledge. These rules establish a logical framework for reasoning, inference, and prediction, empowering sophisticated analysis and hypothesis testing based on geoscientific data. Finally, metaknowledge provides essential context by documenting the creation, validation, and updates of other types of knowledge. It includes metadata such as authorship, creation time, and versioning, ensuring transparency, accountability, and traceability throughout the knowledge lifecycle. Together, these four types of knowledge form a comprehensive and interconnected system that supports the advancement and application of geoscientific understanding.

2.1.3 Structural hierarchies of geoscience knowledge

Geoscience knowledge exhibits a distinct hierarchical structure, comprising two pivotal layers: the ontology layer and the instance layer, as shown in Figure 1. The ontology layer forms the foundational framework of the GKG, containing the core concepts, attributes, relationships, and rules that define the structure of geoscientific knowledge. It establishes the categories, relationships, and terminologies that govern the representation of knowledge across geoscientific disciplines, ensuring coherence and consistency throughout the graph. Complementing this foundational layer, the instance layer represents real-world data, including specific data points, observations, and facts related to particular geoscientific entities and phenomena. This layer is essential for practical applications, enabling analyses and decision-making by linking abstract knowledge to empirical data. The integration of these two layers culminates in a dynamic and comprehensive GKG, bridging the gap between abstract conceptualizations and concrete real-world instances. This fusion not only empowers researchers but also fosters innovation and transformation across the entire geoscience landscape, creating a robust system that supports both theoretical understanding and practical application.
Drawing upon a comprehensive understanding of geoscience knowledge, including its domain scope, content types, and hierarchical structure (Figure 1), we are poised to efficiently manage and apply knowledge within the GKG. Building on this foundational comprehension, the following section will provide an in-depth exploration of the formal representation of the GKG.

2.2 Formalized geoscience knowledge representation

As an application of knowledge graph technology in the field of geoscience, a GKG is crucial for representing geoscience knowledge as a computer-understandable knowledge network. It provides a unified knowledge foundation to solve the heterogeneous problems existing in geoscience research and applications in the era of big data. Some scholars even think that GKG could be the third scientific language of geoscience research, following natural language and maps (Zhang et al., 2022).
The GKG integrates intricate geoscience knowledge into a unified ontology, forming a structured and interconnected semantic network through nodes and edges (Zhu et al., 2022, 2023b; Zhang et al., 2024). Within the semantic network, geoscience knowledge is abstracted into entities, which include geoscience concepts and instances, among others, as well as the relationships between these entities. Nodes in the GKG represent entities, whereas edges connect the nodes to depict relationships. The formal definition of GKG is as follows:
GKG = (N,E)|O
where N represents the set of nodes, E represents the set of edges, and O (ontology) acts as the backbone of the graph. O provides a systematic framework for organizing and defining the geoscientific concepts, attributes, and relationships within the GKG.
More specifically, the nodes within the geoscience knowledge graph encompass various geoscience entities, such as geoscience objects, elements, phenomena, processes, and scenes, along with their corresponding real-world instances. On the other hand, as shown in Figure 1, edges include various types of relationships: attribute relationships among attributes and concepts, instance relationships among instances and concepts, and association relationships among concepts and concepts, as well as among instances and instances. These association relationships between instances and concepts are primarily expressed through concept attributes, which encompass semantic relationships, temporal relationships, spatial relationships, spatiotemporal process relationships, compositional relationships, functional relationships, and computational relationships (Zheng et al., 2022; Zhu et al., 2023b). Significantly, temporal, spatial, spatiotemporal process, and computational relationships are critical distinctions that set the GKG apart from general knowledge graphs, rendering it particularly valuable in the realm of geoscience research and applications. Temporal relationships delve into the intricate temporal aspects of geoscientific entities and events, enabling the examination of Earth’s evolution over extended time spans. In contrast, spatial relationships play a fundamental role in the representation of geospatial phenomena, and are employed to portray spatial orientation, distance, and topological relationships among geoscientific concepts or instances. Spatiotemporal process relationships elucidate the evolution processes of geoscientific instances over time. Lastly, computational relationships pertain to the capacity to derive one geoscientific entity from others. These relationships may involve simple logical rules, mathematical functions, or complex models and patterns.

3 Development of Deep-tProgram knowledge graph software system

3.1 Overall architecture of the knowledge graph software system

To support the one-stop integrated construction, management, sharing, and application services for large-scale DDE GKGs, we have designed the Geoscience Knowledge Graph Construction and Sharing Application Software Framework (GKG Software Framework). This framework comprises two major platforms: the Geoscience Knowledge Graph Construction and Management Platform (GKG Constructor) and the Shared Application Platform (GKG Server). The GKG Constructor has two subsystems: the Group Intelligence Collaborative Construction System (Editor) and the Knowledge Automatic Extraction System (Extractor). These two systems work in tandem to form a large-scale GKG using a combination of top-down and bottom-up approaches, multi-person (crowd intelligence) collaboration, and human-machine collaboration, providing knowledge support for the GKG Server. The GKG Server includes the Knowledge Sharing Service System (Provider) and the Knowledge Application Empowerment System (Enabler). These two systems allow for GKG sharing and application services at basic-sharing and deep-application levels. The overall architecture of the GKG Software Framework is shown in Figure 2.
Figure 2 The geoscience knowledge graph (GKG) construction and sharing application software framework
Guided by the conceptual design of the GKG Software Framework, the DDE project team has successfully developed and released sophisticated software systems for knowledge graph construction, management, sharing, and application. We will explore each system’s features and functionalities in detail in the following sections.

3.2 Software for construction and management of the knowledge graph

3.2.1 Geoscience Knowledge Graph Collaborative Editor

The Geoscience Knowledge Graph Collaborative Editor (GKG Editor), developed by the research team at Tsinghua University, is a powerful knowledge tool designed for the collection, editing, and organization of ontologies and instances in the field of geoscience. It plays a pivotal role in forming a comprehensive and professional knowledge graph. The Editor tool (Figure 3) caters to users with expertise and background in geoscience, such as scholars, experts, and scientists (Shi et al., 2020).
Figure 3 Online service of geoscience knowledge graph collaborative editor
The Editor tool is characterized by four notable features. Firstly, it follows strict ontology construction rules, ensuring normalization in the graph construction process. Secondly, it employs a specialized approach to graph construction, leveraging the collaborative efforts of both humans and machines, ensuring a high level of professionalism and accuracy. Thirdly, the tool offers expandability, allowing users to customize and adjust audit settings based on their specific requirements. Finally, it promotes sustainability by incorporating an incentive mechanism, motivating users to actively contribute and maintain the knowledge graph.

3.2.2 Geoscience knowledge tree auto-renew and completion system

The Geoscience knowledge tree auto-renew and completion system (ReCS) is a product of the research team at Shanghai Jiao Tong University, serving as a robust platform for efficient geoscience knowledge management. Leveraging machine learning algorithms, ReCS establishes connections between vast geoscience documents and relevant knowledge. By employing text mining technology, it expands expert knowledge into a hierarchical knowledge tree, wherein each node represents a field key work, enabling ReCS to dynamically extract valuable knowledge from unstructured data in real-time. Furthermore, ReCS associates each document with nodes on the knowledge tree, facilitating swift and precise document searches. The whole architecture of ReCS is depicted in Figure 4.
Figure 4 Whole architecture of the geoscience knowledge tree auto-renew and completion system
ReCS not only encompasses knowledge tree expansion but also incorporates literature papers within the knowledge tree. By mining potential knowledge from extensive geoscience literature, ReCS identifies relationships between new knowledge points and the existing knowledge tree by using an expansion algorithm. As this process is transparent and may lack accuracy, a human-in-the-loop correction mechanism is employed to verify the results. During verification, the system presents the newly added knowledge points along with recommendations for parent and child nodes on the knowledge tree. Experts assess the recommendations to determine their accuracy and select the appropriate results. Upon the correction of each added point, a new hierarchical knowledge tree is constructed.

3.2.3 Crowdsourced knowledge editing platform

The crowdsourced Knowledge Editing Platform (DDE WIKI) represents an innovative knowledge community meticulously developed by a dedicated research team at Beijing University of Posts and Telecommunications. This platform seamlessly integrates the realms of DDE knowledge inquiry and collaborative content creation through crowd-sourcing. The design of DDE WIKI is illustrated in Figure 5. Within the dynamic ecosystem of the DDE Salon Community, a harmonious fusion of expert involvement and cutting-edge artificial intelligence (AI) technologies, including the Large Language Model (LLM), drives a continuous iterative refinement and enrichment of the geoscience knowledge generated by users. The outcome is a sophisticated and robust knowledge query and response service, ensuring users access of unparalleled quality in their pursuit of information. Central to this innovation is the DDE knowledge crowd-sourcing editing platform. This empowers academic professionals to independently forge new paths of knowledge, enabling the creation of entries and the initiation of topical discussions. This rich fabric of knowledge seamlessly aligns with the information-seeking needs of the DDE Salon Community’s users.
Figure 5 Design philosophy of the crowdsourced knowledge editing platform
By seamlessly combining AI capabilities for swift answer generation, expert intervention to ensure precision, a well-designed incentivization framework, transactional systems, crowd- sourced knowledge construction, and meticulous model optimization strategies, DDE WIKI has emerged as a cornerstone in the field of Earth Science. It provides researchers with an efficient, high-caliber, and expeditious platform for accumulating and disseminating knowledge, while nurturing vibrant scholarly interactions.

3.3 Software for sharing and applications of the knowledge graph

3.3.1 One-stop for geoscience knowledge graph

One-stop for geoscience knowledge graph (GeoOpenKG) is a cutting-edge platform meticulously crafted by the scientific team at the Institute of Geographic Sciences and Natural Resources Research, CAS, with a primary focus on facilitating the open sharing of DDE GKGs. Its primary objective is to provide a well-organized and easily accessible channel for sharing the wealth of knowledge accumulated through the DDE program with a wide range of users, including ordinary individuals, industrial applications, and scientific researchers within the geosciences field. By facilitating the open sharing of these knowledge graphs, the system not only fosters advancements in various geoscience domains, but also enables continuous refinement and enhancement of the knowledge graphs through valuable feedback and meticulous log analysis gathered during practical applications (Zhu et al., 2023a).
The core components of the GeoOpenKG platform provide knowledge sharing and service of three main levels: the first level, known as the Basic Ontology for Geoscience Knowledge Graph (BO4KG), serves as the foundation by providing essential and manual ontologies for all disciplines within the geoscience field. This process ensures consistency in the semantic representation of basic concepts and attributes; the second level comprises the Discipline-specific Knowledge Graphs, also referred to as Geoscience Professional Knowledge Graphs (GPKGs). These GPKGs are collaboratively created by geoscientists from various disciplines worldwide through the GKG Editor; the third level consists of application-specific knowledge graphs, built on top of the BO4KG and GPKGs to address specific geoscience applications. Furthermore, the GeoOpenKG platform includes a valuable compilation of various open GKGs, forming the Geoscience Knowledge Graph Open Directory (GKGD). This directory integrates well-known ontology libraries, thesauri, and geographic dictionaries, such as the USGS Thesaurus, UNBIS Thesaurus, and National Land and Property Gazetteer (NLPG). The integration of the DDE GKG with the GKGD enhances the platform’s capabilities and provides a comprehensive one-stop system dedicated to facilitating the opening and sharing of geoscience knowledge graphs, without the burden of conducting complex search and identification operations. A conceptual diagram of GeoOpenKG is illustrated in Figure 6.
Figure 6 Overall architecture of one-stop for geoscience knowledge graph (GeoOpenKG). The Deep-time Digital Earth International Big Science Program geoscience knowledge graph (DDE GKG) comprises three hierarchical levels: (a) BO4KG, (b) GPKGs, and (c) application-specific knowledge graphs, which correspond to the root (as the foundational base), trunk (as the supporting pillar), and crown (as the fruits of applications) of the “DDE GKG Tree,” respectively. Additionally, Geoscience Knowledge Graph Open Directory (GKGD) constructs an efficient index to other open knowledge graphs, and, together with the DDE GKG, forms the GeoOpenKG, a comprehensive platform for the opening and sharing of GKGs.
The platform’s functionalities cover a range of essential features, including knowledge retrieval, browsing, visualization, and download options. By providing users with these diverse functionalities, GeoOpenKG offers a holistic experience, facilitating efficient acquisition of geoscience knowledge. Researchers and practitioners can efficiently explore and utilize the vast wealth of geoscience knowledge available through this platform. The seamless aggregation of knowledge graphs from multiple sources enhances the system’s usability and effectiveness, positioning GeoOpenKG as a valuable tool for geoscientists seeking to broaden their understanding and engage in cutting-edge research within the field. The home page of GeoOpenKG, depicted in Figure 7, serves as the gateway to this comprehensive platform.
Figure 7 Home page of the one-stop for geoscience knowledge graph

3.3.2 Multimodal geoscience semantic search system

The Multimodal Geoscience Semantic Search System (Semantic Search), developed by the research team at Shanghai Jiao Tong University, is an efficient tool within the Multimodal Geoscience Academic Knowledge Graph (GAKG) platform. Unlike traditional academic search engines such as Google Scholar and Semantic Scholar, the Semantic Search system goes beyond keyword-based search and adopts a semantic-level correlation matching approach. It focuses on knowledge points and essential elements mentioned in geoscience articles within the GAKG, enabling users to retrieve relevant illustrations, tables, geological eras, and geographic location information contained within the papers (Figure 8). This unique capability compensates for the limitations of conventional keyword-based searches, which often overlook valuable semantic information in scientific literature. By leveraging semantic-level correlation matching, Semantic Search enhances the precision and depth of search results, providing geoscientists with a powerful and effective tool for accessing pertinent information in their research domain (Deng et al., 2021).
Figure 8 Multimodal Geoscience Semantic Search System of geoscientific literature

3.3.3 Academic Portrait and Report Auto-generation System

The Academic Portrait and Report Auto-generation System (DDE Scholar), developed by the research team at Shanghai Jiao Tong University, is a sophisticated system tailored specifically to meet the needs of geology scholars. Serving as a portrait system, DDE Scholar swiftly generates comprehensive literature reports by effectively extracting valuable insights from an extensive collection of geoscience literature (Figure 9). The system’s literature repository is derived from openly accessible abstracts and metadata provided by over 5 million researchers from 27,266 institutions, covering more than 6 million papers and 9000 books published in over 600 journals.
Figure 9 Homepage and portrait report of Academic Portrait and Report Auto-generation System
This advanced system offers an array of practical functionalities to empower geoscience researchers in their scholarly pursuits. Among them is GeoRankings, a metric-based ranking system that provides scholars with valuable information on top research institutions globally. This feature fosters collaboration and excellence within the geoscience community, catalyzing academic progress on a global scale. At the heart of DDE Scholar’s capabilities lies the GKG, which is constructed from the extensive literature repository mentioned previously. By employing a structured semantic network in the form of a graph, DDE Scholar empowers geoscience researchers with advanced reasoning and knowledge discovery capabilities, facilitating exploration of cutting-edge research within the discipline. Additionally, the system offers a powerful similarity check tool, which allows researchers to evaluate the relevance of their work in relation to existing studies across multiple dimensions. Moreover, DDE Scholar provides multidimensional data analysis entry points for authors, institutions, and journals. Through comprehensive statistics and analysis reports, researchers can gain valuable insights into citation patterns, author distributions, and research focus areas, enabling informed decision-making and strategic planning in the field of geoscience.

4 Construction of Deep-time Digital Earth International Big Science Program knowledge graph content

4.1 Overall content framework of the knowledge graph

The DDE GKG serves as the cornerstone of the DDE GKG product family, with its structure organized across three hierarchical levels. At the foundational level, Basic Geoscience Knowledge Graphs provide essential ontologies across all geoscience disciplines, ensuring semantic consistency of fundamental concepts and attributes. Building upon this base, Discipline-Specific Knowledge Graphs are developed through global collaboration among geoscientists using the GKG Editor, capturing the specialized knowledge of various geoscience fields. Finally, Application Knowledge Graphs are constructed atop the Basic and Discipline-Specific Knowledge Graphs, tailored for specific scientific applications to address challenges and advance knowledge in targeted areas. Together, these three levels form a cohesive and scalable framework that supports the integration, specialization, and practical application of geoscientific knowledge.

4.2 Basic geoscience knowledge graphs

Basic geoscience knowledge graphs comprise two essential aspects: the Basic Ontology for Geoscience Knowledge Graph (BO4KG) and the Geoscience Academic Knowledge Graph (GAKG). The primary objective of BO4KG is to provide a foundational and interdisciplinary ontology for constructing DDE discipline-specific and application-specific knowledge graphs, along with their potential applications. This process ensures semantic consistency across diverse disciplinary fields in geoscience, establishing a robust foundation for integrated data resources, shared exchanges, mining and analysis, and computational and logical inference of geoscientific knowledge. BO4KG includes Time-Ontology (Wang et al., 2023b), Spatial-Ontology, Morphology-Ontology, and Provenance-Ontology (Sun et al., 2019), as well as a geologic chronology knowledge graph and a global geographical name knowledge graph based on the Basic Ontology, as depicted in Figure 10. The Time-Ontology formalizes fundamental time-related concepts for geoscientific objects, phenomena, and events, along with their attributes and temporal relationships (Hou et al., 2015). Spatial-Ontology formalizes essential spatial location concepts for geoscientific objects, phenomena, and events, along with their attributes and spatial relationships (Wang et al., 2016). Morphology-Ontology formalizes data structure and external shape concepts, including data reference, format, type, and scale (Sun et al., 2016). Provenance-Ontology formalizes data production, processing, and distribution concepts, supporting the unified evaluation of data reliability, including data source, acquisition and processing methods, tools, and data custodians (Li et al., 2017).
Figure 10 Exemplary ontologies within Basic Ontology for Geoscience Knowledge Graph
Currently, the scale of the constructed Basic Ontology is as follows: Spatial-Ontology includes 38 concepts, 51 relationships, and 73,382 instances; Time-Ontology includes 41 concepts, 42 object properties, 44 data properties, and 326 logical relationships and rules; Morphology-Ontology includes 42 concepts, 27 properties, 39 relationships, and 3995 instances; and Provenance-Ontology includes 69 concepts, 36 properties, 28 relationships, and 77 instances. The BO4KG also supports several key applications, such as the Unified Temporal Framework (UTF) for accurate temporal calculations across diverse geoscientific time references (Wang et al., 2023b), the Geologic Chronology Knowledge Graph based on Time and Spatial Ontologies containing over 60,000 triplets derived from 60 versions of geological time scales spanning from 1893 to 2022, and the integration of global geographic entities (5.96 million points and 578,735 areas) sourced from GeoNames and OpenStreetMap, providing comprehensive geospatial context for geoscientific analysis.
As a multimodal academic knowledge graph of geoscience, GAKG is a compilation of multi-dimensional graph-text-numerical data extracted from academic papers on geosciences (Deng et al., 2021); it centers on academic papers as core entities and combines bibliometric information, while text mining methods are used to match knowledge entities with the articles. GAKG includes illustrations, tables, and related structured data, with knowledge entities automatically linked to open knowledge graphs (linked data) such as DBpedia. The current GAKG data comprise 150 million triples and more than 2 million entities, encompassing 11 types of different concepts, 19 inter-entity relationships (19 object properties), and 39 data properties. Table 1 summarizes key statistics on the concepts and relations within the GAKG.
Table 1 The number of concepts and relations in the Geoscience Academic Knowledge Graph
Concept Count Relation Count
paper 3,711,981 is_cited_by 39,540,518
author 3,924,701 on_the_topic_of 9,194,720
affiliation 27,195 is_written_by 17,219,176
topic 792,281 is_published_in 3,711,981
journal 535 is_last_known_in 3,290,176
timescale 1792 has_illustration 6,417,139
mention 829,749 has_table 1,061,790
illustration 6,417,139 has_mention 17,920,198
papertable 1,061,790 mention_location 723,180
location 412,108 has_geohash 2,161,150
geohash 2,161,150 mention_timescale 2,420,151
parent_mention 51,934
Total 19,340,421 Total 86,492,937

4.3 Discipline geoscience knowledge graphs

The DDE discipline geoscience knowledge graphs (Geoscience Professional Knowledge Graphs, GPKGs), established by experts in various disciplinary fields around the world, are constructed by the Editor system (Shi et al., 2020). The detailed functions of the Editor system can be referenced in Section 3.2.1. Up to now, the GPKGs consisted of the knowledge trees and ontologies of 20 disciplines—stratigraphy, paleontology, geochronology, sedimentology, igneous petrology, metamorphic petrology, mineralogy, geomorphology, paleogeography, tectonics, paleomagnetism, mathematical geology, surficial geochemistry, geophysics, geological mapping, petroleum geology, hydrogeology, geothermics, and engineering geology—comprising nearly 62,000 nodes (concepts/properties) and 63,000 relationships. This collaborative approach allows scientists worldwide to continually contribute to the knowledge base, ensuring that the GPKGs remain up-to-date and scientifically rigorous. The knowledge graphs can be browsed and edited online, supporting seamless collaboration across global geoscientific communities. Figure 11 illustrates the disciplinary knowledge framework and the stratigraphy knowledge graph within the DDE GKG.
Figure 11 Geoscience professional knowledge graphs constructed by Geoscience Knowledge Graph Collaborative Editor

4.4 Application geoscience knowledge graphs

Application geoscience knowledge graphs are constructed on the basis of the Basic Ontology and Discipline knowledge graphs, focusing on solving specific scientific problems and supporting applications in various fields, such as porphyry copper prediction, standard carbonate microfacies identification, basic landforms types identification, and petroliferous basin evaluation (Wang et al., 2023a). These application-oriented knowledge graphs are not only valuable for advancing scientific research but also demonstrate the practical utility of the DDE ontologies in real-world applications. Additional specialized knowledge graphs have been developed for fields such as carbonate rocks, paleobiogeography, and geothermal energy (Chen et al., 2023; Xu et al., 2023; Yu et al., 2023). These domain-specific knowledge graphs offer a powerful toolset for researchers in areas such as tectonic geomorphology and petroleum exploration, providing an integrated approach to data-driven scientific discovery (Tang et al., 2023; Xi et al., 2023). Figure 12 presents examples of application geoscience knowledge graphs that have been meticulously crafted by geoscientists.
Figure 12 Illustrative instances of application geoscience knowledge graphs

5 Discussion

5.1 Distinctive features of the Deep-time Digital Earth geoscience knowledge graph

Geoscience, as a data-intensive discipline, has long been at the forefront of knowledge sharing and utilization, with numerous well-established knowledge engineering projects such as GeoSciML, SWEET, GeoNames Ontology, LinkedEarth Ontology, and GeoDeepDive (Simons et al., 2006; Zhang et al., 2013; Khider et al., 2019). These projects have made significant strides in facilitating knowledge sharing and fostering interdisciplinary collaboration. Although the DDE GKG draws on these precedents for inspiration, it distinguishes itself in several key areas (Table 2).
Table 2 Comparison of key geoscience knowledge projects
Geoscience knowledge project Focus and scope Initiating organization/individual Knowledge scale Construction methodology
DDE GKG Earth’s geological and evolutionary processes, integrating multiple Earth science disciplines DDE International Big Science Program 20 geoscience disciplines, 62,000 solid Earth concepts/ properties; >150 million triples (literature graph) Hybrid: top-down expert-driven ontology, bottom-up AI-driven instance extraction
GeoSciML Data model for geoscience data sharing and exchange IUGS 1772 concepts Expert-driven manual ontology creation
SWEET Ontology library describing core concepts in Earth sciences NASA 4533 concepts, 359 properties Expert, crowdsourcing, and domain dictionary- based curation
GeoNames Ontology Global geographic name dictionary Marc Wick (person) ~12 million geographic entities, ~25 million place names Expert and crowdsourcing- based curation
LinkedEarth Ontology Semantic platform for paleoclimate data integration and archiving National Science Foundation, United States 6 sub-ontologies, 148 concepts, 55 relationships Expert-driven manual ontology creation
GeoDeepDive Geoscience literature processing National Science Foundation, United States ~13.4 million geoscience documents Machine learning- based literature mining
The DDE GKG is uniquely oriented towards Earth’s geological and evolutionary processes, necessitating the integration of a broad array of Earth science disciplines. In contrast, other geoscience knowledge projects often focus on specific domains or serve as foundational repositories for geoscience data. For example, GeoSciML and SWEET provide basic knowledge services but are more limited in scope, often relying on expert-curated, smaller-scale datasets. On the other hand, projects like GeoNames focus on geographic names, and LinkedEarth primarily supports paleoclimatic research. These projects, although valuable in their respective domains, do not directly address the broad interdisciplinary challenges that the DDE GKG aims to solve.
In terms of scale and systematic construction, the DDE GKG spans 20 distinct geoscience disciplines, with a KG that encompasses more than 150 million triples—a scale that continues to expand. This extensive coverage is a significant departure from most other geoscience projects, which typically provide only limited data or specific thematic coverage. Furthermore, the DDE GKG emphasizes user involvement through crowdsourcing, engaging a wide network of international scientists, which contributes to both the depth and breadth of the knowledge graph.
A key differentiator of the DDE GKG lies in its collaborative approach, which integrates cutting-edge AI technologies with the expertise of domain specialists, fostering a robust human-computer collaboration. This synergy enables the effective creation of both top-down knowledge frameworks and ontologies, while also supporting the dynamic, bottom-up extraction of large-scale knowledge instances. By leveraging crowdsourcing alongside AI, the DDE GKG ensures an evolving, active lifecycle, with continuous iterations and updates. This collaborative model offers a distinct advantage over other existing projects, such as GeoSciML and SWEET, which, despite featuring high-quality, well-structured knowledge systems, are predominantly based on expert-driven curation and thus lack the flexibility for ongoing updates. In contrast, projects like GeoDeepDive, which rely on machine learning for literature-based knowledge extraction, often miss the vital domain-specific crowdsourcing input, limiting their ability to generate systematically comprehensive and contextually relevant knowledge, which in turn restricts their depth and practical utility.
Another notable distinction of the DDE GKG is its integrated research ecosystem, comprising a suite of software tools such as the GKG Editor, ReCS, DDE Wiki, GeoOpenKG, Semantic Search, and DDE Scholar. These tools enable both the construction and ongoing enhancement of the knowledge graph, as well as fostering community-driven knowledge sharing and collaboration. In contrast, other geoscience knowledge projects typically focus on static knowledge repositories or limited interactive features, without offering the same degree of support for continuous community-driven development. Together, these unique characteristics position the DDE GKG as a transformative platform for advancing geoscientific research and addressing complex interdisciplinary challenges.

5.2 Construction strategy of the geoscience knowledge graphs

The construction of the DDE GKG knowledge graphs integrates a top-down and bottom-up approach, leveraging both manual expertise and automated methods to ensure comprehensive and scalable knowledge representation, as shown in Figure 13. This hybrid methodology is essential to balancing the accuracy of expert-driven content with the efficiency of machine-driven data extraction.
Figure 13 Overall technical scheme of Deep-time Digital Earth International Big Science Program knowledge graph construction
In the top-down phase, domain experts curate a coherent and unified geoscience knowledge framework. Drawing inspiration from the seven-step ontology construction methodology proposed by Stanford University (Noy and McGuinness, 2001), the DDE GKG employs an eight-step framework to ensure systematic ontology construction, which is shown in Figure 14 (Zhu et al., 2022). This process emphasizes clarity, coherence, and extensibility, ensuring that the knowledge graph meets the scientific standards and practical requirements of geoscientific research.
Figure 14 The steps for constructing ontologies of the geoscience knowledge graph
Concurrently, the bottom-up approach utilizes automated techniques, such as machine learning and natural language processing, to extract and populate the knowledge graph with real-world instances from scientific literature and other sources. This approach enables the rapid scaling of the knowledge graph, facilitating the incorporation of vast amounts of data while maintaining a high degree of accuracy through continuous refinement.
The iterative fusion of both approaches ensures that the knowledge graph remains up-to-date, accurate, and aligned with current research. The GKG Editor allows domain experts to periodically review and adjust the ontology as new concepts and relationships emerge, while the ReCS tool automatically updates the graph with fresh data. This dynamic interaction ensures that the DDE GKG evolves over time in response to new discoveries and shifting research priorities, making it a truly living knowledge resource. By combining the strengths of expert-driven curation and machine-driven scalability, the DDE GKG achieves a robust and adaptive framework that supports the ever-growing demands of geoscientific research.

5.3 Challenges and future directions

Although the DDE GKG has made remarkable progress, the scale and complexity of the project present ongoing challenges. The following areas represent the key challenges and opportunities for further development:
(1) Collaborative Construction and Global Knowledge Sharing
The sustained development of the DDE GKG, which integrates diverse Earth science disciplines, is foundational for the comprehensive understanding of geological and evolutionary processes. A truly effective knowledge graph must incorporate the interdisciplinary nature of geoscience and address the spatiotemporal variability inherent to Earth systems. While the current knowledge graph content covers 20 geoscience disciplines, the breadth and depth of the content remain insufficient to fully support large-scale data integration, complex analyses, and cross-disciplinary knowledge discovery. Geoscience knowledge is inherently dynamic, shaped by both temporal and spatial factors. Therefore, ongoing collaborative contributions from the global geoscience community are essential for expanding the graph, and ensuring that it remains relevant and accurate. The open sharing of geoscientific knowledge not only minimizes the risk of duplicative efforts but also promotes the establishment of a unified, global knowledge base that fosters collaborative research and policy formulation. Hence, the DDE GKG must further cultivate a robust and scalable platform for community-driven contributions. Ensuring accessibility, transparency, and inclusivity will be crucial for realizing its full potential as a powerful tool for advancing geoscientific research, as well as for informing sustainable development policies.
(2) Evaluation and Quality Assurance of the Geoscience Knowledge Graph
Ensuring the scientific validity and integrity of the DDE GKG is paramount to its success. Knowledge graphs, particularly those in complex scientific fields like geoscience, must meet stringent standards of accuracy, consistency, and reliability. These qualities are essential not only for supporting computational reasoning but also for enabling advanced geospatial analysis, data integration, and hypothesis generation. Currently, the quality control processes of the DDE GKG rely largely on expert review, which, while valuable, introduces inherent subjectivity into the evaluation process. To enhance the credibility and robustness of the knowledge graph, the project must establish a more formalized, systematic framework for quality assessment. This framework should be built on objective, quantitative criteria that address key aspects such as data completeness, temporal and spatial precision, conceptual coherence, and alignment with established geoscientific theories. Furthermore, the development of automated quality assurance tools tailored to the unique requirements of the DDE GKG would enable real-time quality evaluation and continuous improvement. Institutionalizing such quality control measures will significantly increase the trustworthiness of the knowledge graph and ensure its long-term applicability in both scientific research and policy contexts.
(3) Integration with Large Language Models (LLMs)
The integration of Large Language Models (LLMs) into the DDE GKG represents a promising avenue for enhancing the graph’s utility in geoscientific research. LLMs have demonstrated exceptional capabilities in natural language processing tasks such as text generation, sentiment analysis, and information retrieval. However, their performance in domain-specific reasoning—particularly in highly specialized fields like geoscience—can be limited by issues of accuracy, interpretability, and consistency. In contrast, knowledge graphs offer structured and semantically rich representations of domain-specific knowledge, providing a clear advantage in tasks requiring precise reasoning and domain-specific inference (Pan et al., 2024). By integrating LLMs with the DDE GKG, the project can leverage the complementary strengths of both approaches. LLMs can enhance the extraction of concepts, relations, and data from large corpora of geoscientific literature, while the knowledge graph can provide the necessary context and semantic grounding to improve the interpretability and reasoning of the model’s outputs. This synergistic combination would facilitate more effective knowledge discovery, automated reasoning, and conceptual modeling, ultimately advancing both AI-driven geoscientific analysis and the long-term development of the DDE knowledge graph.
(4) Practical Applications and Demonstration of Use Cases
Although the DDE GKG has made commendable progress in terms of disciplinary breadth and expert involvement, its long-term success will depend on its ability to address real-world geoscientific problems. The real-world impact of the knowledge graph will be measured by its application in solving concrete research and policy challenges. Future development should focus on leveraging the DDE knowledge graph for specific, high-priority applications in geoscience. This includes, but is not limited to, spatiotemporal alignment and conversion of geoscientific datasets, multimodal resource intelligent retrieval, and the application of the knowledge graph for automated geoscientific modeling and computation. These foundational applications will not only demonstrate the practical value of the DDE GKG but also validate its role in advancing both basic and applied geoscience. Additionally, by showcasing the graph’s capacity to drive advancements in solving complex geoscientific problems, these applications will play a pivotal role in further developing the global knowledge graph and ensuring its continued relevance and expansion.

6 Conclusion

The DDE GKG represents a groundbreaking step in the integration, mining, and analysis of global geoscientific data. It is crucial for linking diverse datasets, scientific literature, models, and computational resources, thus facilitating knowledge inference, evolution, and discovery across the geoscience community. As the cornerstone of the DDE Big Science Program, the DDE GKG has made significant strides in advancing interdisciplinary geoscientific research and knowledge sharing. With nearly four years of development, the DDE GKG has successfully built a multi-tiered knowledge graph system, encompassing basic geoscience knowledge, discipline-specific knowledge graphs, and application-oriented knowledge graphs. Spanning over 20 geoscience disciplines, it has set a robust foundation for future developments and applications. Furthermore, the project has pioneered the implementation of a flexible and scalable knowledge framework, offering tools that enhance the creation, management, and dissemination of geoscientific knowledge.
Looking ahead, the DDE GKG aims to continue expanding its reach, fostering global collaboration and knowledge sharing among geoscientists worldwide. By further refining its methodology and extending its network, DDE will drive new research avenues and applications, contributing to a more integrated and sustainable geoscientific ecosystem. The project is poised to play a key role in shaping the future of geoscientific research and knowledge management, positioning itself as a central resource for advancing scientific discovery and innovation.

Acknowledgements

We would like to extend our thanks to the principal investigators of DDE for their guidance and valuable comments, and all participants of DDE Knowledge Graph Working Group for their hard work.

Conflicts of Interest

All authors declare that no conflicts of interest exist.
[1]
Arvor D, Belgiu M, Falomir Z et al., 2019. Ontologies to interpret remote sensing images: Why do we need them? GIScience & Remote Sensing, 56(6): 911-939.

[2]
Belgiu M, Tomljenovic I, Lampoltshammer T J et al., 2014. Ontology-based classification of building types detected from airborne laser scanning data. Remote Sensing, 6(2): 1347-1366.

[3]
Chen Q, Yao H, Li S et al., 2023. Fact-condition statements and super relation extraction for geothermic knowledge graphs construction. Geoscience Frontiers, 14(5): 101412.

[4]
DDE, 2021a. Geoscience Knowledge Graph Collaborative Editor. Retrieved from https://editor.deep-time.org/#/.

[5]
DDE, 2021b. Geoscience Knowledge Tree Auto-Renew and Completion System. Retrieved from https://labeling.acemap.cn/#/.

[6]
DDE, 2021c. Knowledge Hub. Retrieved from https://deep-time.org/#/home-knowledge/hub.

[7]
DDE, 2021d. Multimodal Geoscience Academic Knowledge Graph. Retrieved from https://gakg.deep-time.org/#/svirql.

[8]
DDE, 2022. One-stop for GeoScience Knowledge Graph. Retrieved from https://geoopenkg.deep-time.org/.

[9]
DDE, 2023a. DDE WIKI. Retrieved from https://wiki.deep-time.org/wiki/Main_Page.

[10]
DDE, 2023b. Deep Literature. Retrieved from https://ddescholar.acemap.info/.

[11]
Deng C, Jia Y, Xu H et al., 2021. GAKG: A Multimodal Geoscience Academic Knowledge Graph. Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Virtual Event, Queensland, Australia: Association for Computing Machinery.

[12]
Hazen R M, Downs R T, Eleish A et al., 2019. Data-driven discovery in mineralogy: Recent advances in data resources, analysis, and visualization. Engineering, 5(3): 397-405.

[13]
Hou Z, Zhu Y, Gao X et al., 2015. Time-ontology and its application in geodata retrieval. Journal of Geo-information Science, 17(4): 379-390. (in Chinese)

[14]
IUGS, 2019. Deep-Time Digital Earth (DDE). Retrieved from https://www.iugs.org/dde.

[15]
Khider D, Emile-Geay J, McKay N P et al., 2019. PaCTS 1.0: A crowdsourced reporting standard for paleoclimate data. Paleoceanography and Paleoclimatology, 34(10): 1570-1596.

[16]
Laxton J, 2017. Geological map fusion: OneGeology-Europe and INSPIRE. Geological Society of London Special Publications, 408(1): 147-160.

[17]
Li W, Zhu Y, Song J et al., 2017. Geospatial data provenance-ontology and its application in data linking. Journal of Geo-information Science, 19(10): 1261-1269. (in Chinese)

[18]
Lin H, You L, Hu C et al., 2018. Prospect of geo-knowledge engineering in the era of spatio-temporal big data. Geomatics and Information Science of Wuhan University, 43(12): 2205-2211. (in Chinese)

[19]
Lozano M G, Schreiber J, Brynielsson J, 2017. Tracking geographical locations using a geo-aware topic model for analyzing social media data. Decision Support Systems, 99: 18-29.

[20]
Mai G, Janowicz K, Cai L et al., 2020. SE-KGE: A location-aware knowledge graph embedding model for geographic question answering and spatial semantic lifting. Transactions in GIS, 24(3): 623-655.

[21]
Mantovani A, Piana F, Lombardo V, 2020. Ontology-driven representation of knowledge for geological maps. Computers & Geosciences, 139: 104446.

[22]
Mendes P N, Jakob M, Bizer C, 2012. DBpedia:A multilingual cross-domain knowledge base. In: International Conference on Language Resources and Evaluation.

[23]
Normile D, 2019. Earth scientists plan a ‘geological Google’. Science, 363(6430): 917-917.

[24]
Noy N F, McGuinness D L, 2001. Ontology development 101:A guide to creating your first ontology. In: Stanford knowledge systems laboratory technical report KSL-01-05 and ….

[25]
Pan S, Luo L, Wang Y et al., 2024. Unifying large language models and knowledge graphs: A roadmap. IEEE Transactions on Knowledge and Data Engineering, 36(7): 3580-3599.

[26]
Peters S, Ross I, Czaplewski J et al., 2017. A new tool for deep-down data mining. Eos, 98.

[27]
Qiu P, Gao J, Yu L et al., 2019. Knowledge embedding with geospatial distance restriction for geographic knowledge graph completion. ISPRS International Journal of Geo-Information, 8(6): 254.

[28]
Shi S, Lyu H, Dong S et al., 2020. An editing platform of geoscience knowledge system. Geological Journal of China Universities, 26(4): 384-394. (in Chinese)

[29]
Simons B, Boisvert E, Brodaric B et al., 2006. GeoSciML: Enabling the exchange of geological map data. ASEG Extended Abstracts, 2006(1): 1-4.

[30]
Stall S, Yarmey L, Cutcher-Gershenfeld J et al., 2019. Make scientific data FAIR. Nature, 570(7759): 27-29.

[31]
Sun K, Zhu Y, Pan P et al., 2016. Research on morphology-ontology and its application in geospatial data discovery. Journal of Geo-information Science, 18(8): 1011-1021. (in Chinese)

[32]
Sun K, Zhu Y, Song J, 2019. Progress and challenges on entity alignment of geographic knowledge bases. ISPRS International Journal of Geo-Information, 8(2): 77.

[33]
Tang X, Feng Z, Xiao Y et al., 2023. Construction and application of an ontology-based domain-specific knowledge graph for petroleum exploration and development. Geoscience Frontiers, 14(5): 101426.

[34]
Tanon T P, Weikum G, Suchanek F, 2020. YAGO 4:A Reason-able Knowledge Base. The Semantic Web: 17th International Conference, ESWC 2020, Heraklion, Crete, Greece, May 31-June 4, 2020, Proceedings, Heraklion, Crete, Greece: Springer-Verlag.

[35]
Trisedya B D, Qi J, Zhang R, 2019. Entity alignment between knowledge graphs using attribute embeddings. In: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence. Honolulu, Hawaii, USA: AAAI Press.

[36]
Vrandečić D, Krötzsch M, 2014. Wikidata: A free collaborative knowledgebase. Communications of the ACM, 57(10): 78-85.

[37]
Wang C, Hazen R M, Cheng Q et al., 2021. The Deep-Time Digital Earth program: Data-driven discovery in geosciences. National Science Review, 8(9): nwab027.

[38]
Wang D, Zhu Y, Pan P et al., 2016. Construction of geodata spatial ontology and its application in data retrieval. Journal of Geo-information Science, 18(4): 443-452. (in Chinese)

[39]
Wang H, Zhong H, Chen A et al., 2023a. A knowledge graph for standard carbonate microfacies and its application in the automatical reconstruction of the relative sea-level curve. Geoscience Frontiers, 14(5): 101535.

[40]
Wang J, Hu Y, Joseph K, 2020. NeuroTPR: A neuro-net toponym recognition model for extracting locations from social media messages. Transactions in GIS, 24(3): 719-735.

[41]
Wang S, Zhang X, Ye P et al., 2019. Geographic Knowledge Graph (GeoKG): A formalized geographic knowledge representation. ISPRS International Journal of Geo-Information, 8(4): 184.

[42]
Wang S, Zhu Y, Qi Y et al., 2023b. A unified framework of temporal information expression in geosciences knowledge system. Geoscience Frontiers, 14(5): 101465.

[43]
Wang W, Stewart K, 2015. Spatiotemporal and semantic information extraction from Web news reports about natural hazards. Computers, Environment and Urban Systems, 50: 30-40.

[44]
Xi J, Wu J, Wu M, 2023. Design and construction of lightweight domain ontology of tectonic geomorphology. Journal of Earth Science, 34(5): 1350-1357.

[45]
Xu Y, Hu X, Han Z, 2023. Carbonate ontology and its application for integrating microfacies data. Journal of Earth Science, 34(5): 1328-1338.

[46]
Yang J, Cao X, Yao J et al., 2024. Geographical big data and data mining: A new opportunity for “water-energy-food” nexus analysis. Journal of Geographical Sciences, 34(2): 203-228.

[47]
Yu C, Zhang L, Hou M et al., 2023. Climate paleogeography knowledge graph and deep time paleoclimate classifications. Geoscience Frontiers, 14(5): 101450.

[48]
Yu L, Qiu P, Liu X et al., 2018. A holistic approach to aligning geospatial data with multidimensional similarity measuring. International Journal of Digital Earth, 11(8): 845-862.

[49]
Zhang C, Govindaraju V, Borchardt J et al., 2013. GeoDeepDive:statistical inference using familiar data-processing languages. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, New York, New York, USA: Association for Computing Machinery.

[50]
Zhang H, Tang G, Xiong L et al., 2024. Geomorphology-oriented theoretical framework and construction method for value-added DEM. Journal of Geographical Sciences, 34(1): 165-184.

[51]
Zhang X, Huang Y, Zhang C et al., 2022. Geoscience Knowledge Graph (GeoKG): Development, construction and challenges. Transactions in GIS, 26(6): 2480-2494.

[52]
Zheng K, Xie M H, Zhang J B et al., 2022. A knowledge representation model based on the geographic spatiotemporal process. International Journal of Geographical Information Science, 36(4): 674-691.

[53]
Zhu Y, Dai X, Yang J et al., 2023a. One-stop sharing and service system for geoscience knowledge graph. Geological Journal of China Universities, 29(3): 325-336. (in Chinese)

[54]
Zhu Y, Sun K, Hu X et al., 2022. Research and practice on the framework for the construction sharing and application of large-scale geoscience knowledge graphs. Journal of Geo-information Science, 25(6): 1215-1227. (in Chinese)

[55]
Zhu Y, Sun K, Wang S et al., 2023b. An adaptive representation model for geoscience knowledge graphs considering complex spatiotemporal features and relationships. Science China Earth Sciences, 66(11): 2563-2578.

Outlines

/