Preprint
Article

Recognition of Derived Semantic Relationships in Geographic Entity Generic Names

Altmetrics

Downloads

72

Views

28

Comments

0

This version is not peer-reviewed

Submitted:

19 June 2024

Posted:

26 June 2024

You are already at the latest version

Alerts
Abstract
Determining the general-name-derived semantic relationships between geographic names often requires consulting many geographic name documents for verification. In geographic name translation and geographic name knowledge graphs, this method cannot meet the needs of large-scale extraction of general-name-derived semantic relationships between geographic entities. To solve this problem, this paper proposes a general-name derivation relation recognition method based on Prompt learning and Semantic Spatiotemporal Distribution Attention Feature Fusion (SSD-AFF). This method adopts a pipe-based relationship recognition strategy. First, the Prompt learning method is adopted to form a general-name derivation pattern with the general-name-derived geographical name category and the original geographical name category as the prompt word, and then the derivation pattern is combined with the geographical name as the input of the model so that the recognition problem of the general-name derivation of the geographical name can be transformed into a classification problem using the bidirectional language model. Then, by analyzing the characteristics of the semantic association and geographical spatio-temporal distribution between the general-name-derived geographical name and the original geographical name, the semantic spatio-temporal distribution characteristics of the general-name-derived geographical name are constructed. Finally, the network model based on the semantic spatio-temporal distribution Attention Feature Fusion (SSD-AFF) is used to distinguish the general-name-derived semantic relationships. The experimental results show that the proposed method has good performance. It can effectively mine the general-name-derived semantic relations between the vector data of geographical entities and has important application value in geographical name translation and geographical name knowledge maps.
Keywords: 
Subject: Computer Science and Mathematics  -   Artificial Intelligence and Machine Learning

Introductions

The general-name derivation relationship refers to the semantic relationship of geographical name derivation between geographic entities, such as the natural geographic entity "Peoria Lake" and the transportation geographic entity "East Lakeview Drive". General name derivation is a common naming method for geographic entities. When people name new geographic entities, in order to express the geographic association with nearby geographic entities, they usually choose the general name from the geographical name of nearby entities or its derivative words as the generic name derivative part of the new geographic entity’s name. For example, in "East Lakeview Drive", "Lakeview" is a derivative of "Lake". At this point, the name of the new geographic entity is called a general-name-derived geographical name, and the geographical name of the nearby geographic entity is called the original geographical name. General name derivation is mainly divided into direct derivation and indirect derivation:1) Direct derivation. People directly use the general name of the original geographical name as the general name derivation part of the derived geographical name. For example, in the general name derived from the geographical name "Goodman Lake Road", the derived general name "Lake" directly comes from the general name "Lake" in the original geographical name "High Rock Lake". 2) Indirect derivation. This involves using a derivative of the original geographical name’s general name as the general name derivation part of the derived geographical name. For example, in the general-name-derived geographical name "North Riverview Drive", the general name derivation part is "Riverview". The original geographical name "Haw River" has the general name "River", and the general name derived part "Riverview" is a derivative of the geographical name general name "River".
The recognition of general name derivation relationships in geographic entities mainly involves the field of geographic entity relationship recognition and derived geographical names. Many researchers have conducted substantial research work on this topic.
In the field of geographic entity relationship recognition, many researchers have focused on capturing semantic relationships between geographic entities from spatial textual data and remote sensing image data. In the field of remote sensing imagery, Cui et al. (2019) [1] proposed the MSRIN(Multi-scale Remote Sensing Image Interpretation Network) model to identify spatial relationships between entity objects in comprehensive remote sensing image interpretation. The model relies on a parallel deep neural network composed of a Fully Convolutional Network (FCN), U-Net, and Long Short-Term Memory (LSTM) network to achieve end-to-end recognition of remote sensing objects and spatial semantic relationships. Stuti et al. (2023) [2] utilized the YOLO5 model for the recognition of geospatial entities and proposed a method to calculate directional relationships based on the centroids and overall directions of spatial entities, thus enabling the identification of geospatial directional relationships in high spatial resolution Remote Sensing (RS) images. Jie et al. (2019) [3] presented a method to express topological relationships, orientation relationships, and distance relationships between entity objects in high-resolution remote sensing images. They utilized a neural network model based on an attention mechanism to identify geospatial relationships in high-resolution remote sensing images. In the domain of spatial textual data, Meng et al. (2023) [4] proposed a method for joint extraction of relationships in geospatial textual data. Using the ERNIE pre-trained language model and the BAB module to improve the CasRel model, they enhanced the extraction effectiveness of geospatial relationships. Li et al. (2016) [5] adopted the bootstrapping technique, constructing features such as part-of-speech, position, and distance of words to measure their weights in the text, thereby extracting keywords that express relationships between geographic entities and improving the extraction effectiveness of unknown relationship types in geographic entity relationships.
In the field of derived geographical names, researchers have primarily focused on the origins of derivational words, characteristics of geographical names, and methods for the extraction of geographical names. Research on derivational words in geographical names, Joan et al. (2021) [6] conducted a study on protohistoric geographical names in Valencia, Spain. The research indicated that protohistoric town names in Valencia primarily derive from natural features of the landscape, while historic town names originate from cultural features of the landscape. Maixner et al. (2020) [7] Investigated the meaning of the derivational word *Sæheimr, suggesting that it referred to peripheral landing points of central place complexes established during the Roman Iron Age and Migration Period, rather than a location on the sea, fjord, or lake. In research on characteristics of geographical names, Hobo et al. (2016) [8] summarized and analyzed the characteristics of derived geographical names and provided corresponding suggestions on the translation of various derived geographical names. In research on extraction methods of derived geographical names, Liu et al. (2022) [9] analyzed the spatial and semantic characteristics of fully derived and general-name-derived geographical names. They employed spatial statistics and rule-based methods to extract fully derived and general-name-derived geographical names from geographical name point data. Liu et al. (2022) [10] used spatial statistical methods to construct a knowledge graph of geographical name derivation. This knowledge graph was then utilized to identify English derived geographical names.
The identification of the derivational relationships of the general name of geographical entities holds significant importance in the fields of toponym translation and toponym knowledge graphs. In the field of toponym translation, identifying the derivational relationships of the general names of geographical entities aids in the accurate translation of derived general names. In the field of toponym knowledge graphs, it can be used to trace the origins of geographical names through the derivational relationships of geographical entities’ general names. Currently, in the field of geographical entity relationship recognition, researchers mainly focus on extracting spatial semantic relationships of geographical entities from remote sensing images and spatial text data, with relatively fewer studies exploring the semantic relationships of geographical entities from vector data. In the area of derived toponym research, scholars primarily examine the causes, types, and characteristics of derived toponyms and often employ rule-based methods to identify derivational relationships of general names, which have limited generalization capabilities and cannot be widely applied to large-scale geographical entity derivational relationship extraction. In response to this, this paper integrates the spatiotemporal distribution and semantic features of derivational relationships of general names and adopts deep learning methods to achieve the recognition of the general-name-derived semantic relationships of geographical entity.

1. Materials and Methods

This paper employs a pipeline strategy, first identifying general-name-derived toponyms and then recognizing the general name derivational relationships of geographical entities. Hence, the identification of general name derivational relationships of geographical entities can be roughly divided into two parts: the identification of general-name-derived toponyms and the identification of the derivational relationships of general names.

1.1. Identification of General-Name-Derived Toponyms

In terms of toponym composition, compared to regular toponyms, general-name-derived toponyms include not only their own general name but also general name or general name derivative that belong to other categories of geographical entities. These general name or general name derivatives do not share a semantic hierarchical relationship with the category of the derived toponym but do have a semantic relationship with the category of the original toponym. For instance, in the general name derived toponym "Lake Shore Road South," the geographical entity category is "Residential Road," while the original toponym "Lake Norman" falls under the category "Water." The derived general name "Lake" does not have a semantic relationship with "Residential Road" but does with "Water." Therefore, compared to regular toponyms, the general name derived part of a toponym has a weaker semantic association with the toponym’s category but a stronger semantic association with the original toponym’s category. Furthermore, compared to the proper noun part of a toponym, the derived general name portion has a stronger co-occurrence relationship with both the category of the toponym and its general name (as illustrated in Figure 1).
Based on the characteristics of the derived general name portion, this paper transforms the identification of derived toponyms into a sentence classification problem and proposes a method for identifying derived toponyms based on prompt learning and bidirectional language models. This method uses the general-name-derived pattern, composed of both the derived toponym category and the original toponym category, as prompts(this is shown in Table 1). Using the bidirectional language model, P ( C | O ; D ; P ; θ ) (where D represents the derived toponym category, O the original toponym category, P the sequence of toponym words, C the sentence vector input to the model, and θ the model parameters), the semantic relationships among the original toponym category, derived toponym category, derived general name part, and toponym is learned, along with the embedded representation of the sentence vector C. This allows the discriminator to predict the probability that a given toponym is a derived toponym (as illustrated in Figure 2), where the sentence vector C uses the feature vector corresponding to the special character ’[CLS]’ as the sentence feature representation. Unlike traditional sentence classification methods based on pre-trained language models, combining prompt learning significantly preserves the original knowledge of the pre-trained language model and shows superior performance in sentence embedding [11,12,13].

1.2. Identification of the Derivational Relationships of General Name

After identifying the general-name-derived geographical names from the geographical name database, it is necessary to find the corresponding original geographical name entities and determine whether there is a general name derivation relationship between the two geographical entities. In this regard, this paper adopts a supervised learning approach, transforming the determination of general name derivation relationships into a binary classification problem. From the perspective of geographic spatial distribution, general-name-derived geographical name entities are usually distributed within the adjacent area of the original geographical name entities. Therefore, this paper proposes a method for identifying general name derivation relationships based on deep learning. The general idea is as follows: First, the derivation distance of various original geographical name entities is estimated using the sample estimation method [10]. Then, the general-name-derived geographical name entities within the adjacent area are retrieved from the geographic entity vector database. Next, construct the semantic relationship features and geographic spatiotemporal distribution features between the general-name-derived geographical name entities and the neighboring geographical name entities. Finally, using a feature fusion module, predict the probability of a general name derivation semantic relationship between the two geographical entities (as shown in Figure 3 below).
This paper analyzes the characteristics of general name derivation relationships in terms of geographical space, time, derivation patterns, and semantic associations of geographical names. It constructs a feature engineering for general name derivation relationships. Based on the semantic features and spatiotemporal distribution characteristics of general name derivation relationships, a model is designed with semantic feature extraction layer, spatial and temporal distribution characteristic layer, and feature fusion layers, resulting in a neural network model for the general name derivation relationships of geographical entities.

1.2.1. Construct Feature Engineering

A place name is a proper name assigned to a geographical entity by people. Place names possess the three basic characteristics of general geographical phenomena: space, attributes, and time [14]. In this regard, this paper analyzes and constructs the generic derivational semantic relationship characteristics between geographical entities from the aspects of geographical spatial topological characteristics, geographical spatial metric characteristics, place name lifespan characteristics, derivational pattern characteristics, and place name semantic characteristics.
1) Spatial topological features: The general name derivative in a general-name-derived geographical name usually contains the spatial topological relationship between the derived place name entity and the original place name entity [10]. For example, in "Riverside Memorial Park", the derived part "Riverside" indicates the area along either side of a river. People can infer from the name "Riverside" that this geographic entity has a spatial topological adjacency relationship with a river-related geographic entity (as shown in Figure 4). Accordingly, this paper chooses to describe the spatial topological relationship between the derived place name entity and the original place name entity, using the nine-intersection model to define these spatial topological relationships.
2) Spatial metric relationship features: General-name-derived place name entities and original place name entities have a proximity relationship. Similar to ordinary human geographical phenomena, the toponymic attributes between geographical entities also exhibit spatial proximity effects [15]. Adjacent place-names usually have higher semantic similarity [16]. In terms of spatial distribution, original toponymic entities are typically located within the vicinity of generic name-derived toponymic entities. In daily life, people can infer that a place name is near a certain type of geographical entity through the derived generic name or generic name-derived words in the generic-name-derived place names. For example, in Washington County, USA, the generic-name-derived place name "River Walk Loop Trail" indicates that there is a river near this geographical entity, as suggested by the "River" in the derived generic name (as shown in Figure 5). Accordingly, this paper uses relative nearest neighbor coding of 0 and 1 to describe the proximity between the general-name-derived place name entity and the original place name entity, where 1 indicates that the place name entity is the nearest among same place name entities, and 0 indicates that it is not the nearest.
3) Place name lifespan features: In the temporal dimension, the lifespan interval of a general-name-derived place name is contained within the lifespan of the original place name. Geographical entities have temporal characteristics [17], and their toponymic attributes also exhibit temporal characteristics. The creation and disappearance of place names constitute the lifespan of general-name-derived place names [18]. During the naming phase, a general-name-derived place name, as a name derived from the original place name, has a naming time that is later than or equal to that of the original place name. During the decommissioning phase, to maintain the association between the original place name and the general-name-derived place name, in the process of place name management, when the spatial location status and name of the original place name entity change, the general-name-derived place name will enter a decommission or change status. Accordingly, this paper describes the features by selecting the positive or negative difference in the start time and the end time of the lifespan between the general-name-derived place name and the original place name.
4) Derived pattern characteristics: There is an asymmetric spatial dependency relationship between generic-name-derived place name entities and original place name entities. Semantically, place names usually contain the geographical environment characteristics of the geographical entities they are associated with [19], that is, the category information of neighboring geographical entities. Generic-name-derived place names also contain the category characteristics of the original place names. From the perspective of geographic spatiotemporal dependency, in the phenomenon of generic name derivation between place names, the generic- name-derived entity typically appears in conjunction with the original place name. For example, a river and a riverside park, a lake and a lakeside hotel, or a park and a park boulevard. Based on this, the article selects the categories of generic-name-derived entities and original place name entities for description and utilizes a neural network model to learn the spatial dependency relationship between the two.
5) Semantic association features: The derived generic name or the generic name derivative in a generic-name-derived place name typically originates from the generic name of the original place name, and the two have a certain level of semantic similarity. As the toponymic attribute of geographical entities, the generic name derivation relationship between geographical entities also conforms to the First Law of Geography [20], which states that nearby geographical entities are more likely to have certain semantic associations between their place names. For example, in "Mather Memorial Parkway," the derived generic name "Parkway" has a semantic similarity to the generic name "Park" in its original place name "Mount Rainier National Park." Additionally, In place names, there is a semantic inclusion relationship between the category of the place name and the generic name of the place name [21]. Additionally, there is a semantic inclusion relationship between the generic name of the original place name and the category of the original place name. This article selects the place name sequences of derived generic name entities and original place name entities as the semantic features for the geographical entity generic name derivation relationship.
To sum up, the characteristics of the general-name derivation relationship between the general-name-derived geographical name entity and the original geographical name entity mainly include discrete numerical characteristics and continuous text vector characteristics, among which the discrete numerical characteristics mainly include spatial topological characteristics, spatial metric characteristics, and geographical name lifespan characteristics. The features of continuous text vectors mainly include derived pattern features and geographical name semantic association features (as shown in the following Table 2).

1.2.2. Construct Model

To integrate the discrete numerical features and continuous text vector features constructed between the generic-name-derived geographical names and neighboring geographic entities, this paper designs three modules: semantic feature extraction layer, spatiotemporal distribution feature extraction layer, and semantic spatiotemporal distribution feature fusion layer.
1) Semantic feature extraction layer. To jointly capture the semantic association relations between generic-name derivative and neighboring toponymic generic names and neighboring toponymic categories, and the asymmetric co-occurrence dependency relations between generic-name-derived toponymic categories and neighboring toponymic categories. This paper chooses the bidirectional language model for capturing the semantic relations of generic name derivation, and splice the generic-name-derived toponym, the generic-name-derived toponym category, the neighboring toponym, and the neighboring toponym category into two types of textual information as model inputs for the semantic feature layer, in which the generic-name-derived toponym category and the neighboring toponym category are used as the derivation pattern information of the generic name derivation as one sentence, and the generic-name-derived toponym and the neighboring toponym are used as the generic derivation semantic association information as another sentence, separated by a special separator between the two sentences. For example, “[CLS] the category of the general name derived place name is a Residential Road. the category of the original place name is Water. [SEP] the general name derived place name is Lakeview Shores Loop. the original place name is Lake Norman.[SEP]". To mark whether the category and place names are general-name-derived place names or neighboring place names, the two sentences are distinguished using fragment codes 0 and 1, respectively.
2) Spatial and temporal distribution characteristic extraction layer. The spatiotemporal distribution characteristics in the derivation relationship of generic names mainly include discrete numerical features such as spatial topological relationship characteristics, spatial metric characteristics, and geographical name lifetime characteristics. For spatial topological relationship characteristics, this paper uses one-hot encoding to represent the spatial topological relationship sequence between the derived geographical name entity and the neighboring geographical name entity as a one-dimensional spatial topological feature vector. For spatial metric characteristics, this paper first clusters the neighboring geographical name entities within the vicinity of the derived geographical name entity according to geographical name categories, then sorts them by their distance to the derived geographical name entity. The nearest is encoded as 1 in the same kind of geographical name entity, and the other geographical name entities are encoded as 0, expressed as a one-dimensional spatial metric feature vector using one-hot encoding. For geographical name lifetime characteristics, this paper calculates the difference in the start and end times of the geographical name lifetime between the derived geographical name entity and the original geographical name entity and uses one-hot encoding to encode the signs of these differences, thus forming a one-dimensional geographical name lifetime feature vector. Finally, this paper concatenates the spatial topological feature vector, the spatial metric feature vector, and the geographical name lifetime feature vector into a one-dimensional spatiotemporal distribution feature vector.
3) Semantic spatiotemporal distribution feature fusion layer. To integrate the two-dimensional semantic feature vectors and one-dimensional spatial and temporal distribution vectors in the general-name-derived semantic relation of geographical entities, a Semantic Spatiotemporal Distribution Attention Feature Fusion (SSD-AFF) model is proposed in this paper(as shown in the Figure 6). SSD-AFF model first extracts the local general-name derived semantic features by using two one-dimensional convolution modules and extracts the global one-dimensional general-name-derived semantic feature vectors by using a convolution kernel pooling layer. Then, the general-name-derived semantic features and spatial-temporal distribution features are combined into the general-name-derived relation feature vector, and the Attention weight matrix of the general-name-derived semantic vector features and the spatial-temporal distribution feature vector is generated by using the MSCAM (Multi-Scale Channel Attention Module) [22] feature fusion module. Attentional feature fusion method of AFF (Attentional Feature Fusion) [22] is used to obtain the general-name-derived relation feature vector. Finally, the layer normalization module and linear classifier module are used to map the general-name-derived relation feature vector into a label vector. The AFF feature fusion framework, In the field of feature fusion, has good feature fusion performance [23,24] and has wide applicability.
In the Figure 6, DS represents a two-dimensional feature vector matrix of general-name-derived semantics obtained by the semantic feature extraction layer, S represents a one-dimensional feature vector of general-name-derived semantics, and T represents a one-dimensional feature vector of spatio-temporal distribution of general-name-derived semantics.

2. Experiment and Analysis

The experimental data for this study is sourced from the Geofabrik website. The downloaded map vector data includes 33,874 vector data from states within the United States such as Illinois, North Carolina, and Washington. This data comprises 13,267 geographical entities and 206 geographical entity categories, mainly including categories such as park, river, lake, school, residential road, water, building, reservoir, parking, nature reserve, and so on. This study primarily investigates the recognition performance of geographical entity generic name derivation relationships under three derivation patterns: river->*, park->*, and lake->*, where * denotes other geographical name categories. The experiments are mainly divided into two parts: generic name derivation geographical name recognition experiments and geographical entity generic name derivation relationship recognition experiments. Both experimental environments are set up using Python 3.10.10 and Paddlepaddle 2.6.1. For both experiments, the Ernie model adopts the ’ernie-2.0-base-en’ pre-trained model weights, and the Bert model uses the ’bert-base-uncased’ pre-trained model weights.

2.1. Generic Name Derivation Place Name Recognition Experiments

The generic name derivation geographical name recognition experiment focuses on the recognition method based on ’Prompt+BPLM(bidirectional pre-trained language model)’. To compare the effects of different toponym category information on generic name derivation geographical name recognition, based on the generic name derived mode prompt (Prompt2), this paper also designed two other prompt words: Prompt0 and Prompt1. Prompt0 does not add any place name information, and Prompt1 only uses generic-name-derived place name categories as prompts. In this experiment, Prompt words (Prompt0, Prompt1 and Prompt2) were combined with bidirectional language models (Ernie and Bert) to form an experimental group, to study the performance of various Prompt and pre-trained language models in generic name derived place name recognition tasks, of which Prompt0 was the benchmark experimental group. The experimental metrics used were accuracy, precision, recall, and f1-score(as shown in the specific Table 3 ). The model training parameters for the experiment were: The maximum sequence length for the tokenizer encoder was set to 60, epochs to 10, batch size to 128, learning rate to 5e-8, and the weight decay parameter of the Adam optimizer was set to 0.1. To prevent overfitting, the experiment used soft labels as the model prediction labels, with the label smooth EPS set to 0.4. To compare the effectiveness of each experiment group, the F1-score was used as the comprehensive evaluation metric for the generic-name-derived geographical name recognition experiment. The experiment results indicate that, compared to Prompt0, which does not add any geographical name information, both Prompt1, which adds generic-name-derived geographical name categories, and Prompt2, which adds both generic-name-derived geographical name categories and original geographical name categories, significantly improved the performance in recognizing generic name derived geographical names, with Prompt2 showing the largest improvement. Additionally, the experiments demonstrated that, compared to the Bert language model, the Ernie model exhibited better performance in the generic-name-derived geographical name recognition task. Among the multiple sets of experiments, the experiment that used both generic-name-derived geographical name categories and original geographical name categories as prompt words, combined with the Ernie pre-trained model, achieved the best recognition performance across all evaluation metrics.
In the experiment of general-name derived place name recognition, the experimental group with the addition of toponymy information category all achieved good general-name derived place name recognition effect, indicating that the general-name derived place name recognition method proposed in this paper based on the combination of Promot learning and bidirectional pre-training language model has good performance.

2.2. Geographical Entity Generic Name Derivation Relationship Recognition Experiments

Based on the SSD-AFF feature fusion model, three other feature fusion models are also designed for the recognition method proposed in this paper, which integrates the semantic features and spatiotemporal distribution features of geographical entities. SF (SimpleFusion), SA-AFF (Self-Attention Attention Feature Fusion), SENet-AFF (Squeezeand Excitation Networks Attention Feature Fusion), where SF directly splice general-name derived semantic features with spatio-temporal distribution features. SA-AFF uses the Self-Attention module based on SF to calculate the weight of the general-name-derived relation feature vector after concatenation, and the AFF feature fusion method is adopted for feature fusion, SENet-AFF uses the SENet module based on SA-AFF to calculate the weight of the general name derived relation feature vector. At the same time, to compare the influence of the text input structure on various generic-name-derived semantic features, another text input structure (Input0) is designed based on the text input structure designed in this paper (Input1), which Input0 combines the category of generic-name-derived place name and generic-name-derived place names into the first sentence, the category of neighboring place name and neighboring place names from the second sentence. In this paper, various feature fusion modules are combined with text input, and multiple control experiments are constructed with the Ernie model and Bert model, to study the recognition performance of general-name-derived semantic relations in each combination. Accuracy, Precision, Recall, and F1 are selected as experimental evaluation indexes in this experiment, and the model parameters in the experiment are as follows: the maximum encoding sequence length of tokenizer is 100, the epoch is 20, the batch size is 128, the l e a r n i n g _ r a t e is 1e-5, the w e i g h t _ d e c a y of the optimizer is 0.8 and the l a b e l _ s m o o t h _ e p s of the soft label is 0.1. To compare the performance differences of each group, F1 was selected as the comprehensive evaluation index of each group, and the combination of SF and Input0 was used as the benchmark experimental group. This experiment shows that in terms of text input structure of general-name-derived semantic features, Input0 and Input1 have similar performance in the recognition task of general-name-derived relations in multiple groups of experiments. Compared with SF, SA-AFF, and SSD-AFF have significantly improved the recognition effect of general-name-derived relations, and SSD-AFF has better performance, indicating that the SSD-AFF feature fusion module can have better performance in extracting semantic features and spatiotemporal distribution features of general-name derived semantic relations.
In the generic-name-derived relation recognition experiment, the experimental group composed of Bert, SSD-AFF, and Input0 has the best performance, which indicates that the generic-name-derived relation recognition method proposed in this paper based on SSD-AFF feature fusion module has good recognition performance.

3. Conclusion

To achieve the automation of recognizing generic-name-derivation semantic relationships between large-scale geographical entities, this paper proposes a deep learning-based method for recognizing generic name derivations semantic relationships in geographical entities. This method effectively improves the efficiency of recognizing generic name derivation relationships and demonstrates excellent performance. It holds significant practical value in fields such as the construction of large-scale geographical name knowledge graphs and geographical name translation. There are usually a large number of neighboring geographical entities within the vicinity of a generic name-derived geographical entity. However, the semantic relationship recognition method for generic name derivations proposed in this paper does not filter during the retrieval process of nearby geographical entities. In the future, we will consider integrating knowledge graph technology to filter out certain types of nearby geographical entities based on the prior semantic association knowledge between the generic name derivative part and the original geographical name category. This will further enhance the efficiency of recognizing generic name derivation semantic relationships in geographical entities.

Author Contributions

Conceptualization, Liu hanyou; methodology, Liu hanyou; software, Liu hanyou; validation, Liu hanyou; formal analysis, Liu hanyou; investigation, Liu hanyou; resources, Liu hanyou; data curation, Liu hanyou; writing—original draft preparation, Liu hanyou; writing—review and editing, Liu hanyou; visualization, Liu hanyou; supervision, Mao Xi; project administration, Mao Xi; funding acquisition, Mao Xi; All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Basic Scientific Research Business Funding Project of Research Institute of Chinese Academy of Surveying and Mapping (project number 7771802,7771721).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wei, C.; Fei, W.; Xin, H.; Dongyou, Z.; Xuxiang, X.; Meng, Y.; Ziwei, W.; Jiejun, H. Multi-Scale Semantic Segmentation and Spatial Relationship Recognition of Remote Sensing Images Based on an Attention Model. Remote Sensing 2019, 11. [Google Scholar] [CrossRef]
  2. Stuti Ahuja, Sonali Patil, U. B. Semantic understanding of high spatial resolution remote sensing images using directional geospatial relationships. Annals of GIS 2023, 29, 401–414. [Google Scholar] [CrossRef]
  3. Jie Chen, Y.H.; Wan, L.; Zhou, X.; Deng, M. Geospatial relation captioning for high-spatial-resolution images by using an attention-based neural network. International Journal of Remote Sensing 2019, 40, 6482–6498. [Google Scholar] [CrossRef]
  4. Meng, J.; Chuncheng, Y.; Haibin, S.; Zhilong, Q.; Zefan, W. Improved CasRel model method for joint extraction of geographic entities and overlapping spatial relations. Journal of Surveying and Mapping 2023, 52, 1387–1397. [Google Scholar]
  5. Li, Y.; Feng, L.; Xiliang, L. Bootstrapping method for open geographic entity relation extraction. Journal of Surveying and Mapping 2016, 45, 616–622. [Google Scholar]
  6. Membrado-Tena, J.C. Interpreting protohistoric societies through geographical names of landscape features: a case study in València, Spain. Landscape Research 2021, 46, 811–827. [Google Scholar] [CrossRef]
  7. Maixner, B. *Sæheimr: Just a Settlement by the Sea? Dating, Naming Motivation and Function of an Iron Age Maritime Geographical namein Scandinavia. Journal of Maritime Archaeology 2020, 15, 5–39. [Google Scholar] [CrossRef]
  8. Cheng, H. Derived Geographical names and Their Translation Methods. Chinese scientific and technological terms 2016, 18, 5–8. [Google Scholar] [CrossRef]
  9. Hanyou, L.; Jizhou, W.; Xi, M.; Weijun, M. Mining of Complete Derived Geographical names and Generic Derived Geographical names. Surveying and mapping science 2022, 47, 176–181+220. [Google Scholar] [CrossRef]
  10. Hanyou, L. Automatic Recognition and Translation of English Derived Geographical names for Global Mapping. Master’s thesis, Liaoning Technical University, 2022. [CrossRef]
  11. Jiang, T.; Jiao, J.; Huang, S.; Zhang, Z.; Wang, D.; Zhuang, F.; Wei, F.; Huang, H.; Deng, D.; Zhang, Q. PromptBERT: Improving BERT Sentence Embeddings with Prompts, 2022. arXiv:cs.CL/2201.04337].
  12. Luo, X.; Xue, Y.; Xing, Z.; Sun, J. PRCBERT: Prompt Learning for Requirement Classification using BERT-based Pretrained Language Models. Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering; Association for Computing Machinery: New York, NY, USA, 2023; ASE ’22. [CrossRef]
  13. Zhu, Y.; Zhou, X.; Qiang, J.; Li, Y.; Yuan, Y.; Wu, X. Prompt-Learning for Short Text Classification, 2022. arXiv:cs.CL/2202.11345].
  14. Jiangtao, B.; Wei, P.; Yongjian, H.; Yongqiang, Z.; Huihui, Y. Research on the Construction of Historical Geographical name Comprehensive Information System Based on TGIS and Big Data Technology. Journal of Global Change Data 2021, 5, 363–372+520–529. [Google Scholar]
  15. Liu Yu, W.K.; Xiaoyue, X.; Hao, G.; Weiyu, Z.; Qinyao, L.; Song, G.; Zhou, H.; Haifeng, L.; Xin, L.; Jiao’e, W.; Jinfeng, W.; Di, Z. Spatial Effects in Geographical Analysis. Acta Geographica Sinica 2023, 78, 517–531. [Google Scholar]
  16. Hu, Y.; Janowicz, K. An Empirical Study on the Names of Points of Interest and Their Changes with Geographic Distance. Geographic Information Science,Geographic Information Science 2018.
  17. Xueying ZHANG, C.Z.a.W.; LV, G. Spatiotemporal features based geographical knowledge graph construction. SCIENTIA SINICA Informationis 2020, 50, 1019–1032. [Google Scholar] [CrossRef]
  18. Hu, Y. Study on the spatial-temporal relationship between Ancient and modern place names in Genealogical GIS. Master’s thesis, Nanjing Normal University, 2008.
  19. Mandillah, L. A morphosyntactic and semantic analysis of toponyms among the Luhya: A case of Bungoma County. Journal of Languages, Linguistics and Literary Studies 2022, p. 28–37. [CrossRef]
  20. Tobler, W.R. A computer movie simulating urban growth in the Detroit region. Economic geography 1970, 46, 234–240. [Google Scholar] [CrossRef]
  21. Zhang, C.; Zhang, X.; Ji, L.; WANG, H. Mapping relationship between place name and Geographical element type. Journal of Wuhan University (Information Science Edition) 2011, 36, 857–861. [Google Scholar]
  22. Dai, Y.; Gieseke, F.; Oehmcke, S.; Wu, Y.; Barnard, K. Attentional Feature Fusion. 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), 2021. [CrossRef]
  23. Feng, M.; Zhang, R.; Wang, H.; Liu, Y.; Yang, G. Two-Level Feature Fusion Network for Remote Sensing Image Change Detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2024, 17, 8477–8489. [Google Scholar] [CrossRef]
  24. Jiao, Y.; Wang, X.; Wang, W.; Li, S. Image Semantic Segmentation Fusion of Edge Detection and AFF Attention Mechanism. Applied Sciences 2022, 12. [Google Scholar] [CrossRef]
Figure 1. Semantic relation of general-name-derived toponym.
Figure 1. Semantic relation of general-name-derived toponym.
Preprints 109760 g001
Figure 2. Generic name derived place name recognition.
Figure 2. Generic name derived place name recognition.
Preprints 109760 g002
Figure 3. Identification of the derivational relationships of general name.
Figure 3. Identification of the derivational relationships of general name.
Preprints 109760 g003
Figure 4. Spatial topological adjacency relationship.
Figure 4. Spatial topological adjacency relationship.
Preprints 109760 g004
Figure 5. Spatial proximity relationship.
Figure 5. Spatial proximity relationship.
Preprints 109760 g005
Figure 6. SSD-AFF.
Figure 6. SSD-AFF.
Preprints 109760 g006
Table 1. the template of generic name derived toponymy recognition.
Table 1. the template of generic name derived toponymy recognition.
Name Content
Template [CLS] the category of derived toponymy is D.
the category of original toponymy is O.
[SEP] the derived toponymy is P. [SEP]
Example Type Postive Negtive
Input[O] River River
Input[D] Road River
Input[P] West Riverside Avenue Dead River
Output 1 0
Answer Map General name derived toponym Regular toponym
Table 2. The relationship characteristics of generic name derivation.
Table 2. The relationship characteristics of generic name derivation.
Feature name Feature value
Spatial topological features Intersects,Disjoint,Contains,Within,Equal,Overlap,Touch,Cross
Spatial metric features Adjacent geographic entity,Non-adjacent geographical entities
Toponym lifetime features The sign of the d i f s t a r t (+,-),The sign of d i f e n d (+,-)
Derivation pattern features Generic name derived toponym’s categories,
Original categories’categories
Semantic association features Derived geographical name word sequences,
Original geographical name word sequences
Table 3. The relationship characteristics of generic name derivation.
Table 3. The relationship characteristics of generic name derivation.
Model Name Accuracy Precision Recall F1 Lift
E r n i e P r o m p t 0 85.89% 86.65% 86.65% 85.82%
E r n i e P r o m p t 1 94.43% 94.55% 94.43% 94.43% +8.61%
E r n i e P r o m p t 2 95.80% 96.05% 95.80% 95.80% +9.98%
B e r t P r o m p t 0 82.94% 83.53% 82.94% 82.85%
B e r t P r o m p t 1 94.08% 94.16% 94.08% 94.08% +11.23%
B e r t P r o m p t 2 95.46% 95.54% 95.46% 95.46% +12.61%
Table 4. The relationship characteristics of generic name derivation.
Table 4. The relationship characteristics of generic name derivation.
Model Name Accuracy Precision Recall F1 Lift
E r n i e S F _ I n p u t 0 81.11% 85.70% 81.11% 81.04%
E r n i e S A A F F _ I n p u t 0 80.79% 83.01% 80.79% 80.91% -0.13%
E r n i e S E N e t A F F _ I n p u t 0 79.44% 83.33% 79.44% 79.45% -1.59%
E r n i e S S D A F F _ I n p u t 0 90.52% 90.71% 90.52% 90.53% +9.49%
E r n i e S F _ I n p u t 1 81.43% 85.52% 81.43% 81.38%
E r n i e S A A F F _ I n p u t 1 91.30% 92.24% 91.30% 91.37% +9.99%
E r n i e S E N e t A F F _ I n p u t 1 79.94% 83.67% 79.94% 79.93% -1.45%
E r n i e S S D A F F _ I n p u t 1 84.16% 86.61% 84.16% 84.22% +2.84%
B e r t S F _ I n p u t 0 81.85% 86.81% 81.85% 81.76%
B e r t S A A F F _ I n p u t 0 89.77% 90.08% 89.77% 89.80% +8.42%
B e r t S E N e t A F F _ I n p u t 0 70.45% 71.14% 70.45% 69.47% -12.29%
B e r t S S D A F F _ I n p u t 0 93.18% 93.89% 93.18% 93.22% +11.46%
B e r t S F _ I n p u t 1 80.15% 86.18% 80.15% 79.90%
B e r t S A A F F _ I n p u t 1 87.75% 88.95% 87.75% 87.82% +7.92%
B e r t S E N e t A F F _ I n p u t 1 79.87% 85.71% 79.87% 79.69% -0.21%
B e r t S S D A F F _ I n p u t 1 89.77% 91.01% 89.77% 89.83% +9.93%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated