Spatial embedding

From HandWiki

Spatial embedding is one of feature learning techniques used in spatial analysis where points, lines, polygons or other spatial data types.[1] representing geographic locations are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per geographic object to a continuous vector space with a much lower dimension.

Such embedding methods allow complex spatial data to be used in neural networks and have been shown to improve performance in spatial analysis tasks[2][3]

Embedded data types

Geographic data can take many forms: text,[4][5][6] images,[7][8] graphs,[9][10] trajectories,[11][12][13] polygons.[14] Depending on the task, there may be a need to combine multimodal data from different sources.[2][15] The next section describes examples of different types of data and their uses.

Text

Geolocated posts on social media can be used to acquire a library of documents bound to a given place that can be later transformed to embedded vectors using word embedding techniques.[4]

Image

Satellites and aircraft collect digital spatial data acquired from remotely sensed images which can be used in machine learning. They are sometimes hard to analyse using basic image analysis methods and convolutional neural networks can be used to acquire an embedding of images bound to a given geographical object or a region.[7]

Satellite image of Seattle city.
Example of Seattle city satellite image acquired using remote sensing methods.

Point

A single point of interest (POI) can be assigned multiple features that can be used in machine learning. These could be demographic, transportation, meteorological, or economic data, for example. When embedding single points, it is common to consider the entire set of available points as nodes in a graph.[10]

Map of point of interests from OpenPoiMap
Example of a point of interests map from OpenPOIMap.

Line / multiline

Among other things, motion trajectories are represented as lines (multilines). Individual trajectories are embedded taking into account travel time, distances and also features of points visited along the way. Embedding of trajectories allows to improve performance of such tasks as clustering and also categorization.[13]

Black and white map of Beijing containing few mobility trajectories plotted on top of it.
Example of mobility trajectories from the GeoLife dataset (Beijing, China ).

Polygon

The geographic areas analyzed in machine learning are defined by both administrative boundaries and top-down division into grids of regular shapes such as rectangles, for example. Both types are represented as polygons and, like points, can be assigned different demographic, transportation, or economic features. A polygon can also have features related to the size of the area or shape it represents.

Map of San Francisco bay with 19 blue hexagons plotted on top of it
Example of regular hexagonal tiling used to divide San Francisco Bay area using Uber's H3 library.
Map of San Francisco administrative districts
Map of San Francisco administrative districts.

Graph

An example domain where graph representation is used is the street layout in a city, where vertices can be intersections and edges can be roads. The vertices can also be destination points like public transport stops or important points in the city, and the edges represent the flow between them. Embedding graphs or single vertices allows to improve accuracy of analysis methods in which the treated geographical domain can be represented as a network.[9]

Diagram of the Rennes Metro
Example of a city network: the Rennes Metro (French: Métro de Rennes). In this example metro stops are vertices and tracks between them are edges.

Usage

  • POI recommendation[15][16] - generating personalized point of interest recommendations based on user preferences.
  • Next/future location prediction[10][17] - prediction of the next location a person will go to based on their historical trajectory.
  • Zone functions classification[13] - based on different mobility of people or POI distribution a function of a given area in a city can be predicted.
  • Crime prediction[18] - estimation of crime rate in different regions of a city.
  • Local event detection[6] - studying spatio-temporal changes in embeddings can provide valuable information in detection of local event occurring in specific location.
  • Regional mobility popularity prediction[11] - analysis of mobility can show patterns in popularity of different regions in a city.
  • Shape matching[14] - finding a similar shape of given polygon, for example finding building with the same shape as input building.
  • Travel time estimation[19][20][21] - predicting estimated travel time given current traffic conditions and special occurring events.
  • Time estimation for on-demand food delivery[22] - estimation of delivery time when placing an order through the website.

Temporal aspect

Some of the data analyzed has a timestamp associated with it. In some cases of data analysis this information is omitted and in others it is used to divide the set into groups. The most common division is the separation of weekdays from weekends or division into hours of the day. This is particularly important in the analysis of mobility data, because the characteristics of mobility during the week and at different times of the day are very different from each other.[3][23][24] Another area in which time division into, for example, individual months can be used is in the analysis of tourism of a given region.[16] In order to take such a split into account, embedding methods treat the time stamp specifically or separate versions of the model are developed for different subgroups of the analyzed set.

References

  1. Schneider, Markus (2009), LIU, LING; ÖZSU, M. TAMER, eds. (in en), Spatial Data Types, Boston, MA: Springer US, pp. 2698–2702, doi:10.1007/978-0-387-39940-9_354, ISBN 978-0-387-39940-9, https://doi.org/10.1007/978-0-387-39940-9_354, retrieved 2021-01-19 
  2. 2.0 2.1 Li, Youru; Zhu, Zhenfeng; Kong, Deqiang; Xu, Meixiang; Zhao, Yao (2019-07-17). "Learning Heterogeneous Spatial-Temporal Representation for Bike-Sharing Demand Prediction". Proceedings of the AAAI Conference on Artificial Intelligence 33: 1004–1011. doi:10.1609/aaai.v33i01.33011004. ISSN 2374-3468. https://aaai.org/ojs/index.php/AAAI/article/view/3890. 
  3. 3.0 3.1 Cao, Hancheng; Xu, Fengli; Sankaranarayanan, Jagan; Li, Yong; Samet, Hanan (2020-05-01). "Habit2vec: Trajectory Semantic Embedding for Living Pattern Recognition in Population". IEEE Transactions on Mobile Computing 19 (5): 1096–1108. doi:10.1109/TMC.2019.2902403. ISSN 1536-1233. https://ieeexplore.ieee.org/document/8656580. 
  4. 4.0 4.1 Dassereto, Federico; Di Rocco, Laura; Guerrini, Giovanna; Bertolotto, Michela (2020), Kyriakidis, Phaedon; Hadjimitsis, Diofantos; Skarlatos, Dimitrios et al., eds., "Evaluating the Effectiveness of Embeddings in Representing the Structure of Geospatial Ontologies" (in en), Geospatial Technologies for Local and Regional Development, Lecture Notes in Geoinformation and Cartography (Cham: Springer International Publishing): pp. 41–57, doi:10.1007/978-3-030-14745-7_3, ISBN 978-3-030-14744-0, http://link.springer.com/10.1007/978-3-030-14745-7_3, retrieved 2021-01-19 
  5. Jin, Jiaqi; Xiao, Zhuojian; Qiu, Qiang; Fang, Jinyun (July 2019). "A Geohash Based Place2vec Model". IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium. Yokohama, Japan: IEEE. pp. 3344–3347. doi:10.1109/IGARSS.2019.8898375. ISBN 978-1-5386-9154-0. https://ieeexplore.ieee.org/document/8898375. 
  6. 6.0 6.1 Silva, Amila; Karunasekera, Shanika; Leckie, Christopher; Luo, Ling (December 2019). "USTAR: Online Multimodal Embedding for Modeling User-Guided Spatiotemporal Activity". 2019 IEEE International Conference on Big Data (Big Data). Los Angeles, CA, USA: IEEE. pp. 1211–1217. doi:10.1109/BigData47090.2019.9005461. ISBN 978-1-7281-0858-2. https://ieeexplore.ieee.org/document/9005461. 
  7. 7.0 7.1 Zhang, Sen; Li, Shaobo; Li, Xiang; Yao, Yong (2020-04-02). "Representation of Traffic Congestion Data for Urban Road Traffic Networks Based on Pooling Operations" (in en). Algorithms 13 (4): 84. doi:10.3390/a13040084. ISSN 1999-4893. 
  8. Dao, Minh-Son; Zettsu, Koji (September 2018). "A Raster-Image-Based Approach for Understanding Associations of Urban Sensing Data". 2018 IEEE First International Conference on Artificial Intelligence and Knowledge Engineering (AIKE). Laguna Hills, CA: IEEE. pp. 134–137. doi:10.1109/AIKE.2018.00029. ISBN 978-1-5386-9555-5. https://ieeexplore.ieee.org/document/8527461. 
  9. 9.0 9.1 Wu, Ning; Zhao, Xin Wayne; Wang, Jingyuan; Pan, Dayan (2020-08-23). "Learning Effective Road Network Representation with Hierarchical Graph Neural Networks" (in en). Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. KDD '20. Virtual Event CA USA: ACM. pp. 6–14. doi:10.1145/3394486.3403043. ISBN 978-1-4503-7998-4. https://dl.acm.org/doi/10.1145/3394486.3403043. 
  10. 10.0 10.1 10.2 Xu, Shuai; Cao, Jiuxin; Legg, Phil; Liu, Bo; Li, Shancang (June 2020). "Venue2Vec: An Efficient Embedding Model for Fine-Grained User Location Prediction in Geo-Social Networks". IEEE Systems Journal 14 (2): 1740–1751. doi:10.1109/JSYST.2019.2913080. ISSN 1932-8184. Bibcode2020ISysJ..14.1740X. https://ieeexplore.ieee.org/document/8713862. 
  11. 11.0 11.1 Fu, Yanjie; Wang, Pengyang; Du, Jiadi; Wu, Le; Li, Xiaolin (2019-07-17). "Efficient Region Embedding with Multi-View Spatial Networks: A Perspective of Locality-Constrained Spatial Autocorrelations" (in en). Proceedings of the AAAI Conference on Artificial Intelligence 33 (1): 906–913. doi:10.1609/aaai.v33i01.3301906. ISSN 2374-3468. https://ojs.aaai.org/index.php/AAAI/article/view/3879. 
  12. Ouyang, Kun; Shokri, Reza; Rosenblum, David S.; Yang, Wenzhuo (July 2018). "A Non-Parametric Generative Model for Human Trajectories" (in en). Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Stockholm, Sweden: International Joint Conferences on Artificial Intelligence Organization. pp. 3812–3817. doi:10.24963/ijcai.2018/530. ISBN 978-0-9992411-2-7. https://www.ijcai.org/proceedings/2018/530. 
  13. 13.0 13.1 13.2 Yao, Zijun; Fu, Yanjie; Liu, Bin; Hu, Wangsu; Xiong, Hui (July 2018). "Representing Urban Functions through Zone Embedding with Human Mobility Patterns" (in en). Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Stockholm, Sweden: International Joint Conferences on Artificial Intelligence Organization. pp. 3919–3925. doi:10.24963/ijcai.2018/545. ISBN 978-0-9992411-2-7. https://www.ijcai.org/proceedings/2018/545. 
  14. 14.0 14.1 Yan, Xiongfeng; Ai, Tinghua; Yang, Min; Tong, Xiaohua (2020-05-25). "Graph convolutional autoencoder model for the shape coding and cognition of buildings in maps" (in en). International Journal of Geographical Information Science 35 (3): 490–512. doi:10.1080/13658816.2020.1768260. ISSN 1365-8816. https://www.tandfonline.com/doi/full/10.1080/13658816.2020.1768260. 
  15. 15.0 15.1 Chang, Buru; Park, Yonggyu; Park, Donghyeon; Kim, Seongsoon; Kang, Jaewoo (July 2018). "Content-Aware Hierarchical Point-of-Interest Embedding Model for Successive POI Recommendation" (in en). Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Stockholm, Sweden: International Joint Conferences on Artificial Intelligence Organization. pp. 3301–3307. doi:10.24963/ijcai.2018/458. ISBN 978-0-9992411-2-7. https://www.ijcai.org/proceedings/2018/458. 
  16. 16.0 16.1 Bin, Chenzhong; Gu, Tianlong; Jia, Zhonghao; Zhu, Guimin; Xiao, Cihan (June 2020). "A neural multi-context modeling framework for personalized attraction recommendation" (in en). Multimedia Tools and Applications 79 (21–22): 14951–14979. doi:10.1007/s11042-019-08554-5. ISSN 1380-7501. http://link.springer.com/10.1007/s11042-019-08554-5. 
  17. "Next Location Prediction with a Graph Convolutional Network Based on a Seq2seq Framework". KSII Transactions on Internet and Information Systems 14 (5). 2020-05-31. doi:10.3837/tiis.2020.05.003. http://itiis.org/digital-library/23549. 
  18. Qian, Yiting; Pan, Li; Wu, Peng; Xia, Zhengmin (July 2020). "GeST: A Grid Embedding based Spatio-Temporal Correlation Model for Crime Prediction". 2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC). Hong Kong, Hong Kong: IEEE. pp. 1–7. doi:10.1109/DSC50466.2020.00009. ISBN 978-1-7281-9558-2. https://ieeexplore.ieee.org/document/9172874. 
  19. Wang, Meng-xiang; Lee, Wang-Chien; Fu, Tao-yang; Yu, Ge (2019-11-05). "Learning Embeddings of Intersections on Road Networks" (in en). Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. Chicago IL USA: ACM. pp. 309–318. doi:10.1145/3347146.3359075. ISBN 978-1-4503-6909-1. 
  20. Xu, Saijun; Xu, Jiajie; Zhou, Rui; Liu, Chengfei; Li, Zhixu; Liu, An (2020), Nah, Yunmook; Cui, Bin; Lee, Sang-Won et al., eds., "TADNM: A Transportation-Mode Aware Deep Neural Model for Travel Time Estimation" (in en), Database Systems for Advanced Applications, Lecture Notes in Computer Science (Cham: Springer International Publishing) 12112: pp. 468–484, doi:10.1007/978-3-030-59410-7_32, ISBN 978-3-030-59409-1, http://link.springer.com/10.1007/978-3-030-59410-7_32, retrieved 2021-01-19 
  21. Xu, Saijun; Zhang, Ruoqian; Cheng, Wanjun; Xu, Jiajie (2020-08-15). "MTLM: a multi-task learning model for travel time estimation" (in en). GeoInformatica 26 (2): 379–395. doi:10.1007/s10707-020-00422-x. ISSN 1384-6175. http://link.springer.com/10.1007/s10707-020-00422-x. 
  22. Zhu, Lin; Yu, Wei; Zhou, Kairong; Wang, Xing; Feng, Wenxing; Wang, Pengyu; Chen, Ning; Lee, Pei (2020-08-23). "Order Fulfillment Cycle Time Estimation for On-Demand Food Delivery" (in en). Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. KDD '20. Virtual Event CA USA: ACM. pp. 2571–2580. doi:10.1145/3394486.3403307. ISBN 978-1-4503-7998-4. https://dl.acm.org/doi/10.1145/3394486.3403307. 
  23. Du, Bowen; Peng, Hao; Wang, Senzhang; Bhuiyan, Md Zakirul Alam; Wang, Lihong; Gong, Qiran; Liu, Lin; Li, Jing (March 2020). "Deep Irregular Convolutional Residual LSTM for Urban Traffic Passenger Flows Prediction". IEEE Transactions on Intelligent Transportation Systems 21 (3): 972–985. doi:10.1109/TITS.2019.2900481. ISSN 1524-9050. https://ieeexplore.ieee.org/document/8664646. 
  24. Hong, Huiting; Lin, Yucheng; Yang, Xiaoqing; Li, Zang; Fu, Kung; Wang, Zheng; Qie, Xiaohu; Ye, Jieping (2020-08-23). "HetETA" (in en). Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. KDD '20. Virtual Event CA USA: ACM. pp. 2444–2454. doi:10.1145/3394486.3403294. ISBN 978-1-4503-7998-4. https://dl.acm.org/doi/10.1145/3394486.3403294.