SWS Academic Research eLibraryEarth & Planetary Sciences

Scholarly record

A REPRODUCIBLE MACHINE LEARNING APPROACH FOR SPATIAL DATA MODELING: CLUSTERING BUCHAREST SECTORS USING DEMOGRAPHIC AND SPATIAL INDICATORS

Radu-Anton Moldovan, Marian Pompiliu Cristescu, Ana-Maria Constantinescu

First published: 2026DOI pendingView metrics

Abstract

This paper examines the applicability of machine learning techniques to spatial data modeling and identification of urban patterns using a limited set of demographic and geographic indicators. Drawing on sector-level data for Bucharest from the National Institute of Statistics of Romania, the analysis uses surface area, population, population density, and distance to the city center computed as Euclidean distance between sector centroids and a defined central reference point as variables. The dataset is processed through a standardized analytical workflow, including feature normalization and k-means clustering, with the number of clusters determined using the elbow method. Principal component analysis is employed to visualize and interpret the results. Our findings show three distinct spatial groupings corresponding to peripheral, intermediate, and central urban structures. Sector 1 is identified as a distinct cluster characterized by large surface area, low population density, and greater distance from the city center, while Sectors 2, 4, and 5 have more compact and centrally located characteristics. Sectors 3 and 6 display intermediate profiles, combining moderate spatial and demographic features. The silhouette score of 0.266 suggests moderate cluster separation, reflecting the limited size and dimensionality of the dataset. These results highlight the potential of simple machine learning approaches to support exploratory spatial analysis, even in data-constrained contexts. The study demonstrates that interpretable and reproducible workflows can provide insights into urban spatial structure and serve as a foundation for more advanced modeling approaches.

Publication details

Title
A REPRODUCIBLE MACHINE LEARNING APPROACH FOR SPATIAL DATA MODELING: CLUSTERING BUCHAREST SECTORS USING DEMOGRAPHIC AND SPATIAL INDICATORS
Authors
Radu-Anton Moldovan, Marian Pompiliu Cristescu, Ana-Maria Constantinescu
Proceedings
SWS 2026 Conference Preprints
Publisher
STEF92 Technology
Year
2026
Pages
Not available yet
ISSN
1314-2704; 1314-2704
ISBN
Not available yet
Language
en
Publication type
Preprint
References15
  1. Shaamala A., Yigitcanlar T., Nili A., Nyandega D., Machine learning applications for urban geospatial analysis: A review of urban and environmental studies, Cities, 2025, ISSN 0264-2751, DOI: 10.1016/j.cities.2025.106139;

  2. Casali Y., Aydin N. Y., Comes T., Machine learning for spatial analyses in urban areas: a scoping review, Sustainable Cities and Society, 2022, ISSN 2210-6715, DOI: 10.1016/j.scs.2022.104050;

  3. Li F., Yigitcanlar T., Nepal M., Nguyen K., Dur F., Machine Learning and Remote Sensing Integration for Leveraging Urban Sustainability: A Review and Framework, Sustainable Cities and Society, 2023, ISSN 2210-6715, DOI: 10.1016/j.scs.2023.104653;

  4. Kilani Hadj B., K-Means Clustering algorithms in Urban studies: A Review of Unsupervised Machine Learning techniques, OSF Preprints, 2023, DOI: 10.31219/osf.io/bs6wy;

  5. Ramadhan A., Achmad F., Zulkarnain I., Aritsugi M., Evaluation of K-Means, DBSCAN, and Hierarchical Clustering for Strategic Segmentation of Tourism SMEs in Rembang, Indonesia, Jurnal Teknik Informatika, 2025, ISSN 2723-3871, DOI: 10.52436/1.jutif.2025.6.3.4602;

  6. Ikotun A. M., Ezurgwu A. E., Abualigah L., Abuhaija B., Heming J., K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data, Information Sciences, 2023, pp. 178-210, ISSN 1872-6291, DOI: 10.1016/j.ins.2022.11.139;

  7. Tu X., Fu C., Huang A., Chen H., Ding X., DBSCAN Spatial Clustering Analysis of Urban "Production-Living-Ecological" Space Based on POI Data: A Case Study of Central Urban Wuhan, China, International Journal of Environmental Research and Public Health (IJERPH), 2022, ISSN 1660-4601, DOI: 10.3390/ijerph19095153;

  8. Vartholomaios A., Detection and clustering of urban form types with machine learning: insights into Thessaloniki's urban planning and evolution, Computational Urban Science, 2025, ISSN 2730-6852, DOI: 10.1007/s43762-025-00206-9;

  9. Hoang N. -D., Machine Learning Approaches for Geospatial Modeling of Urban Land Surface Temperature: Assessing Geographical Compactness, Interpretability, and Casual Inference, Sensors, 2025, ISSN 1424-8220, DOI: 10.3390/s25175380;

  10. Mai G., Janowicz K., Hu Y., Gao S., Yan B., Zhu R., Cai L., Lao N., A Review of Location Encoding for GeoAI: Methods and Applications, International Journal of Geographical Information Science, 2021, DOI: 10.1080/13658816.2021.2004602;

  11. Simons G. D., Untangling urban data signature: unsupervised machine learning methods for the detection of urban archetypes at the pedestrian scale, Physics, 2021, DOI: 10.48550/arXiv.2106.15363;

  12. Xie Y., Shekhar S., Li Y., Statistically-Robust Clustering Techniques for Mapping Spatial Hotspots: A Survey, ACM Computing Surveys, 2021, pp. 1-38, ISSN 1557-7341, https://dl.acm.org/doi/DOI: 10.1145/3487893;

  13. Arribas-Bel D., Garcia-Lopez M. -A., Viladecans-Marsal E., Building(s and) cities: Delineating urban areas with a machine learning algorithm, Journal of Urban Economics, 2019, ISSN 1095-9068, DOI: 10.1016/j.jue.2019.103217;

  14. Li N., Quan S. J., Discovering urban block typologies in Seoul: Combining planning knowledge and unsupervised machine learning, Cities, 2024, ISSN 0264-2751, DOI: 10.1016/j.cities.2024.104988;

  15. Kuncheria A., Walker J. L., Macfarlane J., Exploring urban typologies using comprehensive analysis of transportation dynamics, Transportation, 2025, ISSN 1572-9435, DOI: 10.1007/s11116-024-10580-8.

Back to publication list