Implementation of a Province Clustering Dashboard Based on IPM and Smoking Prevalence 2024 Using K-Means with KNN Validation as a Support for Data-Driven Decision Making

Authors

  • Indry Widiyani Program Studi Teknik Informatika,Universitas Pelita Bangsa, Indonesia
  • Asep Arwan Sulaeman Program Studi Teknik Informatika,Universitas Pelita Bangsa, Indonesia
  • Handala Simetris Harahap Program Studi Teknik Informatika,Universitas Pelita Bangsa, Indonesia

DOI:

https://doi.org/10.51601/ijse.v6i2.649

Abstract

Differences in the Human Development Index (HDI) and smoking rates across provinces in Indonesia indicate variations in social conditions that need to be systematically analyzed. This study aims to implement an interactive dashboard to cluster provinces based on their HDI and smoking rates for the year 2024 using the K-Means algorithm, and to validate the clustering results using K-Nearest Neighbor (KNN). The research data were obtained from the Central Statistics Agency (BPS) and the Regional Management Information System (SIMREG-Bappenas) and processed using RapidMiner Studio. The research stages included data cleaning, normalization using Z-transformation, clustering with K-Means into three clusters, validation using KNN via cross-validation, and the implementation of a Streamlit-based dashboard. The results show that provinces in Indonesia can be grouped into three clusters with distinct characteristics based on HDI values and smoking rates. The developed dashboard presents the analysis results in the form of tables, graphs, and interactive maps, thereby facilitating data interpretation and supporting data-driven decision-making. Validation results indicate that the clustering model exhibits a high level of consistency, making it suitable as a basis for formulating policy recommendations regarding regional development and public health.

Downloads

Download data is not yet available.

References

[1] Suharmanto, W. S. Utami, N. Pratiwi, and F. Muhammad, “Penerapan Data Mining Menggunakan Algoritma K-Means Untuk Clustering Perokok Usia Lebih dari 15 Tahun,” Bull. Inf. Technol. BIT, vol. 4, no. 4, pp. 501–507, Dec. 2023, doi: 10.47065/bit.v4i4.1067.

[2] S. Hartanto, M. S. Adzan, D. Z. Haq, and D. C. R. Novitasari, “Analisis Indeks Pembangunan Manusia Di Jawa Timur Tahun 2022-2023 Berdasarkan Indikator Menggunakan Metode Fuzzy C-Means,” INTEK J. Inform. Dan Teknol. Inf., vol. 7, no. 2, pp. 45–54, Nov. 2024, doi: 10.37729/intek.v7i2.5358.

[3] A. Hardana and J. Nasution, “Pengaruh Rasio Keuangan Pemerintah Daerah terhadap Indeks Pembangunan Manusia,” Glob. Financ. Account. J., vol. 6, no. 1, p. 52, Apr. 2022, doi: 10.37253/gfa.v6i1.6452.

[4] B. A. Lubis, I. Widiyani, and E. Rilvani, “Pengelompokan Provinsi Merokok Di Indonesia Berdasarkan Ipm Menggunakan K-Means Dan Knn”.

[5] A. A. Permatasari, D. C. Lolita, and C. C. Chotimah, “Peran Media Digital Dalam Upaya Promosi Kesehatan Untuk Meningkatkan Kesehatan Masyarakat : Tinjauan Literatur,” Zaitun J. Ilmu Kesehat., vol. 11, no. 1, p. 1, Feb. 2023, doi: 10.31314/zijk.v11i1.2033.

[6] M. Y. Bahtiar, A. Wahyudin, and A. Anisyah, “Perancangan Dashboard Interaktif Untuk Mengoptimalisasi Analisis Hasil Audit Mutu Internal (AMI) Dengan Metode Pureshare,” J. Komput. Teknol. Inf. Sist. Inf. JUKTISI, vol. 4, no. 2, pp. 863–876, Aug. 2025, doi: 10.62712/juktisi.v4i2.550.

[7] D. Hidayati, Y. Yahya, and Muh. A. Juniarta Hidayat, “Implementasi Data Mining Menggunakan Algoritma K-Means untuk Pengelompokkan Obat di Puskesmas Kerongkong Kecamatan Suralaga,” J. Print. J. Pengemb. Rekayasa Inform. Dan Komput., vol. 1, no. 2, pp. 78–88, Dec. 2023, doi: 10.29408/jprinter.v1i2.22006.

[8] S. P. Dewi, N. Nurwati, and E. Rahayu, “Penerapan Data Mining Untuk Prediksi Penjualan Produk Terlaris Menggunakan Metode K-Nearest Neighbor,” Build. Inform. Technol. Sci. BITS, vol. 3, no. 4, pp. 639–648, Mar. 2022, doi: 10.47065/bits.v3i4.1408.

[9] T. Setyadji, R. A. Ramadhani, and L. S. Wahyuniar, “Implementasi Data Mining Untuk Menentukan Pelanggan Potensial Menggunakan Algoritma Pengklasteran K-Means Dan K-Nearest Neighbors,” vol. 9.

[10] D. A. Imanuel and G. Alfian, “Visualisasi Segmentasi Pelanggan Berdasarkan Atribut RFM Menggunakan Algoritma K-Means Untuk Memahami Karakteristik Pelanggan pada Toko Retail Online,” J. Teknol. Inf. Dan Ilmu Komput., vol. 12, no. 2, pp. 283–292, Apr. 2025, doi: 10.25126/jtiik.2025128619.

[11] J. Han, M. Kamber, and D. Mining, “Concepts and techniques,” Morgan Kaufmann, vol. 340, no. 1, pp. 94104–103205, 2006.

[12] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21–27, Jan. 1967, doi: 10.1109/TIT.1967.1053964.

[13] H. Hartatik and R. Rosyid, “Pengaruh User Profiling Pada Rekomendasi Sistem Menggunakan K Means Dan Knn,” J. Inf. Syst. Manag. JOISM, vol. 2, no. 1, pp. 13–18, Jul. 2020, doi: 10.24076/JOISM.2020v2i1.199.

[14] C. C. Aggarwal, Data Mining: The Textbook. Cham: Springer International Publishing, 2015. doi: 10.1007/978-3-319-14142-8.

[15] S. E. Saqila, I. P. Ferina, and A. Iskandar, “Analisis Perbandingan Kinerja Clustering Data Mining Untuk Normalisasi Dataset,” J. Sist. Komput. Dan Inform. JSON, vol. 5, no. 2, p. 356, Dec. 2023, doi: 10.30865/json.v5i2.6919.

[16] O. S. H. Raharusun, A. Hasibuan, and G. C. Rorimpandey, “Rice Production Prediction System Using Multiple Linear Regression for Food Security Optimization in Minahasa Regency,” J. Minfo Polgan, vol. 14, no. 2, pp. 2287–2294, Oct. 2025, doi: 10.33395/jmp.v14i2.15340.

[17] Widi Hastomo, Nur Aini, Adhitio Satyo Bayangkari Karno, and L.M. Rasdi Rere, “Metode Pembelajaran Mesin untuk Memprediksi Emisi Manure Management,” J. Nas. Tek. Elektro Dan Teknol. Inf., vol. 11, no. 2, pp. 131–139, May 2022, doi: 10.22146/jnteti.v11i2.2586.

Downloads

Published

2026-06-24

How to Cite

Indry Widiyani, Asep Arwan Sulaeman, & Handala Simetris Harahap. (2026). Implementation of a Province Clustering Dashboard Based on IPM and Smoking Prevalence 2024 Using K-Means with KNN Validation as a Support for Data-Driven Decision Making. International Journal of Science and Environment (IJSE), 6(2), 1682–1690. https://doi.org/10.51601/ijse.v6i2.649

Issue

Section

Articles