Clustering of the stock price using minimum spanning tree

Document Type : Original Article

Authors

Department of Mathematics, Faculty of Basic Sciences, Ayatollah Boroujerdi University, Boroujerd, Iran

Abstract

Due to the increasing activities of individuals and legal entities in the capital market and the transformation of this market into one of the most important economic drivers of any country, it is concluded that more knowledge regarding the selection of shares will undoubtedly lead to higher profitability. In this paper, clustering of stock price time series using the minimum spanning tree algorithm is proposed. The daily closing prices of the shares of the companies listed on the Tehran Stock Exchange from 09/23/2019 to 11/10/2020 are used as the dataset. In the first stage, we form some sub-clusters that include similar companies in terms of price behavior. Then, based on a similarity criterion, sub-clusters are merged until the desired clusters, which contain members with the most similarity, are achieved. The main advantage of the proposed method is that the similarity measures are calculated locally, resulting in lower computational costs compared to other methods. The results indicate that the method can easily perform the clustering process, especially for large datasets, with favorable accuracy.

Keywords

Main Subjects


[1] M. Saeidi Kousha and S. Mohebbi, “Optimizing stock portfolios by comparing different technical patterns,” Financ. Eng. Portfolio Manage., vol. 12, no. 49, pp. 104-125, 2021, dor: 20.1001.1.22519165. 1400.12.49.5.7 [In Persian].
[2] D. Farid and M. Pourhamidi, “Classifying stocks of listed companies on Tehran Stock Exchange using fuzzy cluster analysis,” J. Financ. Account. Res., vol. 4, no. 3, pp. 105-128, 2012, dor: 20.1001.1.23223405.1391.4.3.8.8 [In Persian].
[3] G. Mishra and S.K. Mohanty, “A fast hybrid clustering technique based on local nearest neighbor using minimum spanning tree,” Expert Syst. Appl., vol. 132, pp. 28-43, 2019, doi: 10.1016/j.eswa.2019.04.048.
[4] J.D. Cryer and K.S. Chan, Time Series Analysis, with Applications in R. New York, NY, USA: Springer, 2008, doi: 10.1007/978-0-387-75959-3.
[5] A. Soroushyar and M. Akhlaghi, “The comparative assessment of data mining methods effectiveness to forecasting return and risk of stock in companies listed in Tehran stock exchange,” J. Financ. Account. Res., vol. 9, no. 1, 2017, doi: 10.22108/far.2017.21746 [In Persian].
[6] Z. Shirazian, H. Nikoumaram, and T. Torabi, “Clustering of volatility and its asymmetry in Tehran Stock Exchange,” J. Invest. Knowl., vol. 9, no. 35, pp. 1-19, 2020. [In Persian]
[7] K. Ghanaei, M. Ghanbari, B. Jamshidinavid, and A. Baghfalaki, “Modeling the Co-Movement of Stocks Between Returns with Negative and Positive Shocks of Sentiment Arising from the Imbalance of Orders Using a Tree-Stage Clustering Approach,” Adv. Math. Financ. Appl., vol. 10, no. 1, pp. 113-129, 2024, doi: 10.71716/amfa.2025.23011843.
[8] B.M. Blau and T.G. Griffith, “Price clustering and the stability of stock prices,” J. Bus. Res., vol. 69, no. 10, pp. 3933-3942, 2016, doi: 10.1016/j.jbusres.2016.06.008.
[9] S.R. Nanda, B. Mahanty, and M.K. Tiwari, “Clustering Indian stock market data for portfolio management,” Expert Syst. Appl., vol. 37, no. 12, pp. 8793-8798, 2010, doi: 10.1016/j.eswa.2010.06.026.
[10] S.N. Zainol Abidin, S.H. Jaaman, M. Ismail, and A.S. Abu Bakar, “Clustering stock performance considering investor preferences using a fuzzy inference system,” Symmetry, vol. 12, p. 1148, 2020, doi: 10.3390/sym12071148.
[11] E. Gungor and A. Ozmen, “Distance and density based clustering algorithm using gaussian kernel,” Expert Syst. Appl., vol. 69, pp. 10-20, 2017, doi: 10.1016/j.eswa.2016.10.022.
[12] B.B. Nair, P.K.S. Kumar, N.R. Sakthivel, and U. Vipin, “Clustering stock price time series data to generate stock trading recommendations: An empirical study,” Expert Syst. Appl., vol. 70, pp. 20-36, 2017, doi: 10.1016/j.eswa.2016.11.002.
[13] S. Guha, R. Rastogi, and K. Shim, “CURE: An efficient clustering algorithm for large databases,” SIGMOD Rec., vol. 27, no. 2, pp. 73-84, 1998, doi: 10.1145/276305.276312.
[14] C.R. Lin and M.S. Chen, “Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 2, pp. 145-159, 2005, doi: 10.1109/TKDE.2005.21.
[15] G. Karypis, E.-H. Han, and V. Kumar, “Chameleon: Hierarchical clustering using dynamic modeling,” Computer, vol. 32, no. 8, pp. 68-75, 1999, doi: 10.1109/2.781637.
[16] X. Wang, X. Wang, and D.M. Wilkes, “A divide-and-conquer approach for minimum spanning tree-based clustering," IEEE Trans. Knowl. Data Eng., vol. 21, no. 7, pp. 945-958, 2009, doi: 10.1109/TKDE.2009.37.
[17] C. Zhong, D. Miao, and P. Franti, “Minimum spanning tree based split-and-merge: A hierarchical clustering method,” Inf. Sci., vol. 181, no. 16, pp. 3397-3410, 2011, doi: 10.1016/j.ins.2011.04.013.
[18] F. Zhao et al., “A similarity measurement for time series and its application to the stock market,” Expert Syst. Appl., vol. 182, p. 115217, 2021, doi: 10.1016/j.eswa.2021.115217.
[19] R. Balakrishnan and K. Ranganathan, A Textbook of Graph Theory. New York, NY, USA: Springer, 2012.
[20] P. Franti and S. Sieranoja, “K-means properties on six clustering benchmark datasets,” Appl. Intell., vol. 48, no. 12, pp. 4743-4759, 2018.
[21] A.K. Das and J. Sil, “Cluster validation using splitting and merging technique,” in Proc. Int. Conf. Comput. Intell. Multimedia Appl. (ICCIMA), 2007, pp. 56-60.