ارائه روشی مبتنی بر یادگیری عمیق و واژه‌نامه حسی برای تحلیل احساسات متون فارسی

نوع مقاله : مقاله پژوهشی

نویسندگان

دانشکده مهندسی برق و کامپیوتر، دانشگاه سیستان و بلوچستان، زاهدان، ایران.

چکیده

تحلیل احساسات یکی از شاخه‌های مهم پردازش زبان طبیعی است که هدف آن طبقه‌بندی متون بر اساس احساس و نگرش نویسنده متن است. در زبان فارسی، متون نوشته شده در شبکه‌های اجتماعی غالبا کوتاه، بدون ساختار و مملو از عبارات محاوره‌ای و غیررسمی هستند که این ویژگی‌ها باعث می‌شود کارایی الگوریتم‌های تحلیل احساسات به طور چشمگیری کاهش یابد. هدف این مقاله ارائه روشی مبتنی بر یادگیری عمیق و واژه‌نامه حسی برای تحلیل احساسات متون فارسی نوشته شده در شبکه‌های اجتماعی است. به دلیل این که اغلب واژه‌نامه‌های حسی موجود در زبان فارسی از لحاظ اندازه کوچک و فاقد عبارات محاوره‌ای و غیررسمی هستند، ابتدا روشی برای گسترش واژه‌نامه‌های حسی موجود با افزودن عبارات محاوره‌ای پرکاربرد در رسانه‌های اجتماعی که به کمک ChatGPT تعیین قطبیت شده‌اند، ارائه می‌شود. سپس از ترکیب واژه‌نامه حسی و شبکه عصبی پیچشی دو کاناله برای تعیین قطبیت متون استفاده می‌شود. نتایج ارزیابی‌های انجام گرفته نشان می‌دهد که با گسترش واژه‌نامه‌های حسی موجود با دو روش پیشنهادی، صحت الگوریتم تحلیل احساسات به ترتیب 1.74 و 2.14 درصد افزایش می‌یابد که نشان‌دهنده موفقیت ChatGPT در تعیین قطبیت عبارات محاوره‌ای فارسی است. همچنین، بکارگیری ویژگی‌های مستخرج از واژه‌نامه حسی در یک شبکه عصبی پیچشی دوکاناله منجر به افزایش دقت دو مدل پایه مورد بررسی به میزان 1.6 و 3.2 درصد می‌شود.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Proposing an approach based on deep learning and sentiment lexicon for Persian sentiment analysis

نویسندگان [English]

  • Samira Noferesti
  • Mahshid Miri
Faculty of Electrical and Computer Engineering, University of Sistan and Baluchestan, Zahedan, Iran.
چکیده [English]

Sentiment analysis is one of the important branches of natural language processing, which aims to classify texts with respect to the feelings and attitudes of the author of the text. In Persian, most of the available sentiment lexicons are small in size and lack slang expressions and informal words. These features significantly reduce the performance of sentiment analysis algorithms. This paper aims to present a method based on deep learning and sentiment lexicons for sentiment analysis of Persian texts written on social networks. Since most existing sentiment lexicons in Persian language are small in size and lack slang and informal expressions, first, two methods based on ChatGPT are proposed to expand the existing Persian sentiment lexicons by adding slang expressions that are widely used in social media. Then, the combination of the sentiment lexicon and dual-channel convolutional neural network (DC-CNN) is used to determine the polarity of texts. Experimental results show that by expanding the existing sentiment lexicons with the two proposed methods, the accuracy of the sentiment analysis algorithm increases by 1.74 and 2.14 percent, respectively, which indicates the success of ChatGPT in polarity classification of Persian slang expressions. Also, employing the features extracted from the sentiment lexicon in a DC-CNN leads to an increase in the precision of the two base models by 1.6 and 3.2 percent.

کلیدواژه‌ها [English]

  • Sentiment analysis
  • Polarity classification
  • Deep learning
  • Dual-channel CNN
  • Sentiment lexicon
  • Slang expressions
[1] O. Khalaf Beigi, S. A. Bashiri Mosavi, and S. Gharloghi, “Applying Character-Level Neural Network-Based Sentiment Analysis Model on Persian Comments of the Social Media-Online Store Platforms,” Soft Comput. J., vol. 11, no. 2, pp. 118-133, 2023, doi: 10.22052/scj.2023.248311.1094 [In Persian].
[2] K. Jahanbin and M. A. Zare Chahooki, “Sentiment Analysis of Cryptocurrencies with Zero-Shot Transfer Learning,” Soft Comput. J., vol. 14, no. 1, pp. 154-183, 2025, doi: 10.22052/scj.2025.255169.1258 [In Persian].
[3] Z. Rajabi and M. Valavi, “A Survey on Sentiment Analysis in Persian: A Comprehensive System Perspective Covering Challenges and Advances in Resources and Methods,” Cognit. Comput., vol. 13, no. 4, pp. 882-902, 2021, doi: 10.1007/s12559-021-09886-x.
[4] Z. Ayeste and S. Noferesti, “A Semantic Approach Based on Domain Knowledge for Polarity Shift Detection Using Distant Supervision,” Prog. Artif. Intell., vol. 11, no. 2, pp. 169-180, 2022, doi: 10.1007/s13748-021-00267-x.
[5] E. Asgarian, M. Kahani, and S. Sharifi, “The Impact of Sentiment Features on the Sentiment Polarity Classification in Persian Reviews,” Cognit. Comput., vol. 10, no. 1, pp. 117-135, 2018, doi: 10.1007/s12559-017-9513-1.
[6] K. Dashtipour et al., “PerSent: A Freely Available Persian Sentiment Lexicon,” in Advances in Brain Inspired Cognitive Systems, Beijing, China: Springer, 2016, pp. 310-320, doi: 10.1007/978-3-319-49685-6_28.
[7] M. Rasouli and V. Kiani, “A Survey on Deep Learning Methods for Text-Based Emotion Classification: Advances, Challenges, and Opportunities,” Soft Comput. J., vol. 13, no. 1, pp. 40-57, 2024, doi: 10.22052/scj.2023.248812.1126 [In Persian].
[8] E. Asgarian, M. Kahani, and S. Sharifi, “HesNegar: Persian Sentiment WordNet,” J. Signal Data Process., vol. 15, no. 1, pp. 71-86, 2018, doi: 10.29252/jsdp.15.1.71 [In Persian].
[9] B. Sabeti, P. Hosseini, G. Ghassem-Sani, and S. A. Mirroshandel, “LexiPers: An Ontology Based Sentiment Lexicon for Persian,” 2019, arXiv:1911.05263. 
[10] F. Amiri, S. Scerri, and M. Khodashahi, “Lexicon-based Sentiment Analysis for Persian Text,” in Proc. Int. Conf. Recent Adv. Natural Lang. Process. (RANLP), Hissar, Bulgaria, 2015, pp. 9-16.
[11] K. Dashtipour et al., “PerSent 2.0: Persian Sentiment Lexicon Enriched with Domain-Specific Words,” in Advances in Brain Inspired Cognitive Systems, vol. 11891, Guangzhou, China: Springer, 2019, pp. 497-509, doi: 10.1007/978-3-030-39431-8_48.
[12] K. Dashtipour, M. Gogate, A. Gelbukh, and A. Hussain, “Extending Persian Sentiment Lexicon with Idiomatic Expressions for Sentiment Analysis,” Soc. Netw. Anal. Min., vol. 12, p. 11, 2022, doi: 10.1007/s13278-021-00840-1.
[13] S. Baccianella, A. Esuli, and F. Sebastiani, “Sentiwordnet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining,” in Proc. 7th Conf. Int. Lang. Resources Evaluation (LREC), Valletta, Malta, 2010, pp. 2200-2204.
[14] E. Cambria, Y. Li, F. Z. Xing, S. Poria, and K. Kwok, “SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis,” in Proc. 29th ACM Int. Conf. Inf. Knowl. Manage. (CIKM), 2020, pp. 105-114, doi: 10.1145/3340531.3412003.
[15] I. Dehdarbehbahani, A. Shakery, and H. Faili, “Semi-supervised Word Polarity Identification in Resource-lean Languages,” Neural Netw., vol. 58, pp. 50-59, 2014, doi: 10.1016/j.neunet.2014.05.018.
[16] M. Darwich, S. A. Mohd, N. Omar, and N. A. Osman, “Corpus-Based Techniques for Sentiment Lexicon Generation: A Review,” J. Digit. Inf. Manage., vol. 17, no. 5, pp. 296-305, 2019, doi: 10.6025/jdim/2019/17/5/296-305.
[17] R. Dehkharghani, “SentiFars: A Persian Polarity Lexicon for Sentiment Analysis,” ACM Trans. Asian Low-Resour. Lang. Inf. Process., vol. 19, no. 2, p. 30, 2019, doi: 10.1145/3345627.
[18] K. Dashtipour, M. Gogate, A. Adeel, H. Larijani, and A. Hussain, “Sentiment Analysis of Persian Movie Reviews Using Deep Learning,” Entropy, vol. 23, no. 5, p. 596, 2021, doi: 10.3390/e23050596.
[19] E. Grave, P. Bojanowski, P. Gupta, A. Joulin, and T. Mikolov, “Learning Word Vectors for 157 Languages,” 2018, arXiv:1802.06893. [Online]. 
[20] M. B. Dastgheib, S. Koleini, and F. Rasti, “The Application of Deep Learning in Persian Documents Sentiment Analysis,” Int. J. Inf. Sci. Manage., vol. 18, no. 1, pp. 1-15, 2020.
[21] O. Davar, G. Dar, and F. Ghasemian, “DeepSentiParsBERT: A Deep Learning Model for Persian Sentiment Analysis Using ParsBERT,” in Proc. 28th Int. Comput. Conf., Comput. Soc. Iran (CSICC), 2023, pp. 1-5, doi: 10.1109/CSICC58665.2023.10105414.
[22] M. Dehghani and Z. Yazdanparast, “Sentiment Analysis of Persian Political Tweets Using ParsBERT Embedding Model with Convolutional Neural Network,” in Proc. 9th Int. Conf. Web Res. (ICWR), 2023, pp. 20-25, doi: 10.1109/ICWR57742.2023.10139063.
[23] M. Rohanian, M. Salehi, A. Darzi, and V. Ranjbar, “Convolutional Neural Networks for Sentiment Analysis in Persian Social Media,” Iran. J. Electr. Comput. Eng. (IJECE), vol. 18, no. 1, pp. 59-66, 2020, dor: 20.1001.1.16823745.1399.18.1.16.6 [In Persian].
[24] M. Vazan and J. Razmara, “Jointly Modeling Aspect and Polarity for Aspect-Based Sentiment Analysis in Persian Reviews,” 2021, arXiv:2109.07680. 
[25] J. P. R. Sharami, P. A. Sarabestani, and S. A. Mirroshandel, “DeepSentipers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus,” 2020, arXiv:2004.05328. 
[26] S. Eyvazi-Abdoljabbar et al., “An Ensemble-Based Model for Sentiment Analysis of Persian Comments on Instagram Using Deep Learning Algorithms,” IEEE Access, vol. 12, pp. 151223-151235, 2024, doi: 10.1109/ACCESS.2024.3473617.
[27] M. Farahani, M. Gharachorloo, M. Farahani, and M. Manthouri, “ParsBERT: Transformer-Based Model for Persian Language Understanding,” Neural Process. Lett., vol. 53, no. 6, pp. 3831-3847, 2021, doi: 10.1007/s11063-021-10528-4.
[28] F. Ariai, M. T. Mahmoudi, and A. Moeini, “Enhancing Aspect-Based Sentiment Analysis with ParsBERT in Persian Language,” 2025, arXiv:2502.01091. 
[29] M. Masumi, S. S. Majd, M. Shamsfard, and H. Beigy, “FaBERT: Pre-training BERT on Persian Blogs,” 2024, arXiv:2402.06617. 
[30] A. Shokri. “Persian-Slang Repository.” github.com. https://github.com/semnan-university-ai/persian-slang (accessed Oct. 1, 2024).
[31] Dataheart. “Opinion Mining Category.” dataheart.ir. http://dataheart.ir/category/67/نظرکاوی (accessed Aug. 10, 2024).