A Novel Intelligent Persian Authorship System based on Writing Style

Authors

Abstract

The rapid development of communication by the Internet and the misuse of the anonymity embedded in the nature of online written documents have led to serious security issues. Anonymous identity of the Internet tools such as emails, blogs, and Web sites have made them target methods of interest for criminal activities. On the other hand, world social and political relations have made a great interest in Persian language leading to the spread of Persian manuscripts in the Internet. In this paper, an intelligent writeprint technique is introduced to demonstrate a Persian authorship based on his/her writing style. In this research, we used specific features of: (1) lexical, syntactic and semantic and (2) the application for identifying the Persian writer. Moreover, we reviewed: (1) the impact of the features performance and (2) KNN and Delta classification methods combined with the genetic algorithm on a database. To make implementation of the proposed approach possible, we designed a pos-tagger to detect Persian nouns, adjectives and adverbs using the word structure. The experimental results showed that, among others, the KNN and genetic algorithm combination method is more accurate in the Persian authorship recognition.

Keywords