Skip to content

2021 Papers

Denoising distant supervision for ontology lexicalization using semantic similarity measures

Mehdi Jabalameli, Mohammad Ali Nematbakhsh, and Reza Ramezani

Ontology lexicalization aims to provide information about how the elements of an ontology are verbalized in a given language. Most ontology lexicalization techniques require labeled training data, which are usually generated automatically using the distant supervision technique. This technique is based upon the assumption that if a sentence contains two entities of a triple in a knowledge base, it expresses the relation stated in that triple. read more

A novel similarity measure for spatial entity resolution based on data granularity model: Managing inconsistencies in place descriptions

Mohammad Khodizadeh-Nahari, Nasser Ghadiri, Ahmad Baraani & Jörg-Rüdiger Sack

Tremendous amounts of data are generated every day by different sources and stored in heterogeneous databases. Providing an integrated view by fusion of data is essential to enhance data utilization. An indispensable type of data is spatial data, with diverse application domains, including GIS, e-commerce, military, and tourism. The concept of location forms a key part of user-generated data with serious challenges, including uncertainty. read more

RMI-DBG algorithm: A more agile iterative de Bruijn graph algorithm in short read genome assembly

Zeinab Zare Hosseini, Shekoufeh Kolahdouz Rahimi, Esmaeil Forouzan, and Ahmad Baraani

The de Bruijn Graph algorithm (DBG) as one of the cornerstones algorithms in short read assembly has extended with the rapid advancement of the Next Generation Sequencing (NGS) technologies and low-cost production of millions of high-quality short reads. Erroneous reads, non-uniform coverage, and genomic repeats are three major problems that influence the performance of short read assemblers. To encounter these problems, the iterative DBG algorithm applies multiple kk-mers instead of a single kk-mer, by iterating the DBG graph over a range of kk-mer sizes from the minimum to the maximum. read more

A survey on meta-heuristic algorithms for the influence maximization problem in the social networks

Zahra Aghaee, Mohammad Mahdi Ghasemi, Hamid Ahmadi Beni, Asgarali Bouyer, and Afsaneh Fatemi

The different communications of users in social networks play a key role in effect to each other. The effect is important when they can achieve their goals through different communications. Studying the effect of specific users on other users has been modeled on the influence maximization problem on social networks. To solve this problem, different algorithms have been proposed that each of which has attempted to improve the influence spread and running time than other algorithms. read more

An influence maximization algorithm based on community detection using topological features

Zahra Aghaee, and Afsaneh Fatemi

Due to the growning use of social networks and the use of viral marketing in these networks, finding influential people to maximize information diffusion is considered. This problem is Influence Maximization Problem on social networks. The main goal of this Problem is to select a set of influential nodes to maximize the influence spread in a social network. Researchers in this field have proposed different algorithms, but finding the influential people in the shortest possible time is still a challenge that has attracted the attention of researchers. read more

A Weighted TF-IDF-based Approach for Authorship Attribution

Ali Abedzadeh, Reza Ramezani, and Afsaneh Fatemi

Authorship Attribution (AA) is a task in which a disputed text is automatically assigned to an author chosen from a list of candidate authors. To this end, a model is trained on a dataset of textual documents with known authors, which can be considered as a multi-class single-label classification task. In this paper, we approach this task differently by extending information retrieval techniques to train an AA model. It is based on weighting the AARR technique, presented in our previous study, to relax the value of term frequency. read more

ParSQuAD: Persian Question Answering Dataset based on Machine Translation of SQuAD 2.0

Negin Abadani, Jamshid Mozafari, Afsaneh Fatemi, Mohamadali Nematbakhsh, Arefeh Kazemi

Recent developments in Question Answering (QA) have improved state-of-the-art results, and various datasets have been released for this task. Since substantial English training datasets are available for this task, the majority of works published are for English Question Answering. However, due to the lack of Persian datasets, less research has been done on the latter language, making comparisons difficult. This paper introduces the Persian Question Answering Dataset (ParSQuAD) based on the machine translation of the SQuAD 2.0 dataset. read more

An emotion-aware music recommender system: bridging the user’s interaction and music recommendation

Saba Yousefian Jazi, Marjan Kaedi, and Afsaneh Fatemi

In emotion-aware music recommender systems, the user’s current emotion is identified and considered in recommending music to him. We have two motivations to extend the existing systems: (1) to the best of our knowledge, the current systems first estimate the user’s emotions and then suggest music based on it. Therefore, the emotion estimation error affects the recommendation accuracy. (2) Studies show that the pattern of users’ interactions with input devices can reflect their emotions. read more

A language-independent authorship attribution approach for author identification of text documents

Reza Ramezani

In the Authorship Attribution (AA) task, the most likely author of textual documents, such as books, papers, news, and text messages and posts are identified using statistical and computational methods. In this paper, a new computational approach is presented for identifying the most likely author of text documents. The proposed solution emphasizes lazy profile-based classification and, by using the Term Frequency-Inverse Document Frequency (TF_IDF) scheme, introduces a new measure for identifying important terms of documents. The importance of the terms is then used to calculate the similarity between an anonymous document and known documents. read more

Exact and efficient reliability and performance optimization of synchronous task graphs

Reza Ramezani, Abolfazl Ghavidel, Yasser Sedaghat

SRAM-based FPGAs have found many applications in modern computer systems. In these systems, high-performance computing applications are executed as task graphs in which reliability and performance are crucial constraints. In this paper, an exact method is presented to efficiently optimize the reliability and performance of synchronous task graphs running on SRAM-based FPGAs in harsh environments. read more

Using ParsBert on Augmented Data for Persian News Classification

Mohammadreza Varasteh, Arefeh Kazemi

Text classification is a fundamental task in Natural Language Processing (NLP). Although many works have been done to perform text classification in English, the number of studies on Persian text classification is limited. Previous works on Persian text classification often use classic machine learning methods such as Naive Bayes, Support Vector Machines, Decision Trees, etc. While these methods are fast and straightforward, they need feature engineering, and their performance heavily depends on the selected features. read more

ParSQuAD: Machine Translated SQuAD dataset for Persian Question Answering

Negin Abadani, Jamshid Mozafari, Afsaneh Fatemi, Mohammd Ali Nematbakhsh, Arefeh Kazemi

Recent advances in the field of Question Answering (QA) have improved state-of-the-art results. Due to the availability of rich English training datasets for this task, most results reported are for this language. However, due to the lack of Persian datasets, less research has been done for the latter language therefore the results are hard to compare. In the present work, we introduce the Persian Question Answering Dataset (ParSQuAD) translated from the well-known SQuAD 2.0 dataset. read more