Ali Hürriyetoğlu
2020
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020
Ali Hürriyetoğlu
|
Erdem Yörük
|
Vanni Zavarella
|
Hristo Tanev
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020
Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report
Ali Hürriyetoğlu
|
Vanni Zavarella
|
Hristo Tanev
|
Erdem Yörük
|
Ali Safaya
|
Osman Mutlu
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020
We describe our effort on automated extraction of socio-political events from news in the scope of a workshop and a shared task we organized at Language Resources and Evaluation Conference (LREC 2020). We believe the event extraction studies in computational linguistics and social and political sciences should further support each other in order to enable large scale socio-political event information collection across sources, countries, and languages. The event consists of regular research papers and a shared task, which is about event sentence coreference identification (ESCI), tracks. All submissions were reviewed by five members of the program committee. The workshop attracted research papers related to evaluation of machine learning methodologies, language resources, material conflict forecasting, and a shared task participation report in the scope of socio-political event information collection. It has shown us the volume and variety of both the data sources and event information collection approaches related to socio-political events and the need to fill the gap between automated text processing techniques and requirements of social and political sciences.
Analyzing ELMo and DistilBERT on Socio-political News Classification
Berfu Büyüköz
|
Ali Hürriyetoğlu
|
Arzucan Özgür
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020
This study evaluates the robustness of two state-of-the-art deep contextual language representations, ELMo and DistilBERT, on supervised learning of binary protest news classification (PC) and sentiment analysis (SA) of product reviews. A ”cross-context” setting is enabled using test sets that are distinct from the training data. The models are fine-tuned and fed into a Feed-Forward Neural Network (FFNN) and a Bidirectional Long Short Term Memory network (BiLSTM). Multinomial Naive Bayes (MNB) and Linear Support Vector Machine (LSVM) are used as traditional baselines. The results suggest that DistilBERT can transfer generic semantic knowledge to other domains better than ELMo. DistilBERT is also 30% smaller and 83% faster than ELMo, which suggests superiority for smaller computational training budgets. When generalization is not the utmost preference and test domain is similar to the training domain, the traditional machine learning (ML) algorithms can still be considered as more economic alternatives to deep language representations.
Search
Co-authors
- Erdem Yörük 2
- Vanni Zavarella 2
- Hristo Tanev 2
- Ali Safaya 1
- Osman Mutlu 1
- show all...