PropagandaDetectionNLP

SEMEVAL 2020 TASK 11 “DETECTION OF PROPAGANDA TECHNIQUES IN NEWS ARTICLES”

Propaganda is commonly defined as information of a biased or misleading nature, possibly purposefully shaped, to promote an agenda or a cause. In this project we are trying to build machine learning system for the Detection of Propaganda Techniques in News Articles.There are two subtasks to be solved as part of this project which are Span Identification and Technique Classification. We are able to secure position 17 on leader board in SI Task and Position 20 in TC Task.

The propaganda detection pipeline includes two sub tasks

14 class distribution

Task-2 is a 14-class classification task. The distribution amongst the classes is shown below. Dataset is highly imbalance

Word Cloud – Propaganda span from training dataset

Many propaganda includes words like god, church and Muslim. It shows that religion is used as propaganda more commonly.

Baseline Architecture

Final Architecture

Leaderboard Results

Team Information

SI task

TC task

Tools and Technologies

How to use this repository

Dataset Detail

References