Sinhalese Language Based Hate Speech Detection and Intent Analysis on Tweets

• Researched developing a tool to detect Sinhalese language-based hate speech and analyze the intention of tweets.
• Semantic analysis and data mining techniques used in addressing the problem.
• Technologies used: Python, NLTK, Pandas, NumPy, Scikit-learn, PyCharm. 

Mobirise
Mobirise

Text Classification Module

 Researched Sinhalese Language Based Short Text Classification
      to Improve Information Filtering

Module 2: Extract Sinhala twitter texts, Sinhala text preprocessing and Categorize tweet content considering Subject and Domain .

My contribution for the project was implementing the text preprocessing module of which outcomes are used as inputs for analysis, process of classifying sinhala texts found on Twitter into 5 primary domains(Religion, Racist, Political, Sports, Sexism and Others) and further classifying Sinhala texts into 3 subjective categories(Opinion, News, Emotion)

Purpose of implementing this module is to further analyze hate according to particular domain and subject classes to define severity level of the twitter text content. 

Twitter image text analysis

Categories tweets according to context

Hate detection and intent analysis

© Copyright 2022 Isurie K. Liyanage - All Rights Reserved