Dhananjay Bhagat / Pranali Dhawas
Hate Speech and harassment are widespread in online communication, due to user’sfreedom and anonymity and the lack of regulation governed by social media. Due to thiscyber trolling and bullying is major issue in a society. To overcome this problem, we canuse the ability of machine learning for hate speech detection to capture common propertiesfrom topic generic datasets and transfer this knowledge to recognize specificmanifestations of hate speech using NLP, ML and Analysis. Our main goal is to apply thissophisticated and efficient model on text data to get optimal and accurate results. We usedifferent machine learning and deep learning technique including multi modalapproaches. We use dataset that is divided into topic-specific like misogyny, sexism,racism, xenophobia, homophobia. Training a model on a combination of several (trainingsets from several) topic-specific datasets is more effective than training a model on atopic-generic dataset. Dataset can be gathered from various sources like from YouTubeAPI, Twitter API, web-scrapping or from various government sources. Our aim is toperform preprocessing and exploratory data analysis on collected data and deriveconclusion from it.