by Aansh Shah, Eric Liu, Heila Precel, Zander Chase
Initially developed at a hackathon, Stop Harassment is a Chrome Extension for a variety of common social media platform to hide harassing content. At the time, it only supported YouTube and Twitter with a harassment classifier that is not especially robust and requires people to list specific root words that they experience harassment for.
The goal of this semester’s project at Brown was to add additional social platforms, improve the harassment classification, add additional resources and reporting options, and get it in a state where it is ready to be released to the Chrome Store.
We were paired with Uplift-Together. Uplift is dedicated to combating sexual abuse in online communities through education and advocacy. They work to ensure that these flourishing communities are safe for the millions of people who connect through them.
Through the Menu Bar, you are able to enable or disable the application. By enabling the Selection Tool, content that is flagged as harassment by our extension will be highlighted. At the bottom of each highlighted post, you will be able to block a user from your newsfeed or get more detailed instructions on how to report the user.
Users are able to subscribe to premade word lists that they want to filter content out from.
Manage the words that you choose to be filtered out. Custom added words and pre-made wordlist words are managed here. Can toggle between completely blocking comments with passed in words and passing them through our classifier.
Users can that are blocked or white listed can be managed here.
When the selector tool is enabled and a comment that contains a word you have subscribed in a word list is classified as toxic, the comment is blacked out. You have the option of blocking the user, whitelisting the user, or using our simplified reporting feature.
When the selector tool is disabled and a comment that contains a word you have subscribed to in a word list is classified as toxic, the comment disappears from your feed.
We built a robust toxicity classifier that was trained on millions of comments using a Long Short Term Memory Model. Throughout the process of designing our classifier, we tried numerous models including: Bag of Words and Convolution Neural Networks. Due to a lack of publicly available toxicity datasets available, we created our own datasets to train our models with, achieving a very high accuracy rate of approximately 96%.
In order to serve our toxicity classifications to our chrome extension, we built a scalable Restful API that handles thousands of concurrent queries.