Comment Toxicity Checker

📖 How it works

Comment Toxicity Checker is a website which allows you to type in a sentence, and using ML, get a rating on whether that sentence is "toxic".

Specifically, the website tells to what degree your sentence is toxic, by giving it a rating from 0% to 100%. It calculates this rating for six toxicity categories:

toxic
severe toxic
obscene
threat
insult
identity hate

🤔 How does it work (high-level)?

Behind the scenes, the website uses a Natural Language Processing model to predict the probability that a sentence belongs to each of the six toxicity categories.

Here's a highly simplified walk-through of how the model works:

Feed the into an NLP model.
The model outputs the probabilities that the original sentence is toxic, severe toxic, obscene, threat, insult and identity hate.

⚙️ How does it work (low-level)?

I built a text classification model using TensorFlow to predict probablities for each toxicity class.

The model uses pretrained word embeddings, but I fine-tuned it on this Kaggle dataset.

If you would like to view the model training procedure, check out this Kaggle notebook.

After training the model, I converted it into the TensorFlowLite format (an efficient TensorFlow model format designed for edge devices) and stored it in this repository in the directory model.

Through the TensorFlowJS API, the website is able to load the model from this repository.

Tokenization is performed on the client itself before passing the sentence to the model; this is possible because TensorFlowLite saves the tokenizer dictionary - find it at model/tokenizer_dictionary.json.

🤗 Contributions

Any contributions are welcome, whether its a brand-new feature or a typo fix.

If you would like to see a feature implemented, raise an Issue.

If you want to contribute to the project, feel free to send a PR.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
model		model
web		web
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comment Toxicity Checker

📖 How it works

🤔 How does it work (high-level)?

⚙️ How does it work (low-level)?

🤗 Contributions

About

Releases

Packages

Languages

raj-pulapakura/Comment-Toxicity-Checker

Folders and files

Latest commit

History

Repository files navigation

Comment Toxicity Checker

📖 How it works

🤔 How does it work (high-level)?

⚙️ How does it work (low-level)?

🤗 Contributions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages