Skip to content

Use keywords, tld, and some other attributes of the domain name to classify content of domain as pornographic or not

Notifications You must be signed in to change notification settings

themains/keyword_porn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Classifying Pornographic Domains Using Keywords and Domain Suffixes

We build a model about whether or not a particular domain carries pornographic content using a short list of keywords and a list of domain level suffixes. To build the model, we use data from Shallalist, which maintains a database of category of content hosted by a domain. Details about the method are outlined in Where's the Porn? Classifying Porn Domains Using a Calibrated Keyword Classifier.

The classifier using the following shallalist data, list of keywords and domain suffixes achieves an accuracy of nearly 80%.

About

Use keywords, tld, and some other attributes of the domain name to classify content of domain as pornographic or not

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages