Skip to content

Text Mining to Reveal Gender Bias in Parenting books using Word2Vec

Notifications You must be signed in to change notification settings

xiaofanliang/TextMining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Text Mining to Reveal Gender Biases in Parenting Books

Text mining and digitalized text data allows the emergence of natural language processing these days. People desperately hope to teach machine to master this art of human being and have already started to build technology based on machine learning outcomes based on text data. However, human language itself itself carries many biases (e.x. gender, race, age etc.) and without interventions, machine learning prediction will carry out the same bias in its outcome.

This project for my machine learning class final tries to tap into this space by learning how to train the most popular word- vectorization model called Word2Vec and explore whether parenting books have different "genderred" language associated with "mom" and "dad". I will

  1. Explain Word2Vec algorithm and how it works
  2. Train Word2Vec model on the text dataset of six parenting books
  3. Experiment with model's basic functionalities (e.x. word similarity score)
  4. Experiment with different parametrizations
  5. Output words along gender (dad-mom) axis and exam whether the text data has gender bias.

You can access the code here

About

Text Mining to Reveal Gender Bias in Parenting books using Word2Vec

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published