Want to explore the data associated with your Google account?
Google has a data export tool. But the data is housed inside HTML files. This project:
- Converts your Search history into a CSV for your own EDA, model building, and miscellaneous use.
- Clusters your Google Searches into a specified number of topics and the words that comprise each
- Visit Google Takeout, and request a copy of your data
- Install required libraries
- Change
path
to your Google HTML file - Change
OUTPUT_FILE
to your desired CSV name - Optional: specify start date and end date in the
ModelData
class to see search query topics from a previous time. - Run
- Log type
- Query (raw)
- Date
- URL
- Location (coordinates)
- Day
- Desktop/Mobile
- Site
- Location (address)
- Query (cleaned for topic modeling)
* Note: Only searches have location data
ProcessGoogleData
class: creates the CSV from Google's HTMLGenerateFeatures
class: builds additional useful featuresModelData
class: creates topic NMF or LDA topic models from the Google Search queries*
* Only tested on Google search since I rarely use the others.