Skip to content
This repository has been archived by the owner on Jul 15, 2023. It is now read-only.

Preprocess: u'geoNetwork.networkDomain', u'geoNetwork.networkLocation', u'geoNetwork.region', u'geoNetwork.subContinent', #75

Open
wesleytian opened this issue Oct 7, 2018 · 1 comment
Assignees
Labels
easy preprocessing Preprocessing for data matrix

Comments

@wesleytian
Copy link

Preprocess the following features:

u'geoNetwork.networkDomain',
u'geoNetwork.networkLocation',
u'geoNetwork.region',
u'geoNetwork.subContinent',

  1. Standardization: http://scikit-learn.org/stable/modules/preprocessing.html#standardization-or-mean-removal-and-variance-scaling

  2. Impute missing values: http://scikit-learn.org/stable/modules/impute.html

  3. Normalization: http://scikit-learn.org/stable/modules/preprocessing.html#normalization

  4. Encode categorical features (optional): http://scikit-learn.org/stable/modules/preprocessing.html#encoding-categorical-features

  5. Discretization (optional): http://scikit-learn.org/stable/modules/preprocessing.html#discretization

http://scikit-learn.org/stable/modules/preprocessing.html

@wesleytian wesleytian added easy preprocessing Preprocessing for data matrix labels Oct 7, 2018
@mengqiuteng mengqiuteng self-assigned this Oct 8, 2018
@mengqiuteng
Copy link

If for a certain fullVisitorID, multiple records with different geoNetwork value is found, how should we deal with this? Do we just take the majority value?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
easy preprocessing Preprocessing for data matrix
Projects
None yet
Development

No branches or pull requests

2 participants