Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training error when column to predict has more than 100 variants #120

Open
joelchen opened this issue Apr 17, 2022 · 2 comments
Open

Training error when column to predict has more than 100 variants #120

joelchen opened this issue Apr 17, 2022 · 2 comments

Comments

@joelchen
Copy link
Contributor

When column to predict has more than 100 variants for multiclass classification, there is following error during training:

✅ Inferring train table columns. 6s
✅ Loading train table. 6s
✅ Shuffling. 0s 846ms
✅ Computing train stats. 10s
✅ Computing test stats. 2s
✅ Finalizing stats. 11s
error: invalid target column type
@nitsky
Copy link
Contributor

nitsky commented Apr 17, 2022

Hi @joelchen the default settings assume that a column with more than 100 non-numeric unique values is a text column, not an enum column. You can force the CLI to treat your target column as an enum column using a config file.

@joelchen
Copy link
Contributor Author

@nitsky Alright, the accuracy of 100 variants is low and I have not trained again with enum as target column in config file, but other users may encounter this issue, so I will leave it to your team to decide whether there is room for improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants