Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use chardet by default to find the encoding #2040

Merged
merged 1 commit into from
Aug 27, 2023
Merged

Conversation

jere344
Copy link
Contributor

@jere344 jere344 commented Aug 10, 2023

See #2022 for context

I tried a few different sources, and it did not seem to cause any issues.
I still added it as an argument to be able to overwrite it if chardet can't determine encoding.

@zGadli
Copy link
Contributor

zGadli commented Aug 10, 2023

Check if sources: 69shu.com works with the code.

@jere344
Copy link
Contributor Author

jere344 commented Aug 10, 2023

When I remove the custom get_soup It works with 69shu.com with no issue

@jere344
Copy link
Contributor Author

jere344 commented Aug 13, 2023

Chardet seems to make a lot of mistake in finding the encoding, I'm not sure if it should get merged. For example it messed up the encoding here : https://chrysanthemumgarden.com/novel-tl/ygbg/ygbg-1/

@dipu-bd dipu-bd merged commit 6e820d6 into dipu-bd:dev Aug 27, 2023
0 of 5 checks passed
@dipu-bd
Copy link
Owner

dipu-bd commented Aug 27, 2023

I merged this to enable passing custom encoding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants