Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dash doesn't like the yahoo page #33

Open
gamag opened this issue Jun 28, 2017 · 4 comments
Open

dash doesn't like the yahoo page #33

gamag opened this issue Jun 28, 2017 · 4 comments

Comments

@gamag
Copy link

gamag commented Jun 28, 2017

When using scrape_yahoo on debian with sh linked to dash 0.5.7, the output file is empty most of the time.

Testing
sh -c 'echo "$(wget https://help.yahoo.com/kb/SLN23997.html -q -O -)"'
mostly gives an incomplete page (output stops somewhere in the css styles). Without the echo and $() there is nothing missing.

I assume this is due to some Unicode characters or null bytes that dash doesn't like, changing shebang to bash solves the problem for me. Using a pipe directly between wget to grep seems to work too, but maybe there is a cleaner solution.

@drybjed
Copy link

drybjed commented Aug 19, 2017

I can confirm that this happens consistently. The shebang interpreter should be changed to /bin/bash to solve this issue. In the meantime, if anybody else looks for the soluton, you can run the scrape_yahoo script from cron by running it via bash <path-to>/scrape_yahoo.

@stevejenkins
Copy link
Owner

Hi, @drybjed. The original script was re-written a while ago to rely on /bin/sh instead of Bash so that it could run on more systems. I can't replicate the issue, since I'm running the script on a CentOS box and /bin/sh seems to be parsing the Yahoo data fine.

Any other suggestions for fixing the problem for Debian users while still keeping the script as "universal" as possible?

@gamag
Copy link
Author

gamag commented Oct 12, 2017

Downloading to a temporary file might be an option - wget/curl can write directly to files, grep can read from files, so no redirection from the shell would be needed.

@gurubert
Copy link

gurubert commented Oct 13, 2017

If /bin/sh != bash then scrape_yahoo does not work as it is a bash script and not e.g. a dash script. If you are using the bash dialect in your script you should use /bin/bash as interpreter.

On CentOS /bin/sh is a symbolic link to /bin/bash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants