Document menu scraping #33

qinghao1 · 2020-08-02T13:22:18Z

Hi there, I don't think the code to scrape the OHS menu is in here and the scraper is for mongo. Could you add the code you're using here or in a separate repo?

jonchan51 · 2020-08-02T13:32:24Z

Hi, you can refer to this branch for a psql version. I didn't merge it in as the way we were scraping the pdfs were rather awkward because uhs filenames weren't standardized. You'll need to modify menu_downloader.py to make it work with the new file names they are using on the website now.

qinghao1 · 2020-08-02T13:51:09Z

Thanks! I see, it might be better to get the URL from parsing https://uci.nus.edu.sg/ohs/current-residents/students/daily-menu-2/, I'll see if I can help you with that if I have the time haha

qinghao1 · 2020-08-06T13:41:37Z

Just an update on this, I think they actually changed the URLs (https://uci.nus.edu.sg/ohs/current-residents/students/daily-menu-2/). I'm going to work on the parsing but please let me know if you have already found another way around it! Thanks

moziliar · 2020-08-11T11:32:12Z

Hi @qinghao1 may I know if the issue has been resolved. I was in charge of the bot alone and was not very updated with the scraper done by my teammates.

qinghao1 · 2020-08-11T12:09:38Z

Hi there, I think it hasn't been fixed, but this PR should provide everything you need to fix it. That's on the scraper side though, so I don't think it has anything to do with the bot itself.

moziliar · 2020-08-11T12:11:49Z

Thanks. I just tried running the scraper on the CentOS container again and the lru seems to be breaking on it without meaningful error message. I replaced it with a normal dict and it still doesn't work. May I have your input on this?

qinghao1 · 2020-08-11T12:16:54Z

What's the error that you're seeing? It might be the case that the OHS website is blocked. Maybe try running it locally?

qinghao1 · 2020-08-11T12:17:22Z

Also you'd have to install lru in pipenv

moziliar · 2020-08-11T12:19:00Z

I did pipenv install with lru inside, but the installation seems to output some stacktrace without much meaningful error message.

qinghao1 · 2020-08-11T12:20:54Z

I guess you could just replace it with a normal dict, I don't think it will exceed memory usage with normal use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document menu scraping #33

Document menu scraping #33

qinghao1 commented Aug 2, 2020

jonchan51 commented Aug 2, 2020

qinghao1 commented Aug 2, 2020

qinghao1 commented Aug 6, 2020

moziliar commented Aug 11, 2020

qinghao1 commented Aug 11, 2020

moziliar commented Aug 11, 2020

qinghao1 commented Aug 11, 2020

qinghao1 commented Aug 11, 2020

moziliar commented Aug 11, 2020

qinghao1 commented Aug 11, 2020

Document menu scraping #33

Document menu scraping #33

Comments

qinghao1 commented Aug 2, 2020

jonchan51 commented Aug 2, 2020

qinghao1 commented Aug 2, 2020

qinghao1 commented Aug 6, 2020

moziliar commented Aug 11, 2020

qinghao1 commented Aug 11, 2020

moziliar commented Aug 11, 2020

qinghao1 commented Aug 11, 2020

qinghao1 commented Aug 11, 2020

moziliar commented Aug 11, 2020

qinghao1 commented Aug 11, 2020