Gateway timeout #140

raffaem · 2020-03-31T09:16:55Z

During an

ab = AbstractRetrieval(scopus_id)

instruction, the following exception was thrown by requests:

Traceback (most recent call last):
File "C:\InstalledPrograms\Anaconda3\lib\site-packages\pybliometrics\scopus\utils\get_content.py", line 76, in get_content
error_type = errors[resp.status_code]
KeyError: 504

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "3_download_affiliations.py", line 63, in
ab = AbstractRetrieval(scopus_id)
File "C:\InstalledPrograms\Anaconda3\lib\site-packages\pybliometrics\scopus\abstract_retrieval.py", line 617, in init
api='AbstractRetrieval', refresh=refresh, view=view)
File "C:\InstalledPrograms\Anaconda3\lib\site-packages\pybliometrics\scopus\classes\retrieval.py", line 60, in init
Base.init(self, qfile, refresh, params=params, url=url)
File "C:\InstalledPrograms\Anaconda3\lib\site-packages\pybliometrics\scopus\classes\base.py", line 106, in init
content = get_content(url, params, *args, **kwds).text.encode('utf-8')
File "C:\InstalledPrograms\Anaconda3\lib\site-packages\pybliometrics\scopus\utils\get_content.py", line 83, in get_content
resp.raise_for_status()
File "C:\InstalledPrograms\Anaconda3\lib\site-packages\requests\models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: GATEWAY_TIMEOUT for url: https://api.elsevier.com/content/abstract/scopus_id/85043991328?view=META_ABS

I propose that maybe pybliometrics catch this exception and retry the download automatically? Maybe let the user set a maximum number of retry before pybliometrics throw an exception?

The text was updated successfully, but these errors were encountered:

Michael-E-Rose · 2020-04-02T20:22:27Z

Yes, I came across these errors as well. Thanks for bringing it to attention again.

One fix is to attempt the download again, if certain exceptions occur.

I suspect however that the gateway timeout occurs when requests are too frequent (as explained in #125). If this is the case, a better fix is to enforce a speed limit. This involves a deeper restructuring of the architecture of pybliometrics because right now, from one call to the next, pybliometrics doesn't know how many requests have been made.

Michael-E-Rose added Backend Effort: High labels Apr 16, 2020

Michael-E-Rose closed this as completed Jul 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gateway timeout #140

Gateway timeout #140

raffaem commented Mar 31, 2020

Michael-E-Rose commented Apr 2, 2020

Gateway timeout #140

Gateway timeout #140

Comments

raffaem commented Mar 31, 2020

Michael-E-Rose commented Apr 2, 2020