Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raster_Identify Failed to download: invalid url? #234

Open
howff opened this issue Nov 24, 2017 · 6 comments
Open

Raster_Identify Failed to download: invalid url? #234

howff opened this issue Nov 24, 2017 · 6 comments

Comments

@howff
Copy link

howff commented Nov 24, 2017

The URL reported in the error message seems to be missing the host and path parts when I add a raster resource to a dataset:

Downloading resource from http://nppdnbdaysdr.17319131254.tif to: /var/local/ckan/default/tmp//rasterstorer//Cov_.raster

Is that the reason why celery sticks?
Maybe there is some magic configuration file to edit?

Full message:

[2017-11-20 15:03:33,593: INFO/PoolWorker-1] rasterstorer.identify[24bc5372-7c28-4524-93e8-b222e62fbf12]: [Raster_Identify]Downloading resource a32a6046-ce1f-42a4-a417-118376bb32e3...
[2017-11-20 15:03:33,593: INFO/PoolWorker-1] rasterstorer.identify[24bc5372-7c28-4524-93e8-b222e62fbf12]: [Raster_DownloadResource] Downloading resource a32a6046-ce1f-42a4-a417-118376bb32e3 from http://nppdnbdaysdr.17319131254.tif to: /var/local/ckan/default/tmp//rasterstorer/a32a6046-ce1f-42a4-a417-118376bb32e3/Cov_a32a6046_ce1f_42a4_a417_118376bb32e3.raster
[2017-11-20 15:03:33,596: ERROR/PoolWorker-1] rasterstorer.identify[24bc5372-7c28-4524-93e8-b222e62fbf12]: [Raster_Identify] Failed to download: Failed to download http://nppdnbdaysdr.17319131254.tif: <urlopen error [Errno -2] Name or service not known>
[2017-11-20 15:03:33,610: ERROR/MainProcess] Task rasterstorer.identify[24bc5372-7c28-4524-93e8-b222e62fbf12] raised exception: CannotDownload('Failed to download http://nppdnbdaysdr.17319131254.tif: <urlopen error [Errno -2] Name or service not known>',)
Traceback (most recent call last):
File "/var/local/ckan/default/pyenv/local/lib/python2.7/site-packages/celery/execute/trace.py", line 47, in trace
return cls(states.SUCCESS, retval=fun(*args, **kwargs))
File "/var/local/ckan/default/pyenv/local/lib/python2.7/site-packages/celery/app/task/init.py", line 247, in call
return self.run(*args, **kwargs)
File "/var/local/ckan/default/pyenv/local/lib/python2.7/site-packages/celery/app/init.py", line 175, in run
return fun(*args, **kwargs)
File "/var/local/ckan/default/pyenv/src/ckanext-publicamundi/ckanext/publicamundi/storers/raster/tasks.py", line 28, in rasterstorer_identify
rasterstorer_identify.retry(exc=ex, countdown=60)
File "/var/local/ckan/default/pyenv/local/lib/python2.7/site-packages/celery/app/task/init.py", line 535, in retry
self.name, options["task_id"], args, kwargs))
CannotDownload: Failed to download http://nppdnbdaysdr.17319131254.tif: <urlopen error [Errno -2] Name or service not known>

@kalxas
Copy link
Member

kalxas commented Nov 25, 2017

The URL does not seem to be a valid one.

@howff
Copy link
Author

howff commented Nov 25, 2017

That's right, but it has been generated by the raster importer so either there is a bug in the raster importer or something needs to be configured somewhere. Any ideas?

(I created a dataset in ckan web interface and attached a geotiff resource)

@kalxas
Copy link
Member

kalxas commented Nov 25, 2017

@drmalex07 any ideas?

@drmalex07
Copy link
Member

Well, i think you should ping the rasdaman team, which was the only one involved with raster-storer plugin.

@vladmerti
Copy link
Member

cross posting from the rasdaman-dev mailing list in case I'm missing something:

Hi Andrew,

It's been a while since I looked over this code.

The error occurs in https://github.com/PublicaMundi/ckanext-publicamundi/blob/master/ckanext/publicamundi/storers/raster/tasks.py#L11. The method exposes a celery task which prepares a resource for ingestion in rasdaman. In your case the URL points to a non-existing resource, so the rasterstorer can not import it.

You could track where the URL is coming from. It appears in the task context, and is passed on to a utility class for download. On https://github.com/PublicaMundi/ckanext-publicamundi/blob/master/ckanext/publicamundi/storers/raster/tasks.py#L18 you already have dump of the context, so a first step would be to check if the url is ok in the context or it's already broken when it gets there. If the URL is correct in the context but it still fails to download, you can have a look at https://github.com/PublicaMundi/ckanext-publicamundi/blob/master/ckanext/publicamundi/storers/raster/raster_plugin_util.py#L49 (but my intuition is that the URL is already pointing to nothing in the context).

The celery task itself is created whenever a new resource with one of the geotiff, png, jpeg, zip or raster formats is added to ckan (https://github.com/PublicaMundi/ckanext-publicamundi/blob/master/ckanext/publicamundi/storers/raster/plugin.py#L60).

HTH,
Vlad

@howff
Copy link
Author

howff commented Dec 5, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants