add task: download_from_url() to download a URL to a file #562
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds a task to tasks_utils.wdl,
download_from_url()
, to download content from an individual URL to a file, viawget
. This task exists as a workaround until Terra supports this functionality natively. (cromwell already does)This has the following inputs:
url_to_download
: The URL to download; this is passed to wget (required)output_filename
: The filename to use for the downloaded file. This is optional, though it can be helpful in the event the server does not advise on a filename to use via the 'Content-Disposition' HTTP response header. (optional)additional_wget_opts
: Additional options passed to wget as part of the download command. (optional)request_method
: The request method (GET
,POST
, etc.) passed to wget. (default:GET
)request_max_retries
: The maximum number of (additional) re-tries to attempt in the event of failed download. (optional)md5_hash_expected
: The (binary-mode) md5 hash expected for the file to download. If provided and the value does not match the md5 hash of the downloaded file, the task will fail. mutually exclusive with md5_hash_expected_file_url (optional)md5_hash_expected_file_url
: The url of a file containing the (binary-mode) md5 hash expected for the file to download. If provided and the value does not match the md5 hash of the downloaded file, the task will fail. mutually exclusive with md5_hash_expected (optional)save_response_header_to_file
: If save_response_header_to_file=true, http response headers will be saved to a separate output file. Only applicable for http[s] URLs. (optional)disk_size
: The size of the disk used for the instance downloading the file (default: 50 GB)Note: at present, this task only downloads a single file from a single URL. This is a design decision made for a couple reasons:
Content-Disposition
response header. Returning a single file takes advantage of separation of task outputs at runtime to avoid collisions