Skip to content

Hawk data transfer guide

Markus Battarbee edited this page Jan 27, 2021 · 1 revision

Copying data between Hawk and Datacloud / Allas

Since Hawk does not allow any outgoing network connections, transferring data out is a bit tricky. You can transfer by pulling from remote machines via rsync, but you can also work around the restriction by using your ssh connection as a SOCKS proxy. The trick is to ssh -R <portNumber>, where you choose a random number above 1024 as your local socks proxy port, and then, on hawk, set your http_proxy and https_proxy environment variables to socks5://localhost:<portNumber>

Then you will need to install rclone and the openstack/swift clients into your user's python paths. This is a bit more difficult than on other machines, because the pip installation on hawk does not come with socks proxy support out of the box. So you'll need to provide it first. On your local machine, download the pip package:

pip3 download pysocks -d .

and copy the PySocks-*.whl file you recievie over onto Hawk. There, you can execute the following commands to setup everything:

export http_proxy=socks5://localhost:<portNumber>
export https_proxy=socks5://localhost:<portNumber>
module load python
pip install --user PySocks-*.whl -f . --no-index
pip install --user python-openstackclient
pip install --user python-swiftclient
curl https://downloads.rclone.org/rclone-current-linux-amd64.zip -o rclone.zip
unzip rclone.zip && rm rclone.zip
ln -s ~/rclone-v1.53.2-linux-amd64/rclone ~/bin
git clone https://github.com/CSCfi/allas-cli-utils
cd allas-cli-utils
pip install --user six
pip install --user pyparsing
pip install --user decorator
pip install --user wrapt
pip install --user pytz
export PATH=$PATH:~/.local/bin
source ./allas_conf -u <CSCusername>

Afterwards, configure Datacloud in rclone like before. Then you should be good to go.