We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I benchmarked the performance of BytePS with cross barrier using the script in /example/pytorch/benchmark_cross_barrier_byteps.py.
The complete commands as follows:
export DMLC_NUM_WORKER=2 export DMLC_ROLE=scheduler export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip1 bpslaunch
sever1: export DMLC_NUM_WORKER=2 export DMLC_ROLE=server export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip1 bpslaunch
export DMLC_NUM_WORKER=2 export DMLC_ROLE=server export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip1 bpslaunch
sever2: export DMLC_NUM_WORKER=2 export DMLC_ROLE=server export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip2 bpslaunch
export DMLC_NUM_WORKER=2 export DMLC_ROLE=server export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip2 bpslaunch
worker1 export NVIDIA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export DMLC_WORKER_ID=0 export DMLC_NUM_WORKER=2 export DMLC_ROLE=worker export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 # the scheduler port export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip3 bpslaunch python3 /usr/local/byteps/example/pytorch/benchmark_cross_barrier_byteps.py --model resnet50 --batch-size 64 --num-iters 500
export NVIDIA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export DMLC_WORKER_ID=0 export DMLC_NUM_WORKER=2 export DMLC_ROLE=worker export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 # the scheduler port export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip3 bpslaunch python3 /usr/local/byteps/example/pytorch/benchmark_cross_barrier_byteps.py --model resnet50 --batch-size 64 --num-iters 500
worker2 export NVIDIA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export DMLC_WORKER_ID=1 export DMLC_NUM_WORKER=2 export DMLC_ROLE=worker export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip4 bpslaunch python3 /usr/local/byteps/example/pytorch/benchmark_cross_barrier_byteps.py --model resnet50 --batch-size 64 --num-iters 500
export NVIDIA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export DMLC_WORKER_ID=1 export DMLC_NUM_WORKER=2 export DMLC_ROLE=worker export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip4 bpslaunch python3 /usr/local/byteps/example/pytorch/benchmark_cross_barrier_byteps.py --model resnet50 --batch-size 64 --num-iters 500
After executing the command, worker1 can print throughout but worker2 is hanging:
Finished:
The text was updated successfully, but these errors were encountered:
No branches or pull requests
I benchmarked the performance of BytePS with cross barrier using the script in /example/pytorch/benchmark_cross_barrier_byteps.py.
The complete commands as follows:
export DMLC_NUM_WORKER=2 export DMLC_ROLE=scheduler export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip1 bpslaunch
sever1:
export DMLC_NUM_WORKER=2 export DMLC_ROLE=server export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip1 bpslaunch
sever2:
export DMLC_NUM_WORKER=2 export DMLC_ROLE=server export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip2 bpslaunch
worker1
export NVIDIA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export DMLC_WORKER_ID=0 export DMLC_NUM_WORKER=2 export DMLC_ROLE=worker export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 # the scheduler port export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip3 bpslaunch python3 /usr/local/byteps/example/pytorch/benchmark_cross_barrier_byteps.py --model resnet50 --batch-size 64 --num-iters 500
worker2
export NVIDIA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export DMLC_WORKER_ID=1 export DMLC_NUM_WORKER=2 export DMLC_ROLE=worker export DMLC_NUM_SERVER=2 export DMLC_PS_ROOT_URI=ip1 export DMLC_PS_ROOT_PORT=1234 export DMLC_INTERFACE=xgbe1 export DMLC_NODE_HOST=ip4 bpslaunch python3 /usr/local/byteps/example/pytorch/benchmark_cross_barrier_byteps.py --model resnet50 --batch-size 64 --num-iters 500
After executing the command, worker1 can print throughout but worker2 is hanging:
Finished:
The text was updated successfully, but these errors were encountered: