A parameter server with dynamic parameter allocation. I.e., it can relocate parameters among nodes during run time. This capability can be key for efficient distributed machine learning. More information can be found in our paper on dynamic parameter allocation: PVLDB (slightly longer version on arXiv). Details on the experiment settings for this paper can be found in docs/experiments-vldb20.md.
Lapse provides the following primitives:
Pull(keys)
: retrieve the values of a set of parameters (identified by keys) from the corresponding serversPush(keys, updates)
: send updates for parameters to the corresponding serversLocalize(keys)
: request local allocation of parameters
By default, primitives execute asynchronously. Wait()
can be used to execute any primitive synchronously. For example: Wait(Pull(keys))
.
The Lapse implementation is based on PS-Lite.
A simple example:
std::vector<uint64_t> keys = {1, 3, 5};
std::vector<float> updates = {1, 1, 1};
std::vector<float> recv_vals;
ps::KVWorker<float> kv;
kv.Wait(kv.Pull(keys, &recv_vals));
kv.Wait(kv.Push(keys, updates));
kv.Wait(kv.Localize(keys));
kv.Wait(kv.Pull(keys, &recv_vals)); // access is now local
lapse
requires a C++11 compiler such as g++ >= 4.8
and boost for some the application examples. On Ubuntu >= 13.10, you
can install it by
sudo apt-get update && sudo apt-get install -y build-essential git libboost-all-dev
Then clone and build
git clone https://github.com/alexrenz/lapse-ps
cd lapse-ps && make
A very simple example can be found in simple.cc. To run it, compile it:
make apps/simple
and run
python tracker/dmlc_local.py -s 1 build/apps/simple
to run with one node and default parameters or
python tracker/dmlc_local.py -s 3 build/apps/simple -v 5 -i 10 -k 14 -t 4
to run with 3 nodes and specific parameters. Run build/apps/simple --help
to see available parameters.
To test dynamic parameter allocation (i.e., moving parameters between servers), run
make -j 4 tests/test_dynamic_allocation
python tracker/dmlc_local.py -s 4 tests/test_dynamic_allocation
There are multiple start scripts. At the moment, we mostly use the following ones:
- tracker/dmlc_local.py to run on a local machine
- tracker/dmlc_ssh.py to run on a cluster
To see more information, run
python tracker/dmlc_local.py --help
, for example.
The -s
flag specifies how many processes (i.e., nodes to use, e.g. -s 4
uses 4 nodes. In each process, Lapse starts one server thread and multiple worker threads.
You find example applications in the apps/ directory and launch commands to locally run toy examples below. The toy datasets are in apps/data/.
make apps/matrix_factorization
python tracker/dmlc_local.py -s 2 build/apps/matrix_factorization --dataset apps/data/mf/ -r 2 --num_keys 12 --epochs 10
make apps/knowledge_graph_embeddings
python tracker/dmlc_local.py -s 2 build/apps/knowledge_graph_embeddings --dataset apps/data/kge/ --num_entities 280 --num_relations 112 --num_epochs 4 --embed_dim 100 --eval_freq 2
make apps/word2vec
python tracker/dmlc_local.py -s 2 build/apps/word2vec --num_threads 2 --negative 2 --binary 1 --num_keys 4970 --embed_dim 10 --input_file apps/data/lm/small.txt --num_iterations 4 --window 2 --localize_pos 1 --localize_neg 1 --data_words 10000
Lapse starts one process per node. Within this process, worker threads access the parameter store directly. A parameter server thread handles requests by other nodes and parameter relocations.