Two implementations of Divide and Conquer Sequential Monte Carlo (DC SMC), described in our Journal of Computational Statistics and Graphics 2017 paper.
- One implementation has a specific model hard coded in it, namely the hierarchical model with a binomial emission model used in section 5.2 of the above pre-print.
- A second implementation offers convenient interfaces to apply the algorithm to other models. This second implementation also offers distributed and parallel computing functionalities. This second implementation is used in section 5.3 of the above pre-print.
Both implementations are based on Algorithm 2 of our arXiv preprint and at the moment do not support the extensions described in Section 4 of the preprint.
Please cite the following paper:
Fredrik Lindsten, Adam M. Johansen, Christian A. Naesseth,
Bonnie Kirkpatrick, Thomas B. Schon, John Aston,
Alexandre Bouchard-Côté. (2017) Divide-and-conquer with sequential Monte Carlo.
Journal of Computational Statistics and Graphics 26:445–458.
There are three ways to install:
Simply add the following lines (replacing 1.0.0 by the current version (see git tags)):
repositories {
mavenCentral()
jcenter()
maven {
url "http://www.stat.ubc.ca/~bouchard/maven/"
}
}
dependencies {
compile group: 'ca.ubc.stat', name: 'divide-and-conquer-smc', version: '3.0.0'
}
- Check out the source
git clone [email protected]:alexandrebouchard/divide-and-conquer-smc.git
- Compile using
./gradlew installDist
- Add the jars in
build/install/divide-and-conquer-smc/lib/
into your classpath
- Check out the source
git clone [email protected]:alexandrebouchard/divide-and-conquer-smc.git
- Type
./gradlew eclipse
from the root of the repository - From eclipse:
Import
inFile
menuImport existing projects into workspace
- Select the root
- Deselect
Copy projects into workspace
to avoid having duplicates
- Assuming you have checked out the repo and used
./gradlew installDist
successfully - Download and prepare the data by typing
scripts/prepare-data.sh
from the root of the repository (requires wget). This writes the preprocessed data indata/preprocessedNYSData.csv
- Run the software using
scripts/run-simple.sh -inputData data/preprocessedNYSData.csv -nParticles 1000
. - Note: for the plots to work, you need to have R in your PATH variable (more precisely, Rscript)
- Various output files are written in
results/latest/
The main interest of this implementation is for replicating the results in section 5.2 and 5.3 of our pre-print. To do so, see the following separate public repository which contains the exact sets of options used to run the experiments as well as the plotting scripts. To extend DC SMC to other models, use the second implementation instead, described next.
Running parallel and distributed DC SMC on the binary emission hierarchical model (implementation 2)
- Assuming you have checked out the repo and used
./gradlew installDist
successfully - Download and prepare the data by typing
scripts/prepare-data.sh
from the root of the repository (requires wget). This writes the preprocessed data indata/preprocessedNYSData.csv
- Run the software using
scripts/run-distributed.sh -dataFile data/preprocessedNYSData.csv
To run in a distributed fashion, simply issue the same command line on different machines. The machines will
discover each other (via multicast) and distribute the work dynamically. You can see in the output a variable
nWorkers=[integer]
showing the number of nodes cooperating. For testing purpose, you can do this on
one machine by opening two terminals (however, there is a better way to get multi-threading, see below).
The machines will only cooperate if all the command line options match exactly. You can check for example that running
scripts/run-distributed.sh -dataFile data/preprocessedNYSData.csv -nParticles 2000
will not cooperate with
scripts/run-distributed.sh -dataFile data/preprocessedNYSData.csv -nParticles 1000
. You can also force groups
of machines to avoid cooperation by using the -clusterSubGroup [integer]
To use multiple threads within one machine, use -nThreadsPerNode [integer]
. This can be used in conjunction with
a distributed computation, or without.
If you want the machines to wait each other,
use -minimumNumberOfClusterMembersToStart [integer]
. This will cause machines to wait to start until this number
of machines is gathered OR until maximumTimeToWaitInMinutes
minutes has elapsed.
Whether these options is used or not, machines can always join at any point in the execution of the algorithm.
The waiting options were used for wall-clock
running time analysis purpose. In most cases, these options can be ignored.
When the number of nodes in the DC-SMC tree is much larger than the number of threads times the number of nodes,
it is useful to
coarsen the granularity of the subtasks. To do so, you can use -maximumDistributionDepth [integer]
, which alters the
way subtasks are created as follows: instead of the basic initial tasks being the leaves of the tree, they are the
subtrees rooted at a depth from the root given by maximumDistributionDepth
. The subtrees under are computed
serially within each node/thread.
First, build a class to encode each individual particle. The only requirement is that it should implement
Serializable
. This will become generic type P
in the following.
Second, build a class to encode nodes in the tree. The only requirement is that you should override hashcode
and equals to ensure that each node in the tree be unique (i.e. not .equals(..) with any other node,
they will be inserted in a hashtable), AND reproducible (i.e. the default implementation of hashcode will
depend on the memory location and will be different from machine to machine). A reasonable default implementation
is in prototype.Node
. This will become generic type N
in the following.
Next, the main step consists in providing code that proposes, i.e. merges sub-populations. This is done by
creating a class implementing dc.DCProposal
. This class will be responsible for both proposing, and
providing a LOG weight update for the proposal (including taking care of computing the ratio in step 2(c)
of Algorithm 2 in the arXiv pre-print).
Here is an example, based on a simple model where transitions are provided
public static dc.DCProposal markovChainNaiveProposal(xlinear.Matrix,xlinear.Matrix)
{
final int nStates = transition.nRows();
checkValidMarkovChain(transition, prior);
// Model: a finite state Markov chain, with..
// transition, a n by n transition matrix
// initial, a n by 1 initial distribution (i.e. prior on root)
// NB: this is just for illustration/testing purpose as one can do exact inference
// on this model
return new DCProposal<Integer>() { // In this example, particles are just integers
/**
* Propose a parent particle given the children.
* All the randomness should be obtained via the provided random object.
*
* @param random
* @param childrenParticles
* @return A pair containing the LOG weight update and the proposed particle
* In the basic algorithm, this is given by gamma(x') / q(x'|x_1, .., x_C) / gamma(x_1) / .. / gamma(x_C)
* Where x' is the tree, and x_1, .., x_C are the C children subtrees.
*/
@Override
public Pair<Double, Integer> propose(Random random, List<Integer> childrenParticles)
{
// Naive proposal: uniform
Integer proposal = random.nextInt(nStates);
double weightUpdate = nStates; // 1 / q(x'|x), where q(x'|x) = 1/nStates
weightUpdate *= prior.get(proposal, 0); // prior(proposal) is part of gamma(x')
for (Integer childState : childrenParticles)
{
weightUpdate *= transition.get(proposal, childState); // transition(child | proposal) is part of gamma(x')
weightUpdate /= prior.get(proposal, 0); // everything else in gamma(x_c) gets cancelled with gamma(x')
}
return Pair.of(Math.log(weightUpdate), proposal);
}
};
}
After building a DCProposal, we also need a DCProposalFactory, which instantiate one proposal for each node of the tree and each thread and node.
This can be useful to produce different types of proposal depending on the part of the tree. For example, in our simple example where we arbitrarily set all the observations to be the state 0, the proposals at the leaves propose the value 0 deterministically, while the other nodes used the proposal defined above.
public static dc.DCProposalFactory markovChainNaiveProposalFactory(xlinear.Matrix,xlinear.Matrix)
{
return new DCProposalFactory<Integer,Node>()
{
@Override
public DCProposal<Integer> build(
Random random,
Node currentNode,
List<Node> childrenNodes)
{
if (childrenNodes.size() == 0)
// In this toy example, we set the leaves as observed and equal to 0
// This is modelled as a proposal that always returns state 0 if this
// is a leaf.
return (rand, children) -> Pair.of(Math.log(prior.get(0, 0)), 0); // NB: DCProposal is a FunctionalInterface
else
// Else, return the proposal defined above
return markovChainNaiveProposal(transition, prior);
}
};
}
Here is an example of how to put it all together:
System.out.println(transition);
// 2 x 2 dense matrix
// 0 1
// 0 | 0.800000 0.200000
// 1 | 0.200000 0.800000
System.out.println(prior);
// 2 x 1 dense matrix
// 0
// 0 | 0.500000
// 1 | 0.500000
DCProposalFactory<Integer,Node> factory = markovChainNaiveProposalFactory(transition, prior);
// Topology, here a simple perfect binary tree of depth 3
final DirectedTree<Node> tree = perfectBinaryTree(3);
// Use DCOptions to set option, either programmatically as below,
// or see DDCMain for an example how to parse these options from command line.
DCOptions options = new DCOptions();
options.nThreadsPerNode = 4;
options.nParticles = 1_000_000;
options.masterRandomSeed = 31; // Note: given the seed, output is deterministic even if distributed and/or parallelized
options.resamplingScheme = ResamplingScheme.MULTINOMIAL; // currently supported: STRATIFIED and MULTINOMIAL (default)
options.relativeEssThreshold = 0.5; // only resample if relative ESS fall below this threshold
// Prepare the simulation
DistributedDC<Integer, Node> ddc = DistributedDC.createInstance(options, factory, tree);
// By default, the processor below is included, which prints and records in the result folder
// some basic statistics like ESS, logZ estimates, etc. I.e. the line below is not needed but
// examplifies how other custom processors would be added.
// instance.addProcessorFactory(new DefaultProcessorFactory<>());
// Perform the sampling
ddc.start();
// timing: node=0, ESS=548622.6672945316, rESS=0.5486226672945316, logZ=-3.5950250310183574, nWorkers=1, iterationProposalTime=196, globalTime=3324
// Compare to exact log normalization obtained by sum product:
final double exactLogZ = computeExactLogZ(transition, prior, tree, (node) -> 0);
System.out.println("exact = " + exactLogZ);
// exact = -3.600962588536195
final double relativeError = Math.abs(exactLogZ - ddc.getRootPopulation().logNormEstimate())/exactLogZ;
Assert.assertTrue(relativeError < 0.01);
From:dc.Doc