-
Notifications
You must be signed in to change notification settings - Fork 137
Propagating Changes in One Directory to Another
In this use case, we create Brooklin datastreams to reflect changes in one file system directory to another.
- Source: File System (Directory)
- Destination: File System (Directory)
- Connector:
DirectoryConnector
- Transport Provider:
DirectoryTransportProvider
Brooklin requires Java Development Kit 8+. Here are some options:
-
Download the latest stable release of ZooKeeper.
-
Untar the ZooKeeper tarball
tar -xzf zookeeper-3.4.14.tar.gz cd zookeeper-3.4.14
-
Start a ZooKeeper server
bin/zkServer.sh start conf/zoo_sample.cfg &
- Download the latest tarball (tgz) from Brooklin releases.
- Untar the Brooklin tarball
tar -xzf brooklin-1.0.0.tgz cd brooklin-1.0.0
- Run Brooklin
bin/brooklin-server-start.sh config/dir-sync-example.properties >/dev/null 2>&1 &
-
Create a datastream to sync changes made in a source directory to a destination directory.
# Replace <src-dir> and <dest-dir> below with file paths of source and destination # directories, respectively bin/brooklin-rest-client.sh -o CREATE -u http://localhost:32311/ -n first-dir-datastream -s <src-dir> -d <dest-dir> -dp 1 -c dirC -p 1 -t dirTP -m '{"owner":"test-user"}'
Here are the options we used to create this datastream:
-o CREATE The operation is datastream creation -u http://localhost:32311/ Datstream Management Service URI -n first-dir-datastream Datastream name -s <src-dir> Datastream source (source directory path in this case) -d <dest-dir> Datastream destination (destination directory path in this case) -c dirC Connector name ("dirC" is the name we use to refer to DirectoryConnector in config) -t dirTP Transport provider name ("dirTP" is the name we use to refer to DirectoryTransportProvider in config) -p 1 Number of source partitions -dp 1 Number of destination partitions -m '{"owner":"test-user"}' Datastream metadata (specifying datastream owner is mandatory)
-
Verify the datastream creation by requesting all datastream metadata from Brooklin using the command line REST client.
bin/brooklin-rest-client.sh -o READALL -u http://localhost:32311/
-
You can also view some more information about the different
Datastreams
andDatastreamTasks
by querying the health monitoring REST endpoint of the Datastream Management Service.curl -s "http://localhost:32311/health"
- Add/Modify/Delete files and/or directories in the source directory you specified when you created the datastream in step 3.
Please note that files/directories present in the source directory before datastream creation will not be copied to the destination directory. Only the ones you change after the datastream is created will be reflected in the destination.
-
Observe the destination directory you specified when you created the datastream in step 3.
-
If you wish to delete the datastream you created, you can do so by running:
bin/brooklin-rest-client.sh -o DELETE -u http://localhost:32311/ -n first-dir-datastream
-
Feel free to explore the various operations you can perform on datastreams using the REST client utility.
bin/brooklin-rest-client.sh --help
When you are done, run the following commands to stop all running apps.
cd brooklin-1.0.0
bin/brooklin-server-stop.sh
cd zookeeper-3.4.14
bin/zkServer.sh stop conf/zoo_sample.cfg
- Home
- Brooklin Architecture
- Production Use Cases
- Developer Guide
- Documentation
- REST Endpoints
- Connectors
- Transport Providers
- Brooklin Configuration
- Test Driving Brooklin