-
Notifications
You must be signed in to change notification settings - Fork 558
CPP API
Euler provides the C++ interfaces that does not rely on concrete machine learning frameworks and can be used as the independent graph engine. Users can access the graph engine via the Graph
class in euler::client
. Refer to the header file euler/client/graph.h.
#include <chrono>
#include <iostream>
#include <vector>
#include <thread>
#include "euler/client/graph.h"
int main() {
euler::client::GraphConfig config;
config.Add("mode", "Local");
config.Add("directory", "."); // path of graph data
auto graph = euler::client::Graph::NewGraph(config);
graph->GetFullNeighbor(
{0, 1}, {0, 1}, [](const euler::client::IDWeightPairVec &result) {
for (const auto &neighbors : result) {
for (const auto &tuple : neighbors) {
euler::client::NodeID target;
float weight;
int32_t edge_type;
std::tie(target, weight, edge_type) = tuple;
std::cout << "(" << target << ", "
<< weight << ", "
<< edge_type << ") ";
}
std::cout << std::endl;
}
});
std::this_thread::sleep_for(std::chrono::milliseconds(100));
graph->GetNodeBinaryFeature(
{1, 2}, {0}, [](const euler::client::BinaryFatureVec& result) {
for (const auto &features : result) {
for (const std::string &feature: features) {
std::cout << feature;
}
std::cout << std::endl;
}
});
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
Running the above program for Graph in this example will output:
(1, 2, 0) (2, 4, 0)
(2, 3, 1)
a fruit 6 phone
a start s8 phone
The user needs to initialize a Graph
object before using the graph, which can be obtained via a GraphConfig
or configuration file:
std::unique_ptr<Graph> NewGraph(const std::string& config_file);
std::unique_ptr<Graph> NewGraph(const GraphConfig& config);
The above example can also be initialized with the following configuration file:
mode=Local
directory=.
The mode is divided into Local and Remote. The Local mode initializes a graph engine in memory space of the current process, and the Remote mode visits a set of graph engine services through RPC. The two modes correspond to different sets of configurations:
mode | Local |
---|---|
directory | only support Unix currently |
load_type | fast / compact |
Mode | Remote |
---|---|
zk_server | ZooKeeper address, ip:port, to obtain meta information of servers |
zk_path | ZooKeeper node, to obtain meta information of servers |
num_retries | RPC retry times, non-positive for infininte reties, defaults to 10 |
num_channels_per_host | number of RPC channels per host, defaults to 1 |
bad_host_cleanup_interval | dectection interval of bad servers, defaults to 1 |
bad_host_timeout | recover interval of bad servers, defaults to 10 |
If use the Remote mode, users need to [start a group of Euler services](#Graph service Initialization).
Type | Definition | Description |
---|---|---|
NodeID |
uint64_t |
node id |
EdgeID |
tuple<NodeID, NodeID, int32_t> |
edge id, represented as source, destination, and edge type |
IDWeightPair |
tuple<NodeID, float, int32_t> |
neighbor information, includes neighbor id, edge weight, and edge type |
The tuple
, vector
, string
in this section and next section stands for std::tuple
, std::vector
, and std::string
.
As shown in the example above, the C++ interfaces provided by Euler are all asynchronous. The last parameter of the methods is a callback function with the form std::function<void(const ResultType& result)>
, which is called after the corresponding data is received.
All results are returned in row-major order, i.e. the first dimension of vector<vector<IDWeightPair>>
corresponds to each node in the request, and the second dimension corresponds to each neighbor of each node; the first dimension of vector<vector<vector<T>>>
corresponds to each vertex/edge in the request, the second dimension corresponds to each property of each node, and the third dimension corresponds to each value of each property of each vertex.
Interface | Parameters | Return | description |
---|---|---|---|
SampleNode |
int node_type int count
|
vector<NodeID> |
sample nodes by type |
SampleEdge |
int edge_type int count
|
vector<EdgeID> |
sample edges by type |
SampleNeighbor |
vector<NodeID> node_ids vector<int> edge_types int count
|
vector<vector<IDWeightPair>> |
Sample the outgoing edges of a node in the request by type |
GetTopKNeighbor |
vector<NodeID> node_ids vector<int> edge_types int k k |
vector<vector<IDWeightPair>> |
Get the outgoing edge with the maximum weight in the request by type |
GetFullNeighbor |
vector<NodeID> node_ids vector<int> edge_types
|
vector<vector<IDWeightPair>> |
Get the outgoing edges of nodes in the request by type |
GetSortedFullNeighbor |
vector<NodeID> node_ids vector<int> edge_types
|
vector<vector<IDWeightPair>> |
Get the outgoing edges of nodes in the request by type, and sort them by dst_id. |
BiasedSampleNeighbor |
vector<NodeID> node_ids vector<NodeID> parent_node_ids vector<int> edge_types vector<int> patent_edge_types int count float p float q
|
vector<vector<IDWeightPair>> |
biased outgoing edges sampling used in node2vec |
GetNodeFloat32Feature |
vector<NodeID> node_ids vector<int> fids
|
vector<vector<vector<float>>> float feature |
Get the float feature of nodes in the request |
GetNodeUint64Feature |
vector<NodeID> node_ids vector<int> fids
|
vector<vector<vector<uint64_t>>> uint64 feature |
Get the uint64 feature of nodes in the request |
GetNodeBinaryFeature |
vector<NodeID> node_ids vector<int> fids
|
vector<vector<string>> binary feature |
Get the binary feature of nodes in the request |
GetEdgeFloat32Feature |
vector<EdgeID> edge_ids vector<int> fids
|
vector<vector<vector<float>>> float feature |
Get the float feature of edges in the request |
GetEdgeUint64Feature |
vector<EdgeID> edge_ids vector<int> fids
|
vector<vector<vector<uint64_t>> uint64 feature |
Get the uint64 feature of edges in the request |
GetEdgeBinaryFeature |
vector<EdgeID> edge_ids vector<int> fids
|
vector<vector<string>> binary feature |
Get the binary feature of edges in the request |
When using the Remote mode, users need to start a set of graph services via StartService
in euler::service
. Refer to the header file euler/service/graph_service.h.
void StartService(const ServiceConfig& conf);
euler::service::StartService({
{"directory", "/path/to/data"},
{"loader_type", "hdfs"},
{"hdfs_addr", "namenode.example.com"},
{"hdfs_port", "9000"},
{"shard_idx", "0"},
{"shard_num", "1"},
{"zk_addr", "zk.example.com:2181"},
{"zk_path", "/path/for/euler"},
{"global_sampler_type", "node"},
{"graph_type", "compact"}
});
The configuration items in its parameters are:
key | value description |
---|---|
directory | Path of graph data |
loader_type | Type of file system: local or hdfs |
hdfs_addr | HDFS Namenode address |
hdfs_port | HDFS Namenode port |
shard_idx | id of shard |
shard_num | number of shard |
zk_addr | ZooKeeper address, ip:port, used to publish meta information of server |
zk_path | ZooKeeper path,used to publish meta information of server |
global_sampler_type | Type of globl sampler: all / node / edge / none |
graph_type | Type of graph engine: compact or fast |
server_thread_num | Thread number of RPC service, defaults to the number of CPU cores |
When the Euler service is started, the entire graph is divided into multiple shards, and each shard can have multiple Euler service instances.