Skip to content

CPP API

Siran Yang edited this page Jun 4, 2019 · 1 revision

Euler provides the C++ interfaces that does not rely on concrete machine learning frameworks and can be used as the independent graph engine. Users can access the graph engine via the Graph class in euler::client. Refer to the header file euler/client/graph.h.

#include <chrono>
#include <iostream>
#include <vector>
#include <thread>

#include "euler/client/graph.h"

int main() {
  euler::client::GraphConfig config;
  config.Add("mode", "Local");
  config.Add("directory", "."); // path of graph data
  auto graph = euler::client::Graph::NewGraph(config);

  graph->GetFullNeighbor(
      {0, 1}, {0, 1}, [](const euler::client::IDWeightPairVec &result) {
        for (const auto &neighbors : result) {
          for (const auto &tuple : neighbors) {
            euler::client::NodeID target;
            float weight;
            int32_t edge_type;
            std::tie(target, weight, edge_type) = tuple;
            std::cout << "(" << target << ", "
                             << weight << ", "
                             << edge_type << ") ";
          }
          std::cout << std::endl;
        }
      });
  std::this_thread::sleep_for(std::chrono::milliseconds(100));

  graph->GetNodeBinaryFeature(
      {1, 2}, {0}, [](const euler::client::BinaryFatureVec& result) {
        for (const auto &features : result) {
          for (const std::string &feature: features) {
            std::cout << feature;
          }
          std::cout << std::endl;
        }
      });
  std::this_thread::sleep_for(std::chrono::milliseconds(100));
}

Running the above program for Graph in this example will output:

(1, 2, 0) (2, 4, 0)
(2, 3, 1)
a fruit 6 phone
a start s8 phone

Graph client initialization

The user needs to initialize a Graph object before using the graph, which can be obtained via a GraphConfig or configuration file:

std::unique_ptr<Graph> NewGraph(const std::string& config_file);
std::unique_ptr<Graph> NewGraph(const GraphConfig& config);

The above example can also be initialized with the following configuration file:

mode=Local
directory=.

The mode is divided into Local and Remote. The Local mode initializes a graph engine in memory space of the current process, and the Remote mode visits a set of graph engine services through RPC. The two modes correspond to different sets of configurations:

mode Local
directory only support Unix currently
load_type fast / compact
Mode Remote
zk_server ZooKeeper address, ip:port, to obtain meta information of servers
zk_path ZooKeeper node, to obtain meta information of servers
num_retries RPC retry times, non-positive for infininte reties, defaults to 10
num_channels_per_host number of RPC channels per host, defaults to 1
bad_host_cleanup_interval dectection interval of bad servers, defaults to 1
bad_host_timeout recover interval of bad servers, defaults to 10

If use the Remote mode, users need to [start a group of Euler services](#Graph service Initialization).

Data Type

Type Definition Description
NodeID uint64_t node id
EdgeID tuple<NodeID, NodeID, int32_t> edge id, represented as source, destination, and edge type
IDWeightPair tuple<NodeID, float, int32_t> neighbor information, includes neighbor id, edge weight, and edge type

The tuple, vector, string in this section and next section stands for std::tuple, std::vector, and std::string.

Data access interface

As shown in the example above, the C++ interfaces provided by Euler are all asynchronous. The last parameter of the methods is a callback function with the form std::function<void(const ResultType& result)>, which is called after the corresponding data is received.

All results are returned in row-major order, i.e. the first dimension of vector<vector<IDWeightPair>> corresponds to each node in the request, and the second dimension corresponds to each neighbor of each node; the first dimension of vector<vector<vector<T>>> corresponds to each vertex/edge in the request, the second dimension corresponds to each property of each node, and the third dimension corresponds to each value of each property of each vertex.

Interface Parameters Return description
SampleNode int node_type
int count
vector<NodeID> sample nodes by type
SampleEdge int edge_type
int count
vector<EdgeID> sample edges by type
SampleNeighbor vector<NodeID> node_ids
vector<int> edge_types
int count
vector<vector<IDWeightPair>> Sample the outgoing edges of a node in the request by type
GetTopKNeighbor vector<NodeID> node_ids
vector<int> edge_types
int k k
vector<vector<IDWeightPair>> Get the outgoing edge with the maximum weight in the request by type
GetFullNeighbor vector<NodeID> node_ids
vector<int> edge_types
vector<vector<IDWeightPair>> Get the outgoing edges of nodes in the request by type
GetSortedFullNeighbor vector<NodeID> node_ids
vector<int> edge_types
vector<vector<IDWeightPair>> Get the outgoing edges of nodes in the request by type, and sort them by dst_id.
BiasedSampleNeighbor vector<NodeID> node_ids
vector<NodeID> parent_node_ids
vector<int> edge_types
vector<int> patent_edge_types
int count float p
float q
vector<vector<IDWeightPair>> biased outgoing edges sampling used in node2vec
GetNodeFloat32Feature vector<NodeID> node_ids
vector<int> fids
vector<vector<vector<float>>> float feature Get the float feature of nodes in the request
GetNodeUint64Feature vector<NodeID> node_ids
vector<int> fids
vector<vector<vector<uint64_t>>> uint64 feature Get the uint64 feature of nodes in the request
GetNodeBinaryFeature vector<NodeID> node_ids
vector<int> fids
vector<vector<string>> binary feature Get the binary feature of nodes in the request
GetEdgeFloat32Feature vector<EdgeID> edge_ids
vector<int> fids
vector<vector<vector<float>>> float feature Get the float feature of edges in the request
GetEdgeUint64Feature vector<EdgeID> edge_ids
vector<int> fids
vector<vector<vector<uint64_t>> uint64 feature Get the uint64 feature of edges in the request
GetEdgeBinaryFeature vector<EdgeID> edge_ids
vector<int> fids
vector<vector<string>> binary feature Get the binary feature of edges in the request

Graph service initialization

When using the Remote mode, users need to start a set of graph services via StartService in euler::service. Refer to the header file euler/service/graph_service.h.

void StartService(const ServiceConfig& conf);
euler::service::StartService({
    {"directory", "/path/to/data"},
    {"loader_type", "hdfs"},
    {"hdfs_addr", "namenode.example.com"},
    {"hdfs_port", "9000"},
    {"shard_idx", "0"},
    {"shard_num", "1"},
    {"zk_addr", "zk.example.com:2181"},
    {"zk_path", "/path/for/euler"},
    {"global_sampler_type", "node"},
    {"graph_type", "compact"}
});

The configuration items in its parameters are:

key value description
directory Path of graph data
loader_type Type of file system: local or hdfs
hdfs_addr HDFS Namenode address
hdfs_port HDFS Namenode port
shard_idx id of shard
shard_num number of shard
zk_addr ZooKeeper address, ip:port, used to publish meta information of server
zk_path ZooKeeper path,used to publish meta information of server
global_sampler_type Type of globl sampler: all / node / edge / none
graph_type Type of graph engine: compact or fast
server_thread_num Thread number of RPC service, defaults to the number of CPU cores

When the Euler service is started, the entire graph is divided into multiple shards, and each shard can have multiple Euler service instances.

Clone this wiki locally