WARNING! This package is no longer maintained nor supported. Please transition to the currently maintained Python SDK for Pachyderm: pachyderm-sdk
Official Python Pachyderm client.
This library provides the autogenerated gRPC/protobuf code for Pachyderm, along with a higher-level and more pythonic Client
class.
See the API docs.
pip install python-pachyderm
Here's an example that creates a repo and adds a file:
import python_pachyderm
# Connects to a pachyderm cluster on localhost:30650.
# For other options, see the API docs.
client = python_pachyderm.Client()
# Create a pachyderm repo called `test`
client.create_repo("test")
# Create a file in `(repo="test", branch="master")` at `/dir_a/data.txt`
# Similar to `pachctl put file test@master:/dir_a/data.txt`
with client.commit("test", "master") as commit:
client.put_file_bytes(commit, "/dir_a/data.txt", b"DATA")
# Get back the file
f = client.get_file(("test", "master"), "/dir_a/data.txt")
print(f.read()) # >>> b"DATA"
How to load a CSV file into a Pandas dataframe
import pandas as pd
f = client.get_file(("my_repo", "my_branch"), "/path_to/my_data.csv")
df = pd.read_csv(f)
For more sophisticated examples, see the examples directory.
Prior to python-pachyderm 2.0, this library's versioning synced with pachyderm's core versioning; e.g. version 1.8.5 of this library synced with 1.8.5 of pachyderm core. python-pachyderm 2.0 onwards uses semver instead, so versions are not tied to pachyderm core. This was done for two reasons:
- Sometimes this library makes breaking or backwards-incompatible changes, which aren't properly conveyed by revision changes.
- Pachyderm core is stable enough that most features of this library will work for disparate versions of pachyderm clusters. To help ensure this, this library's CI tests against several versions of pachyderm core.
However, if for whatever reason you need to know which version of pachyderm core a version of python-pachyderm was built with, consult CHANGELOG.md
. As a broad rule of thumb, we recommend working with the latest version of both pachyderm core and python-pachyderm where possible.
This driver is co-maintained by Pachyderm and the community. If you're looking to contribute to the project, this is a fantastic place to get involved. Take a look at the contributing guide for more info (including testing instructions).