Skip to content

adamhadani/HBasta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HBasta
A Simple wrapper around Thrift HBase API


Introduction
------------
HBasta aims to do three things:

* Ease the pain of initially setting up all requirements to actually use the
  hbase Thrift python API (e.g thrift, thrift python bindings, generated hbase bindings)

* Provide a thin wrapper on top of the generated hbase thrift API
  that takes care of some of the boilerplate needed to setup a connection,
  as well as dealing with most of the Thrift types, implicitly converting
  them to native python structures (e.g dict, list, tuple) in the most natural
  way possible.

* Provide some simple command-line tools to streamline working
  with hbase. In particular, easy ways of creating data dumps/views
  by scanning different ranges in tables

It was designed for lazy people (like me) in mind :)

This software is distributed under the Apache 2.0 license.

Prerequisites
-------------
* setuptools

  If you've done any serious python, you probably have setuptools installed. 
  Easy way to check would be to see if you have the 'easy_install' executable in your path.
  Otherwise, It can easily be installed with most linux package manages. on Ubuntu:

	apt-get install python-setuptools

* Thrift

  Thrift can be typically installed e.g by downloading Thrift from project homepage (http://thrift.apache.org/download/)
  and using the Makefile:

        tar xzf thrift-0.5.0.tar.gz
	cd thrift-0.5.0
	./configure && make
	sudo make install

* Thrift python bindings 

  These should be installed automatically when using setuptools (see below on installation instructions)

* hbase generated library

  Run the included 'generate_hbase_py.sh' to generate the hbase python module.
  Afterwards, it should be automatically installed when using setuptools (see below on installation instructions).

  NOTE: This library will by default reflect latest HBase Thrift API from the HBase trunk.
  If you'd like to bundle a different hbase generated API, simply generate it yourself from the appropriate .thrift
  file and then place all the generated files in a folder called 'hbase', under the hbasta project root before installing.


Installing
----------
Installation should be straightforward:

1. As mentioned above, use 'generate_hbase_py.sh' to get latest Hbase.thrift from Hbase trunk and generate the
   Hbase python bindings on the fly:

	./generate_hbase_py.sh

2. Now you can install with setuptools easily:

	python setup.py install


Usage
------
The HBasta class exposes a simple programmatic interface to most of the common HBase operations:

* Create a table

>>> from hbasta import HBasta
>>> client = HBasta('localhost', 9999)
>>> client.create_table('tablename')


* Get a whole / partial row

>>> from hbasta import HBasta
>>> client = HBasta('localhost', 9999)
>>> row = client.get_row('mytable', 'row_key')
>>> print row
{'colfamily:col1': col1_val, ...}
>>> partial_row = client.get_row('mytable', 'row_key', colspec=['colfamily:col2'])
>>> print partial_row
{'colfamily:col2': col2_val}


* Add a new row

>>> client.add_row('mytable', 'row_key', cols = { 'myfamily:mycolumn' => 'mydata' })


* Use a scanner

>>> from hbasta import HBasta
>>> client = HBasta('localhost', 9999)
>>> scanner_id = client.scanner_open('mytable', 'row_0_id', colspec=('data',))
>>> for row, cols in client.scanner_get_list(scanner_id, num_rows=100):
...
>>> client.scanner_close(scanner_id)


For usage instructions on how to use the command-line utilities, use the --help flag.
Currently supported command-line tools:

* hbase-prefix-scan - Run a prefix-based scanner on a table



	

About

Simple wrapper around HBase/Thrift Python client

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published