GitHub - vardancse/incubator-zeppelin: Mirror of Apache Zeppelin (Incubating)

#Zeppelin

Documentation: User Guide
Mailing List: User and Dev mailing list
Continuous Integration:
Contributing: Contribution Guide
License: Apache 2.0

Zeppelin, a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.

Core feature:

Web based notebook style editor.
Built-in Apache Spark support

To know more about Zeppelin, visit our web site http://zeppelin.incubator.apache.org

Requirements

Java 1.7
Tested on Mac OSX, Ubuntu 14.X, CentOS 6.X
Maven (if you want to build from the source code)
Node.js Package Manager

Getting Started

Before Build

If you don't have requirements prepared, install it. (The installation method may vary according to your environment, example is for Ubuntu.)

sudo apt-get update
sudo apt-get install openjdk-7-jdk
sudo apt-get install git
sudo apt-get install maven
sudo apt-get install npm

Build

If you want to build Zeppelin from the source, please first clone this repository. And then:

mvn clean package -DskipTests

Build with specific Spark version

Spark 1.4.x

mvn clean package -Pspark-1.4 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests

Spark 1.3.x

mvn clean package -Pspark-1.3 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests

Spark 1.2.x

mvn clean package -Pspark-1.2 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests

Spark 1.1.x

mvn clean package -Pspark-1.1 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests

Spark 1.5.x

mvn clean package -Pspark-1.5 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests

CDH 5.X

mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0 -Phadoop-2.4 -DskipTests

Yarn (Hadoop 2.7.x)

mvn clean package -Pspark-1.4 -Dspark.version=1.4.1 -Dhadoop.version=2.7.0 -Phadoop-2.6 -Pyarn -DskipTests

Yarn (Hadoop 2.6.x)

mvn clean package -Pspark-1.1 -Dhadoop.version=2.6.0 -Phadoop-2.6 -Pyarn -DskipTests

Yarn (Hadoop 2.4.x)

mvn clean package -Pspark-1.1 -Dhadoop.version=2.4.0 -Phadoop-2.4 -Pyarn -DskipTests

Yarn (Hadoop 2.3.x)

mvn clean package -Pspark-1.1 -Dhadoop.version=2.3.0 -Phadoop-2.3 -Pyarn -DskipTests

Yarn (Hadoop 2.2.x)

mvn clean package -Pspark-1.1 -Dhadoop.version=2.2.0 -Phadoop-2.2 -Pyarn -DskipTests

Ignite (1.1.0-incubating and later)

mvn clean package -Dignite.version=1.1.0-incubating -DskipTests

Configure

If you wish to configure Zeppelin option (like port number), configure the following files:

./conf/zeppelin-env.sh
./conf/zeppelin-site.xml

(You can copy ./conf/zeppelin-env.sh.template into ./conf/zeppelin-env.sh. Same for zeppelin-site.xml.)

Setting SPARK_HOME and HADOOP_HOME

Without SPARK_HOME and HADOOP_HOME, Zeppelin uses embedded Spark and Hadoop binaries that you have specified with mvn build option. If you want to use system provided Spark and Hadoop, export SPARK_HOME and HADOOP_HOME in zeppelin-env.sh You can use any supported version of spark without rebuilding Zeppelin.

# ./conf/zeppelin-env.sh
export SPARK_HOME=...
export HADOOP_HOME=...

External cluster configuration

Mesos

# ./conf/zeppelin-env.sh
export MASTER=mesos://...
export ZEPPELIN_JAVA_OPTS="-Dspark.executor.uri=/path/to/spark-*.tgz" or SPARK_HOME="/path/to/spark_home"
export MESOS_NATIVE_LIBRARY=/path/to/libmesos.so

If you set SPARK_HOME, you should deploy spark binary on the same location to all worker nodes. And if you set spark.executor.uri, every worker can read that file on its node.

Yarn

# ./conf/zeppelin-env.sh
export SPARK_HOME=/path/to/spark_dir

Run

./bin/zeppelin-daemon.sh start

browse localhost:8080 in your browser.

For configuration details check ./conf subdirectory.

Package

To package final distribution do:

  mvn clean package -P build-distr

The archive is generated under zeppelin-distribution/target directory

###Run end-to-end tests Zeppelin comes with a set of end-to-end acceptance tests driving headless selenium browser

  #assumes zeppelin-server running on localhost:8080 (use -Durl=.. to override)
  mvn verify

  #or take care of starting\stoping zeppelin-server from packaged _zeppelin-distribuion/target_
  mvn verify -P using-packaged-distr

Name		Name	Last commit message	Last commit date
Latest commit History 1,760 Commits
_tools		_tools
angular		angular
bin		bin
cassandra		cassandra
conf		conf
dev		dev
docs		docs
flink		flink
geode		geode
hive		hive
ignite		ignite
kylin		kylin
lens		lens
markdown		markdown
notebook/2A94M5J1Z		notebook/2A94M5J1Z
phoenix		phoenix
postgresql		postgresql
shell		shell
spark-dependencies		spark-dependencies
spark		spark
tajo		tajo
testing		testing
zeppelin-distribution		zeppelin-distribution
zeppelin-interpreter		zeppelin-interpreter
zeppelin-server		zeppelin-server
zeppelin-web		zeppelin-web
zeppelin-zengine		zeppelin-zengine
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOY.md		DEPLOY.md
DISCLAIMER		DISCLAIMER
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
Roadmap.md		Roadmap.md
STYLE.md		STYLE.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Requirements

Getting Started

Before Build

Build

Configure

Setting SPARK_HOME and HADOOP_HOME

External cluster configuration

Run

Package

About

Releases

Packages

Languages

License

vardancse/incubator-zeppelin

Folders and files

Latest commit

History

Repository files navigation

Requirements

Getting Started

Before Build

Build

Configure

Setting SPARK_HOME and HADOOP_HOME

External cluster configuration

Run

Package

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages