Skip to content

Spark Shell

Jonathan Janetzki edited this page Jul 26, 2017 · 1 revision

This page explains how to use the Spark Shell by taking the example of counting elements in a Cassandra table.

1. Preparation

Start spark-shell by importing your package. For example:

spark-shell --jars jars/DataLakeImport-assembly-1.0.jar

2. Import your package

import DataLake._

Your case classes are known from now on.

3. Import Libs, especially datastax.spark.connector

import com.datastax.spark.connector._

4. Query the table you are seeking for

val subjects = sc.cassandraTable[Subject]("datalake","subject")

5. Count the elements within

subjects.count()

6. Et voìla

res0: Long = 859390