Databricks sample content

1. dbstats - Compute statistics for all tables (&columns) in a schema

Notebook

2. Stream-Processing with Azure EventHubs with Kafka enabled endpoint

Generating events to an Azure EventHub with a Kafka enabled endpoint. The Apache Kafka connectors for Structured Streaming are packaged in Databricks Runtime.
Reading the generated events with the Kafka libs. Sample Notebook reading data from an EventHub with Kafka enabled endpoint, writing the data to an Azure Data Lake Store serialized as JSON partitioned by ingest-date.
Reading the generated events with the Azure Event Hubs libs.
Best practice is to archive incoming events on an Azure Event Hub with EventHub capturing enabled on an EventHubs. Events are captured in Azure Blob or Azure Data Lakes Store in Avro. This Notebook demonstrated how to read the captured events.

3. Read data from IoT-Hub and write it back to an EventHub and Azure Data Lake

Notebook

4. Read all tables from a schema and copy them to a SQL DB

The driver notebook reads all tables in the given schema and triggers the copy-notebook copying the Spark SQL table to the Azure SQL DB.

5. Reading Azure IoT-Hub data stored at a Storage Container Endpoint

Demo-Notebooks to read data from an Azure Storage Container added as an additional custom endpoint to Azure IoT-Hub:

In this example the Blob file format is configured as:

input/simdev/ingestdate={YYYY}-{MM}-{DD}/{HH}{mm}{iothub}{partition}.avro

Important:

use the extension .avro
think how to partition the data, in this example it is daily, if you want to read data hourly define partitions with a finer granularity

Configure a route to the new custom endpoint for the Device Messages. You have to type in nothing in the route query string, it will be true by default, so all messages are routed to the endpoint.

Now all new messages will be streamed to the new endpoint. If you mount the storage container in Azure Databricks, the data can be accessed directly under the mountpoint.

Sample Notebooks can be found here:

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
notebooks		notebooks
png		png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Databricks sample content

1. dbstats - Compute statistics for all tables (&columns) in a schema

2. Stream-Processing with Azure EventHubs with Kafka enabled endpoint

3. Read data from IoT-Hub and write it back to an EventHub and Azure Data Lake

4. Read all tables from a schema and copy them to a SQL DB

5. Reading Azure IoT-Hub data stored at a Storage Container Endpoint

About

Releases

Packages

hau-mal/databricks

Folders and files

Latest commit

History

Repository files navigation

Databricks sample content

1. dbstats - Compute statistics for all tables (&columns) in a schema

2. Stream-Processing with Azure EventHubs with Kafka enabled endpoint

3. Read data from IoT-Hub and write it back to an EventHub and Azure Data Lake

4. Read all tables from a schema and copy them to a SQL DB

5. Reading Azure IoT-Hub data stored at a Storage Container Endpoint

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages