Skip to content

Commit

Permalink
Merge pull request #468 from apache/kamir-patch-2
Browse files Browse the repository at this point in the history
Kamir patch 2
  • Loading branch information
2pk03 authored Sep 12, 2024
2 parents a2f5ed7 + b95e18c commit 886938e
Show file tree
Hide file tree
Showing 24 changed files with 1,097 additions and 24 deletions.
3 changes: 2 additions & 1 deletion bin/wayang-submit
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@

CLASS=$1


if [ -z "${CLASS}" ]; then
echo "Target Class for execution was not provided"
exit 1
Expand Down Expand Up @@ -120,5 +119,7 @@ do
ARGS="$ARGS \"${arg}\""
done

WAYANG_CLASSPATH="${WAYANG_CLASSPATH}:${WAYANG_APP_HOME}"

eval "$RUNNER $FLAGS -cp "${WAYANG_CLASSPATH}" $CLASS ${ARGS}"

41 changes: 41 additions & 0 deletions env_template_osx.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
export BOOTSTRAP_SERVER= ...
export CLUSTER_API_KEY= ...
export CLUSTER_API_SECRET= ...
export SR_ENDPOINT= ...
export SR_API_KEY= ...
export SR_API_SECRET= ...
export SCHEMA_REGISTRY_BASIC_AUTH_USER_INFO=" ... : .... "
export SCHEMA_REGISTRY_URL="https://.... "

export SPARK_HOME= ...
export HADOOP_HOME= ...
export PATH=$PATH:$HADOOP_HOME/bin
export WAYANG_VERSION= ...
export WAYANG_HOME= ...
export WAYANG_APP_HOME= ...

echo "Hadoop home : $HADOOP_HOME"
echo "Spark home : $SPARK_HOME"
echo "Wayang home : $WAYANG_HOME"
echo "Wayang app : $WAYANG_APP_HOME"
echo "Wayang version : $WAYANG_VERSION"


7 changes: 4 additions & 3 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1339,14 +1339,15 @@
<modules>
<module>wayang-commons</module>
<module>wayang-platforms</module>
<module>wayang-tests-integration</module>
<module>wayang-api</module>
<module>wayang-profiler</module>
<module>wayang-plugins</module>
<module>wayang-resources</module>
<module>wayang-benchmark</module>
<module>wayang-assembly</module>
<module>wayang-ml4all</module>
<!-- <module>wayang-docs</module> -->
<module>wayang-applications</module>
<module>wayang-benchmark</module>
<module>wayang-tests-integration</module>
<!--module>wayang-docs</module-->
</modules>
</project>
Original file line number Diff line number Diff line change
Expand Up @@ -966,7 +966,8 @@ class DataQuanta[Out: ClassTag](val operator: ElementaryOperator, outputIndex: I
topicName,
new TransformationDescriptor(formatterUdf, basicDataUnitType[Out], basicDataUnitType[String], udfLoad)
)
sink.setName(s"Write to KafkaTopic $topicName")
sink.setName(s"*#-> Write to KafkaTopic $topicName")
println(s"*#-> Write to KafkaTopic $topicName")
this.connectTo(sink, 0)

// Do the execution.
Expand All @@ -991,6 +992,7 @@ class DataQuanta[Out: ClassTag](val operator: ElementaryOperator, outputIndex: I
new TransformationDescriptor(formatterUdf, basicDataUnitType[Out], basicDataUnitType[String], udfLoad)
)
sink.setName(s"Write to $url")

this.connectTo(sink, 0)

// Do the execution.
Expand Down
60 changes: 60 additions & 0 deletions wayang-applications/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Wayang Applications

This module provides some example applications for using Apache Wayang in industrie specific scenarios.

## Example 1
Our traditional word-count example for Kafka topics is provided in the script:

```bash
run_wordcount_kafka.sh
```

This script needs the configuration files:

```bash
source .env.sh
source env.demo1.sh
```

Furthermore, the cluster properties are stored in the _default.properties_ file in the module with Kafka-Source and Kafka-Sink components.

**TODO:** We will improve this, by making the path to the Kafka client properties configurable soon.


## Prerequisites

The following scripts use Apache Kafka topics as source and sink:

- run_wordcount_kafka.sh

In order to make the demo working, you need a proper cluster setup.
Over time, we aim on a robust and reusable DEMO environment.
For the beginning, we use a Confluent cloud cluster, and its specific CLI tool to setup and teardown the topics.

Later on, an improved solution will follow.

For now you need the following tools installed, in addition to the Wayang, Spark, Hadoop libraries:

- Confluent CLI tool.
- jq

In OSX, both can be installed using homebrew. The installation and setup process for other environment are different.
```bash
brew install confluentinc/tap/cli
brew install jq
```

## Configuration

### Application environment
The file named _.env.sh_ is ignored by git, hence it is the place
for your personal configuration including credentials and cluster coordinates.
An example is given in _env_template.sh_.

### DEMO Setup
The file _env.demo1.sh_ contains additional properties, need in a particlar demo application.

In this file we will never see cluster or user specific details, only properties which are specific to the
particular application are listed here.


35 changes: 35 additions & 0 deletions wayang-applications/bin/bootstrap_ccloud_kafka_topics.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

source .env.sh
source env.demo1.sh

confluent kafka topic delete $topic_l1_a --cluster $DEMO1_CLUSTER1
confluent kafka topic delete $topic_l1_b --cluster $DEMO1_CLUSTER1
confluent kafka topic delete $topic_l1_c --cluster $DEMO1_CLUSTER1
confluent kafka topic delete $topic_l2_a --cluster $DEMO1_CLUSTER1
confluent kafka topic delete $topic_l2_b --cluster $DEMO1_CLUSTER1

confluent kafka topic create $topic_l1_a --cluster $DEMO1_CLUSTER1
confluent kafka topic create $topic_l1_b --cluster $DEMO1_CLUSTER1
confluent kafka topic create $topic_l1_c --cluster $DEMO1_CLUSTER1
confluent kafka topic create $topic_l2_a --cluster $DEMO1_CLUSTER1
confluent kafka topic create $topic_l2_b --cluster $DEMO1_CLUSTER1

confluent kafka topic list --cluster $DEMO1_CLUSTER1

curl --silent -X GET -u $SCHEMA_REGISTRY_BASIC_AUTH_USER_INFO $SCHEMA_REGISTRY_URL/subjects | jq .
27 changes: 27 additions & 0 deletions wayang-applications/bin/cleanup_ccloud_kafka_topics.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

source .env.sh
source env.demo1.sh

confluent kafka topic delete $topic_l1_a --cluster $DEMO1_CLUSTER1
confluent kafka topic delete $topic_l1_b --cluster $DEMO1_CLUSTER1
confluent kafka topic delete $topic_l1_c --cluster $DEMO1_CLUSTER1
confluent kafka topic delete $topic_l2_a --cluster $DEMO1_CLUSTER1
confluent kafka topic delete $topic_l2_b --cluster $DEMO1_CLUSTER1

confluent kafka topic list --cluster $DEMO1_CLUSTER1
24 changes: 24 additions & 0 deletions wayang-applications/bin/env.demo1.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# topics for demo1
export topic_l1_a=region_emea_counts
export topic_l1_b=region_apac_counts
export topic_l1_c=region_uswest_counts
export topic_l2_a=global_contribution
export topic_l2_b=global_averages
38 changes: 38 additions & 0 deletions wayang-applications/bin/env_template.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

export JAVA_HOME=...
export SPARK_HOME=...
export HADOOP_HOME=...
export PATH=$PATH:$HADOOP_HOME/bin
export WAYANG_HOME=...
export WAYANG_APP_HOME=...

# properties of brokers and schema registry of Ccloud cluster for demo 1
export BOOTSTRAP_SERVER=
export CLUSTER_API_KEY=
export CLUSTER_API_SECRET=
export SR_ENDPOINT=
export SR_API_KEY=
export SR_API_SECRET=

export SCHEMA_REGISTRY_BASIC_AUTH_USER_INFO="...:..."
export SCHEMA_REGISTRY_URL="..."

# cluster-id of Ccloud cluster...
export DEMO1_CLUSTER1=...
34 changes: 34 additions & 0 deletions wayang-applications/bin/run_wordcount.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

######
# to adjust variables to your own environment, please configure them in env.sh
#
source .env.sh

cd ../..

mvn clean compile package install -pl :wayang-assembly -Pdistribution -DskipTests

cd wayang-applications

mvn compile package install -DskipTests

cd ..

bin/wayang-submit org.apache.wayang.applications.WordCount java file://$(pwd)/wayang-applications/data/case-study/DATA_REPO_001/README.md
35 changes: 35 additions & 0 deletions wayang-applications/bin/run_wordcount_kafka.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

######
# to adjust variables to your own environment, please configure them in env.sh
#
source .env.sh
source env.demo1.sh

cd ../..

mvn clean compile package install -pl :wayang-assembly -Pdistribution -DskipTests

cd wayang-applications

mvn compile package install -DskipTests

cd ..

bin/wayang-submit org.apache.wayang.applications.WordCountOnKafkaTopic
Loading

0 comments on commit 886938e

Please sign in to comment.