Skip to content

Commit

Permalink
Merge pull request #27 from hpccsystems-solutions-lab/updates-to-empt…
Browse files Browse the repository at this point in the history
…y-learnEcl-content

Updates to empty learn ecl content
  • Loading branch information
bmcabrera authored Oct 27, 2023
2 parents 02cce72 + 95283d8 commit 8a9f267
Show file tree
Hide file tree
Showing 35 changed files with 1,118 additions and 346 deletions.
32 changes: 16 additions & 16 deletions learnEcl/0100-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,31 +3,31 @@ title: Learn ECL
slug: introduction
---

# ECL
# WHAT IS ECL?

ECL - Enterprise Control Language is designed to handle and manipulate immense datasets which makes it a prefect language to solve big data problems. ECL can be used for both ETL (Extract, Transform, and Load) and querying data. ECL is a declarative language which allows processing big data without the need of programmer being involved with details and in-depth of imperative decisions.
ECL - Enterprise Control Language is designed to handle and manipulate immense datasets which make it a perfect language to solve big data problems. ECL can be used for both ETL (Extract, Transform, and Load) and querying data. ECL is a declarative language which allows processing big data without the need of a programmer being involved with details and in-depth imperative decisions.

## ECL vs SQL

ECL and SQL can both be used to query a relational database. Following tables displays similar features between ECL and SQL.
ECL and SQL can both be used to query a relational database. Following is a table displaying similar features between ECL and SQL.

| SQL | ECL |
| --------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
| Declarative Language | Declarative Language |
| Database Server | Thor Cluster or Roxie Cluster |
| A SQL Table | An ECL Logical File |
| A SQL Editor | VSCode Editor or ECL Cloud IDE |
| A SQL File | A ECL File |
| SQL | ECL |
| :- | :- |
| Declarative Language | Declarative Language |
| Database Server | Thor Cluster or Roxie Cluster |
| An SQL Table | An ECL Logical File |
| An SQL Editor | VSCode Editor or ECL Cloud IDE |
| An SQL File | An ECL File |
| Executing SQL means submitting the written SQL to the Database Server which in turn compiles it and executes it | Executing ECL means submitting the written ECL to a Thor or Roxie cluster which in turn compiles and executes it |
| SQL Execution History/Logs | ECL Workunits Database & ECL Watch Workunits View |
| SQL Execution History/Logs | ECL Workunits Database & ECL Watch Workunits View |

## Language Highlights

- ECL is not case sensitive language, but it is recommended to use uppercase for reserved words
- ECL is not a case sensitive language, but it is recommended to use uppercase for reserved words
- White spaces are ignored, but it is strongly recommended to use white space for clarity and readability
- Declarative Programming Language, which means you specifies what needs to be done rather than how to do it
- ECL is a declarative programming language, which means you specify what needs to be done rather than how to do it
- Source-to-source compiler
- ECL code translated to C++ that is compiled to shared libraries and executed within a custom frame-work
- ECL code gets translated to C++ that is compiled to shared libraries and executed within a custom framework

Please refer [ECL syntax](./syntax.md) to learn about ECL standards. Or,
[jump right into coding](./output.md) and skip all the introductions.
Please refer [ECL syntax](./0200-syntax.md) to learn about ECL standards. Or,
[jump right into coding](./0500-output.md) and skip all the introductions.
208 changes: 123 additions & 85 deletions learnEcl/0200-syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,122 +5,160 @@ slug: ecl-syntax

# ECL SYNTAX

ECL syntax is characterized by its English-like readability and declarative nature. Developers use ECL to define data transformation workflows by specifying operations on datasets. ECL scripts consist of modules and functions that manipulate data through a sequence of transformations, including filters, joins, sorts, and aggregations. ECL's unique feature is its ability to optimize and parallelize these transformations across distributed computing resources, making it well-suited for big data processing tasks.
ECL syntax is characterized by its English-like readability and declarative nature. Developers use ECL to define data transformation workflows by specifying operations on datasets. ECL scripts consist of modules and functions that manipulate data through a sequence of transformations, including filters, joins, sorts, and aggregations. ECL's unique feature is its ability to optimize and parallelize these transformations across distributed computing resources, making it well-suited for big data processing tasks.

## Definition
- ECL is not case-sensitive but usually reserved keywords and built-in functions are written in ALL CAPS
- White space is ignored, allowing formatting for readability as needed

- Definition operator is :=
- Terminator for statement is ;
## Definitions

<EclCode
ECL definitions are the basic building blocks of ECL. A definition specifies what needs to be done rather than how it is to be done.

code= "Val1 := 12;
Val2 := 65;
- The Definition operator (`:=` read as "is defined as") defines an expression
- Definitions must be explicitly terminated with a semi-colon (`;`)

Result := Val1 + Val2;"
**Syntax**
<pre>
<EclCode
code="
[attrib_type] attrib_name := value
">
</EclCode>
</pre>

> </EclCode>
| Value | Definition |
| :- | :- |
| attrib_type | Optional. Compiler can infer it from definition. |
| attrib_name | The name by which the definition will be invoked. |
| value | Assigned value to the definition. |

## Comments

| Comment Type | Symbol | Example |
| ------------ | ------- | ------------------------------------- |
| Single Line | `//` | `// This is a single-line comment.` |
| Multi Line | `/* */` | `/* This is a multi-line comment. */` |
<pre>
<EclCode
code="
// attrib_name Val1 is defined and value 12 is assigned to it
Val1 := 12;

// attrib_name Val2 is defined and value 65 is assigned to it
Val2 := 65;

// attrib_name Result is defined and the summation of Val1 and Val2 is assigned to it
Result := Val1 + Val2;
">
</EclCode>
</pre>


## Field Access
## Comments

You can use of object.property to access dataset fields and definitions.
Comments in ECL are supported using the following syntax.

- `dataset.fieldName` Referencing an attribute from a module
- `moduleName.definition` Referencing a field from dataset
| Comment Type | Symbol | Example |
| :- | :- | :- |
| Single Line | `//` | `// This is a single-line comment.` |
| Multi Line | `/* */` | `/* This is a multi-line comment. */` |

<pre>
<EclCode
<EclCode
code="
// This is a single-line comment.

/* This
is
a
multi-line
comment.
*/
">
</EclCode>
</pre>

## Field Access

You can use object.property to access dataset fields and definitions.

code="MyDataset.FieldName;
MyModule.ExportedValue;"
- `dataset.fieldName` Referencing a field from a dataset
- `moduleName.definition` Referencing an attribute from a module

>
</EclCode>
<pre>
<EclCode
code="
MyDataset.FieldName;
MyModule.ExportedValue;
">
</EclCode>
</pre>

## Statement Types

In ECL, coding revolves around two main approaches: Definitions and Actions. These provide the structure to define data intricacies using Definitions and execute tasks effectively through Actions, forming the foundation for robust ECL solutions.

**Example**

<pre>
<EclCode
id = "IntroExp_1"
tryMe="IntroExp_1"

code="// Action vs Definition Examples.
STRING Def1 := 'OUTPUT turns definition ';
STRING Def2 := 'to action.';

// Action: String concatenation
Def1 + Def2;

Val1 := 12;
Val2 := 50;

// Definition
SomeResult := Val1 + Val2;

// Action: print result
SomeResult;"

> </EclCode>
<EclCode
id = "IntroExp_1"
tryMe="IntroExp_1"
code="
/* Actions vs Definitions */

// Definitions
STRING Def1 := 'Concatenating two Definitions ';
STRING Def2 := 'and performing an OUTPUT Action.';

// Action: String Concatenation
Def1 + Def2;

// Definitions
Val1 := 12;
Val2 := 50;
SomeResult := Val1 + Val2;

// Action: Print Result
SomeResult;
">
</EclCode>
</pre>

## Definition

Assigning an expression to an attribute. Definitions can't not be executing unless it is wrapped in an action and are defined by `:=`. Let's take a look at an example:

`Val := 23;` is a Definition. Attribute Val is defined and value 23 is assigned to it. To turn `Val` to an action we can wrap it in an OUTPUT.`OUTPUT(Val);` is an Action and result would be 23.

<EclCode code="[attrib_type] attrib_name := value"></EclCode>

| Value | Definition |
| :---------- | :----------------------------------------------- |
| attrib_type | Optional, compiler can infer it from Definition. |
| attrib_name | The name by which the Definition will be invoked |
| value | Assigned value to the Definition. |

## Action

Action simply means "do something." Actions trigger execution of a workunit that produces results.
Action simply means "do something". Actions trigger execution of a workunit that produce output in the workunit. Actions do NOT have a return value.

<pre>
<EclCode code="OUTPUT('this is an action');
SUM(1,2);">
</EclCode>
<EclCode
code="
// Action
OUTPUT('This is an Action.');

// Action
SUM(1,2);
">
</EclCode>
</pre>

**Example**

<pre>
<EclCode
id="IntroExp_2"
tryMe="IntroExp_2"
code="//Action vs Definition Examples.

// Defining an attribute
str := 'Hello Word';

// Turning it into Action
OUTPUT(str, NAMED('My_First_Program'));

// Defining an Action
NumOne := MAX(1,2,5,6);

// Turning to Action
OUTPUT(NumOne, NAMED('ActionThis'));

// Simple Actions, followings produce result
'my first ECL code';
1 + 4 + 5;
2 * 3;"></EclCode>
<EclCode
id="IntroExp_2"
tryMe="IntroExp_2"
code="
/* Actions vs Definitions */

// Defining an attribute
str := 'Hello Word';

// Performing an OUTPUT Action
OUTPUT(str, NAMED('My_First_Program'));

// Defining an Action
NumOne := MAX(1,2,5,6);

// Performing an OUTPUT Action
OUTPUT(NumOne, NAMED('ActionThis'));

// Simple Actions
'My first ECL code';
1 + 4 + 5;
2 * 3;
">
</EclCode>
</pre>
18 changes: 8 additions & 10 deletions learnEcl/0300-bigData.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,31 @@ slug: big-data

# BIG DATA

Big data refers to a large amount of data that is hard to process and manage using traditional data management systems.
Big data refers to a large amount of data that is difficult to process and manage using traditional data management systems.

Big data is defined by three common properties called the 3Vs: Volume, Variety and Velocity. While there are other Vs involved such as Value and Veracity, the 3Vs are the most famous ones.
Big data is defined by three common properties called the 3Vs; Volume, Variety and Velocity. While there are other Vs involved such as Value and Veracity, the 3Vs are the most famous ones.

**Velocity**
is the measurement of how fast data is coming into the system, it's processed and it's transferred to desire destination. The higher the velocity rate, the faster data is processed.
**Velocity** is the measurement of how fast data is coming into the system, how fast it's processed and how fast it's transferred to the desired destination. The higher the velocity rate, the faster data is processed.

**Variety** refers to different type of data. Big data is often comprised of all different kinds of data, each of which needs to processed separately.
**Variety** refers to different types of data. Big data is often comprised of all different kinds of data, each of which needs to processed separately.

**Volume** is the size of the dataset. Larger datasets could require different processes or infrastructure.

## Big Data Types

Structured
Structured data is data that is clearly defined and formatted following organization standards and (possibly) relational database rules. Since data is formatted and clearly defined, querying this kind of data is easier and faster.

Exp: Relational database tables, address books.
E.g. relational database tables, address books.

### Unstructured

Unstructured data refers to the data that lacks any specific form or structure. Processing data is difficult, time consuming and prone to errors. Unstructured data is stored in its original format and remains that way until needed.
Keep in mind that in order to meaningfully work with data, some kind of structure must be imposed on it.

Exp: Pictures, videos, audios.
i.e. pictures, videos, audios.

### Semi-structured

Contains both Structure and Unstructured data. Semi-structured data has internal tags and markings to describe the data elements.
Contains both Structured and Unstructured data. Semi-structured data has internal tags and markings to describe the data elements.

Exp: Emails; data saved in CSV, JSON, or YAML formats.
i.e. emails; data saved in CSV, JSON, or YAML formats.
Loading

0 comments on commit 8a9f267

Please sign in to comment.