Skip to content

Commit

Permalink
Copy docs repo content (OpenLineage#2948)
Browse files Browse the repository at this point in the history
* copy docs repo content

Signed-off-by: Pawel Leszczynski <[email protected]>

* fix spelling from the migrated docs

Signed-off-by: Pawel Leszczynski <[email protected]>

* remove existing github workflows for the moment

Signed-off-by: Pawel Leszczynski <[email protected]>

---------

Signed-off-by: Pawel Leszczynski <[email protected]>
  • Loading branch information
pawel-big-lebowski authored Aug 22, 2024
1 parent b555577 commit fe7a75f
Show file tree
Hide file tree
Showing 765 changed files with 132,893 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ repos:
types: [text]
args:
- --ignore-words=spelling_wordlist.txt
- --skip=*.js,*.svg,*.xml,*/**/test/*,*/**/tests/*
- --skip=*.css,*.js,*.svg,*.xml,*/**/test/*,*/**/tests/*,*/yarn.lock,**/openapi/*.html,*/package-lock.json,*/spec/*
- repo: local
hooks:
- id: check_schemas
Expand Down
4 changes: 3 additions & 1 deletion spelling_wordlist.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
crate
ons
ser
ser
wit
historial
95 changes: 95 additions & 0 deletions website/.github/workflows/visual-difference-detection.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
name: Visual difference detection
# This workflow performs visual comparisons between the main branch and a pull request.
# It activates when the "visual-comparison-required" label is added to a pull request.
#
# KEY POINTS:
# - Activation: The workflow triggers only when the "visual-comparison-required" label is added.
# - Trigger Conditions: Subsequent commits or amendments to the pull request will not trigger the workflow again.
# To re-trigger, remove and add again the "visual-comparison-required" label.
# - Label Handling: Other labels can also activate the workflow. However, the workflow will halt if the "visual-comparison-required" label is missing.
# - Caution: If the "visual-comparison-required" label is present, adding other labels will still trigger and execute the entire workflow.

on:
pull_request:
types: [ labeled ]

jobs:
check-label:
# Checks if the "visual-comparison-required" label is present on the pull request.
# The remaining jobs will only run if this label is found.
name: Check for "visual-comparison-required" label
runs-on: ubuntu-latest
outputs:
visual-comparison-required-label-found: ${{ steps.check.outputs.visual-comparison-required-label-found }}
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Check for "visual-comparison-required" label
id: check
env:
VISUAL_COMPARISON_REQUIRED_LABEL_PRESENT: ${{ contains(github.event.pull_request.labels.*.name, 'visual-comparison-required') }}
run: |
echo "visual-comparison-required-label-found=$VISUAL_COMPARISON_REQUIRED_LABEL_PRESENT" | tee "$GITHUB_OUTPUT"
take-screenshots-main:
# This job takes screenshots of the main branch for visual comparison.
# It runs only if the "visual-comparison-required" label is found.
name: Take screenshots (main branch)
needs: check-label
if: needs.check-label.outputs.visual-comparison-required-label-found == 'true'
runs-on: ubuntu-latest
steps:
# We switch to the main branch to take the screenshots
- name: Check out repository code
uses: actions/checkout@v4
with:
ref: main
- name: Use Node.js
uses: actions/setup-node@v3
with:
node-version: current
- name: Install dependencies
run: yarn install --frozen-lockfile
- name: Install Playwright browsers
run: yarn playwright install --with-deps chromium
- name: Build the website
run: yarn docusaurus build
- name: Take screenshots with Playwright
run: yarn workspace argos screenshot
# Argos needs two extra pieces of information to associate screenshots with the branch.
# - We have to set the ARGOS_BRANCH variable to main, so that it could be labelled properly in the UI
# - We have to set the ARGOS_COMMIT variable to the main branch commit sha. It is necessary, because Argos
# uses this information to make sure the screenshots for comparison are the ancestors of the version
# in the pull request
- name: Store the main branch sha in GitHub environmental variables
run: echo "ARGOS_COMMIT=$(git rev-parse HEAD)" >> $GITHUB_ENV
- name: Upload screenshots to Argos
run: yarn workspace argos upload
env:
ARGOS_BRANCH: main
ARGOS_COMMIT: ${{ env.ARGOS_COMMIT }}

take-screenshots-pull-request:
# This job takes screenshots of the pull request branch for visual comparison.
# It runs only if the "visual-comparison-required" label is found.
name: Take screenshots (pull request branch)
needs: [check-label, take-screenshots-main]
if: needs.check-label.outputs.visual-comparison-required-label-found == 'true'
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v4
- name: Use Node.js
uses: actions/setup-node@v3
with:
node-version: current
- name: Install dependencies
run: yarn install --frozen-lockfile
- name: Install Playwright browsers
run: yarn playwright install --with-deps chromium
- name: Build the website
run: yarn docusaurus build
- name: Take screenshots with Playwright
run: yarn workspace argos screenshot
- name: Upload screenshots to Argos
run: yarn workspace argos upload
26 changes: 26 additions & 0 deletions website/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Dependencies
node_modules

# Production
/build

# Generated files
.docusaurus
.cache-loader

# Misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local

npm-debug.log*
yarn-debug.log*
yarn-error.log*

# intellij
.idea

argos/screenshots
argos/test-results
1 change: 1 addition & 0 deletions website/CNAME
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
openlineage.io
94 changes: 94 additions & 0 deletions website/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# OpenLineage Docs

[![Covered by Argos Visual Testing](https://argos-ci.com/badge.svg)](https://app.argos-ci.com/pawel-big-lebowski/docs/reference?utm_source=OpenLineage&utm_campaign=oss)

This is a Docusaurus site, and all content can be found in `docs/`. Contributions are welcome in the form of issues or pull requests. Pages that require attention have been marked with Docusaurus Admonitions.

### New posts

We love new blog posts, and welcome content about OpenLineage! Topics include:
* experiences from users of all kinds
* supporting products and technologies
* proposals for discussion

If you are familiar with the GitHub pull request process, it is easy to propose a new blog post:

1. Fork this project.
2. Make a new directory in `/blog`. The name of the directory will become part of the posts's URL, so choose something descriptive and unique.
3. Create an `index.mdx` file in the new directory containing your blog content. Use one of the other posts as a template. The `title`, `date`, `authors`, and `description` front matter fields are all required.
4. Add your author information -- name, title, url (optional), and image_url (optional) -- to `blog/authors.yml`.
5. Build the site locally if you want to see it in a browser and build confidence in your formatting choices.
6. Commit your changes and submit a pull request.

### New ecosystem partners for the Ecosystem page

- Add a rectangular logo in SVG format twice as wide as it is tall to static/img.
- Add a record to the appropriate file and array in static/ecosystem, using simply the filename of the logo for the image value.

### Changes to basepages

If you want to make a change to a basepage - e.g. to add a new member to the Ecosystem page - the best way is to submit a pull request.

These basepages can be found in `src/pages`, and are formatted in markdown.

### Building openapi docs

To build the openapi docs using `redoc-cli`, run:

```
% yarn run build:docs
```

## Local development

First, clone the repo.

Install the [node version manager](https://github.com/nvm-sh/nvm) and use it to create a Node 16 environment:

```
$ nvm install 16
$ nvm use 16
```

Run Yarn to install all of the Node dependencies for the project:

```
$ yarn
```

## Local site build

You need to first build the documentation contents. This is necessary before starting the docusaurus server.

```
$ yarn build
```

This command generates static content into the `build` directory. If you want to look at it, try `cd build && python3 -m http.server`.

## Local server start

Tell Yarn to start a development server:

```
$ yarn start
```

This command provides a URL where the doc site can be viewed. Most changes are reflected live without having to restart the server.

By default, the server port will be set to 3000. In case the port is already being used, you can specify the port number when starting the server:

```
$ yarn start --port 3001
```

## Deployment

Once the site has been launched, pull requests to `main` will cause a new doc site to be shipped via GitHub Pages.

The site is deployed using the [Gatsby Publish GitHub action](https://github.com/OpenLineage/docs/blob/main/.github/workflows/deploy.yml) whenever a change is merged into `main`.

This GitHub Action will:
* Execute `scripts/build-docs.sh`, which performs a build of the OpenAPI docs based on the latest version of the spec that has been published into `static/spec` by the [OpenLineage release script](https://github.com/OpenLineage/OpenLineage/blob/main/spec/release.sh). The resulting docs are placed into `static/apidocs/openapi`.
* Execute `yarn run build`, which performs a build of the Gatsby landing pages and places them into `public/`. The `static/` directory, containing the OpenAPI and Java client documentation, is copied into `public/` during this step.
* Replace the contents of the `gh-pages` branch of the [org domain repo](https://github.com/OpenLineage/OpenLineage.github.io) with the contents of `public/`. This will cause that repo's GitHub Action to deploy the new content.
17 changes: 17 additions & 0 deletions website/argos/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"name": "argos",
"version": "0.0.0",
"description": "Workspace for visual difference detection",
"license": "MIT",
"private": true,
"scripts": {
"screenshot": "playwright test",
"upload": "npx @argos-ci/cli upload ./screenshots"
},
"devDependencies": {
"@argos-ci/cli": "^0.6.0",
"@argos-ci/playwright": "^0.0.7",
"@playwright/test": "^1.38.1",
"cheerio": "^1.0.0-rc.12"
}
}
20 changes: 20 additions & 0 deletions website/argos/playwright.config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import {devices} from '@playwright/test';
import type {PlaywrightTestConfig} from '@playwright/test';

const config: PlaywrightTestConfig = {
webServer: {
cwd: "..",
port: 3000,
command: 'yarn serve',
},
projects: [
{
name: 'chromium',
use: {
...devices['Desktop Chrome'],
},
},
],
};

export default config;
19 changes: 19 additions & 0 deletions website/argos/screenshot.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
/* Iframes can load lazily */
iframe,
/* Avatars can be flaky due to using external sources: GitHub/Unavatar */
.avatar__photo,
/* Gifs load lazily and are animated */
img[src$='.gif'],
/* Algolia keyboard shortcuts appear with a little delay */
.DocSearch-Button-Keys > kbd,
/* The live playground preview can often display dates/counters */
[class*='playgroundPreview'] {
visibility: hidden;
}

/* Different docs last-update dates can alter layout */
.theme-last-updated,
/* Mermaid diagrams are rendered client-side and produce layout shifts */
.docusaurus-mermaid-container {
display: none;
}
36 changes: 36 additions & 0 deletions website/argos/screenshot.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import * as fs from "fs";
import {test} from "@playwright/test";
import {argosScreenshot} from "@argos-ci/playwright";
import {extractSitemapPathnames, pathnameToArgosName} from "argos/utils";

// Constants:
const siteUrl = "http://localhost:3000";
const sitemapPath = "../build/sitemap.xml";
const stylesheetPath = "./screenshot.css";
const stylesheet = fs.readFileSync(stylesheetPath).toString();

// Wait for hydration, requires Docusaurus v2.4.3+
// See https://github.com/facebook/docusaurus/pull/9256
// Docusaurus adds a <html data-has-hydrated="true"> once hydrated
function waitForDocusaurusHydration() {
// uncomment the line when Docusaurus is upgraded to v2.4.3
// return document.documentElement.dataset.hasHydrated === "true";
return true;
}

function screenshotPathname(pathname: string, index: number, numberOfPaths: number) {
test(`pathname ${pathname}`, async ({page}) => {
const url = siteUrl + pathname;
console.log(`${index + 1}/${numberOfPaths} Screenshotting`, url);
await page.goto(url);
await page.waitForFunction(waitForDocusaurusHydration);
await page.addStyleTag({content: stylesheet});
await argosScreenshot(page, pathnameToArgosName(pathname));
});
}

test.describe("Docusaurus site screenshots", () => {
const pathnames = extractSitemapPathnames(sitemapPath);
console.log("Pathnames to screenshot:", pathnames);
pathnames.forEach((path, index) => screenshotPathname(path, index, pathnames.length));
});
17 changes: 17 additions & 0 deletions website/argos/utils.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import * as cheerio from "cheerio";
import * as fs from "fs";

export function extractSitemapPathnames(sitemapPath: string): string[] {
const sitemap = fs.readFileSync(sitemapPath).toString();
const $ = cheerio.load(sitemap, { xmlMode: true });
const urls: string[] = [];
$("loc").each(function handleLoc() {
urls.push($(this).text());
});
return urls.map((url) => new URL(url).pathname);
}

// Converts a pathname to a decent screenshot name
export function pathnameToArgosName(pathname: string): string {
return pathname.replace(/^\/|\/$/g, "") || "index";
}
3 changes: 3 additions & 0 deletions website/babel.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
module.exports = {
presets: [require.resolve('@docusaurus/core/lib/babel/preset')],
};
33 changes: 33 additions & 0 deletions website/blog/0.1-release/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
title: Introducing OpenLineage 0.1.0
date: 2021-09-03
authors: [Le Dem]
description: We are pleased to announce the initial release of OpenLineage. This release includes the core specification, data model, clients, and integrations with common data tools.
---
We are pleased to announce the initial release of OpenLineage. This release includes the core specification, data model, clients, and integrations with common data tools.

<!--truncate-->

We are pleased to announce the initial release of OpenLineage. This is the culmination of a broad community effort, and establishes a common framework for data lineage collection and analysis.

We want to thank [all the contributors](https://github.com/OpenLineage/OpenLineage/graphs/contributors) as well all the projects and companies involved in the design (in alphabetical order): [Airflow](https://airflow.apache.org), [Astronomer](https://www.astronomer.io), [Datakin](https://datakin.com), [Data Mesh](https://datameshlearning.com), [dbt](https://www.getdbt.com), [Egeria](https://egeria.odpi.org), [GetInData](https://getindata.com), [Great Expectations](https://greatexpectations.io), [Iceberg](https://iceberg.apache.org) (and others that I am probably forgetting).

This release includes:
* The initial 1-0-0 release of the [OpenLineage specification](https://github.com/OpenLineage/OpenLineage/blob/main/spec/OpenLineage.md)
* A core lineage model of Jobs, Runs and Datasets
* Core facets
* Data Quality Metrics and statistics
* Dataset schema
* Source code location
* SQL
* Clients that send OpenLineage events to an HTTP backend
* Java
* Python
* [Integrations](https://github.com/OpenLineage/OpenLineage/tree/main/integration) that collect lineage metadata as OpenLineage events
* Apache Airflow with support for BigQuery, Great Expectations, Postgres, Redshift, Snowflake
* Apache Spark
* dbt

This is only the beginning. We invite everyone interested to [consult and contribute to the roadmap](https://github.com/OpenLineage/OpenLineage/projects). The roadmap currently contains, among other things: adding support for [Kafka](https://github.com/OpenLineage/OpenLineage/issues/152), [BI dashboards](https://github.com/OpenLineage/OpenLineage/issues/207), and [column level lineage](https://github.com/OpenLineage/OpenLineage/issues/148)...but you can influence it by participating!

Follow the [repo](https://github.com/OpenLineage/OpenLineage) to stay updated. And, as always, you can [join the conversation](http://bit.ly/OpenLineageSlack) on Slack.
Loading

0 comments on commit fe7a75f

Please sign in to comment.