Skip to content

Commit

Permalink
Merge pull request #12 from run-llama/feat/examples
Browse files Browse the repository at this point in the history
Improve the basic and manual examples
  • Loading branch information
fersilva16 authored Jul 25, 2024
2 parents f335e26 + 07857b2 commit 011f9d5
Show file tree
Hide file tree
Showing 3 changed files with 120 additions and 125 deletions.
61 changes: 60 additions & 1 deletion examples/demo_basic.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,29 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# LlamaExtract Usage"
"# Infer a schema to extract data from files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook, we will demonstrate how to infer a schema from a set of files and using it to extract structured data from invoice PDF files.\n",
"\n",
"The steps are:\n",
"1. Infer a schema from the invoices files.\n",
"2. Extract structured data (i.e. JSONs) from invoice PDF files\n",
"\n",
"Additional Resources:\n",
"- `LlamaExtract`: https://docs.cloud.llamaindex.ai/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"Install `llama-extract` client library:"
]
},
{
Expand All @@ -16,6 +38,13 @@
"%pip install llama-extract"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Apply `nest_asyncio` and bring your own LlamaCloud API key:"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -32,6 +61,14 @@
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"llx-...\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Infer the schema\n",
"First, let's infer the schema using the invoice files with `LlamaExtract`."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -47,6 +84,13 @@
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Preview the inferred schema:"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -64,6 +108,14 @@
"print(extraction_schema.data_schema)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extract structured data\n",
"Now with the schema, we can extract structured data (i.e. JSON) from the our invoices files."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -84,6 +136,13 @@
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Preview the extracted data:"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
123 changes: 0 additions & 123 deletions examples/demo_existent_schema.ipynb

This file was deleted.

61 changes: 60 additions & 1 deletion examples/demo_manual.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,29 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Create a schema with your own schema to extract data from files"
"# Manually create a schema to extract data from files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook, we will demonstrate how to manually create a schema and using it to extract structured data from invoice PDF files.\n",
"\n",
"The steps are:\n",
"1. Create a schema using a valid JSON schema object.\n",
"2. Extract structured data (i.e. JSONs) from invoice PDF files\n",
"\n",
"Additional Resources:\n",
"- `LlamaExtract`: https://docs.cloud.llamaindex.ai/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"Install `llama-extract` client library:"
]
},
{
Expand All @@ -16,6 +38,13 @@
"%pip install llama-extract"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Apply `nest_asyncio` and bring your own LlamaCloud API key:"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -32,6 +61,14 @@
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"llx-...\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the schema\n",
"First, let's create the schema using a valid JSON schema object with `LlamaExtract`."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -54,6 +91,13 @@
"extraction_schema = extractor.create_schema(\"Test Schema\", data_schema)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's preview the created schema:"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -71,6 +115,14 @@
"print(extraction_schema)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extract structured data\n",
"Now with the schema, we can extract structured data (i.e. JSON) from the our invoices files."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -91,6 +143,13 @@
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Preview the extracted data:"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down

0 comments on commit 011f9d5

Please sign in to comment.