Skip to content

Commit

Permalink
Implement CLI for finetuning (#85)
Browse files Browse the repository at this point in the history
* add cli

* remove hint

* fix

* updat docs

* lock
  • Loading branch information
jxnl authored Aug 24, 2023
1 parent 964c17c commit 64e7f51
Show file tree
Hide file tree
Showing 9 changed files with 1,257 additions and 571 deletions.
60 changes: 30 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,37 +123,37 @@ In this updated schema, we use the `Field` class from `pydantic` to add descript
!!! note "Code, schema, and prompt"
We can run `openai_schema` to see exactly what the API will see, notice how the docstrings, attributes, types, and field descriptions are now part of the schema. This describes on this library's core philosophies.

```python hl_lines="2 3"
class UserDetails(OpenAISchema):
"Correctly extracted user information"
name: str = Field(..., description="User's full name")
age: int

UserDetails.openai_schema
```

```json hl_lines="3 8"
{
"name": "UserDetails",
"description": "Correctly extracted user information",
"parameters": {
"type": "object",
"properties": {
"name": {
"description": "User's full name",
"type": "string"
},
"age": {
"type": "integer"
}
},
"required": [
"age",
"name"
]
}
```python hl_lines="2 3"
class UserDetails(OpenAISchema):
"Correctly extracted user information"
name: str = Field(..., description="User's full name")
age: int

UserDetails.openai_schema
```

```json hl_lines="3 8"
{
"name": "UserDetails",
"description": "Correctly extracted user information",
"parameters": {
"type": "object",
"properties": {
"name": {
"description": "User's full name",
"type": "string"
},
"age": {
"type": "integer"
}
```
},
"required": [
"age",
"name"
]
}
}
```

### Section 3: Calling the ChatCompletion

Expand Down
89 changes: 89 additions & 0 deletions docs/finetune.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Using the Command Line Interface
The instructor CLI provides functionalities for managing fine-tuning jobs on OpenAI.

## Creating a Fine-Tuning Job

### View Jobs Options

```sh
$ instructor jobs --help

Usage: instructor jobs [OPTIONS] COMMAND [ARGS]...

Monitor and create fine tuning jobs

Options:
--help Show this message and exit.

Commands:
cancel Cancel a fine-tuning job.
create-from-file Create a fine-tuning job from a file.
create-from-id Create a fine-tuning job from an existing ID.
list Monitor the status of the most recent fine-tuning jobs.
```

### Create from File

The create-from-file command uploads and trains a model in a single step:

```sh
$ instructor jobs create-from-file transformed_data.jsonl
```

### Create from ID

The create-from-id command uses an uploaded file and trains a model

```sh
$ instructor files upload transformed_data.jsonl
$ instructor files list
...
$ instructor jobs create-from-file <file_id>
```


### Viewing Files and Jobs

#### Viewing Jobs

```sh
$ instructor jobs list

OpenAI Fine Tuning Job Monitoring
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ ┃ ┃ ┃ Completion ┃ ┃ ┃ ┃ ┃
┃ Job ID ┃ Status ┃ Creation Time ┃ Time ┃ Model Name ┃ File ID ┃ Epochs ┃ Base Model ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ ftjob-PWo6uwk… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 23:10:54 │ │ │ │ │ │
│ ftjob-1whjva8… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 22:47:05 │ │ │ │ │ │
│ ftjob-wGoBDld… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 22:44:12 │ │ │ │ │ │
│ ftjob-yd5aRTc… │ ✅ succeeded │ 2023-08-23 │ 2023-08-23 │ ft:gpt-3.5-tur… │ file-IQxAUDqX… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 14:26:03 │ 15:02:29 │ │ │ │ │
└────────────────┴──────────────┴────────────────┴────────────────┴─────────────────┴────────────────┴────────┴─────────────────┘
Automatically refreshes every 5 seconds, press Ctrl+C to exit
```


#### Viewing Files

```sh
$ instructor files list

OpenAI Files
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃ File ID ┃ Size (bytes) ┃ Creation Time ┃ Filename ┃ Purpose ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ file-0lw2BSNRUlXZXRRu2beCCWjl │ 369523 │ 2023-08-23 23:31:57 │ file │ fine-tune │
│ file-IHaUXcMEykmFUp1kt2puCDEq │ 369523 │ 2023-08-23 23:09:35 │ file │ fine-tune │
│ file-ja9vRBf0FydEOTolaa3BMqES │ 369523 │ 2023-08-23 22:42:29 │ file │ fine-tune │
│ file-F7lJg6Z47CREvmx4kyvyZ6Sn │ 369523 │ 2023-08-23 22:42:03 │ file │ fine-tune │
│ file-YUxqZPyJRl5GJCUTw3cNmA46 │ 369523 │ 2023-08-23 22:29:10 │ file │ fine-tune │
└───────────────────────────────┴──────────────┴─────────────────────┴──────────┴───────────┘
```


## Conclusion
The instructor CLI offers an intuitive interface for managing OpenAI's fine-tuning jobs and related files. By utilizing simple commands, you can create, monitor, and manage your fine-tuning tasks with ease. Feel free to explore further options and parameters by using the --help flag with any command.
Empty file added instructor/cli/__init__.py
Empty file.
11 changes: 11 additions & 0 deletions instructor/cli/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import typer
import instructor.cli.jobs as jobs
import instructor.cli.files as files

app = typer.Typer(
name="instructor-ft",
help="A CLI for fine-tuning OpenAI's models",
)

app.add_typer(jobs.app, name="jobs", help="Monitor and create fine tuning jobs")
app.add_typer(files.app, name="files", help="Manage files on OpenAI's servers")
123 changes: 123 additions & 0 deletions instructor/cli/files.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
from typing import List
from typing_extensions import Annotated
from rich.live import Live
from rich.table import Table
from rich.spinner import Spinner
from rich.console import Console

from datetime import datetime
import openai
import typer
import time

app = typer.Typer()
console = Console()


# Sample response data
def generate_file_table(files: List[openai.File]) -> Table:
table = Table(
title="OpenAI Files",
)
table.add_column("File ID", style="dim")
table.add_column("Size (bytes)", justify="right")
table.add_column("Creation Time")
table.add_column("Filename")
table.add_column("Purpose")

for file in files:
table.add_row(
file["id"],
str(file["bytes"]),
str(datetime.fromtimestamp(file["created_at"])),
file["filename"],
file["purpose"],
)

return table


def get_files(limit: int = 5) -> List[openai.File]:
files = openai.File.list(limit=limit)["data"] # type: ignore
files = sorted(files, key=lambda x: x["created_at"], reverse=True)
return files[:limit]


def get_file_status(file_id: str) -> str:
response = openai.File.retrieve(file_id)
return response["status"]


@app.command(
help="Upload a file to OpenAI's servers, will monitor the upload status until it is processed",
)
def upload(
filepath: str = typer.Argument(..., help="Path to the file to upload"),
purpose: str = typer.Option("fine-tune", help="Purpose of the file"),
poll: int = typer.Option(5, help="Polling interval in seconds"),
):
with open(filepath, "rb") as file:
response = openai.File.create(file=file, purpose=purpose)
file_id = response["id"]
with console.status(f"Monitoring upload: {file_id}...") as status:
status.spinner_style = "dots"
while True:
file_status = get_file_status(file_id)
if file_status == "processed":
console.log(f"[bold green]File {file_id} uploaded successfully!")
break
time.sleep(poll)


@app.command(
help="Download a file from OpenAI's servers",
)
def download(
file_id: str = typer.Argument(..., help="ID of the file to download"),
output: str = typer.Argument(..., help="Output path for the downloaded file"),
):
with console.status(
f"[bold green]Downloading file {file_id}...", spinner="dots"
) as status:
content = openai.File.download(file_id)
with open(output, "wb") as file:
file.write(content)
console.log(f"[bold green]File {file_id} downloaded successfully!")


@app.command(
help="Delete a file from OpenAI's servers",
)
def delete(file_id: str = typer.Argument(..., help="ID of the file to delete")):
with console.status(
f"[bold red]Deleting file {file_id}...", spinner="dots"
) as status:
try:
openai.File.delete(file_id)
console.log(f"[bold red]File {file_id} deleted successfully!")
except Exception as e:
console.log(f"[bold red]Error deleting file {file_id}: {e}")
return


@app.command(
help="Monitor the status of a file on OpenAI's servers",
)
def status(
file_id: str = typer.Argument(..., help="ID of the file to check the status of")
):
with console.status(f"Monitoring status of file {file_id}...") as status:
while True:
file_status = get_file_status(file_id)
status.update(f"File status: {file_status}")
if file_status in ["pending", "processed"]:
break
time.sleep(5)


@app.command(
help="List the files on OpenAI's servers",
)
def list(limit: int = typer.Option(5, help="Limit the number of files to list")):
files = get_files(limit=limit)
console.log(generate_file_table(files))
Loading

0 comments on commit 64e7f51

Please sign in to comment.