Skip to content

Commit

Permalink
updat
Browse files Browse the repository at this point in the history
  • Loading branch information
jxnl committed Sep 6, 2023
1 parent 8ab0bc2 commit 93330c4
Show file tree
Hide file tree
Showing 4 changed files with 150 additions and 55 deletions.
33 changes: 18 additions & 15 deletions docs/finetune.md → docs/cli/finetune.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,22 @@ The instructor CLI provides functionalities for managing fine-tuning jobs on Ope
### View Jobs Options

```sh
$ instructor jobs --help
$ instructor jobs --help

Usage: instructor jobs [OPTIONS] COMMAND [ARGS]...

Monitor and create fine tuning jobs

╭─ Options ───────────────────────────────────────────────────────────────────────────────╮
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ──────────────────────────────────────────────────────────────────────────────────────────────────╮
│ cancel Cancel a fine-tuning job. │
│ create-from-file Create a fine-tuning job from a file. │
│ create-from-id Create a fine-tuning job from an existing ID. │
│ list Monitor the status of the most recent fine-tuning jobs. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Usage: instructor jobs [OPTIONS] COMMAND [ARGS]...

Monitor and create fine tuning jobs

Options:
--help Show this message and exit.

Commands:
cancel Cancel a fine-tuning job.
create-from-file Create a fine-tuning job from a file.
create-from-id Create a fine-tuning job from an existing ID.
list Monitor the status of the most recent fine-tuning jobs.
```

### Create from File
Expand Down Expand Up @@ -84,6 +86,7 @@ OpenAI Files
└───────────────────────────────┴──────────────┴─────────────────────┴──────────┴───────────┘
```

# Contributions

We aim to provide a light wrapper around the API rather than offering a complete CLI. Contributions are welcome! Please feel free to make an issue at [jxnl/instructor/issues](https://github.com/jxnl/instructor/issues) or submit a pull request.

## Conclusion
The instructor CLI offers an intuitive interface for managing OpenAI's fine-tuning jobs and related files. By utilizing simple commands, you can create, monitor, and manage your fine-tuning tasks with ease. Feel free to explore further options and parameters by using the --help flag with any command.
57 changes: 57 additions & 0 deletions docs/cli/usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Using the OpenAI API Usage CLI

The OpenAI API Usage CLI tool provides functionalities for monitoring your OpenAI API usage, breaking it down by model, date, and cost.

## Monitoring API Usage

### View Usage Options

```sh
$ instructor usage --help

Usage: instructor usage [OPTIONS] COMMAND [ARGS]...

Check OpenAI API usage data

╭─ Options ───────────────────────────────────────────────────────╮
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────╯
╭─ Commands ──────────────────────────────────────────────────────╮
│ list Displays OpenAI API usage data for the past N days. │
╰─────────────────────────────────────────────────────────────────╯
```

### List Usage for Specific Number of Days

To display API usage for the past 3 days, use the following command:

```sh
$ instructor usage list -n 3
```

This will output a table similar to:

```plaintext
Usage Summary by Date, Snapshot, and Cost
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Date ┃ Snapshot ID ┃ Total Requests ┃ Total Cost ($) ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ 2023-09-04 │ gpt-4-0613 │ 44 │ 0.68 │
│ 2023-09-04 │ gpt-3.5-turbo-16k-0613 │ 195 │ 0.84 │
│ 2023-09-04 │ text-embedding-ada-002-v2 │ 276 │ 0.00 │
│ 2023-09-04 │ gpt-4-32k-0613 │ 328 │ 49.45 │
└────────────┴───────────────────────────┴────────────────┴────────────────┘
```

### List Usage for Today

To display the API usage for today, simply run:

```sh
$ instructor usage list
```

# Contributions

We aim to provide a light wrapper around the API rather than offering a complete CLI. Contributions are welcome! Please feel free to make an issue at [jxnl/instructor/issues](https://github.com/jxnl/instructor/issues) or submit a pull request.

112 changes: 73 additions & 39 deletions instructor/cli/usage.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,44 +23,79 @@ async def fetch_usage(date: str) -> dict:
return await resp.json()


def generate_usage_table(usage_data: List[dict]) -> Table:
table = Table(title="OpenAI API Usage")
table.add_column("Timestamp", style="dim")
table.add_column("Requests", justify="right")
table.add_column("Snapshot ID")
table.add_column("Context Tokens")
table.add_column("Generated Tokens")

for usage in usage_data:
table.add_row(
str(datetime.fromtimestamp(usage["aggregation_timestamp"])),
str(usage["n_requests"]),
usage["snapshot_id"],
str(usage["n_context_tokens_total"]),
str(usage["n_generated_tokens_total"]),
)
return table


async def get_usage_for_past_n_days(n_days: int) -> List[dict]:
tasks = []
all_data = []
with Progress() as progress:
task = progress.add_task("[green]Fetching usage data...", total=n_days)
for i in range(n_days):
date = (datetime.now() - timedelta(days=i)).strftime("%Y-%m-%d")
tasks.append(fetch_usage(date))
progress.update(task, advance=1)
if n_days > 1:
task = progress.add_task("[green]Fetching usage data...", total=n_days)
for i in range(n_days):
date = (datetime.now() - timedelta(days=i)).strftime("%Y-%m-%d")
tasks.append(fetch_usage(date))
progress.update(task, advance=1)
else:
tasks.append(fetch_usage(datetime.now().strftime("%Y-%m-%d")))

fetched_data = await asyncio.gather(*tasks)
for data in fetched_data:
all_data.extend(data.get("data", []))
return all_data


from collections import defaultdict
from datetime import datetime
from typing import List
from rich.table import Table

# Define the cost per unit for each model
MODEL_COSTS = {
"gpt-3.5-turbo": {"prompt": 0.0015 / 1000, "completion": 0.002 / 1000},
"gpt-3.5-turbo-16k": {"prompt": 0.003 / 1000, "completion": 0.004 / 1000},
"gpt-4": {"prompt": 0.03 / 1000, "completion": 0.03 / 1000},
"gpt-4-32k": {"prompt": 0.06 / 1000, "completion": 0.12 / 1000},
"text-embedding-ada-002": 0.0001 / 1000,
"text-embedding-ada-002-v2": 0.0001 / 1000,
}


def get_model_cost(model: str):
"""Get the cost details for a given model."""
if model in MODEL_COSTS:
return MODEL_COSTS[model]

# Handle prefix-based matching
if model.startswith("gpt-3.5-turbo-16k"):
return MODEL_COSTS["gpt-3.5-turbo-16k"]
elif model.startswith("gpt-3.5-turbo"):
return MODEL_COSTS["gpt-3.5-turbo"]
elif model.startswith("gpt-4-32k"):
return MODEL_COSTS["gpt-4-32k"]
elif model.startswith("gpt-4"):
return MODEL_COSTS["gpt-4"]
else:
raise ValueError(f"Cost for model {model} not found")


def calculate_cost(
snapshot_id: str, n_context_tokens: int, n_generated_tokens: int
) -> float:
"""Calculate the cost based on the snapshot ID and number of tokens."""
cost = get_model_cost(snapshot_id)

if isinstance(cost, float):
return cost * (n_context_tokens + n_generated_tokens)

prompt_cost = cost["prompt"] * n_context_tokens
completion_cost = cost["completion"] * n_generated_tokens
return prompt_cost + completion_cost


def group_and_sum_by_date_and_snapshot(usage_data: List[dict]) -> Table:
"""Group and sum the usage data by date and snapshot, including costs."""
summary = defaultdict(
lambda: defaultdict(lambda: {"total_requests": 0, "total_tokens": 0})
lambda: defaultdict(
lambda: {"total_requests": 0, "total_tokens": 0, "total_cost": 0.0}
)
)

for usage in usage_data:
Expand All @@ -71,41 +106,40 @@ def group_and_sum_by_date_and_snapshot(usage_data: List[dict]) -> Table:
summary[date][snapshot_id]["total_requests"] += usage["n_requests"]
summary[date][snapshot_id]["total_tokens"] += usage["n_generated_tokens_total"]

table = Table(title="Usage Summary by Date and Snapshot")
# Calculate and add the cost
cost = calculate_cost(
snapshot_id,
usage["n_context_tokens_total"],
usage["n_generated_tokens_total"],
)
summary[date][snapshot_id]["total_cost"] += cost

table = Table(title="Usage Summary by Date, Snapshot, and Cost")
table.add_column("Date", style="dim")
table.add_column("Snapshot ID", style="dim")
table.add_column("Model", style="dim")
table.add_column("Total Requests", justify="right")
table.add_column("Total Tokens", justify="right")
table.add_column("Total Cost ($)", justify="right")

for date, snapshots in summary.items():
for snapshot_id, data in snapshots.items():
table.add_row(
date,
snapshot_id,
str(data["total_requests"]),
str(data["total_tokens"]),
"{:.2f}".format(data["total_cost"]),
)

return table


@app.command(help="Displays OpenAI API usage data for the past N days.")
def list(
n: int = typer.Option(5, help="Number of days."),
n: int = typer.Option(0, help="Number of days."),
):
all_data = asyncio.run(get_usage_for_past_n_days(n))
table = group_and_sum_by_date_and_snapshot(all_data)
console.print(table)


@app.command(
help="Groups the OpenAI API usage data by snapshot_id and sums the total tokens."
)
def usage_group_by_snapshot():
usage_data = asyncio.run(fetch_usage(datetime.now().strftime("%Y-%m-%d")))
table = group_and_sum_by_snapshot(usage_data.get("data", []))
console.print(table)


if __name__ == "__main__":
app()
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ nav:
- "Introduction: Writing Prompts": "writing-prompts.md"
- "Prompting Templates": "chat-completion.md"
- CLI Reference:
- "Finetuning": finetune.md
- "Usage": "cli/usage.md"
- "Finetuning": "cli/finetune.md"
extra:
analytics:
provider: google
Expand Down

0 comments on commit 93330c4

Please sign in to comment.