Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improves logging and plots benchmark information #54

Merged
merged 68 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
fc5cc70
Add rpc_state_reader and blockifier to tracing default directives
JulianGCalderon Sep 13, 2024
7d71fa8
Simplify tx execution logs
JulianGCalderon Sep 13, 2024
cc2ffe7
Add compilation time logs
JulianGCalderon Sep 13, 2024
dcd02f2
Update to blockifier native2.8.x-arc-tracing
JulianGCalderon Sep 13, 2024
159eec6
Use millis for compilation time
JulianGCalderon Sep 16, 2024
568a143
Add basic plotting script
JulianGCalderon Sep 16, 2024
19018f3
Plot with VM
JulianGCalderon Sep 16, 2024
0985e48
Improve plot
JulianGCalderon Sep 16, 2024
333185b
Add title
JulianGCalderon Sep 16, 2024
dfeb91b
Update lock
JulianGCalderon Sep 16, 2024
400f2fe
Rename plot script
JulianGCalderon Sep 16, 2024
dcc5010
Plot compilation time
JulianGCalderon Sep 16, 2024
a839624
Remove print
JulianGCalderon Sep 16, 2024
90225c3
Merge branch 'main' into bench-analysis
JulianGCalderon Sep 23, 2024
c5e0a97
Merge branch 'update-llvm19' into bench-analysis
JulianGCalderon Sep 25, 2024
9d94ad8
Merge branch 'contract_executor' into bench-analysis
JulianGCalderon Sep 25, 2024
cb6cd6e
Update Cargo.lock
JulianGCalderon Sep 25, 2024
20076ef
Add library size
JulianGCalderon Sep 25, 2024
bea8cdc
Merge branch 'contract_executor' into bench-analysis
JulianGCalderon Sep 25, 2024
02e1c51
Merge branch 'contract_executor' into bench-analysis
JulianGCalderon Sep 25, 2024
a39bfeb
Add plot_library_size.py script
JulianGCalderon Sep 25, 2024
ba24dc4
Merge branch 'main' into bench-analysis
JulianGCalderon Sep 26, 2024
d7dec16
Merge branch 'contract_executor' into bench-analysis
JulianGCalderon Sep 26, 2024
7c6d3a3
Add compilation span
JulianGCalderon Sep 27, 2024
bebad96
Add program length to compilation span
JulianGCalderon Sep 30, 2024
e0c1bd0
Enable cairo_native logs
JulianGCalderon Sep 30, 2024
2f64aa0
Plot finer compilation
JulianGCalderon Oct 1, 2024
0422ca1
Add vm contract compilation logs
JulianGCalderon Oct 2, 2024
7247a31
Rename
JulianGCalderon Oct 2, 2024
53f07a5
Plot compilation trend
JulianGCalderon Oct 2, 2024
dd842ec
Update compilation time
JulianGCalderon Oct 2, 2024
a6fd5df
Update compilation time finer
JulianGCalderon Oct 2, 2024
0635947
Move compilation span upwards
JulianGCalderon Oct 2, 2024
a2d965c
Add casm compilation time
JulianGCalderon Oct 2, 2024
9b364d7
Rename scripts
JulianGCalderon Oct 2, 2024
99b9f20
Update compilation memory
JulianGCalderon Oct 2, 2024
c7f3c4e
nit fix
JulianGCalderon Oct 2, 2024
faa4135
Calculate actual bytecode size
JulianGCalderon Oct 2, 2024
5fc9e7a
Plot compilation memory trend
JulianGCalderon Oct 2, 2024
c4d01be
Update execution time
JulianGCalderon Oct 2, 2024
626b273
Correlate casm with native comp size
JulianGCalderon Oct 2, 2024
7f79c31
Remove extra plot
JulianGCalderon Oct 2, 2024
a0c9d64
Update time trend
JulianGCalderon Oct 2, 2024
c3991a7
Add structured_logging feature
JulianGCalderon Oct 2, 2024
e776e3a
Document the plotting scripts
JulianGCalderon Oct 2, 2024
80abf59
Update README
JulianGCalderon Oct 2, 2024
0f69cd8
Improve docs
JulianGCalderon Oct 3, 2024
fd56c6d
Merge branch 'main' into bench-analysis
JulianGCalderon Oct 3, 2024
1810fcc
Update lock
JulianGCalderon Oct 3, 2024
c34b515
Use new commit for buildin runtime
JulianGCalderon Oct 3, 2024
f70623f
Update to latest native commit
JulianGCalderon Oct 3, 2024
794b3c0
Use /opt/homebrew/opt/llvm in CI
JulianGCalderon Oct 3, 2024
4cfeb21
Instal lld
JulianGCalderon Oct 3, 2024
7758db5
Remove lld
JulianGCalderon Oct 3, 2024
7e7e08f
Set envs with brew --prefix
JulianGCalderon Oct 3, 2024
4822152
Install lld
JulianGCalderon Oct 3, 2024
f142896
Remove lld
JulianGCalderon Oct 3, 2024
3a716fa
Restore ci.yml
JulianGCalderon Oct 3, 2024
b4b5799
Remove sierras
JulianGCalderon Oct 4, 2024
83dfc5f
Remove logs
JulianGCalderon Oct 4, 2024
94e80a8
Update brew
JulianGCalderon Oct 4, 2024
f2536c5
Merge branch 'main' into bench-analysis
JulianGCalderon Oct 10, 2024
3c231df
Merge branch 'main' into bench-analysis
JulianGCalderon Oct 28, 2024
4b3337a
Update commit
JulianGCalderon Oct 28, 2024
35aafda
Update README
JulianGCalderon Oct 28, 2024
b57ded8
Update lock
JulianGCalderon Oct 28, 2024
9253572
Modify log message
JulianGCalderon Oct 29, 2024
18df2ea
Merge branch 'main' into bench-analysis
JulianGCalderon Oct 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,3 +118,25 @@ To compare the outputs, you can use the following scripts. Some of them required
```bash
> ./scripts/delta_state_dumps.sh
```

### Plotting

In the `plotting` directory, you can find python scripts to plot relevant information. Before using them, you must first execute the replay with the `structured_logging` feature, and redirect the output to a file. You should do it with both Native execution and VM execution.

Make sure to erase the `compiled_programs` directory, then run:

```bash
cargo run --features structured_logging block mainnet 724000 | tee native-logs
cargo run --features structured_logging,only_cairo_vm block mainnet 724000 | tee vm-logs
```

Once you have done this, you can use the plotting scripts:

- `python ./plotting/plot_compilation_memory.py native-logs`: Size of the compiled native libraries, by contract class.
- `python ./plotting/plot_compilation_memory_corr.py native-logs vm-logs`: Size of the compiled native libraries, by the associated Casm contract size.
- `python ./plotting/plot_compilation_memory_trend.py native-logs vm-logs`: Size of the compiled native and casm contracts, by the sierra contract size.
- `python ./plotting/plot_compilation_time.py native-logs`: Native compilation time, by contract class
- `python ./plotting/plot_compilation_time_trend.py native-logs vm-logs`: Native and Casm compilation time, by the sierra contract size.
- `python ./plotting/plot_execution_time.py native-logs vm-logs`: Plots the execution time of Native vs VM, by contract class. This is best used with the benchmark feature, as it ignores compilation and RPC calls.
- `python ./plotting/plot_compilation_time_finer.py native-logs`: Native compilation time, with fine-grained stage separation, by contract class. It requires a specific [Cairo Native branch](https://github.com/lambdaclass/cairo_native/tree/time-compilation) (as it need finer logging)

47 changes: 47 additions & 0 deletions plotting/plot_compilation_memory.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
from argparse import ArgumentParser

argument_parser = ArgumentParser('Stress Test Plotter')
argument_parser.add_argument("native_logs_path")
arguments = argument_parser.parse_args()

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

dataset = pd.read_json(arguments.native_logs_path, lines=True, typ="series")

def canonicalize_compilation_time(event):
if "contract compilation finished" not in event["fields"]["message"]:
return None

compilation_span = find_span(event, "contract compilation")
if compilation_span is None:
return None

return {
"class hash": compilation_span["class_hash"],
"size": event["fields"]["size"] / (1024 * 1024),
}

def find_span(event, name):
for span in event["spans"]:
if name in span["name"]:
return span
return None

def format_hash(class_hash):
return f"0x{class_hash[:6]}..."


dataset = dataset.apply(canonicalize_compilation_time).dropna().apply(pd.Series)

figure, ax = plt.subplots()

sns.set_color_codes("bright")
sns.barplot(ax=ax, y="class hash", x="size", data=dataset, formatter=format_hash) # type: ignore

ax.set_xlabel("Library Size (MiB)")
ax.set_ylabel("Class Hash")
ax.set_title("Library Size by Contract")

plt.show()
71 changes: 71 additions & 0 deletions plotting/plot_compilation_memory_corr.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
from argparse import ArgumentParser

argument_parser = ArgumentParser('Stress Test Plotter')
argument_parser.add_argument("native_logs_path")
argument_parser.add_argument("vm_logs_path")
arguments = argument_parser.parse_args()

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

dataset_native = pd.read_json(arguments.native_logs_path, lines=True, typ="series")
dataset_vm = pd.read_json(arguments.vm_logs_path, lines=True, typ="series")

def canonicalize_compilation_time(event):
if "contract compilation finished" not in event["fields"]["message"]:
return None

compilation_span = find_span(event, "contract compilation")
if compilation_span is None:
return None

return {
"class hash": compilation_span["class_hash"],
"size": event["fields"]["size"] / 1024,
}

def find_span(event, name):
for span in event["spans"]:
if name in span["name"]:
return span
return None

def format_hash(class_hash):
return f"0x{class_hash[:6]}..."


dataset_native = dataset_native.apply(canonicalize_compilation_time).dropna().apply(pd.Series)
dataset_vm = dataset_vm.apply(canonicalize_compilation_time).dropna().apply(pd.Series)

dataset_native = dataset_native.set_index("class hash")
dataset_vm = dataset_vm.set_index("class hash")

dataset = dataset_native.join(dataset_vm, lsuffix="_native", rsuffix="_casm")

figure, ax = plt.subplots()

sns.set_color_codes("bright")

sns.regplot(
x="size_native",
y="size_casm",
label = "Native (<1000)",
data=dataset[dataset["size_native"] < 1000],
ax = ax,
)
sns.regplot(
x="size_native",
y="size_casm",
label = "Native (>=1000)",
data=dataset[dataset["size_native"] >= 1000],
ax = ax,
)

ax.set_xlabel("Native Compilation Size (KiB)")
ax.set_ylabel("Casm Compilation Size (KiB)")
ax.set_title("Compilation Size Correlation")

ax.legend()

plt.show()
76 changes: 76 additions & 0 deletions plotting/plot_compilation_memory_trend.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
from argparse import ArgumentParser

argument_parser = ArgumentParser('Stress Test Plotter')
argument_parser.add_argument("native_logs_path")
argument_parser.add_argument("vm_logs_path")
arguments = argument_parser.parse_args()

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

dataset_native = pd.read_json(arguments.native_logs_path, lines=True, typ="series")
dataset_vm = pd.read_json(arguments.vm_logs_path, lines=True, typ="series")

def canonicalize_compilation_time(event):
if "contract compilation finished" not in event["fields"]["message"]:
return None

compilation_span = find_span(event, "contract compilation")
if compilation_span is None:
return None

return {
"class hash": compilation_span["class_hash"],
"length": compilation_span["length"] / 1024,
"size": event["fields"]["size"] / 1024,
}

def find_span(event, name):
for span in event["spans"]:
if name in span["name"]:
return span
return None

def format_hash(class_hash):
return f"0x{class_hash[:6]}..."


dataset_native = dataset_native.apply(canonicalize_compilation_time).dropna().apply(pd.Series)
dataset_vm = dataset_vm.apply(canonicalize_compilation_time).dropna().apply(pd.Series)

figure, ax = plt.subplots()

sns.set_color_codes("bright")

sns.regplot(
x="length",
y="size",
label = "Native (<1000)",
data=dataset_native[dataset_native["size"] < 1000],
ax = ax,
)
sns.regplot(
x="length",
y="size",
label = "Native (>=1000)",
data=dataset_native[dataset_native["size"] >= 1000],
ax = ax,
)
sns.regplot(
x="length",
y="size",
label = "Casm",
data=dataset_vm,
ax = ax,
)

ax.set_xlabel("Sierra size (KiB)")
ax.set_ylabel("Compiled size (KiB)")
ax.set_title("Compilation Size Trend")
ax.ticklabel_format(style="plain")


ax.legend()

plt.show()
47 changes: 47 additions & 0 deletions plotting/plot_compilation_time.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
from argparse import ArgumentParser

argument_parser = ArgumentParser('Stress Test Plotter')
argument_parser.add_argument("native_logs_path")
arguments = argument_parser.parse_args()

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

dataset = pd.read_json(arguments.native_logs_path, lines=True, typ="series")

def canonicalize_compilation_time(event):
# keep contract compilation finished logs
if "contract compilation finished" not in event["fields"]["message"]:
return None

compilation_span = find_span(event, "contract compilation")
if compilation_span is None:
return None

return {
"class hash": compilation_span["class_hash"],
"time": float(event["fields"]["time"]),
}

def find_span(event, name):
for span in event["spans"]:
if name in span["name"]:
return span
return None

def format_hash(class_hash):
return f"0x{class_hash[:6]}..."

dataset = dataset.apply(canonicalize_compilation_time).dropna().apply(pd.Series)

figure, ax = plt.subplots()

sns.set_color_codes("bright")
sns.barplot(ax=ax, y="class hash", x="time", data=dataset, formatter=format_hash) # type: ignore

ax.set_xlabel("Compilation Time (ms)")
ax.set_ylabel("Class Hash")
ax.set_title("Native Compilation Time")

plt.show()
112 changes: 112 additions & 0 deletions plotting/plot_compilation_time_finer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
from argparse import ArgumentParser
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np

argument_parser = ArgumentParser("Stress Test Plotter")
argument_parser.add_argument("native_logs_path")
arguments = argument_parser.parse_args()


dataset = pd.read_json(arguments.native_logs_path, lines=True, typ="series")


def canonicalize_compilation_time(event):
# keep contract compilation finished logs
compilation_span = find_span(event, "contract compilation")
if compilation_span is None:
return None

class_hash = compilation_span["class_hash"]
class_length = compilation_span["length"]

if "contract compilation finished" in event["fields"]["message"]:
return {
"class hash": class_hash,
"length": class_length,
"type": "Total",
"time": float(event["fields"]["time"]),
}
elif "sierra to mlir compilation finished" in event["fields"]["message"]:
return {
"class hash": class_hash,
"length": class_length,
"type": "Sierra to MLIR",
"time": float(event["fields"]["time"]),
}
elif "mlir passes finished" in event["fields"]["message"]:
return {
"class hash": class_hash,
"length": class_length,
"type": "MLIR passes",
"time": float(event["fields"]["time"]),
}
elif "mlir to llvm finished" in event["fields"]["message"]:
return {
"class hash": class_hash,
"length": class_length,
"type": "MLIR to LLVM",
"time": float(event["fields"]["time"]),
}
elif "llvm passes finished" in event["fields"]["message"]:
return {
"class hash": class_hash,
"length": class_length,
"type": "LLVM passes",
"time": float(event["fields"]["time"]),
}
elif "llvm to object compilation finished" in event["fields"]["message"]:
return {
"class hash": class_hash,
"length": class_length,
"type": "LLVM to object",
"time": float(event["fields"]["time"]),
}
elif "linking finished" in event["fields"]["message"]:
return {
"class hash": class_hash,
"length": class_length,
"type": "Linking",
"time": float(event["fields"]["time"]),
}
return None


def find_span(event, name):
for span in event["spans"]:
if name in span["name"]:
return span
return None


def format_hash(class_hash):
return f"0x{class_hash[:6]}..."


dataset = dataset.apply(canonicalize_compilation_time).dropna().apply(pd.Series)
dataset = dataset.pivot(index = ["class hash"], columns = "type", values = "time")

pd.set_option('display.max_columns', None)

figure, ax = plt.subplots()

sns.set_color_codes("pastel")
sns.barplot(data=dataset, y="class hash", x="Total", label="Other", ax=ax, formatter=format_hash)

bottom = np.zeros(len(dataset))
sections = ["Linking", "LLVM to object", "LLVM passes", "MLIR to LLVM", "MLIR passes", "Sierra to MLIR"]

for section in sections:
bottom += dataset[section]

for section in sections:
sns.barplot(y=dataset.index, x=bottom, ax=ax, label=section, formatter=format_hash, orient="h")
bottom -= dataset[section]

ax.set_xlabel("Compilation Time (ms)")
ax.set_ylabel("Class Hash")
ax.set_title("Native Compilation Time")
ax.legend()

plt.show()
Loading
Loading