Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specifying log_compression results in log_avg_msec ignored for bw and iops logs #1746

Open
1 task done
Rogach opened this issue Apr 6, 2024 · 5 comments
Open
1 task done

Comments

@Rogach
Copy link

Rogach commented Apr 6, 2024

Please acknowledge the following before creating a ticket

Description of the bug:
If log_compression is specified together with log_avg_msec, then bw and iops logs still contain a separate entry for each I/O done (instead of averages as expected).

Environment: Arch Linux, 6.7.5-arch1-1

fio version: fio-3.37

Reproduction steps

[global]
size=1TB

kb_base=1000
ioengine=libaio
invalidate=1
direct=1

write_bw_log
write_iops_log
write_lat_log
log_compression=128MiB
log_avg_msec=10

[test]
rw=read
io_limit=2GiB
iodepth=64

Run with fio --filename /dev/sdX test.fio.

Observe lots of non-averaged entries in the bw and iops logs:

9, 2803, 0, 0, 0
9, 3107, 0, 0, 0
9, 3103, 0, 0, 0
9, 3102, 0, 0, 0
9, 3102, 0, 0, 0
9, 3101, 0, 0, 0
9, 3101, 0, 0, 0
9, 3101, 0, 0, 0
9, 3100, 0, 0, 0
9, 3100, 0, 0, 0
@vincentkfu
Copy link
Collaborator

Do you know a fio version where this was working? I tried 3.36 and see the same issue.

root@localhost:~/fio-dev/1746/fio-canonical# ./fio-3.36 --name=test --ioengine=null --filesize=1T --write_bw_log=test --log_avg_msec=1000 --time_based --runtime=3s --log_compression=128M
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=null, iodepth=1
fio-3.36
Starting 1 process
Jobs: 1 (f=1): [R(1)][-.-%][r=14.8GiB/s][r=3868k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=16213: Mon Apr  8 14:13:48 2024
  read: IOPS=3857k, BW=14.7GiB/s (15.8GB/s)(44.1GiB/3000msec)
    clat (nsec): min=30, max=16490, avg=33.72, stdev=39.94
     lat (nsec): min=60, max=16640, avg=64.57, stdev=55.76
    clat percentiles (nsec):
     |  1.00th=[   32],  5.00th=[   32], 10.00th=[   32], 20.00th=[   33],
     | 30.00th=[   33], 40.00th=[   33], 50.00th=[   33], 60.00th=[   34],
     | 70.00th=[   34], 80.00th=[   34], 90.00th=[   35], 95.00th=[   35],
     | 99.00th=[   36], 99.50th=[   37], 99.90th=[  125], 99.95th=[  131],
     | 99.99th=[  137]
   bw (  MiB/s): min=  242, max=133333, per=100.00%, avg=119630.13, stdev=4764.08, samples=11570360
   iops        : min=3798766, max=3876458, avg=3859501.60, stdev=34036.42, samples=5
  lat (nsec)   : 50=99.89%, 100=0.01%, 250=0.10%, 500=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=99.83%, sys=0.13%, ctx=3, majf=0, minf=112
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=11570360,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=14.7GiB/s (15.8GB/s), 14.7GiB/s-14.7GiB/s (15.8GB/s-15.8GB/s), io=44.1GiB (47.4GB), run=3000-3000msec
root@localhost:~/fio-dev/1746/fio-canonical# head test_bw.1.log
999, 122248957, 0, 0, 0
999, 35008547, 0, 0, 0
999, 117028571, 0, 0, 0
999, 117028571, 0, 0, 0
999, 124121212, 0, 0, 0
999, 128000000, 0, 0, 0
999, 124121212, 0, 0, 0
999, 117028571, 0, 0, 0
999, 117028571, 0, 0, 0
999, 124121212, 0, 0, 0

@Rogach
Copy link
Author

Rogach commented Apr 9, 2024

No, I only tested it on the latest version. Do you want me to bisect it?

@vincentkfu
Copy link
Collaborator

I think the issue is in iolog.h:

static inline bool per_unit_log(struct io_log *log)
{
        return log && (!log->avg_msec || log->log_gz || log->log_gz_store);
}

Try reverting 8363654 and let me know if it works. The only problem is that I'm not sure what issues that patch fixed, so I don't know what harm would be caused by reverting it.

@Rogach
Copy link
Author

Rogach commented Apr 11, 2024

I reverted that commit, and got a segfault :) Seems that patch did have some purpose.

@vincentkfu
Copy link
Collaborator

gdb says that the segfault occurs at:

#0  all_sw_idle (wq=0x7f12fbcb7b98) at workqueue.c:76
76			if (!(sw->flags & SW_F_IDLE))
[Current thread is 1 (Thread 0x7f12fbc896c0 (LWP 4449))]
(gdb) list
71		int i;
72	
73		for (i = 0; i < wq->max_workers; i++) {
74			struct submit_worker *sw = &wq->workers[i];
75	
76			if (!(sw->flags & SW_F_IDLE))
77				return false;
78		}
79	
80		return true;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants