core(long-tasks): add more task information to debugData #15198

brendankenny · 2023-06-23T18:22:36Z

The long-tasks audit lists the 20 longest tasks, attributed to specific URLs when possible. This makes sense in the HTML report, where these 20 are typically representative of the worst behavior, and any more would risk overwhelming any prioritization it provides.

However, when doing analysis using HTTP Archive, you often want all the long tasks, to see where first and third parties are spending time when blocking the main thread. Meanwhile, you often want to see how time was spent during that task (all js? lengthy layout? etc), which today we only provide in aggregate in mainthread-work-breakdown, not per task or per URL (third-party-summary has yet another summary with slightly different information summed up per third-party URL).

This PR extends the debugData associated with the long-tasks audit to include more long tasks and include a breakdown of how time was spent during each long task (similar to some of the breakdown LoAF will provide).

As an example, here's the debug data taken from sample_v2:

{
  type: 'debugdata',
  urls: [
    'http://localhost:10200/dobetterweb/dbw_tester.html',
    'http://localhost:10200/dobetterweb/third_party/aggressive-promise-polyfill.js',
  ],
  tasks: [
    {
      urlIndex: 0,
      startTime: 6845.8,
      duration: 1174.9,
      other: 0.8,
      parseHTML: 0.6,
      scriptEvaluation: 1170.7,
      scriptParseCompile: 2.5,
      styleLayout: 0.2
    },
    {
      urlIndex: 1,
      startTime: 8044.8,
      duration: 145.8,
      other: 0.1,
      parseHTML: 13.5,
      scriptEvaluation: 120.7,
      scriptParseCompile: 10.4,
      styleLayout: 1.2
    }
  ]
}

Often the biggest thing in the LHR is the URLs, so this keeps them in a separate array and has each entry index into it so that URLs are listed at most once.

brendankenny · 2023-06-23T18:38:19Z

core/audits/long-tasks.js

@@ -14,6 +14,9 @@ import {getJavaScriptURLs, getAttributableURLForTask} from '../lib/tracehouse/ta

 /** We don't always have timing data for short tasks, if we're missing timing data. Treat it as though it were 0ms. */
 const DEFAULT_TIMING = {startTime: 0, endTime: 0, duration: 0};
+const DISPLAYED_TASK_COUNT = 20;
+// 99% of the May 2023 HTTP Archive run had fewer than 150 long tasks.
+const DEBUG_TASK_COUNT = 150;


I went back and forth on this number. 150 is a lot, but even in the top 1000 most popular sites in CrUX, there are still 18 sites with more than 150 long tasks in their May 2023 HTTP Archive run.

LHR size-wise, the good news is that 90% of the 27 million pages tested in the May run have ~50 long tasks (the media is 11 long tasks). Most pages won't have enough to move the needle.

150 long tasks will cover all long tasks of 99% of pages. Of the remaining 1% of pages, 90% of them only have long tasks after that of less than 75ms (there's only 1 site in the top 1000 with more than 150 long tasks and a 151st long task longer than 75ms).

At that point, we're capturing basically everything that's going on, and I don't think it's worth boosting this number to e.g. 240 to be able to capture the 99.9% case.

brendankenny · 2023-06-23T19:43:40Z

core/audits/long-tasks.js

-    } else {
-      for (const task of tasks) {
-        if (task.unbounded || task.parent) continue;
-        taskTimingsByEvent.set(task.event, task);
-      }


the existing code uses taskTimingsByEvent for all task timing, including in the non-simulated case by using the tasks directly here. To get the task type breakdown you have to go into child tasks, so this would have to drop the task.parent check and instead put all events in this map, which felt like overkill since the timings are already all on the tasks we're traversing.

Instead, taskTimingsByEvent is only created in the throttlingMethod === 'simulate' case. Timings come from there if it exists, otherwise timing comes from the task directly (this is also all wrapped up in the getTiming function now).

core/audits/long-tasks.js

benschwarz · 2023-06-24T01:33:00Z

We recently discussed this limitation internally, so just wanted to add my support for this change too!

As you've pointed out several of the audits choose to truncate data for display in HTML reports, but it means that API users (like us) have to either create custom audits or other methods to be able to collect certain information.

Increasing truncation limits is a worthy improvement 👍

However, one detail of note is that truncating results still means that any calculated totals (e.g.: "total amount of script time for a third party") aren't really a total, but instead are a total of what was sampled. Given you've looked at HTTP archive data to base the new limit this is less of a problem, so I'm not sure how much that detail really matters.

core/audits/long-tasks.js

brendankenny · 2023-06-27T23:05:00Z

@benschwarz those are all good points. I'm taking a look at how bad it would be if we didn't include a truncation limit (or set it much much higher).

brendankenny · 2023-07-09T19:02:20Z

I put together a cost/benefit analysis here (sorry, locked down). Putting no limit on the number of results puts its LHR size impact in the middle of current opportunities for the mean result (20 long tasks) and actually below many opportunities at p95 (75 long tasks). Being able to always calculate results from the full long-task data seems worth the tradeoff.

(meanwhile reducing the LHR size for all the image opportunities looks like it would be a major size win for LH 12+ if we want to prioritize that)

core(long-tasks): add more task information to debugData

2946df2

brendankenny requested a review from a team as a code owner June 23, 2023 18:22

brendankenny requested review from adamraine and removed request for a team June 23, 2023 18:22

brendankenny commented Jun 23, 2023

View reviewed changes

devtools-bot assigned adamraine Jun 23, 2023

devtools-bot added the waiting4reviewer label Jun 23, 2023

adamraine reviewed Jun 27, 2023

View reviewed changes

core/audits/long-tasks.js Outdated Show resolved Hide resolved

core/audits/long-tasks.js Show resolved Hide resolved

bring back N/A

f69e4aa

vercel bot deployed to Preview July 5, 2023 21:35 View deployment

include all long tasks

590964d

vercel bot deployed to Preview July 9, 2023 18:59 View deployment

adamraine approved these changes Jul 10, 2023

View reviewed changes

adamraine merged commit 4d3ee72 into main Jul 10, 2023
31 checks passed

adamraine deleted the more-long-tasks branch July 10, 2023 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core(long-tasks): add more task information to debugData #15198

core(long-tasks): add more task information to debugData #15198

brendankenny commented Jun 23, 2023

brendankenny Jun 23, 2023

brendankenny Jun 23, 2023

benschwarz commented Jun 24, 2023 •

edited

Loading

brendankenny commented Jun 27, 2023

brendankenny commented Jul 9, 2023

core(long-tasks): add more task information to debugData #15198

core(long-tasks): add more task information to debugData #15198

Conversation

brendankenny commented Jun 23, 2023

brendankenny Jun 23, 2023

Choose a reason for hiding this comment

brendankenny Jun 23, 2023

Choose a reason for hiding this comment

benschwarz commented Jun 24, 2023 • edited Loading

brendankenny commented Jun 27, 2023

brendankenny commented Jul 9, 2023

benschwarz commented Jun 24, 2023 •

edited

Loading