From 143de7443a4d2152fae6927dad69440b5896632b Mon Sep 17 00:00:00 2001 From: Travis Long Date: Mon, 21 Oct 2024 14:38:18 -0500 Subject: [PATCH] Bug 1926095 - Add how-to with tips on investigating data anomalies --- .dictionary | 3 +- docs/user/SUMMARY.md | 1 + docs/user/user/howto/index.md | 14 ++++- .../investigating-data-issues.md | 60 +++++++++++++++++++ 4 files changed, 75 insertions(+), 3 deletions(-) create mode 100644 docs/user/user/howto/investigating-data-issues/investigating-data-issues.md diff --git a/.dictionary b/.dictionary index cd56c3d518..3cf7b16fc8 100644 --- a/.dictionary +++ b/.dictionary @@ -1,4 +1,4 @@ -personal_ws-1.1 en 290 utf-8 +personal_ws-1.1 en 291 utf-8 AAR AARs ABI @@ -41,6 +41,7 @@ Gradle Grapheme Hotfix Howtos +ISPs JDK JNA JNI diff --git a/docs/user/SUMMARY.md b/docs/user/SUMMARY.md index 94c7de3483..722d5db656 100644 --- a/docs/user/SUMMARY.md +++ b/docs/user/SUMMARY.md @@ -47,6 +47,7 @@ - [Walkthroughs and How-tos](user/howto/index.md) - [Server Knobs Walkthrough](user/howto/server-knobs-walkthrough/server-knobs-walkthrough.md) - ["Real-Time" Events](user/howto/real-time-events/real-time-events.md) + - [Telemetry/Data Bug Investigation Recommendations](user/howto/investigating-data-issues/investigating-data-issues.md) # API Reference diff --git a/docs/user/user/howto/index.md b/docs/user/user/howto/index.md index 6791411fec..9bc41edbbc 100644 --- a/docs/user/user/howto/index.md +++ b/docs/user/user/howto/index.md @@ -4,6 +4,16 @@ This chapter contains various how-tos and walkthroughs to help aid you in using ### [Server Knobs Walkthrough] -A step-by-step guide in setting up and launching a Server Knobs Experiment +A step-by-step guide in setting up and launching a Server Knobs Experiment. -[Server Knobs Walkthrough]: ./server-knobs-walkthrough/server-knobs-walkthrough.md \ No newline at end of file +### ["Real-Time" Events] + +A guide describing the different methods to collect and transmit data in a "real-time" fashion using Glean. + +### [Telemetry/Data Bug Investigation Recommendations] + +Recommendations and tips on investigating data anomalies. + +[Server Knobs Walkthrough]: ./server-knobs-walkthrough/server-knobs-walkthrough.md +["Real-Time" Events]: ./real-time-events/real-time-events.md +[Telemetry/Data Bug Investigation Recommendations]: ./investigating-data-issues/investigating-data-issues.md diff --git a/docs/user/user/howto/investigating-data-issues/investigating-data-issues.md b/docs/user/user/howto/investigating-data-issues/investigating-data-issues.md new file mode 100644 index 0000000000..5bb0270581 --- /dev/null +++ b/docs/user/user/howto/investigating-data-issues/investigating-data-issues.md @@ -0,0 +1,60 @@ +# Telemetry/Data Bug Investigation Recommendations + +This document outlines several diagnostic categories and the insights they may offer when investigating unusual telemetry patterns or data anomalies. + +### 1\. Countries (e.g., China, Iran, etc.) + +* Purpose: Identify geographical patterns that could explain anomalies. +* Considerations: + * Are there ongoing national holidays or similar events that could affect data? + * Is the region known for bot activity or unusual behavior? (e.g., Malaysia, China, Ireland, etc.) + +### 2\. ISP (Internet Service Provider) + +* Purpose: Analyze data at a more granular level than countries to identify potential automation or bot activity. +* Considerations: + * Could the anomaly be traced back to a single ISP, potentially indicating automation? + * Be mindful of the large number of ISPs; consider applying filters (e.g., HAVING clause) to exclude smaller ISPs. + +### 3\. Product Version / Build ID + +* Purpose: Check if issues began with a specific product version or build. +* Considerations: + * Did the issue arise after a particular version update? If so, collaborate with the product team to identify changes. + * Ensure that the build ID matches a known Mozilla build. If not, it could be a clone, fork, or side-load build. + +### 4\. Glean SDK Version + +* Purpose: Determine whether the issue is tied to a specific Glean SDK version. +* Considerations: + * Did the anomaly start after an update to Glean? Work with the Glean team to verify version changes. + +### 5\. Other Library Version Changes + +* Purpose: Identify possible regressions due to library updates. +* Considerations: + * Review updates to Application Services, Gecko, and other dependencies (e.g., Viaduct, rkv) that could affect telemetry collection. + +### 6\. OS SDK Version (Android, iOS) + +* Purpose: Check if platform SDK changes are impacting data collection. +* Considerations: + * Have there been changes to platform lifecycle events or background task behaviors (e.g., 0-duration pings, or ping submission issues)? + +### 7\. Time Differences: start/end\_time vs. submission\_timestamp + +* Purpose: Assess the delay between telemetry collection and submission. +* Considerations: + * Are the recorded timestamps reasonable, both in terms of the ping time window and the delay from collection to submission? + +### 8\. Glean Errors + +* Purpose: Identify telemetry or network errors related to data collection. +* Considerations: + * Are there networking errors, ingestion issues, or other telemetry failures that could be related to the anomaly? + +### 9\. Hardware Details (Manufacturer/Version) + +* Purpose: Determine if the issue is hardware-specific. +* Considerations: + * Does the anomaly occur primarily on older or newer hardware models?