Skip to content

Commit

Permalink
Bug 1926095 - Add how-to with tips on investigating data anomalies
Browse files Browse the repository at this point in the history
  • Loading branch information
travis79 committed Oct 21, 2024
1 parent 36d1bb1 commit 143de74
Show file tree
Hide file tree
Showing 4 changed files with 75 additions and 3 deletions.
3 changes: 2 additions & 1 deletion .dictionary
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
personal_ws-1.1 en 290 utf-8
personal_ws-1.1 en 291 utf-8
AAR
AARs
ABI
Expand Down Expand Up @@ -41,6 +41,7 @@ Gradle
Grapheme
Hotfix
Howtos
ISPs
JDK
JNA
JNI
Expand Down
1 change: 1 addition & 0 deletions docs/user/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
- [Walkthroughs and How-tos](user/howto/index.md)
- [Server Knobs Walkthrough](user/howto/server-knobs-walkthrough/server-knobs-walkthrough.md)
- ["Real-Time" Events](user/howto/real-time-events/real-time-events.md)
- [Telemetry/Data Bug Investigation Recommendations](user/howto/investigating-data-issues/investigating-data-issues.md)

# API Reference

Expand Down
14 changes: 12 additions & 2 deletions docs/user/user/howto/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,16 @@ This chapter contains various how-tos and walkthroughs to help aid you in using

### [Server Knobs Walkthrough]

A step-by-step guide in setting up and launching a Server Knobs Experiment
A step-by-step guide in setting up and launching a Server Knobs Experiment.

[Server Knobs Walkthrough]: ./server-knobs-walkthrough/server-knobs-walkthrough.md
### ["Real-Time" Events]

A guide describing the different methods to collect and transmit data in a "real-time" fashion using Glean.

### [Telemetry/Data Bug Investigation Recommendations]

Recommendations and tips on investigating data anomalies.

[Server Knobs Walkthrough]: ./server-knobs-walkthrough/server-knobs-walkthrough.md
["Real-Time" Events]: ./real-time-events/real-time-events.md
[Telemetry/Data Bug Investigation Recommendations]: ./investigating-data-issues/investigating-data-issues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Telemetry/Data Bug Investigation Recommendations

This document outlines several diagnostic categories and the insights they may offer when investigating unusual telemetry patterns or data anomalies.

### 1\. Countries (e.g., China, Iran, etc.)

* Purpose: Identify geographical patterns that could explain anomalies.
* Considerations:
* Are there ongoing national holidays or similar events that could affect data?
* Is the region known for bot activity or unusual behavior? (e.g., Malaysia, China, Ireland, etc.)

### 2\. ISP (Internet Service Provider)

* Purpose: Analyze data at a more granular level than countries to identify potential automation or bot activity.
* Considerations:
* Could the anomaly be traced back to a single ISP, potentially indicating automation?
* Be mindful of the large number of ISPs; consider applying filters (e.g., HAVING clause) to exclude smaller ISPs.

### 3\. Product Version / Build ID

* Purpose: Check if issues began with a specific product version or build.
* Considerations:
* Did the issue arise after a particular version update? If so, collaborate with the product team to identify changes.
* Ensure that the build ID matches a known Mozilla build. If not, it could be a clone, fork, or side-load build.

### 4\. Glean SDK Version

* Purpose: Determine whether the issue is tied to a specific Glean SDK version.
* Considerations:
* Did the anomaly start after an update to Glean? Work with the Glean team to verify version changes.

### 5\. Other Library Version Changes

* Purpose: Identify possible regressions due to library updates.
* Considerations:
* Review updates to Application Services, Gecko, and other dependencies (e.g., Viaduct, rkv) that could affect telemetry collection.

### 6\. OS SDK Version (Android, iOS)

* Purpose: Check if platform SDK changes are impacting data collection.
* Considerations:
* Have there been changes to platform lifecycle events or background task behaviors (e.g., 0-duration pings, or ping submission issues)?

### 7\. Time Differences: start/end\_time vs. submission\_timestamp

* Purpose: Assess the delay between telemetry collection and submission.
* Considerations:
* Are the recorded timestamps reasonable, both in terms of the ping time window and the delay from collection to submission?

### 8\. Glean Errors

* Purpose: Identify telemetry or network errors related to data collection.
* Considerations:
* Are there networking errors, ingestion issues, or other telemetry failures that could be related to the anomaly?

### 9\. Hardware Details (Manufacturer/Version)

* Purpose: Determine if the issue is hardware-specific.
* Considerations:
* Does the anomaly occur primarily on older or newer hardware models?

0 comments on commit 143de74

Please sign in to comment.