Skip to content

Commit

Permalink
Merge branch 'main' into hackathon-xray-web-stream
Browse files Browse the repository at this point in the history
  • Loading branch information
jcreixell authored Feb 8, 2024
2 parents 7d25a5d + 1a035ee commit 489dff5
Show file tree
Hide file tree
Showing 383 changed files with 12,377 additions and 8,869 deletions.
116 changes: 58 additions & 58 deletions .drone/drone.yml

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion .github/workflows/integration-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: "1.21"
go-version: "1.22"
- name: Set OTEL Exporter Endpoint
run: echo "OTEL_EXPORTER_ENDPOINT=172.17.0.1:4318" >> $GITHUB_ENV
- name: Run tests
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Go 1.21
- name: Set up Go 1.22
uses: actions/setup-go@v5
with:
go-version: "1.21"
go-version: "1.22"
cache: true
- name: Test
run: make GO_TAGS="nodocker" test
82 changes: 74 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,23 +10,51 @@ internal API changes are not present.
Main (unreleased)
-----------------

### Security fixes
### Breaking changes

- Fixes following vulnerabilities (@hainenber)
- [GO-2023-2409](https://github.com/advisories/GHSA-mhpq-9638-x6pw)
- [GO-2023-2412](https://github.com/advisories/GHSA-7ww5-4wqc-m92c)
- [CVE-2023-49568](https://github.com/advisories/GHSA-mw99-9chc-xw7r)
- Prohibit the configuration of services within modules. (@wildum)

- For `otelcol.exporter` components, change the default value of `disable_high_cardinality_metrics` to `true`. (@ptodev)

### Features

- A new `discovery.process` component for discovering Linux OS processes on the current host. (@korniltsev)

- A new `pyroscope.java` component for profiling Java processes using async-profiler. (@korniltsev)

- A new `otelcol.processor.resourcedetection` component which inserts resource attributes
to OTLP telemetry based on the host on which Grafana Agent is running. (@ptodev)

- Expose track_timestamps_staleness on Prometheus scraping, to fix the issue where container metrics live for 5 minutes after the container disappears. (@ptodev)

### Enhancements

- Include line numbers in profiles produced by `pyrsocope.java` component. (@korniltsev)
- Add an option to the windows static mode installer for expanding environment vars in the yaml config. (@erikbaranowski)
- Add authentication support to `loki.source.awsfirehose` (@sberz)

- Sort kubelet endpoint to reduce pressure on K8s's API server and watcher endpoints. (@hainenber)

- Expose `physical_disk` collector from `windows_exporter` v0.24.0 to
- Expose `physical_disk` collector from `windows_exporter` v0.24.0 to
Flow configuration. (@hainenber)

- Renamed Grafana Agent Mixin's "prometheus.remote_write" dashboard to
"Prometheus Components" and added charts for `prometheus.scrape` success rate
and duration metrics. (@thampiotr)

- Removed `ClusterLamportClockDrift` and `ClusterLamportClockStuck` alerts from
Grafana Agent Mixin to focus on alerting on symptoms. (@thampiotr)

- Increased clustering alert periods to 10 minutes to improve the
signal-to-noise ratio in Grafana Agent Mixin. (@thampiotr)

- `mimir.rules.kubernetes` has a new `prometheus_http_prefix` argument to configure
the HTTP endpoint on which to connect to Mimir's API. (@hainenber)

- `service_name` label is inferred from discovery meta labels in `pyroscope.java` (@korniltsev)

- Mutex and block pprofs are now available via the pprof endpoint. (@mattdurham)

### Bugfixes

- Fix an issue in `remote.s3` where the exported content of an object would be an empty string if `remote.s3` failed to fully retrieve
Expand All @@ -37,6 +65,16 @@ Main (unreleased)
- Fix a duplicate metrics registration panic when sending metrics to an static
mode metric instance's write handler. (@tpaschalis)

- Fix issue causing duplicate logs when a docker target is restarted. (@captncraig)

- Fix an issue where blocks having the same type and the same label across
modules could result in missed updates. (@thampiotr)

- Fix an issue with static integrations-next marshaling where non singletons
would cause `/-/config` to fail to marshal. (@erikbaranowski)

- Fix divide-by-zero issue when sharding targets. (@hainenber)

### Other changes

- Removed support for Windows 2012 in line with Microsoft end of life. (@mattdurham)
Expand All @@ -45,6 +83,34 @@ Main (unreleased)

- Updated dependency to add support for Go 1.22 (@stefanb)

- Use Go 1.22 for builds. (@rfratto)

v0.39.2 (2024-1-31)
--------------------

### Bugfixes

- Fix error introduced in v0.39.0 preventing remote write to Amazon Managed Prometheus. (@captncraig)

- An error will be returned in the converter from Static to Flow when `scrape_integration` is set
to `true` but no `remote_write` is defined. (@erikbaranowski)


v0.39.1 (2024-01-19)
--------------------

### Security fixes

- Fixes following vulnerabilities (@hainenber)
- [GO-2023-2409](https://github.com/advisories/GHSA-mhpq-9638-x6pw)
- [GO-2023-2412](https://github.com/advisories/GHSA-7ww5-4wqc-m92c)
- [CVE-2023-49568](https://github.com/advisories/GHSA-mw99-9chc-xw7r)

### Bugfixes

- Fix issue where installing the Windows Agent Flow installer would hang then crash. (@mattdurham)


v0.39.0 (2024-01-09)
--------------------

Expand All @@ -63,7 +129,7 @@ v0.39.0 (2024-01-09)
- This change will not break any existing configurations and you can opt in to validation via the `validate_dimensions` configuration option.
- Before this change, pulling metrics for azure resources with variable dimensions required one configuration per metric + dimension combination to avoid an error.
- After this change, you can include all metrics and dimensions in a single configuration and the Azure APIs will only return dimensions which are valid for the various metrics.

### Features

- A new `discovery.ovhcloud` component for discovering scrape targets on OVHcloud. (@ptodev)
Expand Down Expand Up @@ -160,7 +226,7 @@ v0.39.0 (2024-01-09)
- Attach unique Agent ID header to remote-write requests. (@captncraig)

- Update to v2.48.1 of `github.com/prometheus/prometheus`.
Previously, a custom fork of v2.47.2 was used.
Previously, a custom fork of v2.47.2 was used.
The custom fork of v2.47.2 also contained prometheus#12729 and prometheus#12677.

v0.38.1 (2023-11-30)
Expand Down
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
##
## test Run tests
## lint Lint code
## integration-tests Run integration tests
## integration-test Run integration tests
##
## Targets for building binaries:
##
Expand Down Expand Up @@ -167,7 +167,7 @@ test-packages:
docker pull $(BUILD_IMAGE)
go test -tags=packaging ./packaging

.PHONY: integration-tests
.PHONY: integration-test
integration-test:
cd integration-tests && $(GO_ENV) go run .

Expand Down
2 changes: 1 addition & 1 deletion build-image/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ FROM alpine:3.17 as helm
RUN apk add --no-cache helm

# Dependency: Go and Go dependencies
FROM golang:1.21.4-bullseye as golang
FROM golang:1.22.0-bullseye as golang

# Keep in sync with cmd/grafana-agent-operator/DEVELOPERS.md
ENV CONTROLLER_GEN_VERSION v0.9.2
Expand Down
2 changes: 1 addition & 1 deletion build-image/windows/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM library/golang:1.21.4-windowsservercore-1809
FROM library/golang:1.22.0-windowsservercore-1809

SHELL ["powershell", "-command"]

Expand Down
2 changes: 1 addition & 1 deletion cmd/grafana-agent-operator/DEVELOPERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ running.
### Apply the CRDs

Generated CRDs used by the operator can be found in [the Production
folder](../../production/operator/crds). Deploy them from the root of the
folder](../../operations/agent-static-operator/crds). Deploy them from the root of the
repository with:

```
Expand Down
38 changes: 37 additions & 1 deletion cmd/internal/flowmode/cmd_run.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ import (
"os"
"os/signal"
"path/filepath"
"runtime"
"strconv"
"strings"
"sync"
"syscall"
Expand Down Expand Up @@ -178,6 +180,9 @@ func (fr *flowRun) Run(configPath string) error {

level.Info(l).Log("boringcrypto enabled", boringcrypto.Enabled)

// Enable the profiling.
setMutexBlockProfiling(l)

// Immediately start the tracer.
go func() {
err := t.Run(ctx)
Expand Down Expand Up @@ -365,7 +370,7 @@ func getEnabledComponentsFunc(f *flow.Flow) func() map[string]interface{} {
components := component.GetAllComponents(f, component.InfoOptions{})
componentNames := map[string]struct{}{}
for _, c := range components {
componentNames[c.Registration.Name] = struct{}{}
componentNames[c.ComponentName] = struct{}{}
}
return map[string]interface{}{"enabled-components": maps.Keys(componentNames)}
}
Expand Down Expand Up @@ -455,3 +460,34 @@ func splitPeers(s, sep string) []string {
}
return strings.Split(s, sep)
}

func setMutexBlockProfiling(l log.Logger) {
mutexPercent := os.Getenv("PPROF_MUTEX_PROFILING_PERCENT")
if mutexPercent != "" {
rate, err := strconv.Atoi(mutexPercent)
if err == nil && rate > 0 {
// The 100/rate is because the value is interpreted as 1/rate. So 50 would be 100/50 = 2 and become 1/2 or 50%.
runtime.SetMutexProfileFraction(100 / rate)
} else {
level.Error(l).Log("msg", "error setting PPROF_MUTEX_PROFILING_PERCENT", "err", err, "value", mutexPercent)
runtime.SetMutexProfileFraction(1000)
}
} else {
// Why 1000 because that is what istio defaults to and that seemed reasonable to start with. This is 00.1% sampling.
runtime.SetMutexProfileFraction(1000)
}
blockRate := os.Getenv("PPROF_BLOCK_PROFILING_RATE")
if blockRate != "" {
rate, err := strconv.Atoi(blockRate)
if err == nil && rate > 0 {
runtime.SetBlockProfileRate(rate)
} else {
level.Error(l).Log("msg", "error setting PPROF_BLOCK_PROFILING_RATE", "err", err, "value", blockRate)
runtime.SetBlockProfileRate(10_000)
}
} else {
// This should have a negligible impact. This will track anything over 10_000ns, and will randomly sample shorter durations.
// Default taken from https://github.com/DataDog/go-profiler-notes/blob/main/block.md
runtime.SetBlockProfileRate(10_000)
}
}
3 changes: 3 additions & 0 deletions component/all/all.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ import (
_ "github.com/grafana/agent/component/discovery/nomad" // Import discovery.nomad
_ "github.com/grafana/agent/component/discovery/openstack" // Import discovery.openstack
_ "github.com/grafana/agent/component/discovery/ovhcloud" // Import discovery.ovhcloud
_ "github.com/grafana/agent/component/discovery/process" // Import discovery.process
_ "github.com/grafana/agent/component/discovery/puppetdb" // Import discovery.puppetdb
_ "github.com/grafana/agent/component/discovery/relabel" // Import discovery.relabel
_ "github.com/grafana/agent/component/discovery/scaleway" // Import discovery.scaleway
Expand Down Expand Up @@ -81,6 +82,7 @@ import (
_ "github.com/grafana/agent/component/otelcol/processor/k8sattributes" // Import otelcol.processor.k8sattributes
_ "github.com/grafana/agent/component/otelcol/processor/memorylimiter" // Import otelcol.processor.memory_limiter
_ "github.com/grafana/agent/component/otelcol/processor/probabilistic_sampler" // Import otelcol.processor.probabilistic_sampler
_ "github.com/grafana/agent/component/otelcol/processor/resourcedetection" // Import otelcol.processor.resourcedetection
_ "github.com/grafana/agent/component/otelcol/processor/span" // Import otelcol.processor.span
_ "github.com/grafana/agent/component/otelcol/processor/tail_sampling" // Import otelcol.processor.tail_sampling
_ "github.com/grafana/agent/component/otelcol/processor/transform" // Import otelcol.processor.transform
Expand Down Expand Up @@ -127,6 +129,7 @@ import (
_ "github.com/grafana/agent/component/prometheus/remotewrite" // Import prometheus.remote_write
_ "github.com/grafana/agent/component/prometheus/scrape" // Import prometheus.scrape
_ "github.com/grafana/agent/component/pyroscope/ebpf" // Import pyroscope.ebpf
_ "github.com/grafana/agent/component/pyroscope/java" // Import pyroscope.java
_ "github.com/grafana/agent/component/pyroscope/scrape" // Import pyroscope.scrape
_ "github.com/grafana/agent/component/pyroscope/write" // Import pyroscope.write
_ "github.com/grafana/agent/component/remote/http" // Import remote.http
Expand Down
6 changes: 3 additions & 3 deletions component/component_provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,8 @@ type Info struct {
// this component depends on, or is depended on by, respectively.
References, ReferencedBy []string

Registration Registration // Component registration.
Health Health // Current component health.
ComponentName string // Name of the component.
Health Health // Current component health.

Arguments Arguments // Current arguments value of the component.
Exports Exports // Current exports value of the component.
Expand Down Expand Up @@ -157,7 +157,7 @@ func (info *Info) MarshalJSON() ([]byte, error) {
}

return json.Marshal(&componentDetailJSON{
Name: info.Registration.Name,
Name: info.ComponentName,
Type: "block",
ModuleID: info.ID.ModuleID,
LocalID: info.ID.LocalID,
Expand Down
10 changes: 8 additions & 2 deletions component/discovery/discovery.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,13 @@ func (t *DistributedTargets) Get() []Target {
return t.targets
}

res := make([]Target, 0, (len(t.targets)+1)/len(t.cluster.Peers()))
peerCount := len(t.cluster.Peers())
resCap := (len(t.targets) + 1)
if peerCount != 0 {
resCap = (len(t.targets) + 1) / peerCount
}

res := make([]Target, 0, resCap)

for _, tgt := range t.targets {
peers, err := t.cluster.Lookup(shard.StringKey(tgt.NonMetaLabels().String()), 1, shard.OpReadWrite)
Expand All @@ -55,7 +61,7 @@ func (t *DistributedTargets) Get() []Target {
// back to owning the target ourselves.
res = append(res, tgt)
}
if peers[0].Self {
if len(peers) == 0 || peers[0].Self {
res = append(res, tgt)
}
}
Expand Down
37 changes: 37 additions & 0 deletions component/discovery/process/args.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
package process

import (
"time"

"github.com/grafana/agent/component/discovery"
)

type Arguments struct {
Join []discovery.Target `river:"join,attr,optional"`
RefreshInterval time.Duration `river:"refresh_interval,attr,optional"`
DiscoverConfig DiscoverConfig `river:"discover_config,block,optional"`
}

type DiscoverConfig struct {
Cwd bool `river:"cwd,attr,optional"`
Exe bool `river:"exe,attr,optional"`
Commandline bool `river:"commandline,attr,optional"`
Username bool `river:"username,attr,optional"`
UID bool `river:"uid,attr,optional"`
ContainerID bool `river:"container_id,attr,optional"`
}

var DefaultConfig = Arguments{
Join: nil,
RefreshInterval: 60 * time.Second,
DiscoverConfig: DiscoverConfig{
Cwd: true,
Exe: true,
Commandline: true,
ContainerID: true,
},
}

func (args *Arguments) SetToDefault() {
*args = DefaultConfig
}
Loading

0 comments on commit 489dff5

Please sign in to comment.