Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Zstd compression support to S3 plugin #439

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ddukbg
Copy link

@ddukbg ddukbg commented Oct 18, 2024

Summary

This PR adds support for Zstd compression in the Fluentd S3 plugin.

Changes

  • Implemented Zstd compression using the zstd-ruby library.
  • Introduced the ZstdCompressor class to handle log compression before uploading to S3.
  • Updated the example configuration to demonstrate the use of store_as zstd.
  • Ensured that the Zstd module is properly loaded to avoid uninitialized constant errors.

Testing

  • Successfully tested the Zstd compression functionality by sending large log data to S3.
  • All test cases passed without errors.

Test Code

require 'fluent/test'
require 'fluent/plugin/output'
require_relative '../lib/fluent/plugin/out_s3'  # out_s3.rb 파일을 명시적으로 불러옵니다
require 'fluent/plugin/s3_compressor_zstd'
require 'zstd-ruby'
require 'tempfile'

RSpec.describe Fluent::Plugin::S3Output::ZstdCompressor do
  let(:log) { double('log', warn: nil) }
  let(:compressor) { described_class.new(buffer_type: 'memory', log: log) }

  describe '#compress' do
    let(:test_data) { "This is a test log message" }

    it 'compresses the data using zstd' do
      chunk = double('chunk')
      allow(chunk).to receive(:open).and_yield(StringIO.new(test_data))

      tmp_file = Tempfile.new
      compressor.compress(chunk, tmp_file)
      tmp_file.rewind

      compressed_data = tmp_file.read
      expect(compressed_data).not_to eq(test_data)
      expect(Zstd.decompress(compressed_data)).to eq(test_data)
    end

    it 'logs a warning if compression fails' do
      chunk = double('chunk')
      allow(chunk).to receive(:open).and_raise(StandardError.new("Mock compression error"))

      tmp_file = Tempfile.new
      expect { compressor.compress(chunk, tmp_file) }.to raise_error(StandardError)

      expect(log).to have_received(:warn).with(/zstd compression failed: Mock compression error/)
    end
  end
end

Result

rspec test/zstd_compressor_spec.rb 

..

Finished in 0.03044 seconds (files took 0.59546 seconds to load)
2 examples, 0 failures

store_as (Zstd) Test

#fluent.conf
# -*- encoding: utf-8 -*-
<source>
  @type forward
  @id   input
  @label @mainstream
</source>

<label @mainstream>
  <match **>
    @type s3
    s3_bucket fluent-test-yw
    s3_region ap-northeast-2
    path logs/
    store_as zstd
    <format>
      @type json
      time_key timestamp   # JSON의 시간 키 명시
      encoding utf-8       # 인코딩 명시
    </format>
    <buffer>
      @type memory              # 메모리 버퍼 사용
      chunk_limit_size 1m        # 청크 크기를 1MB로 설정 (더 큰 로그를 한번에 처리)
      flush_interval 1s          # 1초마다 플러시
      flush_thread_count 4       # 동시에 플러시할 쓰레드 개수 (시스템 성능에 따라 더 증가 가능)
      retry_max_interval 10s     # 재시도 시간 조정
      retry_timeout 60m          # 재시도 시간 제한
    </buffer>
  </match>
</label>

Test Data
echo '{"message": "'$(head -c 1000000 </dev/zero | tr '\0' 'A')'"}' | fluent-cat test.tag

fluentd log
2024-10-18 17:49:37 +0900 [info]: #0 fluent/log.rb:362:info: [Aws::S3::Client 200 0.162773 0 retries] head_object(bucket:"fluent-test-yw",key:"logs/20241018_0.zst")

S3 Data
image

Why this feature?

Zstd compression provides a better compression ratio and performance compared to gzip, making it a valuable option for users who want efficient log storage on S3.

@daipom daipom self-requested a review October 18, 2024 08:59
@daipom
Copy link
Contributor

daipom commented Oct 18, 2024

Thanks for this enhancement.
Could you please add DCO to all commits?

fluent-plugin-s3.gemspec Outdated Show resolved Hide resolved
@ddukbg ddukbg force-pushed the master branch 2 times, most recently from 0d0bf95 to 6af3b5d Compare October 18, 2024 09:30
dependabot bot and others added 2 commits October 18, 2024 18:30
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: yongwoo.kim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants