Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small perf improvement #1533

Closed
wants to merge 1 commit into from
Closed

Conversation

ptolts
Copy link

@ptolts ptolts commented Oct 27, 2023

I saw this method in a few profiles and used ChatGPT to make it faster. Seems worth the effort.

require 'benchmark/ips'

MAX_MEMBER_COUNT = 1000

class OldVersion
  def initialize(hash)
    excess = hash.size - MAX_MEMBER_COUNT
    hash = Hash[hash.drop(excess)] if excess.positive?
    @hash = hash.freeze
  end
end

class NewVersion
  def initialize(hash)
    excess = hash.size - MAX_MEMBER_COUNT
    @hash = excess.positive? ? Hash[hash.first(MAX_MEMBER_COUNT)] : hash
    @hash.freeze
  end
end

small_hash = Hash[(1..500).map { |i| [i, i] }] 
medium_hash = Hash[(1..1500).map { |i| [i, i] }]
large_hash = Hash[(1..5000).map { |i| [i, i] }]


Benchmark.ips do |x|
  x.config(:time => 5, :warmup => 2)
  
  x.report("Old version with small input") do 
    OldVersion.new(small_hash) 
  end

  x.report("New version with small input") do 
    NewVersion.new(small_hash) 
  end

  x.compare!
end

Benchmark.ips do |x|
  x.config(:time => 5, :warmup => 2)

  x.report("Old version with medium input") do 
    OldVersion.new(medium_hash) 
  end

  x.report("New version with medium input") do 
    NewVersion.new(medium_hash) 
  end

  x.compare!
end

Benchmark.ips do |x|
  x.config(:time => 5, :warmup => 2)
  
  x.report("Old version with large input") do 
    OldVersion.new(large_hash) 
  end

  x.report("New version with large input") do 
    NewVersion.new(large_hash) 
  end

  x.compare!
end
Warming up --------------------------------------
Old version with small input
                       567.474k i/100ms
New version with small input
                       613.909k i/100ms
Calculating -------------------------------------
Old version with small input
                          5.561M (± 3.2%) i/s -     27.806M in   5.005872s
New version with small input
                          6.145M (± 0.7%) i/s -     31.309M in   5.095542s

Comparison:
New version with small input:  6144778.6 i/s
Old version with small input:  5561220.3 i/s - 1.10x  slower

Warming up --------------------------------------
Old version with medium input
                         1.209k i/100ms
New version with medium input
                         1.486k i/100ms
Calculating -------------------------------------
Old version with medium input
                         11.875k (± 2.7%) i/s -     60.450k in   5.094750s
New version with medium input
                         14.949k (± 0.9%) i/s -     75.786k in   5.069964s

Comparison:
New version with medium input:    14949.2 i/s
Old version with medium input:    11874.8 i/s - 1.26x  slower

Warming up --------------------------------------
Old version with large input
                       523.000  i/100ms
New version with large input
                         1.492k i/100ms
Calculating -------------------------------------
Old version with large input
                          5.184k (± 3.2%) i/s -     26.150k in   5.050380s
New version with large input
                         14.630k (± 3.1%) i/s -     74.600k in   5.105030s

Comparison:
New version with large input:    14630.0 i/s
Old version with large input:     5184.2 i/s - 2.82x  slower

@linux-foundation-easycla
Copy link

CLA Not Signed

Copy link
Contributor

@fbogsany fbogsany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change in behaviour. Hash#drop(n) removes the first n elements, leaving the remainder. Hash#first(m) retains the first m elements, and drops the remainder. I.e. if MAX_MEMBER_COUNT is 2 and hash is {"a" => 1, "b" => 2, "c" => 3} then the original code will set @hash = {"b" => 2, "c" => 3} whereas your new code will set @hash = {"a" => 1, "b" => 2}.

irb(main):001> hash = {"a" => 1, "b" => 2, "c" => 3}
=> {"a"=>1, "b"=>2, "c"=>3}
irb(main):002> Hash[hash.drop(1)]
=> {"b"=>2, "c"=>3}
irb(main):003> Hash[hash.first(2)]
=> {"a"=>1, "b"=>2}

@ptolts
Copy link
Author

ptolts commented Oct 27, 2023 via email

@ptolts ptolts closed this Oct 27, 2023
@ptolts ptolts deleted the pt/perf-improvement branch October 27, 2023 20:54
@fbogsany
Copy link
Contributor

Thanks for highlighting this, though. I misread what this was fixing, and went off to check the spec for Span attribute limits. Turns out our behaviour is wrong, probably because we implemented it before the limits were added to the spec in open-telemetry/opentelemetry-specification#1130. We retain elements added later, and drop earlier elements in

def trim_span_attributes(attrs)
return if attrs.nil?
excess = attrs.size - @span_limits.attribute_count_limit
excess.times { attrs.shift } if excess.positive?
truncate_attribute_values(attrs, @span_limits.attribute_length_limit)
nil
end
whereas the spec requires dropping any unique attributes added after the limit is reached.

Fixing that will also change behaviour, but is required for spec compliance. I'll have to re-read the spec for tracestate propagation to see what's required there. I would prefer to avoid changing behaviour if the spec is ... not specific.

@fbogsany
Copy link
Contributor

Followup: the W3C trace-context spec states:

In a situation where tracestate needs to be truncated due to size limitations, the vendor MUST truncate whole entries. Entries larger than 128 characters long SHOULD be removed first. Then entries SHOULD be removed starting from the end of tracestate. Note that other truncation strategies like safe list entries, blocked list entries, or size-based truncation MAY be used, but are highly discouraged. Those strategies decrease the interoperability of various tracing vendors.

So the code in this PR is more correct, but not yet most correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants