class Logging::Stats::Sampler

A very simple little class for doing some basic fast statistics sampling. You feed it either samples of numeric data you want measured or you call #tick to get it to add a time delta between the last time you called it. When you're done either call sum, sumsq, num, min, max, mean or sd to get the information. The other option is to just call #to_s and see everything.

It does all of this very fast and doesn't take up any memory since the samples are not stored but instead all the values are calculated on the fly.

Attributes

last[R]
max[R]
min[R]
name[R]
num[R]
sum[R]
sumsq[R]

Public Class Methods

keys() click to toggle source

Class method that returns the headers that a CSV file would have for the values that this stats object is using.

# File lib/logging/stats.rb, line 88
def self.keys
  %w[name sum sumsq num mean sd min max]
end
new( name ) click to toggle source

Create a new sampler.

# File lib/logging/stats.rb, line 22
def initialize( name )
  @name = name
  reset
end

Public Instance Methods

coalesce( other ) click to toggle source

Coalesce the statistics from the other sampler into this one. The other sampler is not modified by this method.

Coalescing the same two samplers multiple times should only be done if one of the samplers is reset between calls to this method. Otherwise statistics will be counted multiple times.

# File lib/logging/stats.rb, line 47
def coalesce( other )
  @sum += other.sum
  @sumsq += other.sumsq
  if other.num > 0
    @min = other.min if @min > other.min
    @max = other.max if @max < other.max
    @last = other.last
  end
  @num += other.num
end
mark() click to toggle source

You can just call tick repeatedly if you need the delta times between a set of sample periods, but many times you actually want to sample how long something takes between a start/end period. Call mark at the beginning and then tick at the end you'll get this kind of measurement. Don't mix mark/tick and tick sampling together or the measurement will be meaningless.

# File lib/logging/stats.rb, line 124
def mark
  @last_time = Time.now.to_f
end
mean() click to toggle source

Calculates and returns the mean for the data passed so far.

# File lib/logging/stats.rb, line 99
def mean
  return 0.0 if num < 1
  sum / num
end
reset() click to toggle source

Resets the internal counters so you can start sampling again.

# File lib/logging/stats.rb, line 29
def reset
  @sum = 0.0
  @sumsq = 0.0
  @num = 0
  @min = 0.0
  @max = 0.0
  @last = nil
  @last_time = Time.now.to_f
  self
end
sample( s ) click to toggle source

Adds a sampling to the calculations.

# File lib/logging/stats.rb, line 60
def sample( s )
  @sum += s
  @sumsq += s * s
  if @num == 0
    @min = @max = s
  else
    @min = s if @min > s
    @max = s if @max < s
  end
  @num += 1
  @last = s
end
sd() click to toggle source

Calculates the standard deviation of the data so far.

# File lib/logging/stats.rb, line 106
def sd
  return 0.0 if num < 2

  # (sqrt( ((s).sumsq - ( (s).sum * (s).sum / (s).num)) / ((s).num-1) ))
  begin
    return Math.sqrt( (sumsq - ( sum * sum / num)) / (num-1) )
  rescue Errno::EDOM
    return 0.0
  end
end
tick() click to toggle source

Adds a time delta between now and the last time you called this. This will give you the average time between two activities.

An example is:

t = Sampler.new("do_stuff")
10000.times { do_stuff(); t.tick }
t.dump("time")
# File lib/logging/stats.rb, line 137
def tick
  now = Time.now.to_f
  sample(now - @last_time)
  @last_time = now
end
to_a() click to toggle source

An array of the values: [name,sum,sumsq,num,mean,sd,min,max]

# File lib/logging/stats.rb, line 81
def to_a
  [name, sum, sumsq, num, mean, sd, min, max]
end
to_hash() click to toggle source
# File lib/logging/stats.rb, line 92
def to_hash
  {:name => name, :sum => sum, :sumsq => sumsq, :num => num,
   :mean => mean, :sd => sd, :min => min, :max => max}
end
to_s() click to toggle source

Returns statistics in a common format.

# File lib/logging/stats.rb, line 75
def to_s
  "[%s]: SUM=%0.6f, SUMSQ=%0.6f, NUM=%d, MEAN=%0.6f, SD=%0.6f, MIN=%0.6f, MAX=%0.6f" % to_a
end