1. 04 4月, 2014 6 次提交
    • A
      Transparent LZF compression initial implementation. · f7501fb1
      antirez 提交于
      This commit shapes the main ideas for the implementation but doesn't
      fix all the command implementations, nor handles loading of LZF
      compressed objects in a way able to perserve the compression.
      f7501fb1
    • A
      tryObjectEncoding() refactoring. · 2a7da8a7
      antirez 提交于
      We also avoid to re-create an object that is already in EMBSTR encoding.
      2a7da8a7
    • A
      Changed HyperLogLog hash seed to a non-zero value. · 433ce7f8
      antirez 提交于
      Using a seed of zero has the side effect of having the empty string
      hashing to what is a very special case in the context of HyperLogLog: a
      very long run of zeroes.
      
      This did not influenced the correctness of the result with 16k registers
      because of the harmonic mean, but still it is inconvenient that a so
      obvious value maps to a so special hash.
      
      The seed 0xadc83b19 is used instead, which is the first 64 bits of the
      SHA1 of the empty string.
      
      Reference: issue #1657.
      433ce7f8
    • A
      Initial HyperLogLog tests. · 352208ff
      antirez 提交于
      352208ff
    • A
      Return "WRONGTYPE" error on PF* type mismatch. · d2ca4bb6
      antirez 提交于
      d2ca4bb6
    • A
      Fix PFADD infinite loop. · 349c9781
      antirez 提交于
      We need to guarantee that the last bit is 1, otherwise an element may
      hash to just zeroes with probability 1/(2^64) and trigger an infinite
      loop.
      
      See issue #1657.
      349c9781
  2. 03 4月, 2014 5 次提交
  3. 02 4月, 2014 5 次提交
  4. 01 4月, 2014 4 次提交
  5. 31 3月, 2014 12 次提交
    • A
      HLLMERGE fixed by adding a... missing loop! · f1b76081
      antirez 提交于
      f1b76081
    • A
      hll-gnuplot-graph.rb: added new filter "all". · 0c9f06a2
      antirez 提交于
      0c9f06a2
    • A
      HyperLogLog apply bias correction using a polynomial. · ec1ee662
      antirez 提交于
      Better results can be achieved by compensating for the bias of the raw
      approximation just after 2.5m (when LINEARCOUNTING is no longer used) by
      using a polynomial that approximates the bias at a given cardinality.
      
      The curve used was found using this web page:
      
          http://www.xuru.org/rt/PR.asp
      
      That performs polynomial regression given a set of values.
      ec1ee662
    • A
      HLLMERGE implemented. · f2277475
      antirez 提交于
      Merge N HLL data structures by selecting the max value for every
      M[i] register among the set of HLLs.
      f2277475
    • A
      HLLCOUNT is technically a write command · 4ab45183
      antirez 提交于
      When we update the cached value, we need to propagate the command and
      signal the key as modified for WATCH.
      4ab45183
    • A
      HLLADD: propagate write when only variable name is given. · 8aeb0c19
      antirez 提交于
      The following form is given:
      
          HLLADD myhll
      
      No element is provided in the above case so if 'myhll' var does not
      exist the result is to just create an empty HLL structure, and no update
      will be performed on the registers.
      
      In this case, the DB should still be set dirty and the command
      propagated.
      8aeb0c19
    • A
      hll-gnuplot-graph.rb: Use |error| when filter is :max · 69a93194
      antirez 提交于
      69a93194
    • A
      Ignore txt files inside utils/hyperloglog. · a8fb1a32
      antirez 提交于
      Those are generated to trace graphs using gnuplot.
      a8fb1a32
    • A
      HyperLogLog: use LINEARCOUNTING up to 3m. · 60e60f4e
      antirez 提交于
      The HyperLogLog original paper suggests using LINEARCOUNTING for
      cardinalities < 2.5m, however for P=14 the median / max error
      curves show that a value of '3' is the best pick for m = 16384.
      60e60f4e
    • A
      7f9d289e
    • A
      HyperLogLog approximated cardinality caching. · 307a1899
      antirez 提交于
      The more we add elements to an HyperLogLog counter, the smaller is
      the probability that we actually update some register.
      
      From this observation it is easy to see how it is possible to use
      caching of a previously computed cardinality and reuse it to serve
      HLLCOUNT queries as long as no register was updated in the data
      structure.
      
      This commit does exactly this by using just additional 8 bytes for the
      data structure to store a 64 bit unsigned integer value cached
      cardinality. When the most significant bit of the 64 bit integer is set,
      it means that the value computed is no longer usable since at least a
      single register was modified and we need to recompute it at the next
      call of HLLCOUNT.
      
      The value is always stored in little endian format regardless of the
      actual CPU endianess.
      307a1899
    • A
      String value unsharing refactored into proper function. · 543ede03
      antirez 提交于
      All the Redis functions that need to modify the string value of a key in
      a destructive way (APPEND, SETBIT, SETRANGE, ...) require to make the
      object unshared (if refcount > 1) and encoded in raw format (if encoding
      is not already REDIS_ENCODING_RAW).
      
      This was cut & pasted many times in multiple places of the code. This
      commit puts the small logic needed into a function called
      dbUnshareStringValue().
      543ede03
  6. 30 3月, 2014 1 次提交
    • A
      Use endian neutral hash function for HyperLogLog. · aaf6db45
      antirez 提交于
      We need to be sure that you can save a dataset in a Redis instance,
      reload it in a different architecture, and continue to count in the same
      HyperLogLog structure.
      
      So 32 and 64 bit, little or bit endian, must all guarantee to output the
      same hash for the same element.
      aaf6db45
  7. 29 3月, 2014 7 次提交