HyperLogLog

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def alpha(bits: Int): Double
def approximateSize(bits: Int, size: Int, zeroCnt: Int, z: Double): Approximate[Long]
def asApprox(bits: Int, v: Double): Approximate[Long]
final def asInstanceOf[T0]: T0

Definition Classes
Any
def bitsForError(err: Double): Int

This gives you a number of bits to use to have a given standard error
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def error(bits: Int): Double

The true error is distributed like a Gaussian with this standard deviation.
The true error is distributed like a Gaussian with this standard deviation. let m = 2^bits. The size of the HLL is m bytes.
bits | size | error 9 512 0.0460 10 1024 0.0325 11 2048 0.0230 12 4096 0.0163 13 8192 0.0115 14 16384 0.0081 15 32768 0.0057 16 65536 0.0041 17 131072 0.0029 18 262144 0.0020 19 524288 0.0014 20 1048576 0.0010
Keep in mind, to store N distinct longs, you only need 8N bytes. See SetSizeAggregator for an approach that uses an exact set when the cardinality is small, and switches to HLL after we have enough items. Ideally, you would keep an exact set until it is smaller to store the HLL (but actually since we use sparse vectors to store the HLL, a small HLL takes a lot less than the size above).
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def fromByteBuffer(bb: ByteBuffer): HLL
def fromBytes(bytes: Array[Byte]): HLL
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hash(input: Array[Byte]): Array[Byte]
def hashCode(): Int

Definition Classes
AnyRef → Any
val hashSize: Int
def initialEstimate(bits: Int, size: Int, zeroCnt: Int, z: Double): Double
implicit def int2Bytes(i: Int): Array[Byte]
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def j(bytes: Array[Byte], bits: Int): Int
def jRhoW(in: Array[Byte], bits: Int): (Int, Byte)

We are computing j and \rho(w) from the paper, sorry for the name, but it allows someone to compare to the paper extremely low probability rhow (position of the leftmost one bit) is > 127, so we use a Byte to store it Given a hash <w_0, w_1, w_2 ...
We are computing j and \rho(w) from the paper, sorry for the name, but it allows someone to compare to the paper extremely low probability rhow (position of the leftmost one bit) is > 127, so we use a Byte to store it Given a hash <w_0, w_1, w_2 ... w_n> the value 'j' is equal to <w_0, w_1 ... w_(bits-1)> and the value 'w' is equal to <w_bits ... w_n>. The function rho counts the number of leading zeroes in 'w'. We can calculate rho(w) at once with the method rhoW.
implicit def long2Bytes(i: Long): Array[Byte]
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def rhoW(bytes: Array[Byte], bits: Int): Byte

The value 'w' is represented as a bitset (encoding in bytes).
The value 'w' is represented as a bitset (encoding in bytes). This function counts the number of leading zeros in 'w'.
Each byte is treated as a set of bits (little-endian). That is, the one bit represents the first value, then the two bit, then four, and so on.
We treat the leading bits bits as if they were instead a single zero bit.
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toBytes(h: HLL): Array[Byte]
def toString(): String

Definition Classes
AnyRef → Any
def twopow(i: Int): Double

Annotations
@inline()
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Deprecated Value Members

def j(bsl: BitSetLite, bits: Int): Int

Annotations
@deprecated
Deprecated
(Since version 0.12.3) This is no longer used. Use j(Array[Byte], Int) instead.
def rhoW(bsl: BitSetLite, bits: Int): Byte

Annotations
@deprecated
Deprecated
(Since version 0.12.3) This is no longer used. Use rhoW(Array[Byte], Int) instead.

Related Doc: package algebird

object HyperLogLog

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

def alpha(bits: Int): Double

def approximateSize(bits: Int, size: Int, zeroCnt: Int, z: Double): Approximate[Long]

def asApprox(bits: Int, v: Double): Approximate[Long]

final def asInstanceOf[T0]: T0

def bitsForError(err: Double): Int

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def error(bits: Int): Double

def finalize(): Unit

def fromByteBuffer(bb: ByteBuffer): HLL

def fromBytes(bytes: Array[Byte]): HLL

final def getClass(): Class[_]

def hash(input: Array[Byte]): Array[Byte]

def hashCode(): Int

val hashSize: Int

def initialEstimate(bits: Int, size: Int, zeroCnt: Int, z: Double): Double

implicit def int2Bytes(i: Int): Array[Byte]

final def isInstanceOf[T0]: Boolean

def j(bytes: Array[Byte], bits: Int): Int

def jRhoW(in: Array[Byte], bits: Int): (Int, Byte)

implicit def long2Bytes(i: Long): Array[Byte]

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def rhoW(bytes: Array[Byte], bits: Int): Byte

final def synchronized[T0](arg0: ⇒ T0): T0

def toBytes(h: HLL): Array[Byte]

def toString(): String

def twopow(i: Int): Double

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Deprecated Value Members

def j(bsl: BitSetLite, bits: Int): Int

def rhoW(bsl: BitSetLite, bits: Int): Byte

Inherited from AnyRef

Inherited from Any

Ungrouped