onnx-scala-common/org.emergentorder.onnx/Float16

Float16

final

class Float16(val raw: Short) extends AnyVal

Float16 represents 16-bit floating-point values.

This type does not actually support arithmetic directly. The expected use case is to convert to Float to perform any actual arithmetic, then convert back to a Float16 if needed.

Binary representation:

Value interpretation (in order of precedence, with _ wild):

0 00000 0000000000 (positive) zero 1 00000 0000000000 negative zero _ 00000 __________ subnormal number _ 11111 0000000000 +/- infinity _ 11111 __________ not-a-number _ _____ __________ normal number

An exponent of all 1s signals a sentinel (NaN or infinity), and all 0s signals a subnormal number. So the working "real" range of exponents we can express is [-14, +15].

For non-zero exponents, the mantissa has an implied leading 1 bit, so 10 bits of data provide 11 bits of precision for normal numbers.

For normal numbers:

x = (1 - sign*2) * 2^exponent * (1 + mantissa/1024)

For subnormal numbers, the implied leading 1 bit is absent. Thus, subnormal numbers have the same exponent as the smallest normal numbers, but without an implied 1 bit.

So for subnormal numbers:

x = (1 - sign*2) * 2^(-14) * (mantissa/1024)

Companion: object

class AnyVal

trait Matchable

class Any

Float16

Value members

Concrete methods

Whether this Float16 value is finite or not.

For the purposes of this method, infinities and NaNs are considered non-finite. For those values it returns false and for all other values it returns true.

Returns if this is a zero value (positive or negative).

Return the sign of a Float16 value as a Float.

There are five possible return values:

NaN: the value is Float16.NaN (and has no sign) * -1F: the value is a non-zero negative number * -0F: the value is Float16.NegativeZero * 0F: the value is Float16.Zero * 1F: the value is a non-zero positive number

PositiveInfinity and NegativeInfinity return their expected signs.

Convert this Float16 value to the nearest Float.

Non-finite values and zero values will be mapped to the corresponding Float value.

All other finite values will be handled depending on whether they are normal or subnormal. The relevant formulas are:

normal: (sign2-1) * 2^(exponent-15) * (1 + mantissa/1024) * subnormal: (sign2-1) * 2^-14
(mantissa/1024)

Given any (x: Float16), Float16.fromFloat(x.toFloat) = x

The reverse is not necessarily true, since there are many Float values which are not precisely representable as Float16 values.

String representation of this Float16 value.

Definition Classes: Any

Reverse the sign of this Float16 value.

This just involves toggling the sign bit with XOR.

-Float16.NaN has no meaningful effect. -Float16.Zero returns Float16.NegativeZero.

Float16

Value members

Concrete methods

Concrete fields