Class JsonBinary


  • public class JsonBinary
    extends Object
    Utility to parse the binary-encoded value of a MySQL JSON type, translating the encoded representation into method calls on a supplied JsonFormatter implementation.

    Binary Format

    Each JSON value (scalar, object or array) has a one byte type identifier followed by the actual value.

    Scalar

    The binary value may contain a single scalar that is one of:
    • null
    • boolean
    • int16
    • int32
    • int64
    • uint16
    • uint32
    • uint64
    • double
    • string
    • DATE as a string of the form YYYY-MM-DD where YYYY can be positive or negative
    • TIME as a string of the form HH-MM-SS where HH can be positive or negative
    • DATETIME as a string of the form YYYY-MM-DD HH-mm-SS.ssssss where YYYY can be positive or negative
    • TIMESTAMP as the number of microseconds past epoch (January 1, 1970), or if negative the number of microseconds before epoch (January 1, 1970)
    • any other MySQL value encoded as an opaque binary value

    JSON Object

    If the value is a JSON object, its binary representation will have a header that contains:
    • the member count
    • the size of the binary value in bytes
    • a list of pointers to each key
    • a list of pointers to each value
    The actual keys and values will come after the header, in the same order as in the header.

    JSON Array

    If the value is a JSON array, the binary representation will have a header with
    • the element count
    • the size of the binary value in bytes
    • a list of pointers to each value
    followed by the actual values, in the same order as in the header.

    Grammar

    The grammar of the binary representation of JSON objects are defined in the MySQL codebase in the json_binary.h file:
       doc ::= type value
       type ::=
           0x00 |  // small JSON object
           0x01 |  // large JSON object
           0x02 |  // small JSON array
           0x03 |  // large JSON array
           0x04 |  // literal (true/false/null)
           0x05 |  // int16
           0x06 |  // uint16
           0x07 |  // int32
           0x08 |  // uint32
           0x09 |  // int64
           0x0a |  // uint64
           0x0b |  // double
           0x0c |  // utf8mb4 string
           0x0f    // custom data (any MySQL data type)
       value ::=
           object  |
           array   |
           literal |
           number  |
           string  |
           custom-data
       object ::= element-count size key-entry* value-entry* key* value*
       array ::= element-count size value-entry* value*
       // number of members in object or number of elements in array
       element-count ::=
           uint16 |  // if used in small JSON object/array
           uint32    // if used in large JSON object/array
       // number of bytes in the binary representation of the object or array
       size ::=
           uint16 |  // if used in small JSON object/array
           uint32    // if used in large JSON object/array
       key-entry ::= key-offset key-length
       key-offset ::=
           uint16 |  // if used in small JSON object
           uint32    // if used in large JSON object
       key-length ::= uint16    // key length must be less than 64KB
       value-entry ::= type offset-or-inlined-value
       // This field holds either the offset to where the value is stored,
       // or the value itself if it is small enough to be inlined (that is,
       // if it is a JSON literal or a small enough [u]int).
       offset-or-inlined-value ::=
           uint16 |   // if used in small JSON object/array
           uint32     // if used in large JSON object/array
       key ::= utf8mb4-data
       literal ::=
           0x00 |   // JSON null literal
           0x01 |   // JSON true literal
           0x02 |   // JSON false literal
       number ::=  ....  // little-endian format for [u]int(16|32|64), whereas
                         // double is stored in a platform-independent, eight-byte
                         // format using float8store()
       string ::= data-length utf8mb4-data
       custom-data ::= custom-type data-length binary-data
       custom-type ::= uint8   // type identifier that matches the
                               // internal enum_field_types enum
       data-length ::= uint8*  // If the high bit of a byte is 1, the length
                               // field is continued in the next byte,
                               // otherwise it is the last byte of the length
                               // field. So we need 1 byte to represent
                               // lengths up to 127, 2 bytes to represent
                               // lengths up to 16383, and so on...
     
    Author:
    Randall Hauch
    • Constructor Detail

      • JsonBinary

        public JsonBinary​(byte[] bytes)
    • Method Detail

      • parseAsString

        public static String parseAsString​(byte[] bytes)
                                    throws IOException
        Parse the MySQL binary representation of a JSON value and return the JSON string representation.

        This method is equivalent to parse(byte[], JsonFormatter) using the JsonStringFormatter.

        Parameters:
        bytes - the binary representation; may not be null
        Returns:
        the JSON string representation; never null
        Throws:
        IOException - if there is a problem reading or processing the binary representation
      • parse

        public static void parse​(byte[] bytes,
                                 JsonFormatter formatter)
                          throws IOException
        Parse the MySQL binary representation of a JSON value and call the supplied JsonFormatter for the various components of the value.
        Parameters:
        bytes - the binary representation; may not be null
        formatter - the formatter that will be called as the binary representation is parsed; may not be null
        Throws:
        IOException - if there is a problem reading or processing the binary representation
      • getString

        public String getString()
      • parseObject

        protected void parseObject​(boolean small,
                                   JsonFormatter formatter)
                            throws IOException
        Parse a JSON object.

        The grammar of the binary representation of JSON objects are defined in the MySQL code base in the json_binary.h file:

        Grammar

           value ::=
               object  |
               array   |
               literal |
               number  |
               string  |
               custom-data
           object ::= element-count size key-entry* value-entry* key* value*
           // number of members in object or number of elements in array
           element-count ::=
               uint16 |  // if used in small JSON object/array
               uint32    // if used in large JSON object/array
           // number of bytes in the binary representation of the object or array
           size ::=
               uint16 |  // if used in small JSON object/array
               uint32    // if used in large JSON object/array
           key-entry ::= key-offset key-length
           key-offset ::=
               uint16 |  // if used in small JSON object
               uint32    // if used in large JSON object
           key-length ::= uint16    // key length must be less than 64KB
           value-entry ::= type offset-or-inlined-value
           // This field holds either the offset to where the value is stored,
           // or the value itself if it is small enough to be inlined (that is,
           // if it is a JSON literal or a small enough [u]int).
           offset-or-inlined-value ::=
               uint16 |   // if used in small JSON object/array
               uint32     // if used in large JSON object/array
           key ::= utf8mb4-data
           literal ::=
               0x00 |   // JSON null literal
               0x01 |   // JSON true literal
               0x02 |   // JSON false literal
           number ::=  ....  // little-endian format for [u]int(16|32|64), whereas
                             // double is stored in a platform-independent, eight-byte
                             // format using float8store()
           string ::= data-length utf8mb4-data
           custom-data ::= custom-type data-length binary-data
           custom-type ::= uint8   // type identifier that matches the
                                   // internal enum_field_types enum
           data-length ::= uint8*  // If the high bit of a byte is 1, the length
                                   // field is continued in the next byte,
                                   // otherwise it is the last byte of the length
                                   // field. So we need 1 byte to represent
                                   // lengths up to 127, 2 bytes to represent
                                   // lengths up to 16383, and so on...
         
        Parameters:
        small - true if the object being read is "small", or false otherwise
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseArray

        protected void parseArray​(boolean small,
                                  JsonFormatter formatter)
                           throws IOException
        Parse a JSON array.

        The grammar of the binary representation of JSON objects are defined in the MySQL code base in the json_binary.h file, and are:

        Grammar

        Grammar

           value ::=
               object  |
               array   |
               literal |
               number  |
               string  |
               custom-data
           array ::= element-count size value-entry* value*
           // number of members in object or number of elements in array
           element-count ::=
               uint16 |  // if used in small JSON object/array
               uint32    // if used in large JSON object/array
           // number of bytes in the binary representation of the object or array
           size ::=
               uint16 |  // if used in small JSON object/array
               uint32    // if used in large JSON object/array
           value-entry ::= type offset-or-inlined-value
           // This field holds either the offset to where the value is stored,
           // or the value itself if it is small enough to be inlined (that is,
           // if it is a JSON literal or a small enough [u]int).
           offset-or-inlined-value ::=
               uint16 |   // if used in small JSON object/array
               uint32     // if used in large JSON object/array
           key ::= utf8mb4-data
           literal ::=
               0x00 |   // JSON null literal
               0x01 |   // JSON true literal
               0x02 |   // JSON false literal
           number ::=  ....  // little-endian format for [u]int(16|32|64), whereas
                             // double is stored in a platform-independent, eight-byte
                             // format using float8store()
           string ::= data-length utf8mb4-data
           custom-data ::= custom-type data-length binary-data
           custom-type ::= uint8   // type identifier that matches the
                                   // internal enum_field_types enum
           data-length ::= uint8*  // If the high bit of a byte is 1, the length
                                   // field is continued in the next byte,
                                   // otherwise it is the last byte of the length
                                   // field. So we need 1 byte to represent
                                   // lengths up to 127, 2 bytes to represent
                                   // lengths up to 16383, and so on...
         
        Parameters:
        small - true if the object being read is "small", or false otherwise
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseBoolean

        protected void parseBoolean​(JsonFormatter formatter)
                             throws IOException
        Parse a literal value that is either null, true, or false.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseInt16

        protected void parseInt16​(JsonFormatter formatter)
                           throws IOException
        Parse a 2 byte integer value.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseUInt16

        protected void parseUInt16​(JsonFormatter formatter)
                            throws IOException
        Parse a 2 byte unsigned integer value.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseInt32

        protected void parseInt32​(JsonFormatter formatter)
                           throws IOException
        Parse a 4 byte integer value.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseUInt32

        protected void parseUInt32​(JsonFormatter formatter)
                            throws IOException
        Parse a 4 byte unsigned integer value.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseInt64

        protected void parseInt64​(JsonFormatter formatter)
                           throws IOException
        Parse a 8 byte integer value.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseUInt64

        protected void parseUInt64​(JsonFormatter formatter)
                            throws IOException
        Parse a 8 byte unsigned integer value.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseDouble

        protected void parseDouble​(JsonFormatter formatter)
                            throws IOException
        Parse a 8 byte double value.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseString

        protected void parseString​(JsonFormatter formatter)
                            throws IOException
        Parse the length and value of a string stored in MySQL's "utf8mb" character set (which equates to Java's UTF-8 character set. The length is a variable length integer length of the string.
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseOpaque

        protected void parseOpaque​(JsonFormatter formatter)
                            throws IOException
        Parse an opaque type. Specific types such as DATE, TIME, and DATETIME values are stored as opaque types, though they are to be unpacked. TIMESTAMPs are also stored as opaque types, but converted by MySQL to DATETIME prior to storage. Other MySQL types are stored as opaque types and passed on to the formatter as opaque values.

        See the MySQL source code for the logic used in this method.

        Grammar

           custom-data ::= custom-type data-length binary-data
           custom-type ::= uint8   // type identifier that matches the
                                   // internal enum_field_types enum
           data-length ::= uint8*  // If the high bit of a byte is 1, the length
                                   // field is continued in the next byte,
                                   // otherwise it is the last byte of the length
                                   // field. So we need 1 byte to represent
                                   // lengths up to 127, 2 bytes to represent
                                   // lengths up to 16383, and so on...
         
        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseDate

        protected void parseDate​(JsonFormatter formatter)
                          throws IOException
        Parse a DATE value, which is stored using the same format as DATETIME: 5 bytes + fractional-seconds storage. However, the hour, minute, second, and fractional seconds are ignored.

        The non-fractional part is 40 bits:

          1 bit  sign           (1= non-negative, 0= negative)
          17 bits year*13+month  (year 0-9999, month 0-12)
           5 bits day            (0-31)
           5 bits hour           (0-23)
           6 bits minute         (0-59)
           6 bits second         (0-59)
         
        The fractional part is typically dependent upon the fsp (i.e., fractional seconds part) defined by a column, but in the case of JSON it is always 3 bytes.

        The format of all temporal values is outlined in the MySQL documentation, although since the MySQL JSON type is only available in 5.7, only version 2 of the date-time formats are necessary.

        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseTime

        protected void parseTime​(JsonFormatter formatter)
                          throws IOException
        Parse a TIME value, which is stored using the same format as DATETIME: 5 bytes + fractional-seconds storage. However, the year, month, and day values are ignored

        The non-fractional part is 40 bits:

          1 bit  sign           (1= non-negative, 0= negative)
          17 bits year*13+month  (year 0-9999, month 0-12)
           5 bits day            (0-31)
           5 bits hour           (0-23)
           6 bits minute         (0-59)
           6 bits second         (0-59)
         
        The fractional part is typically dependent upon the fsp (i.e., fractional seconds part) defined by a column, but in the case of JSON it is always 3 bytes.

        The format of all temporal values is outlined in the MySQL documentation, although since the MySQL JSON type is only available in 5.7, only version 2 of the date-time formats are necessary.

        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseDatetime

        protected void parseDatetime​(JsonFormatter formatter)
                              throws IOException
        Parse a DATETIME value, which is stored as 5 bytes + fractional-seconds storage.

        The non-fractional part is 40 bits:

          1 bit  sign           (1= non-negative, 0= negative)
          17 bits year*13+month  (year 0-9999, month 0-12)
           5 bits day            (0-31)
           5 bits hour           (0-23)
           6 bits minute         (0-59)
           6 bits second         (0-59)
         
        The sign bit is always 1. A value of 0 (negative) is reserved. The fractional part is typically dependent upon the fsp (i.e., fractional seconds part) defined by a column, but in the case of JSON it is always 3 bytes. Unlike the documentation, however, the 8 byte value is in little-endian form.

        The format of all temporal values is outlined in the MySQL documentation, although since the MySQL JSON type is only available in 5.7, only version 2 of the date-time formats are necessary.

        Parameters:
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • parseDecimal

        protected void parseDecimal​(int length,
                                    JsonFormatter formatter)
                             throws IOException
        Parse a DECIMAL value. The first two bytes are the precision and scale, followed by the binary representation of the decimal itself.
        Parameters:
        length - the length of the complete binary representation
        formatter - the formatter to be notified of the parsed value; may not be null
        Throws:
        IOException - if there is a problem reading the JSON value
      • readFractionalSecondsInMicroseconds

        protected int readFractionalSecondsInMicroseconds()
                                                   throws IOException
        Throws:
        IOException
      • readBigEndianLong

        protected long readBigEndianLong​(int numBytes)
                                  throws IOException
        Throws:
        IOException
      • readUnsignedIndex

        protected int readUnsignedIndex​(int maxValue,
                                        boolean isSmall,
                                        String desc)
                                 throws IOException
        Throws:
        IOException
      • readVariableInt

        protected int readVariableInt()
                               throws IOException
        Read a variable-length integer value.

        If the high bit of a byte is 1, the length field is continued in the next byte, otherwise it is the last byte of the length field. So we need 1 byte to represent lengths up to 127, 2 bytes to represent lengths up to 16383, and so on...

        Returns:
        the integer value
        Throws:
        IOException - if we don't encounter an end-of-int marker
      • asHex

        protected static String asHex​(byte b)
      • asHex

        protected static String asHex​(int value)