AMD64MathCosOp

java.lang.Object
- org.graalvm.compiler.lir.LIRInstruction
- - org.graalvm.compiler.lir.amd64.AMD64LIRInstruction
  - - org.graalvm.compiler.lir.amd64.AMD64MathIntrinsicUnaryOp
    - - org.graalvm.compiler.lir.amd64.AMD64MathCosOp

public final class AMD64MathCosOp
extends AMD64MathIntrinsicUnaryOp

                     ALGORITHM DESCRIPTION - COS()
                     ---------------------

     1. RANGE REDUCTION

     We perform an initial range reduction from X to r with

          X =~= N * pi/32 + r

     so that |r| <= pi/64 + epsilon. We restrict inputs to those
     where |N| <= 932560. Beyond this, the range reduction is
     insufficiently accurate. For extremely small inputs,
     denormalization can occur internally, impacting performance.
     This means that the main path is actually only taken for
     2^-252 <= |X| < 90112.

     To avoid branches, we perform the range reduction to full
     accuracy each time.

          X - N * (P_1 + P_2 + P_3)

     where P_1 and P_2 are 32-bit numbers (so multiplication by N
     is exact) and P_3 is a 53-bit number. Together, these
     approximate pi well enough for all cases in the restricted
     range.

     The main reduction sequence is:

             y = 32/pi * x
             N = integer(y)
     (computed by adding and subtracting off SHIFTER)

             m_1 = N * P_1
             m_2 = N * P_2
             r_1 = x - m_1
             r = r_1 - m_2
     (this r can be used for most of the calculation)

             c_1 = r_1 - r
             m_3 = N * P_3
             c_2 = c_1 - m_2
             c = c_2 - m_3

     2. MAIN ALGORITHM

     The algorithm uses a table lookup based on B = M * pi / 32
     where M = N mod 64. The stored values are:
       sigma             closest power of 2 to cos(B)
       C_hl              53-bit cos(B) - sigma
       S_hi + S_lo       2 * 53-bit sin(B)

     The computation is organized as follows:

          sin(B + r + c) = [sin(B) + sigma * r] +
                           r * (cos(B) - sigma) +
                           sin(B) * [cos(r + c) - 1] +
                           cos(B) * [sin(r + c) - r]

     which is approximately:

          [S_hi + sigma * r] +
          C_hl * r +
          S_lo + S_hi * [(cos(r) - 1) - r * c] +
          (C_hl + sigma) * [(sin(r) - r) + c]

     and this is what is actually computed. We separate this sum
     into four parts:

          hi + med + pols + corr

     where

          hi       = S_hi + sigma r
          med      = C_hl * r
          pols     = S_hi * (cos(r) - 1) + (C_hl + sigma) * (sin(r) - r)
          corr     = S_lo + c * ((C_hl + sigma) - S_hi * r)

     3. POLYNOMIAL

     The polynomial S_hi * (cos(r) - 1) + (C_hl + sigma) *
     (sin(r) - r) can be rearranged freely, since it is quite
     small, so we exploit parallelism to the fullest.

          psc4       =   SC_4 * r_1
          msc4       =   psc4 * r
          r2         =   r * r
          msc2       =   SC_2 * r2
          r4         =   r2 * r2
          psc3       =   SC_3 + msc4
          psc1       =   SC_1 + msc2
          msc3       =   r4 * psc3
          sincospols =   psc1 + msc3
          pols       =   sincospols *
                         

     4. CORRECTION TERM

     This is where the "c" component of the range reduction is
     taken into account; recall that just "r" is used for most of
     the calculation.

          -c   = m_3 - c_2
          -d   = S_hi * r - (C_hl + sigma)
          corr = -c * -d + S_lo

     5. COMPENSATED SUMMATIONS

     The two successive compensated summations add up the high
     and medium parts, leaving just the low parts to add up at
     the end.

          rs        =  sigma * r
          res_int   =  S_hi + rs
          k_0       =  S_hi - res_int
          k_2       =  k_0 + rs
          med       =  C_hl * r
          res_hi    =  res_int + med
          k_1       =  res_int - res_hi
          k_3       =  k_1 + med

     6. FINAL SUMMATION

     We now add up all the small parts:

          res_lo = pols(hi) + pols(lo) + corr + k_1 + k_3

     Now the overall result is just:

          res_hi + res_lo

     7. SMALL ARGUMENTS

     Inputs with |X| < 2^-252 are treated specially as
     1 - |x|.

 Special cases:
  cos(NaN) = quiet NaN, and raise invalid exception
  cos(INF) = NaN and raise invalid exception
  cos(0) = 1

- Nested Class Summary
  - Nested classes/interfaces inherited from class org.graalvm.compiler.lir.LIRInstruction
    LIRInstruction.Alive, LIRInstruction.Def, LIRInstruction.OperandFlag, LIRInstruction.OperandMode, LIRInstruction.State, LIRInstruction.Temp, LIRInstruction.Use
- Field Summary
  
  Fields
  Modifier and Type Field and Description
  
  static LIRInstructionClass<AMD64MathCosOp> TYPE
  - Fields inherited from class org.graalvm.compiler.lir.amd64.AMD64MathIntrinsicUnaryOp
    input, output, temps
  - Fields inherited from class org.graalvm.compiler.lir.LIRInstruction
    ADDRESS_FLAGS, ALLOWED_FLAGS
- Constructor Summary
  
  Constructors
  Constructor and Description
  
  AMD64MathCosOp()
- Method Summary
  
  All Methods Instance Methods Concrete Methods
  Modifier and Type Method and Description
  
  void emitCode(CompilationResultBuilder crb, AMD64MacroAssembler masm)
  - Methods inherited from class org.graalvm.compiler.lir.amd64.AMD64MathIntrinsicUnaryOp
    emitLIRWrapper
  - Methods inherited from class org.graalvm.compiler.lir.amd64.AMD64LIRInstruction
    emitCode
  - Methods inherited from class org.graalvm.compiler.lir.LIRInstruction
    addStackSlotsToTemporaries, destroysCallerSavedRegisters, forEachAlive, forEachAlive, forEachInput, forEachInput, forEachOutput, forEachOutput, forEachRegisterHint, forEachRegisterHint, forEachState, forEachState, forEachState, forEachState, forEachTemp, forEachTemp, getComment, getLIRInstructionClass, getPosition, hashCode, hasOperands, hasState, id, isLoadConstantOp, isMoveOp, isValueMoveOp, name, needsClearUpperVectorRegisters, setComment, setId, setPosition, toString, toString, toStringWithIdPrefix, verify, visitEachAlive, visitEachAlive, visitEachInput, visitEachInput, visitEachOutput, visitEachOutput, visitEachState, visitEachState, visitEachTemp, visitEachTemp
  - Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, notify, notifyAll, wait, wait, wait

Fields
Modifier and Type	Field and Description
`static LIRInstructionClass<AMD64MathCosOp>`	`TYPE`

Constructors
Constructor and Description
`AMD64MathCosOp()`

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`emitCode(CompilationResultBuilder crb, AMD64MacroAssembler masm)`

Field Detail

TYPE

public static final LIRInstructionClass<AMD64MathCosOp> TYPE

Constructor Detail
- AMD64MathCosOp
```
public AMD64MathCosOp()
```

Method Detail

emitCode

public void emitCode(CompilationResultBuilder crb,
                     AMD64MacroAssembler masm)

Specified by:: emitCode in class AMD64LIRInstruction

Class AMD64MathCosOp

Nested Class Summary

Nested classes/interfaces inherited from class org.graalvm.compiler.lir.LIRInstruction

Field Summary

Fields inherited from class org.graalvm.compiler.lir.amd64.AMD64MathIntrinsicUnaryOp

Fields inherited from class org.graalvm.compiler.lir.LIRInstruction

Constructor Summary

Method Summary

Methods inherited from class org.graalvm.compiler.lir.amd64.AMD64MathIntrinsicUnaryOp

Methods inherited from class org.graalvm.compiler.lir.amd64.AMD64LIRInstruction

Methods inherited from class org.graalvm.compiler.lir.LIRInstruction

Methods inherited from class java.lang.Object

Field Detail

TYPE

Constructor Detail

AMD64MathCosOp

Method Detail

emitCode