Package

za.co.absa.cobrix.spark.cobol.source

index

Permalink

package index

Visibility
  1. Public
  2. All

Type Members

  1. case class ExecutorPair(newExecutor: String, busyExecutor: String) extends Product with Serializable

    Permalink

Value Members

  1. object LocationBalancer

    Permalink

    This class provides methods to rebalance partitions in case we have idle executors.

    This class provides methods to rebalance partitions in case we have idle executors.

    For instance, assume there were only 4 nodes (n1 ... n4) in the cluster when the files to be processed were stored.

    Later on, two new nodes were added (n5 and n6).

    When trying to achieve record-level locality the new nodes might be idle since they don't have any HDFS blocks related to the files being processed.

    This class provides methods to send some work for those new nodes.

    Although some shuffling will happen, the overall benefit of having more workers should outweigh it.

Ungrouped