Aggregates (reduce) the array with the provided functions, similar to scala fold left in collections.
Aggregates (reduce) the array with the provided functions, similar to scala fold left in collections.
type of the transformed values.
zero value.
function to combine the previous result with the element of the array.
the column reference with the applied transformation.
scaladoc link (issue #135)
org.apache.spark.sql.functions.aggregate
Aggregates (reduce) the array with the provided functions, similar to scala fold left in collections, with a final transformation.
Aggregates (reduce) the array with the provided functions, similar to scala fold left in collections, with a final transformation.
type of the intermediate values
type of the final value to return
zero value
function to combine the previous result with the element of the array
the final transformation
the column reference with the applied transformation.
scaladoc link (issue #135)
org.apache.spark.sql.functions.aggregate
Returns null if the array is null, true if the array contains value
, and false otherwise.
Returns null if the array is null, true if the array contains value
, and false otherwise.
Removes duplicate values from the array.
Removes duplicate values from the array.
Returns element of array at given index in value.
Returns element of array at given index in value.
Returns an array of the elements in the first array but not in the second array, without duplicates.
Returns an array of the elements in the first array but not in the second array, without duplicates. The order of elements in the result is not determined
Returns whether a predicate holds for one or more elements in the array.
Returns whether a predicate holds for one or more elements in the array.
df.select(colArray("i").exists(_ % 2 === 0))
scaladoc link not available for spark 2.4
org.apache.spark.sql.functions.exists
Creates a new row for each element in the given array column.
Creates a new row for each element in the given array column.
Creates a new row for each element in the given array column.
Creates a new row for each element in the given array column. Unlike explode, if the array is null or empty then null is produced.
Filters the array elements using the provided condition.
Filters the array elements using the provided condition.
the condition to filter.
the column reference with the filter applied.
scaladoc link (issue #135)
org.apache.spark.sql.functions.filter
Selects the nth element of the array, returns null value if the length is shorter than n.
Selects the nth element of the array, returns null value if the length is shorter than n.
the index of the element to retrieve.
the DoricColumn with the selected element.
Returns an array of the elements in the intersection of the given two arrays, without duplicates.
Returns an array of the elements in the intersection of the given two arrays, without duplicates.
Concatenates the elements of column
using the delimiter
.
Concatenates the elements of column
using the delimiter
. Nulls are deleted
scaladoc link (issue #135)
org.apache.spark.sql.functions.array_join
Concatenates the elements of column
using the delimiter
.
Concatenates the elements of column
using the delimiter
. Null values are replaced with
nullReplacement
.
scaladoc link (issue #135)
org.apache.spark.sql.functions.array_join
Transforms the original value to a literal.
Transforms the original value to a literal.
a literal with the same type.
Creates a new map column.
Creates a new map column. The array in the first column is used for keys. The array in the second column is used for values.
java.lang.RuntimeException
if arrays doesn't have the same length.
or if a key is null
Returns the maximum value in the array.
Returns the maximum value in the array.
Returns the minimum value in the array.
Returns the minimum value in the array.
Returns true
if a1
and a2
have at least one non-null element in common.
Returns true
if a1
and a2
have at least one non-null element in common. If not and both
the arrays are non-empty and any of them contains a null
, it returns null
. It returns
false
otherwise.
Creates a new row for each element with position in the given array column.
Creates a new row for each element with position in the given array column.
ORIGINAL SPARK DORIC +------------+ +---+---+ +------+ |col | |pos|col| |col | +------------+ +---+---+ +------+ |[a, b, c, d]| |0 |a | |{0, a}| |[e] | |1 |b | |{1, b}| |[] | |2 |c | |{2, c}| |null | |3 |d | |{3, d}| +------------+ |0 |e | |{0, e}| +---+---+ +------+
WARNING: Unlike spark, doric returns a struct
,Uses the default column name pos for position, and value for elements in the array
Creates a new row for each element with position in the given array column.
Creates a new row for each element with position in the given array column. Unlike posexplode, if the array is null or empty then the row null is produced.
ORIGINAL SPARK DORIC +------------+ +----+----+ +------+ |col | |pos |col | |col | +------------+ +----+----+ +------+ |[a, b, c, d]| |0 |a | |{0, a}| |[e] | |1 |b | |{1, b}| |[] | |2 |c | |{2, c}| |null | |3 |d | |{3, d}| +------------+ |0 |e | |{0, e}| |null|null| |null | |null|null| |null | +----+----+ +------+
WARNING: Unlike spark, doric returns a struct
,Uses the default column name pos for position, and col for elements in the array
Locates the position of the first occurrence of the value in the given array as long.
Locates the position of the first occurrence of the value in the given array as long. Returns null if either of the arguments are null.
The position is not zero based, but 1 based index. Returns 0 if value could not be found in array.
Remove all elements that equal to element from the given array.
Remove all elements that equal to element from the given array.
Returns an array with reverse order of elements.
Returns an array with reverse order of elements.
Returns a random permutation of the given array.
Returns a random permutation of the given array.
The function is non-deterministic.
Returns length of array.
Returns length of array.
The function returns null for null input if spark.sql.legacy.sizeOfNull is set to false or spark.sql.ansi.enabled is set to true. Otherwise, the function returns -1 for null input. With the default settings, the function returns -1 for null input.
Returns an array containing all the elements in the column from index start
(or starting from the
end if start
is negative) with the specified length
.
Returns an array containing all the elements in the column from index start
(or starting from the
end if start
is negative) with the specified length
.
scaladoc link (issue #135)
if start
== 0 an exception will be thrown
org.apache.spark.sql.functions.slice
Sorts the input array for the given column in ascending or descending order, according to the natural ordering of the array elements.
Sorts the input array for the given column in ascending or descending order, according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order.
Sorts the input array for the given column in ascending order, according to the natural ordering of the array elements.
Sorts the input array for the given column in ascending order, according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array.
Sorts the input array in ascending order.
Sorts the input array in ascending order. The elements of the input array must be orderable. Null elements will be placed at the end of the returned array.
Converts a column containing a StructType into a JSON string with the specified schema.
Converts a column containing a StructType into a JSON string with the specified schema.
java.lang.IllegalArgumentException
in the case of an unsupported type.
scaladoc link (issue #135)
org.apache.spark.sql.functions.to_json(e:org\.apache\.spark\.sql\.Column,options:scala\.collection\.immutable\.Map\[java\.lang\.String,java\.lang\.String\]):* org.apache.spark.sql.functions.to_csv
Creates a new map column.
Creates a new map column. The array in the first column is used for keys. The array in the second column is used for values.
java.lang.RuntimeException
if arrays doesn't have the same length
or if a key is null
Transform each element with the provided function.
Transform each element with the provided function.
the type of the array elements to return.
lambda with the transformation to apply.
the column reference with the applied transformation.
scaladoc link (issue #135)
org.apache.spark.sql.functions.transform
Transform each element of the array with the provided function that provides the index of the element in the array.
Transform each element of the array with the provided function that provides the index of the element in the array.
the type of the elements of the array
the lambda that takes in account the element of the array and the index and returns a new element.
the column reference with the provided transformation.
scaladoc link (issue #135)
org.apache.spark.sql.functions.transform
Returns an array of the elements in the union of the given N arrays, without duplicates.
Returns an array of the elements in the union of the given N arrays, without duplicates.
Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
Merge two given arrays, element-wise, into a single array using a function.
Merge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function.
df.select(colArray("val1").zipWith(col("val2"), concat(_, _)))
scaladoc link not available for spark 2.4
org.apache.spark.sql.functions.zip_with
DORIC EXCLUSIVE! Given any array[e] column this method will return a new array struct[i, e] column, where the first element is the index and the second element is the value itself
Extension methods for arrays