class UnitTestsGenerator extends Closeable
Unit tests generator, generates scalatest's in the Prophecy format for the given component and some input and output DataFrames.
Note that, for the generated unit tests to be correct, this code should be executed on the gold standard datasets.
Example usage:
val ut = new UnitTestsGenerator("hdfs:///path/to/generated/tests/") val dfInput = Input(spark) val (dfDistribute1, dfDistribute2) = Distribute(spark, dfInput) ut.generateUnitTests("Distribute", Seq(dfInput), Seq(dfDistribute1, dfDistribute2)) val dfSomeJoin = SomeJoin(spark, dfDistribute1, dfDistribute2) ut.generateUnitTests("SomeJoin", Seq(dfDistribute1, dfDistribute2), Seq(dfSomeJoin))
The above sets up a typical spark graph with additional calls to:
- [UnitTestsGenerator!generateUnitTests(String, Seq[DataFrame], Seq[DataFrame], Option[Int])] 2. [UnitTestsGenerator!generateUnitTests(String, Seq[DataFrame], Seq[String], Seq[DataFrame], Seq[String])] method, which executes the spark workflow for the given inputs & outputs and writes the unit tests.
TODO: To increase the performance and reduce the number of spark actions being executed, we can upgrade this to a new Logical Plan operator (similarly like our org.apache.spark.sql.InterimExec). However, due the tests being executed only on a very limited amount of data, for now this should not cause significant performance degradation.
- Alphabetic
- By Inheritance
- UnitTestsGenerator
- Closeable
- AutoCloseable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
UnitTestsGenerator(path: String = "hdfs:///tmp/unit-tests/")
- path
Path where all the generated unit tests (both json and scala) are saved
Type Members
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native() @HotSpotIntrinsicCandidate()
-
def
close(): Unit
- Definition Classes
- UnitTestsGenerator → Closeable → AutoCloseable
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
generateUnitTests(name: String, inputs: Seq[DataFrame], inputPorts: Seq[String], outputs: Seq[DataFrame], outputPorts: Seq[String]): Unit
Generates and writes the unit tests to the path (defined when constructing the Generator) based on the passed DataFrames.
Generates and writes the unit tests to the path (defined when constructing the Generator) based on the passed DataFrames.
- name
The name of the component (used for the file name and object call)
- inputs
All the DataFrames used as an input for the component
- inputPorts
The names of the ports for each input DataFrame
- outputs
All the DataFrames returned as an output for the component
- outputPorts
The names of the ports for each output DataFrame
-
def
generateUnitTests(name: String, inputs: Seq[DataFrame], outputs: Seq[DataFrame], limit: Option[Int] = Some(10)): Unit
Generates and writes the unit tests to the path (defined when constructing the Generator) based on the passed DataFrames.
Generates and writes the unit tests to the path (defined when constructing the Generator) based on the passed DataFrames. This automatically guesses the port names (in, out1, out2, etc) and applies the limits on each input and output DataFrame.
- name
The name of the component (used for the file name and object call)
- inputs
All the DataFrames used as an input for the component
- outputs
All the DataFrames returned as an output for the component
- limit
Optional limit on the number of sample input & output rows
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
- val path: String
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
Deprecated Value Members
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] ) @Deprecated @deprecated
- Deprecated
(Since version ) see corresponding Javadoc for more information.