shortcut name for command-line arguments, after extraction of the hadoop and scoobi ones
shortcut name for command-line arguments, after extraction of the hadoop and scoobi ones
make a DList runnable, executing the computation and returning the values
make a DList runnable, executing the computation and returning the values
make a DObject runnable, executing the computation and returning the values
make a DObject runnable, executing the computation and returning the values
Get the contents of a text file.
Get the contents of a text file.
the categories to show when logging, as a regular expression
the classes directories to include on a job classpath
set the configuration so that the next job is run on the cluster - this is the default
set the configuration so that the next job is run on the cluster - this is the default
store the value of the configuration in a lazy val, so that it can be updated and still be referenced
store the value of the configuration in a lazy val, so that it can be updated and still be referenced
set command-line arguments on the configuration object
set command-line arguments on the configuration object
a configuration with cluster setup
a configuration with memory setup
a configuration with local setup
a configuration where the appropriate properties are set-up for uploaded jars: distributed files + classpath
delete the remote jars currently on the cluster
true if the libjars must be deleted before the Scoobi job runs
a function to display execution times. The default uses log messages
execute some code locally
execute some code locally
execute some code on the cluster, setting the filesystem / jobtracker addresses and setting up the classpath
execute some code on the cluster, setting the filesystem / jobtracker addresses and setting up the classpath
execute some code locally
execute some code locally
the filesystem address
execute some code in memory, using a collection backend, possibly showing execution times
execute some code in memory, using a collection backend, possibly showing execution times
true if you want to include the library jars in the jar that is sent to the cluster for each job
set the configuration so that the next job is run in memory
set the configuration so that the next job is run in memory
true if the cluster argument is specified and the local argument is not
alias for locally
alias for locally
the list of library jars to upload
the jobtracker address
false if temporary files and working directory must be cleaned-up after job execution
the log level to use when logging
the path of the directory to use when loading jars to the filesystem.
set the configuration so that the next job is run locally
set the configuration so that the next job is run locally
the execution is local if the file system is local, as determined by the configuration files loaded by the hadoop script or if "local" is passed on the command line.
the execution is local if the file system is local, as determined by the configuration files loaded by the hadoop script or if "local" is passed on the command line.
if locally returns true then we might attempt to upload the dependent jars to the cluster and to add them to the classpath
List a path .
List a path .
parse the command-line argument and:
parse the command-line argument and:
true if the main jar contains all the dependencies for this application by default this is delegated to the Classes trait which looks for the presence of a scoobi_* jar or for com/nicta/scoobi jar entries in the main jar
false if libjars are used
execute some code on the cluster, possibly showing the execution time
execute some code on the cluster, possibly showing the execution time
execute some code, either locally or on the cluster, depending on the local argument being passed on the commandline
execute some code, either locally or on the cluster, depending on the local argument being passed on the commandline
execute some code locally, possibly showing execution times
execute some code locally, possibly showing execution times
Persisting
Persisting
allow to call list.persist
allow to call list.persist
allow to call object.persist
allow to call object.persist
true to suppress log messages
this method needs to be overridden and define the code to be executed
this method needs to be overridden and define the code to be executed
run a list.
run a list.
This is equivalent to:
val obj = list.materialise
run(obj)
the result of the in-memory run
the cluster evaluation of t
the result of the local run
this provides the arguments which are parsed to change the behavior of the Scoobi app: logging, local/cluster,.
this provides the arguments which are parsed to change the behavior of the Scoobi app: logging, local/cluster,...
ScoobiUserArgs
set the default configuration, depending on the arguments
set the default configuration, depending on the arguments
Static setup to use a testing log factory
Static setup to use a testing log factory
true if the debug logs must show the computation graph
measure the time taken by some executed code and display the time with a specific display function
measure the time taken by some executed code and display the time with a specific display function
true to display execution times for each job
upload the jars unless 'nolibjars' has been set on the command-line'
upload the jars which don't exist yet in the library directory on the cluster
upload the jars which don't exist yet in the library directory on the cluster
the remote jars currently on the cluster
true if cluster configuration must be loaded from Hadoop's configuration directory
the time for the execution of a piece of code
A REPL for Scoobi.
Run the 'scoobi' script, which will bring you into the Scala REPL
You're now good to go!!