An example of building a spark.ml classification model to predict the newsgroup of
articles from the 20 newsgroups data (see http://qwone.com/~jason/20Newsgroups/)
hosted in a Solr collection.
Prerequisites
You must run mvn -DskipTests package in the spark-solr project, and you must download
a Spark 1.6.1 binary distribution and point the environment variable $SPARK_HOME
to the unpacked distribution directory.
Follow the instructions in the NewsgroupsIndexer example's scaladoc to populate a Solr
collection with articles from the above-linked 20 newsgroup data.
An example of building a spark.ml classification model to predict the newsgroup of articles from the 20 newsgroups data (see http://qwone.com/~jason/20Newsgroups/) hosted in a Solr collection.
Prerequisites
You must run
mvn -DskipTests package
in the spark-solr project, and you must download a Spark 1.6.1 binary distribution and point the environment variable$SPARK_HOME
to the unpacked distribution directory.Follow the instructions in the NewsgroupsIndexer example's scaladoc to populate a Solr collection with articles from the above-linked 20 newsgroup data.
Example invocation
To see a description of all available options, run the following: