breeze.text

transform

package transform

Visibility
  1. Public
  2. All

Type Members

  1. case class MinimumLengthFilter(minLength: Int) extends Transformer with Product with Serializable

    Filters out tokens composed of fewer than minLength characters.

  2. class RemoveRareWords extends AnyRef

    Filter that removes rare word that occur in fewer than threshold documents Syntax: new RemoveRareWords(10) apply (data)

  3. case class StopWordFilter(language: String) extends Transformer with Product with Serializable

    Filter that removes stop words.

  4. sealed trait TokenType extends AnyRef

    An enumeration over token types (see inner objects to TokenType companion object) based on regex patterns originally defined by Steven Bethard.

  5. trait Transformer extends (Iterable[String]) ⇒ Iterable[String] with Serializable

    A generic (loadable) transformation of a tokenized input text.

  6. case class WordsAndNumbersOnlyFilter() extends Transformer with Product with Serializable

    A filter that only accepts word and number tokens.

Value Members

  1. object StopWordFilter extends Serializable

  2. object TokenType

Ungrouped