ContentExtractors

net.ruippeixotog.scalascraper.scraper.ContentExtractors

An object containing HtmlExtractor instances for extracting primitive data such as text, elements or attributes, as well as more complex information such as form data. Because they do perform little to no navigation through the document, they are typically preceded by a CSS query defining the location in the HTML document of the data to be retrieved.

Attributes

Graph
Supertypes
class Object
trait Matchable
class Any
Self type

Members list

Value members

Concrete methods

def attr(attr: String): HtmlExtractor[Element, String]

An extractor for the value of an attribute of the first matched element.

An extractor for the value of an attribute of the first matched element.

Value parameters

attr

the attribute name to extract

Attributes

Returns

an extractor for an attribute of the first matched element.

def attrs(attr: String): HtmlExtractor[Element, Iterable[String]]

An extractor for a lazy iterable of the value of an attribute of each matched element.

An extractor for a lazy iterable of the value of an attribute of each matched element.

Value parameters

attr

the attribute name to extract

Attributes

Returns

an extractor for a lazy iterable of the value of an attribute of each matched element.

Concrete fields

val allText: HtmlExtractor[Element, String]

An extractor for the text in all matched elements.

An extractor for the text in all matched elements.

Attributes

An extractor for the first element matched.

An extractor for the first element matched.

Attributes

An extractor for a list of the matched elements.

An extractor for a list of the matched elements.

Attributes

An extractor for an ElementQuery with the matched elements.

An extractor for an ElementQuery with the matched elements.

Attributes

val formData: HtmlExtractor[Element, Map[String, String]]

An extractor for the form data present in the matched elements.

An extractor for the form data present in the matched elements.

Attributes

val formDataAndAction: HtmlExtractor[Element, (Map[String, String], String)]

An extractor for the form data present in the matched elements, together with the submission URL in the form.

An extractor for the form data present in the matched elements, together with the submission URL in the form.

Attributes

val pElement: PolyHtmlExtractor { type Out = [E] =>> E; }

An extractor for the first element matched. It retains the concrete type of the elements being extracted.

An extractor for the first element matched. It retains the concrete type of the elements being extracted.

Attributes

val pElementList: PolyHtmlExtractor { type Out = List; }

An extractor for a list of the matched elements. It retains the concrete type of the elements being extracted.

An extractor for a list of the matched elements. It retains the concrete type of the elements being extracted.

Attributes

val pElements: PolyHtmlExtractor { type Out = [E <: Element] =>> ElementQuery[E]; }

An extractor for an ElementQuery with the matched elements. It retains the concrete type of the elements being extracted.

An extractor for an ElementQuery with the matched elements. It retains the concrete type of the elements being extracted.

Attributes

val table: HtmlExtractor[Element, Vector[Vector[Element]]]

An extractor for the cells of an HTML table.

An extractor for the cells of an HTML table.

Cells spanning multiple rows or columns are repeated in each of the positions they occupy. As such, well-formed rectangular tables always result in a Vector of Vectors with identical sizes.

Rows in thead elements are always presented first, while rows inside tfoot elements are always at the end.

Attributes

val text: HtmlExtractor[Element, String]

An extractor for the text in the first element matched.

An extractor for the text in the first element matched.

Attributes

val texts: HtmlExtractor[Element, Iterable[String]]

An extractor for a lazy iterable of the text in each element matched.

An extractor for a lazy iterable of the text in each element matched.

Attributes