Package it.unimi.dsi.parser.callback
Class LinkExtractor
java.lang.Object
it.unimi.dsi.parser.callback.DefaultCallback
it.unimi.dsi.parser.callback.LinkExtractor
- All Implemented Interfaces:
Callback
public class LinkExtractor extends DefaultCallback
-
Field Summary
-
Constructor Summary
Constructors Constructor Description LinkExtractor() -
Method Summary
Modifier and Type Method Description Stringbase()Returns the URL specified by theBASEelement.voidconfigure(BulletParser parser)Configure the parser to parse elements and certain attributes.StringmetaLocation()Returns the URL specified byMETAHTTP-EQUIVelements of location type.StringmetaRefresh()Returns the URL specified byMETAHTTP-EQUIVelements of refresh type.voidstartDocument()Receive notification of the beginning of the document.booleanstartElement(Element element, Map<Attribute,MutableString> attrMap)Receive notification of the start of an element.Methods inherited from class it.unimi.dsi.parser.callback.DefaultCallback
cdata, characters, endDocument, endElement, getInstance
-
Field Details
-
urls
The URLs resulting from the parsing process.
-
-
Constructor Details
-
LinkExtractor
public LinkExtractor()
-
-
Method Details
-
configure
Configure the parser to parse elements and certain attributes.The required attributes are
SRC,HREF,HTTP-EQUIV, andCONTENT.- Specified by:
configurein interfaceCallback- Overrides:
configurein classDefaultCallback
-
startDocument
public void startDocument()Description copied from interface:CallbackReceive notification of the beginning of the document.The callback must use this method to reset its internal state so that it can be resued. It must be safe to invoke this method several times.
- Specified by:
startDocumentin interfaceCallback- Overrides:
startDocumentin classDefaultCallback
-
startElement
Description copied from interface:CallbackReceive notification of the start of an element.For simple elements, this is the only notification that the callback will ever receive.
- Specified by:
startElementin interfaceCallback- Overrides:
startElementin classDefaultCallback- Parameters:
element- the element whose opening tag was found.attrMap- a map fromAttributes toMutableStrings.- Returns:
- true to keep the parser parsing, false to stop it.
-
metaLocation
Returns the URL specified byMETAHTTP-EQUIVelements of location type. More precisely, this method returns a non-nullresult iff there is at least oneMETA HTTP-EQUIVelement specifying a location URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
METAHTTP-EQUIVelements of location type, ornull.
-
base
Returns the URL specified by theBASEelement. More precisely, this method returns a non-nullresult iff there is at least oneBASEelement specifying a derelativisation URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
BASEelement, ornull.
-
metaRefresh
Returns the URL specified byMETAHTTP-EQUIVelements of refresh type. More precisely, this method returns a non-nullresult iff there is at least oneMETA HTTP-EQUIVelement specifying a refresh URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
METAHTTP-EQUIVelements of refresh type, ornull.
-