|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object gate.corpora.DocumentContentImpl
public class DocumentContentImpl
Represents the commonalities between all sorts of document contents.
Constructor Summary | |
---|---|
DocumentContentImpl()
Default construction |
|
DocumentContentImpl(String s)
For ranges |
|
DocumentContentImpl(URL u,
String encoding,
Long start,
Long end)
Contruction from URL and offsets. |
Method Summary | |
---|---|
boolean |
equals(Object other)
Two documents are the same if their contents is the same |
DocumentContent |
getContent(Long start,
Long end)
Return the contents under a particular span. |
String |
getOriginalContent()
Return the original content of the document received during the loading phase or on construction from string. |
int |
hashCode()
Calculate the hash value for the object. |
Long |
size()
The size of this content (e.g. character length for textual content). |
String |
toString()
Returns the String representing the content in case of a textual document. |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public DocumentContentImpl()
public DocumentContentImpl(URL u, String encoding, Long start, Long end) throws IOException
IOException
public DocumentContentImpl(String s)
Method Detail |
---|
public DocumentContent getContent(Long start, Long end) throws InvalidOffsetException
DocumentContent
Conceptually the annotation offsets are defined as falling in between characters, with "0" pointing before the fist character. Because of that, the offsets where an annotation ends and the space after it starts are the same.
So this is what the "abcde" string looks like with the offsets explicitly included: 0a1b2c3d4e5
"ab cd" would then look like this: 0a1b2 3c4d5
with the following annotations:
Token "ab" [0,2]
SpaceToken " " [2,3]
Token "cd" [3,5]
getContent
in interface DocumentContent
start
- the beginning index, inclusive.end
- the ending index, exclusive.
InvalidOffsetException
- if the
start
is negative, or
end
is larger than the length of
this DocumentContent
object, or
start
is larger than
end
.public String toString()
toString
in class Object
public Long size()
size
in interface DocumentContent
public boolean equals(Object other)
equals
in class Object
public int hashCode()
hashCode
in class Object
public String getOriginalContent()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |