Interface TextCorpus

    • Method Detail

      • getLanguage

        String getLanguage()
        Gets the language of the text/tokens in this TextCorpus.
        Returns:
        language of TextCorpus.
      • getLayers

        List<TextCorpusLayer> getLayers()
        Gets all annotation layers of this TextCorpus.
        Returns:
        annotations layers.
      • getTextLayer

        TextLayer getTextLayer()
        Gets text layer of this TextCorpus.
        Returns:
        annotation layer containing text.
      • createTextLayer

        TextLayer createTextLayer()
        Creates empty TextLayer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • getTokensLayer

        TokensLayer getTokensLayer()
        Gets tokens layer of this TextCorpus.
        Returns:
        annotation layer containing tokens.
      • createTokensLayer

        TokensLayer createTokensLayer()
        Creates empty TokensLayer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • createTokensLayer

        TokensLayer createTokensLayer​(boolean hasCharOffsets)
        Creates empty TokensLayer in this TextCorpus.
        Parameters:
        hasCharOffsets - true if the Token objects in this TokensLayer will contain character offset in text information, false otherwise.
        Returns:
        annotation layer that has been created.
      • getLemmasLayer

        LemmasLayer getLemmasLayer()
        Gets lemmas layer of this TextCorpus.
        Returns:
        layer containing lemma annotations on Token objects from TokensLayer.
      • createLemmasLayer

        LemmasLayer createLemmasLayer()
        Creates empty LemmasLayer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • getPosTagsLayer

        PosTagsLayer getPosTagsLayer()
        Gets part-of-speech layer of this TextCorpus.
        Returns:
        layer containing part-of-speech annotations on Token objects from TokensLayer.
      • createPosTagsLayer

        PosTagsLayer createPosTagsLayer​(String tagset)
        Creates empty PosTagsLayer with the given tagset in this TextCorpus.
        Parameters:
        tagset - of the part-of-speech annotations.
        Returns:
        annotation layer that has been created.
      • getTopologicalFieldsLayer

        TopologicalFieldsLayer getTopologicalFieldsLayer()
        Gets topological fields layer of this TextCorpus.
        Returns:
        layer containing topological field annotations on Token objects from TokensLayer.
      • createTopologicalFieldsLayer

        TopologicalFieldsLayer createTopologicalFieldsLayer​(String tagset)
        Creates empty TopologicalFieldsLayer with the given tagset in this TextCorpus.
        Parameters:
        tagset - of the topological fields.
        Returns:
        annotation layer that has been created.
      • getSentencesLayer

        SentencesLayer getSentencesLayer()
        Gets sentences layer of this TextCorpus.
        Returns:
        layer containing sentence boundary annotations on Token objects from TokensLayer.
      • createSentencesLayer

        SentencesLayer createSentencesLayer()
        Creates empty SentencesLayer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • createSentencesLayer

        SentencesLayer createSentencesLayer​(boolean hasCharOffsets)
        Creates empty SentencesLayer in this TextCorpus.
        Parameters:
        hasCharOffsets - true if the Sentence objects in this SentencesLayer will contain character offset in text information, false otherwise.
        Returns:
        annotation layer that has been created.
      • getConstituentParsingLayer

        ConstituentParsingLayer getConstituentParsingLayer()
        Gets constituent parsing layer of this TextCorpus.
        Returns:
        layer containing constituent parsing annotations on Token objects from TokensLayer.
      • createConstituentParsingLayer

        ConstituentParsingLayer createConstituentParsingLayer​(String tagset)
        Creates empty ConstituentParsingLayer with the given tagset in this TextCorpus.
        Parameters:
        tagset - of the parsing annotations.
        Returns:
        annotation layer that has been created.
      • getDependencyParsingLayer

        DependencyParsingLayer getDependencyParsingLayer()
        Gets dependency parsing layer of this TextCorpus.
        Returns:
        layer containing dependency parsing annotations on Token objects from TokensLayer.
      • createDependencyParsingLayer

        DependencyParsingLayer createDependencyParsingLayer​(boolean multipleGovernorsPossible,
                                                            boolean emptyTokensPossible)
        Creates empty DependencyParsingLayer in this TextCorpus.
        Parameters:
        multipleGovernorsPossible - true if a dependent can be governed by more than 1 governor, false otherwise.
        emptyTokensPossible - true if dependency annotations can contain empty tokens.
        Returns:
        annotation layer that has been created.
      • createDependencyParsingLayer

        DependencyParsingLayer createDependencyParsingLayer​(String tagset,
                                                            boolean multipleGovernorsPossible,
                                                            boolean emptyTokensPossible)
        Creates empty DependencyParsingLayer with the given tagset in this TextCorpus.
        Parameters:
        tagset - of the functions between dependent and governor.
        multipleGovernorsPossible - true if a dependent can be governed by more than 1 governor, false otherwise.
        emptyTokensPossible - true if dependency annotations can contain empty tokens.
        Returns:
        annotation layer that has been created.
      • getMorphologyLayer

        MorphologyLayer getMorphologyLayer()
        Gets morphology layer of this TextCorpus.
        Returns:
        layer containing morphological analysis annotations on Token objects from TokensLayer.
      • createMorphologyLayer

        MorphologyLayer createMorphologyLayer()
        Creates empty MorphologyLayer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • createMorphologyLayer

        MorphologyLayer createMorphologyLayer​(String tagset)
        Creates empty MorphologyLayer in this TextCorpus.
        Parameters:
        tagset - of the morphology annotations contain
        Returns:
        annotation layer that has been created.
      • createMorphologyLayer

        MorphologyLayer createMorphologyLayer​(boolean hasSegmentation)
        Creates empty MorphologyLayer in this TextCorpus.
        Parameters:
        hasSegmentation - true if morphology annotations contain segmentation analysis.
        Returns:
        annotation layer that has been created.
      • createMorphologyLayer

        MorphologyLayer createMorphologyLayer​(String tagset,
                                              boolean hasSegmentation)
        Creates empty MorphologyLayer in this TextCorpus.
        Parameters:
        tagset - of the morphology annotations contain
        hasSegmentation - true if morphology annotations contain segmentation analysis.
        Returns:
        annotation layer that has been created.
      • createMorphologyLayer

        MorphologyLayer createMorphologyLayer​(boolean hasSegmentation,
                                              boolean hasCharOffsets)
        Creates empty MorphologyLayer in this TextCorpus.
        Parameters:
        hasSegmentation - true if morphology annotations contain segmentation analysis.
        hasCharOffsets - true if the MorphologyAnalysis objects in this layer will contain character offset for segmentation within the token information, false otherwise.
        Returns:
        annotation layer that has been created.
      • createMorphologyLayer

        MorphologyLayer createMorphologyLayer​(String tagset,
                                              boolean hasSegmentation,
                                              boolean hasCharOffsets)
        Creates empty MorphologyLayer in this TextCorpus.
        Parameters:
        tagset - of the morphology annotations contain
        hasSegmentation - true if morphology annotations contain segmentation analysis.
        hasCharOffsets - true if the MorphologyAnalysis objects in this layer will contain character offset for segmentation within the token information, false otherwise.
        Returns:
        annotation layer that has been created.
      • getNamedEntitiesLayer

        NamedEntitiesLayer getNamedEntitiesLayer()
        Gets named entities layer of this TextCorpus.
        Returns:
        layer containing named entity annotations on Token objects from TokensLayer.
      • createNamedEntitiesLayer

        NamedEntitiesLayer createNamedEntitiesLayer​(String entitiesType)
        Creates empty NamedEntitiesLayer with the given tagset for named entity types in this TextCorpus.
        Parameters:
        entitiesType - tagset of the named entity annotations.
        Returns:
        annotation layer that has been created.
      • createChunksLayer

        ChunksLayer createChunksLayer​(String entitiesType)
        Creates empty ChunksLayer with the given tagset for named entity types in this TextCorpus.
        Parameters:
        entitiesType - tagset of the chunk annotations.
        Returns:
        annotation layer that has been created.
      • getChunksLayer

        ChunksLayer getChunksLayer()
        Gets chunks layer of this TextCorpus.
        Returns:
        layer containing chunk annotations on Token objects from TokensLayer.
      • getReferencesLayer

        ReferencesLayer getReferencesLayer()
        Gets references layer of this TextCorpus.
        Returns:
        layer containing reference/coreference annotations on Token objects from TokensLayer.
      • createReferencesLayer

        ReferencesLayer createReferencesLayer​(String typetagset,
                                              String reltagset,
                                              String externalReferencesSource)
        Creates empty references layers of this TextCorpus, ready to be filled in with the references data.
        Parameters:
        typetagset - tagset for the mention type values of the references (should be null if no types are defined)
        reltagset - tagset for relation values between the references (should be null if no relations are defined)
        externalReferencesSource - name of external source (should be null if entities from the external source are not referenced)
        Returns:
        annotation layer that has been created.
      • getMatchesLayer

        MatchesLayer getMatchesLayer()
        Gets matches layer of this TextCorpus.
        Returns:
        layer matches annotations on Token objects from TokensLayer.
      • createMatchesLayer

        MatchesLayer createMatchesLayer​(String queryLanguage,
                                        String queryString)
        Creates empty MatchesLayer layers of this TextCorpus, ready to be filled in with the corpus match annotations.
        Parameters:
        queryLanguage - language of the query used to extract corpus matches from a corpus.
        queryString - the query used to extract corpus matches from a corpus.
        Returns:
        annotation layer that has been created.
      • getWordSplittingLayer

        WordSplittingLayer getWordSplittingLayer()
        Gets word splitting layer of this TextCorpus.
        Returns:
        layer split annotations (e.g. hyphenation) on Token objects from TokensLayer.
      • createWordSplittingLayer

        WordSplittingLayer createWordSplittingLayer​(String type)
        Creates empty WordSplittingLayer with the given type of the splitting in this TextCorpus.
        Parameters:
        type - of the splitting, e.g. hyphenation.
        Returns:
        annotation layer that has been created.
      • getPhoneticsLayer

        PhoneticsLayer getPhoneticsLayer()
        Gets phonetics layer of this TextCorpus.
        Returns:
        layer containing phonetic transcriptions of Token objects from TokensLayer.
      • createPhotenicsLayer

        PhoneticsLayer createPhotenicsLayer​(String alphabet)
        Creates empty PhoneticsLayer with the given alphabet for phonetic transcriptions in this TextCorpus.
        Parameters:
        alphabet - of the phonetic transcription annotations.
        Returns:
        annotation layer that has been created.
      • getGeoLayer

        GeoLayer getGeoLayer()
        Gets geo layer of this TextCorpus.
        Returns:
        layer containing geographical location annotations on Token objects from TokensLayer.
      • createGeoLayer

        GeoLayer createGeoLayer​(String source,
                                GeoLongLatFormat coordFormat)
        Creates empty GeoLayer in this TextCorpus.
        Parameters:
        source - of the geographical coordinates.
        coordFormat - format of the geographical coordinates.
        Returns:
        annotation layer that has been created.
      • createGeoLayer

        GeoLayer createGeoLayer​(String source,
                                GeoLongLatFormat coordFormat,
                                GeoContinentFormat conitentFormat,
                                GeoCountryFormat countryFormat,
                                GeoCapitalFormat capitalFormat)
        Creates empty GeoLayer in this TextCorpus.
        Parameters:
        source - of the geographical coordinates.
        coordFormat - format of the geographical coordinates.
        conitentFormat - format of the continent (in case no continent is specified should be null).
        countryFormat - format of the country (in case no country is specified should be null).
        capitalFormat - format of the capital (in case no capital is specified should be null).
        Returns:
        annotation layer that has been created.
      • getOrthographyLayer

        OrthographyLayer getOrthographyLayer()
        Gets orthography layer of this TextCorpus.
        Returns:
        layer containing correct orthographic spellings of misspelled Token objects from TokensLayer.
      • createOrthographyLayer

        OrthographyLayer createOrthographyLayer()
        Creates empty OrthographyLayer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • getTextStructureLayer

        TextStructureLayer getTextStructureLayer()
        Gets text structure layer of this TextCorpus.
        Returns:
        layer containing original text structure (such as paragraphs, lines, pages, etc.), anchored on Token objects from TokensLayer.
      • createSynonymyLayer

        LexicalSemanticsLayer createSynonymyLayer()
        Creates empty synonymy layer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • createAntonymyLayer

        LexicalSemanticsLayer createAntonymyLayer()
        Creates empty antonymy layer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • createHyponymyLayer

        LexicalSemanticsLayer createHyponymyLayer()
        Creates empty hyponymy layer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • createHyperonymyLayer

        LexicalSemanticsLayer createHyperonymyLayer()
        Creates empty hyperonymy layer in this TextCorpus.
        Returns:
        annotation layer that has been created.
      • getDiscourseConnectivesLayer

        DiscourseConnectivesLayer getDiscourseConnectivesLayer()
        Gets discourse connectives layer of this TextCorpus.
        Returns:
        layer containing discourse connectives annotations on Token objects from TokensLayer.
      • createDiscourseConnectivesLayer

        DiscourseConnectivesLayer createDiscourseConnectivesLayer​(String typeTagset)
        Creates empty DiscourseConnectivesLayer in this TextCorpus.
        Parameters:
        typeTagset - tagset used to label semantic types of the connectives
        Returns:
        annotation layer that has been created.
      • getWordSensesLayer

        WordSensesLayer getWordSensesLayer()
        Gets word senses layer of this TextCorpus.
        Returns:
        layer containing word sense annotations on Token objects from TokensLayer.
      • createWordSensesLayer

        WordSensesLayer createWordSensesLayer​(String source)
        Creates empty WordSensesLayer in this TextCorpus.
        Parameters:
        source - from where the word senses are taken
        Returns:
        annotation layer that has been created.
      • getTextSourceLayer

        TextSourceLayer getTextSourceLayer()
        Gets textSource layer of this TextSource.
        Returns:
        annotation layer containing text.
      • createTextSourceLayer

        TextSourceLayer createTextSourceLayer()
        Creates empty TextSourceLayer in this TextCorpus.
        Returns:
        annotation layer that has been created.