public class BilingualAlignmentService
extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
static int |
TRANSLATION_STRATEGY_MOST_FREQUENT |
static int |
TRANSLATION_STRATEGY_MOST_SPECIFIC |
static int |
TRANSLATION_STRATEGY_PRORATA |
Constructor and Description |
---|
BilingualAlignmentService() |
Modifier and Type | Method and Description |
---|---|
BilingualAlignmentService |
addTranslation(java.lang.String sourceLemma,
java.lang.String targetLemma) |
BilingualAlignmentService |
addTranslation(Term sourceTerm,
Term targetTerm) |
java.util.List<TranslationCandidate> |
align(Term sourceTerm,
int nbCandidates,
int minCandidateFrequency) |
java.util.List<TranslationCandidate> |
alignCompositional(TermService sourceTerm,
int nbCandidates,
int minCandidateFrequency) |
java.util.List<TranslationCandidate> |
alignCompositionalSize2(TermService lemmaTerm1,
TermService lemmaTerm2,
int nbCandidates,
int minCandidateFrequency,
TermService sourceTerm) |
java.util.List<TranslationCandidate> |
alignDico(TermService sourceTerm,
int nbCandidates) |
java.util.List<TranslationCandidate> |
alignDicoThenDistributional(TermService sourceTerm,
int nbCandidates,
int minCandidateFrequency)
Translates the source term with the help of the dictionary
and computes the list of
contextSize closest candidate
terms in the target terminology. |
java.util.List<TranslationCandidate> |
alignDistributional(TermService sourceTerm,
int nbCandidates,
int minCandidateFrequency) |
java.util.List<TranslationCandidate> |
alignGraphically(AlignmentMethod method,
TermService sourceTerm,
int nbCandidates,
java.util.Collection<TermService> targetTerms) |
java.util.List<TranslationCandidate> |
alignNeoclassical(Term sourceTerm,
int nbCandidates,
int minCandidateFrequency) |
java.util.List<TranslationCandidate> |
alignNeoclassical(TermService sourceTerm,
int nbCandidates,
int minCandidateFrequency)
Align a term using the TermSuite's neoclassical alignment method.
|
java.util.List<TranslationCandidate> |
alignSemiDistributional(TermService sourceTerm,
int nbCandidates,
int minCandidateFrequency) |
java.util.List<TranslationCandidate> |
alignSemiDistributionalSize2Syntagmatic(TermService lemmaTerm1,
TermService lemmaTerm2,
int nbCandidates,
int minCandidateFrequency,
TermService sourceTerm) |
java.util.List<TranslationCandidate> |
alignSize2(Term sourceTerm,
int nbCandidates,
int minCandidateFrequency) |
java.util.List<TranslationCandidate> |
alignSize2(TermService sourceTerm,
int nbCandidates,
int minCandidateFrequency)
alias for
#align(Term, int, int, |
java.util.List<TranslationCandidate> |
alignSize2(TermService sourceTerm,
int nbCandidates,
int minCandidateFrequency,
boolean allowDistributionalAlignment) |
boolean |
canAlignCompositional(TermService sourceTerm) |
boolean |
canAlignNeoclassical(TermService sourceTerm) |
boolean |
canAlignSemiDistributional(TermService sourceTerm) |
BilingualDictionary |
getDico() |
java.util.Collection<Term> |
getMorphologicalExtensionsAsTerms(TermIndex lemmaLowerCaseIndex,
TermService compound,
Component component)
E.g.
|
java.util.List<java.util.List<TermService>> |
getSourceSingleLemmaTerms(TermService term)
Gives the list of all possible single lemma terms decompositino for a complex term.
|
static java.util.Set<Term> |
getSwtSetFromComponent(TermIndex lemmaLowerCaseIndex,
Component c) |
ContextVector |
translateVector(ContextVector sourceVector,
BilingualDictionary dictionary,
int translationStrategy,
TerminologyService targetTermino)
Translates all
ContextVector components (i.e. its coTerms) into
the target language of this aligner by the mean of one of the available
strategy :
- TRANSLATION_STRATEGY_MOST_FREQUENT
- TRANSLATION_STRATEGY_PRORATA
- TRANSLATION_STRATEGY_EQUI_REPARTITION
- TRANSLATION_STRATEGY_MOST_SPECIFIC |
public static final int TRANSLATION_STRATEGY_PRORATA
public static final int TRANSLATION_STRATEGY_MOST_FREQUENT
public static final int TRANSLATION_STRATEGY_MOST_SPECIFIC
public BilingualAlignmentService addTranslation(Term sourceTerm, Term targetTerm)
public BilingualAlignmentService addTranslation(java.lang.String sourceLemma, java.lang.String targetLemma)
sourceLemma
- targetLemmas
- public java.util.List<TranslationCandidate> alignDicoThenDistributional(TermService sourceTerm, int nbCandidates, int minCandidateFrequency)
contextSize
closest candidate
terms in the target terminology.
sourceTerm
's context vector must be computed and normalized,
as well as all terms' context vectors in the target term index.sourceTerm
- the term to align with target term indexnbCandidates
- the number of TranslationCandidate
to return in the returned listminCandidateFrequency
- the minimum frequency of a target candidateTranslationCandidate
sorted by distance desc. Each
TranslationCandidate
is a container for a target term index's term
and its translation score.public boolean canAlignNeoclassical(TermService sourceTerm)
public java.util.List<TranslationCandidate> alignNeoclassical(Term sourceTerm, int nbCandidates, int minCandidateFrequency)
public java.util.List<TranslationCandidate> alignNeoclassical(TermService sourceTerm, int nbCandidates, int minCandidateFrequency)
sourceTerm
- the source term to alignnbCandidates
- the maximum number of TranslationCandidate
returnedminCandidateFrequency
- the minimum frequency of returned translation candidatesTranslationCandidate
produced by this method
or an empty list if the term could not be aligned using the neoclassical method.#canAlignNeoclassical(Term)
,
CompoundType.NEOCLASSICAL
public java.util.Collection<Term> getMorphologicalExtensionsAsTerms(TermIndex lemmaLowerCaseIndex, TermService compound, Component component)
termino
- compound
- component
- public java.util.List<TranslationCandidate> alignGraphically(AlignmentMethod method, TermService sourceTerm, int nbCandidates, java.util.Collection<TermService> targetTerms)
public java.util.List<TranslationCandidate> alignDistributional(TermService sourceTerm, int nbCandidates, int minCandidateFrequency)
public java.util.List<TranslationCandidate> align(Term sourceTerm, int nbCandidates, int minCandidateFrequency)
public java.util.List<TranslationCandidate> alignSize2(Term sourceTerm, int nbCandidates, int minCandidateFrequency)
public java.util.List<TranslationCandidate> alignSize2(TermService sourceTerm, int nbCandidates, int minCandidateFrequency)
#align(Term, int, int, true
)
sourceTerm
- nbCandidates
- minCandidateFrequency
- public java.util.List<TranslationCandidate> alignSize2(TermService sourceTerm, int nbCandidates, int minCandidateFrequency, boolean allowDistributionalAlignment)
sourceTerm
- nbCandidates
- minCandidateFrequency
- allowDistributionalAlignment
- public java.util.List<TranslationCandidate> alignDico(TermService sourceTerm, int nbCandidates)
public boolean canAlignCompositional(TermService sourceTerm)
public java.util.List<TranslationCandidate> alignCompositional(TermService sourceTerm, int nbCandidates, int minCandidateFrequency)
public boolean canAlignSemiDistributional(TermService sourceTerm)
public java.util.List<TranslationCandidate> alignSemiDistributional(TermService sourceTerm, int nbCandidates, int minCandidateFrequency)
public java.util.List<TranslationCandidate> alignCompositionalSize2(TermService lemmaTerm1, TermService lemmaTerm2, int nbCandidates, int minCandidateFrequency, TermService sourceTerm)
public java.util.List<TranslationCandidate> alignSemiDistributionalSize2Syntagmatic(TermService lemmaTerm1, TermService lemmaTerm2, int nbCandidates, int minCandidateFrequency, TermService sourceTerm)
public BilingualDictionary getDico()
public ContextVector translateVector(ContextVector sourceVector, BilingualDictionary dictionary, int translationStrategy, TerminologyService targetTermino)
ContextVector
components (i.e. its coTerms) into
the target language of this aligner by the mean of one of the available
strategy :
- TRANSLATION_STRATEGY_MOST_FREQUENT
- TRANSLATION_STRATEGY_PRORATA
- TRANSLATION_STRATEGY_EQUI_REPARTITION
- TRANSLATION_STRATEGY_MOST_SPECIFIC
sourceVector
- The source context vector object to be translated into target languagedictionary
- The dico used in the translation processtranslationStrategy
- The translation strategy of the sourceVector
.
Two possible values: TRANSLATION_STRATEGY_MOST_FREQUENT
TRANSLATION_STRATEGY_PRORATA
TRANSLATION_STRATEGY_EQUI_REPARTITION
TRANSLATION_STRATEGY_MOST_SPECIFIC
BilingualDictionary
public java.util.List<java.util.List<TermService>> getSourceSingleLemmaTerms(TermService term)
termino
- term
-