public interface PepperExporter extends PepperModule
A mapping task in the Pepper workflow is not a monolithic block. It consists of several smaller steps.
public MyModule() { super("Name of the module"); setSupplierContact(URI.createURI("Contact address of the module's supplier")); setSupplierHomepage(URI.createURI("homepage of the module")); setDesc("A short description of what is the intention of this module, for instance which formats are importable. "); this.addSupportedFormat("The name of a format which is importable e.g. txt", "The version corresponding to the format name", null); }
public boolean isReadyToStart() { return (true); }
exportCorpusStructure()
. It is invoked on top of the
method ' start() ' of the PepperExporter . For totally changing the default
behavior just override this method. The aim of the method
exportCorpusStructure()
is to fill the map of corresponding
corpus-structure and file structure. The file structure is automatically
created, there are just URI s pointing to the virtual file or folder. The
creation of the file or folder has to be done by the Pepper module itself in
method PepperMapper.mapSCorpus()
or
PepperMapper.mapSDocument()
. To adapt the creation of this 'virtual'
file structure, you first have to choose the mode of export. You can do this
for instance in method 'readyToStart()', as shown in the following snippet.
But even in the constructor as well.
public boolean isReadyToStart(){ ... //option 1 setExportMode(EXPORT_MODE.NO_EXPORT); //option 2 setExportMode(EXPORT_MODE.CORPORA_ONLY); //option 3 setExportMode(EXPORT_MODE.DOCUMENTS_IN_FILES); //sets the ending, which should be added to the documents name setDocumentEnding(ENDING_TAB); .. }In this snippet, option 1 means that nothing will be mapped. Option 2 means that only
SCorpus
objects are mapped to a folder and
SDocument
objects will be ignored. And option 3 means that
SCorpus
objects are mapped to a folder and SDocument
objects
are mapped to a file. The ending of that file can be determined by passing
the ending with method setDocumentEnding(String)
. In the given
snippet a URI
having the ending 'tab' is created for each
SDocument
.
PepperModule.createPepperMapper(Identifier)
a PepperMapper
object needs
to be initialized and returned. The PepperMapper
is the major part
major part doing the mapping. It provides the methods
PepperMapper.mapSCorpus()
to handle the mapping of a single
SCorpus
object and PepperMapper.mapSDocument()
to handle a
single SDocument
object. Both methods are invoked by the Pepper
framework. To set the PepperMapper.getResourceURI()
, which offers the
mapper the file or folder of the current SCorpus
or SDocument
object, this filed needs to be set in the
PepperModule.createPepperMapper(Identifier)
method. The following snippet shows a
dummy of that method:
public PepperMapper createPepperMapper(Identifier sElementId) { PepperMapper mapper = new PepperMapperImpl() { @Override public DOCUMENT_STATUS mapSCorpus() { // handling the mapping of a single corpus // accessing the current file or folder getResourceURI(); // returning, that the corpus was mapped successfully return (DOCUMENT_STATUS.COMPLETED); } @Override public DOCUMENT_STATUS mapSDocument() { // handling the mapping of a single document // accessing the current file or folder getResourceURI(); // returning, that the document was mapped successfully return (DOCUMENT_STATUS.COMPLETED); } }; // pass current file or folder to mapper. When using // PepperImporter.importCorpusStructure or // PepperExporter.exportCorpusStructure, the mapping between file or // folder // and SCorpus or SDocument was stored here mapper.setResourceURI(getIdentifier2ResourceTable().get(sElementId)); return (mapper); }
public void end() { super.end(); // do some clean up like closing of streams etc. }
Modifier and Type | Interface and Description |
---|---|
static class |
PepperExporter.EXPORT_MODE
Determines how the corpus-structure should be exported.
|
ENDING_ALL_FILES, ENDING_FOLDER, ENDING_LEAF_FOLDER, ENDING_TAB, ENDING_TXT, ENDING_XML
Modifier and Type | Method and Description |
---|---|
FormatDesc |
addSupportedFormat(String formatName,
String formatVersion,
org.eclipse.emf.common.util.URI formatReference) |
org.eclipse.emf.common.util.URI |
createFolderStructure(org.corpus_tools.salt.graph.Identifier sElementId)
Creates a folder structure basing on the passed corpus path in (
CorpusDesc.getCorpusPath() ). |
void |
exportCorpusStructure()
This method is called by
PepperModule.start() to export the corpus-structure
into a folder-structure. |
CorpusDesc |
getCorpusDesc()
TODO docu
|
String |
getDocumentEnding()
Returns the format ending for files to be exported and related to
SDocument objects. |
PepperExporter.EXPORT_MODE |
getExportMode()
Returns how corpus-structure is exported.
|
Map<org.corpus_tools.salt.graph.Identifier,org.eclipse.emf.common.util.URI> |
getIdentifier2ResourceTable()
Returns table correspondence between
Identifier and a resource. |
List<FormatDesc> |
getSupportedFormats()
TODO docu
|
void |
setCorpusDesc(CorpusDesc corpusDesc)
TODO docu
|
void |
setDocumentEnding(String sDocumentEnding)
Sets the format ending for files to be exported and related to
SDocument objects. |
void |
setExportMode(PepperExporter.EXPORT_MODE exportMode)
Determines how the corpus-structure should be exported.
|
createPepperMapper, done, done, end, getComponentContext, getCorpusGraph, getDesc, getFingerprint, getModuleController, getModuleType, getName, getProgress, getProgress, getProperties, getResources, getSaltProject, getSelfTestDesc, getStartProblems, getSupplierContact, getSupplierHomepage, getSymbolicName, getTemproraries, getVersion, isMultithreaded, isReadyToStart, proposeImportOrder, setCorpusGraph, setDesc, setIsMultithreaded, setPepperModuleController_basic, setPepperModuleController, setProperties, setResources, setSaltProject, setSupplierContact, setSupplierHomepage, setSymbolicName, setTemproraries, setVersion, start, start
List<FormatDesc> getSupportedFormats()
CorpusDesc getCorpusDesc()
String getDocumentEnding()
SDocument
objects.SDocument
objects to be exported.void setDocumentEnding(String sDocumentEnding)
SDocument
objects.file
- ending for SDocument
objects to be exported.void setCorpusDesc(CorpusDesc corpusDesc)
Map<org.corpus_tools.salt.graph.Identifier,org.eclipse.emf.common.util.URI> getIdentifier2ResourceTable()
Identifier
and a resource.
Stores Identifier
objects corresponding to either a
SDocument
or a SCorpus
object, which has been created
during the run of #importCorpusStructure(SCorpusGraph)
.
Corresponding to the Identifier
object this table stores the
resource from where the element shall be imported.corpus_1 | /home/me/corpora/myCorpus |
corpus_2 | /home/me/corpora/myCorpus/subcorpus |
doc_1 | /home/me/corpora/myCorpus/subcorpus/document1.xml |
doc_2 | /home/me/corpora/myCorpus/subcorpus/document2.xml |
Identifier
and a resource.org.eclipse.emf.common.util.URI createFolderStructure(org.corpus_tools.salt.graph.Identifier sElementId)
CorpusDesc.getCorpusPath()
). For each segment in
Identifier
a folder is created.Identifier
as file path, which was
created on diskPepperExporter.EXPORT_MODE getExportMode()
void setExportMode(PepperExporter.EXPORT_MODE exportMode)
SCorpus
objects are exported into a
folder structure, but SDocument
objects are not exportedSCorpus
objects are exported
into a folder structure and SDocument
objects are stored in files
having the ending determined by PepperExporter#getDocumentEnding()exportMode
- void exportCorpusStructure()
PepperModule.start()
to export the corpus-structure
into a folder-structure. That means, each Identifier
belonging to
a SDocument
or SCorpus
object is stored
getIdentifier2ResourceTable()
together with thze corresponding
file-structure object (file or folder) located by a URI
. The
URI
object corresponding to files will get the file ending
determined by #getDocumentEnding(String)
, which could be set by
setDocumentEnding(String)
. URI
s set the export mode via
setExportMode(EXPORT_MODE)
.FormatDesc addSupportedFormat(String formatName, String formatVersion, org.eclipse.emf.common.util.URI formatReference)
Copyright © 2009–2019 Humboldt-Universität zu Berlin, INRIA. All rights reserved.