-
- All Implemented Interfaces:
-
java.lang.Runnable
,net.maizegenetics.plugindef.Plugin
,net.maizegenetics.plugindef.PluginListener
,net.maizegenetics.util.ProgressListener
public class CreateIntervalsFileFromGffPlugin extends AbstractPlugin
This class creates the interval files needed for running GATK haplotype caller, and the csv files needed for loading reference sequence into the database. Two sets of files are created: one set has coordinates based just on the ref gene coordinates. The other is gene coordinates plus user-specified flanking regions Algorithm: 1. read gff file, grab gene coordinates 2. For each Chromosome: merge genes that overlap, toss genes that are embedded within another gene Store list as mergedGeneList. 3. Using the mergedGeneList in 3, create 2nd per-chrom coordinate lists that includes flanking regions 4. Write files: interval format (chrom:start-end): a. mergedGeneList; b. mergedGEneList with flanking csv format (chr,anchorstart,anchorend,geneStart,geneEnd,geneName) a. mergedGeneList; b. mergedGEneList with flanking debug files: List of merged, list of embedded files written for informational purposes NOTE: the csv files contain the name of all genes contained in an anchor. This data is not stored in the DB. IT is included because the biologists have at times asked for it and this is a good place for it to be stored and retrieved. INPUT: 1. refFile: String: path to reference genome. needed to find size of chromosomes for adding flanking regions to last chrom entry. 2. geneFile: String: path to single file containing all chrom gene data in GFF format; or path to directory containing per-chrom files with gene data in GFF format. These data files must consist of GFF gene data alone, not the full gff. 3. outputBase: String: directory, including trailing "/", where output files will be written. 4. numFlanking: int: number of flanking bps to add on each end of the anchors. OUTPUT: 1. intervals file based on gene coordinates. 2. intervals file based on gene coordinates + numflanking bps 3. csv file based on gene coordinates 4. csv file based on gene coordinates + numflanking bps
-
-
Field Summary
Fields Modifier and Type Field Description public final static String
DEFAULT_CITATION
public final static String
POSITION_LIST_NONE
public final static String
TAXA_LIST_NONE
-
Constructor Summary
Constructors Constructor Description CreateIntervalsFileFromGffPlugin()
CreateIntervalsFileFromGffPlugin(Frame parentFrame)
CreateIntervalsFileFromGffPlugin(Frame parentFrame, boolean isInteractive)
-
Method Summary
Modifier and Type Method Description DataSet
processData(DataSet input)
String
runPlugin(DataSet input)
Convenience method to run plugin with one return object. ImageIcon
getIcon()
String
getButtonName()
String
getToolTipText()
String
refFile()
Fasta file containing reference genome CreateIntervalsFileFromGffPlugin
refFile(String value)
Set Ref Genome File. String
geneFile()
Tab delimited . CreateIntervalsFileFromGffPlugin
geneFile(String value)
Set Gene File. String
outputDir()
Directory where output files will be written CreateIntervalsFileFromGffPlugin
outputDir(String value)
Set Output Directory. Integer
numFlanking()
Number of flanking basepairs to add at each end of the gene sequence CreateIntervalsFileFromGffPlugin
numFlanking(Integer value)
Set Number of Flanking BPs. -
Methods inherited from class net.maizegenetics.plugindef.AbstractPlugin
addListener, cancel, convert, dataSetReturned, getCitation, getInputs, getListeners, getMenu, getPanel, getParameter, getParentFrame, getUsage, getUsageHTML, hasListeners, isInteractive, isPluginParameter, performFunction, pluginDescription, pluginParameters, pluginUserManualURL, progress, receiveInput, reverseTrace, run, setConfigParameters, setParameter, setParameters, setParametersToDefault, setThreaded, trace, usageParameters, wasCancelled
-
Methods inherited from class net.maizegenetics.plugindef.Plugin
getPluginInstance, isPlugin
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
-
Method Detail
-
processData
DataSet processData(DataSet input)
-
getButtonName
String getButtonName()
-
getToolTipText
String getToolTipText()
-
refFile
CreateIntervalsFileFromGffPlugin refFile(String value)
Set Ref Genome File. Fasta file containing reference genome
- Parameters:
value
- Ref Genome File
-
geneFile
String geneFile()
Tab delimited .txt file containing gene-only GFF data from reference GFF file,
-
geneFile
CreateIntervalsFileFromGffPlugin geneFile(String value)
Set Gene File. Tab delimited .txt file containing gene-only GFF data from reference GFF file,
- Parameters:
value
- Gene File
-
outputDir
CreateIntervalsFileFromGffPlugin outputDir(String value)
Set Output Directory. Directory where output files will be written
- Parameters:
value
- Output Directory
-
numFlanking
Integer numFlanking()
Number of flanking basepairs to add at each end of the gene sequence
-
numFlanking
CreateIntervalsFileFromGffPlugin numFlanking(Integer value)
Set Number of Flanking BPs. Number of flanking basepairs to add at each end of the gene sequence
- Parameters:
value
- Number of Flanking BPs
-
-
-
-