-
- All Implemented Interfaces:
public class FindProteomeGenesInAssembly
NOTE: Needs to be re-worked with new db. The PHG calls are obsolete, need calls for new schema. Input: 1. the db from which to pull the assembly/haplotype data 2. the reference genome for pulling the gene sequence (a subset of the anchor sequence) 3. the name of the reference line as stored in the db (e.g. B73Ref) 4. the name of the assembly as stored in the db, e.g. (W22Assembly) 5. name of the line from haplotypeCaller as stored in the db (e.g. W22_Haplotype_Caller) or NONE if this line was not processed via haplotype caller. 6. Tab-delimited file containing list of anchorids, the gene start/end on which the anchor is based, and whether the gene is a proteome gene. 7. The output directory for writing files. Output: 1. A tab-delimited file containing the headers: AnchorID\tChromosome\tGene\tGeneInProteome\tAssemblyHasAnchor\tGeneSeqIn_\tGeneSeqIn_