-
- All Implemented Interfaces:
public class VariantsProcessingUtils
This class contains methods to aid in processing a VariantContext list into the PHG db variants, alleles, and haplotypes tables
-
-
Method Summary
Modifier and Type Method Description static List<String>
createAlleleList(VariantContext vc)
createALleleList returns an array list containing just the ref allele string, if the VariantContext record is a ref record. static List<String>
createAlleleList(HaplotypeNode.VariantInfo vi)
Method to create a list of alleles based on if the variant info is a ref or variant. static Tuple<String, VariantMappingData>
getVariantData(String chrom, Map<String, Integer> alleleHashMap, HaplotypeNode.VariantInfo vi, Connection dbConn)
Method will return a Tuple of the hash of (chrom, start_position, refAlleleID, altAlleleID) and a VariantMappingData record The alleleHashMap should contain the initial alleles pre-populated. static List<Long>
getLongListOfVariantData(List<Tuple<String, VariantMappingData>> variantMappingDataList, Map<String, Integer> hashIDMap)
Takes a list of variantIdhash to VariantMappingData and creates a list of "long" values, that represents the variant context for each entry static long
getLongFromVMData(int vmID, VariantMappingData vmd)
Takes an ID and a VariantMappingData object and creates a list of variant_ids with additional data. static long
getLongRefRecord(int refLen, int refDepth, int startPos)
Takes a reference length, depth and start position on the chromosome. static long
getLongVariantRecord(int vmID, int refDepth, int altDepth, byte isIndel, int otherData)
Takes a variantMapping id, reference depth, alternate depth, indication as to if is an indel, and a dummy int Returns a long formatted with this data. static Array<byte>
encodeVariantLongListToByteArray(List<Long> variantLongList)
Method takes a List of Long objects and converts to a Snappy compressed byte stream static List<Long>
decodeByteArrayToVariantLongList(Array<byte> encodedByteArray)
Method takes a Snappy compressed byte stream and decodes it into a List of Long objects static Array<byte>
longListToByteArray(Collection<Long> ListOfLongs)
static List<Long>
byteArrayToLongList(Array<byte> byteArray)
static int
findAlleleIDFromDB(String allele, Connection dbConn)
From an allele string, compute the hash value and search for a corresponding ID in the DB alleles table. -
-
Method Detail
-
createAlleleList
static List<String> createAlleleList(VariantContext vc)
createALleleList returns an array list containing just the ref allele string, if the VariantContext record is a ref record. Or both the ref allele and first alt allele, if the VariantContext record is a variant record. THis method does not check if the allele exists. The returned list is ALL alleles - this assumes they will be added via an INSERT/IGNORE db command.
-
createAlleleList
static List<String> createAlleleList(HaplotypeNode.VariantInfo vi)
Method to create a list of alleles based on if the variant info is a ref or variant.
-
getVariantData
static Tuple<String, VariantMappingData> getVariantData(String chrom, Map<String, Integer> alleleHashMap, HaplotypeNode.VariantInfo vi, Connection dbConn)
Method will return a Tuple of the hash of (chrom, start_position, refAlleleID, altAlleleID) and a VariantMappingData record The alleleHashMap should contain the initial alleles pre-populated. If the alleleId cannot be found from this hash map, the db will be queried. Generally 2/3 of all alleles will be found on the hashmap.
-
getLongListOfVariantData
static List<Long> getLongListOfVariantData(List<Tuple<String, VariantMappingData>> variantMappingDataList, Map<String, Integer> hashIDMap)
Takes a list of variantIdhash to VariantMappingData and creates a list of "long" values, that represents the variant context for each entry
- Parameters:
hashIDMap
- Map of variant_hash to variantID
-
getLongFromVMData
static long getLongFromVMData(int vmID, VariantMappingData vmd)
Takes an ID and a VariantMappingData object and creates a list of variant_ids with additional data.
-
getLongRefRecord
static long getLongRefRecord(int refLen, int refDepth, int startPos)
Takes a reference length, depth and start position on the chromosome. Returns an encode long holding this information. format: 1bit=ref | 2 bytes 7 bits = refLength | 1 bytes=refDepth | 4 bytes=position on chrom
-
getLongVariantRecord
static long getLongVariantRecord(int vmID, int refDepth, int altDepth, byte isIndel, int otherData)
Takes a variantMapping id, reference depth, alternate depth, indication as to if is an indel, and a dummy int Returns a long formatted with this data. format: 4 bytes= variant_mapping table id | 1 byte=refDepth | 1 byte=altDepth | 1 bytes=isIndel | 1 byte unused
-
encodeVariantLongListToByteArray
static Array<byte> encodeVariantLongListToByteArray(List<Long> variantLongList)
Method takes a List of Long objects and converts to a Snappy compressed byte stream
-
decodeByteArrayToVariantLongList
static List<Long> decodeByteArrayToVariantLongList(Array<byte> encodedByteArray)
Method takes a Snappy compressed byte stream and decodes it into a List of Long objects
-
longListToByteArray
static Array<byte> longListToByteArray(Collection<Long> ListOfLongs)
- Parameters:
ListOfLongs
- a List or Collection of Longs to be converted into a byte[] array
-
byteArrayToLongList
static List<Long> byteArrayToLongList(Array<byte> byteArray)
- Parameters:
byteArray
- an array of bytes converting longs
-
findAlleleIDFromDB
static int findAlleleIDFromDB(String allele, Connection dbConn)
From an allele string, compute the hash value and search for a corresponding ID in the DB alleles table.
-
-
-
-