-
- All Implemented Interfaces:
public class DBLoadingUtils
Common methods used by postgres and sqlite dbs for loading/retrieving data from the PHG dbs. This is the place encoding/decoding methods for table data should stored. Authors zrm22 and lcj34.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public enum
DBLoadingUtils.AnchorType
public enum
DBLoadingUtils.MethodType
public enum
DBLoadingUtils.GenomeFileType
-
Field Summary
Fields Modifier and Type Field Description public final static String
REGION_REFERENCE_RANGE_GROUP
public final static String
INTER_REGION_REFERENCE_RANGE_GROUP
-
Method Summary
Modifier and Type Method Description static Connection
connection(boolean createNew)
Creates a database connection from the TASSEL ParameterCache It is expected that only initial db loading methods will call this with "createNew" = true. static Connection
connection(String propertiesFile, boolean createNew)
Creates a database connection given a properties file It is expected that only initial db loading methods will call this with "createNew" = true. static Connection
connection(String host, String user, String password, String dbName, String type, boolean createNew)
Creates a new database connection or returns connection to existing db If createNew is FALSE then try to connect, and if db doesn't exist, return NULL NOTE: from postgres, User should never create a db that matches all lower case to an existing db. static Set<String>
verifyIntervalRanges(String anchorFile)
static Array<byte>
encodeSelectedVCFRegionsToByteArray(String fileName, boolean onlyVariants, boolean mergeRefRanges, Range<Position> interval)
static Array<byte>
encodeVCFFileToByteArray(String fileName, boolean onlyVariants, boolean mergeRefRanges)
static Array<byte>
encodeVariantContextStreamToByteArray(Stream<VariantContext> variantStream, boolean onlyVariants, boolean mergeRefRanges)
static Array<byte>
encodeVariantContextListToByteArray(List<VariantContext> listOfVariants, boolean mergeRefRanges)
static List<VariantContext>
decodeByteArrayToListOfVariantContext(Array<byte> encodedByteArray)
static Array<byte>
encodeHapCountsArrayFromMultiset(Multiset<HaplotypeNode> perfectHitSet, Multiset<HaplotypeNode> exclusionHitSet)
THis method takes 2 multisets of HaplotypeNode objects: one indicating inclusion counts for a haplotype, the other indicating exclusion counts. static Array<byte>
encodeHapCountsArrayFromFile(String fileName)
static Array<Array<int>>
decodeHapCountsArray(Array<byte> dataAsByteArray)
static Array<byte>
encodePathArrayFromSet(Set<HaplotypeNode> paths)
static Array<byte>
encodePathsFromIntArray(List<Integer> paths)
This method takes a list of haplotype ids and compresses them to a byte array. static Array<int>
decodePathsArray(Array<byte> dataAsByteArray)
static Array<byte>
encodePathArrayForMultipleLists(List<List<HaplotypeNode>> paths)
static List<List<Integer>>
decodePathsForMultipleLists(Array<byte> dataAsByteArray)
static List<String>
splitCigar(String cigarString)
static List<String>
createInitialAlleles(int maxKmerLen)
This method creates a list of allele strings based on the allele set of A,C,G,T,N The size of the set will be 5 + 5^2 + 5^3 + ... static String
formatMethodParamsToJSON(Map<String, String> parameterList)
This method takes a Map of parameterName to parameterValue, and formats them into a JSON string. static Map<String, String>
parseMethodJsonParamsToString(String methodDescription)
Takes a passed method description string from a PHG dd methods table entry, and formats the JSON key/value pairs into a Mapfor the user. static List<Integer>
createPathNodesForGameteGrp(String taxon, Connection conn, int gamete_grp_id)
This method connects to a database, finds the haplotypes for a specific gamete group, and creates an ordered-by-ref-range list of haplotype ids. static String
getChecksumForFile(File file, String protocol)
static Array<byte>
encodeHapidListToByteArray(List<Integer> hapidList)
static List<Integer>
decodeHapidList(Array<byte> encodedByteArray)
-
-
Method Detail
-
connection
static Connection connection(boolean createNew)
Creates a database connection from the TASSEL ParameterCache It is expected that only initial db loading methods will call this with "createNew" = true.
- Parameters:
createNew
- Indicates if the request is to connect to an existing db or to create a new one with the specified name.
-
connection
static Connection connection(String propertiesFile, boolean createNew)
Creates a database connection given a properties file It is expected that only initial db loading methods will call this with "createNew" = true.
- Parameters:
propertiesFile
- properties filecreateNew
- Indicates if the request is to connect to an existing db or to create a new one with the specified name.
-
connection
static Connection connection(String host, String user, String password, String dbName, String type, boolean createNew)
Creates a new database connection or returns connection to existing db If createNew is FALSE then try to connect, and if db doesn't exist, return NULL NOTE: from postgres, User should never create a db that matches all lower case to an existing db. This will cause errors as our db check verifies based on all-lower case. To get a camel-case db name, the db must be created and accessed using This is likely to cause confusion, so this code defaults to postgres all-lowercase db names.
- Parameters:
host
- hostnameuser
- user idpassword
- passworddbName
- database nametype
- database type (sqlite or postgres)createNew
- if true, delete old db if it exists; create new db from PHG schema
-
verifyIntervalRanges
static Set<String> verifyIntervalRanges(String anchorFile)
-
encodeSelectedVCFRegionsToByteArray
static Array<byte> encodeSelectedVCFRegionsToByteArray(String fileName, boolean onlyVariants, boolean mergeRefRanges, Range<Position> interval)
-
encodeVCFFileToByteArray
static Array<byte> encodeVCFFileToByteArray(String fileName, boolean onlyVariants, boolean mergeRefRanges)
-
encodeVariantContextStreamToByteArray
static Array<byte> encodeVariantContextStreamToByteArray(Stream<VariantContext> variantStream, boolean onlyVariants, boolean mergeRefRanges)
-
encodeVariantContextListToByteArray
static Array<byte> encodeVariantContextListToByteArray(List<VariantContext> listOfVariants, boolean mergeRefRanges)
-
decodeByteArrayToListOfVariantContext
static List<VariantContext> decodeByteArrayToListOfVariantContext(Array<byte> encodedByteArray)
-
encodeHapCountsArrayFromMultiset
static Array<byte> encodeHapCountsArrayFromMultiset(Multiset<HaplotypeNode> perfectHitSet, Multiset<HaplotypeNode> exclusionHitSet)
THis method takes 2 multisets of HaplotypeNode objects: one indicating inclusion counts for a haplotype, the other indicating exclusion counts. These sets are on a per-taxon basis. The data will be written compressed to a byte array for storage in the PHG db haplotype_counts table. If indicated, the data will also be written to files.
-
encodeHapCountsArrayFromFile
static Array<byte> encodeHapCountsArrayFromFile(String fileName)
-
decodeHapCountsArray
static Array<Array<int>> decodeHapCountsArray(Array<byte> dataAsByteArray)
-
encodePathArrayFromSet
static Array<byte> encodePathArrayFromSet(Set<HaplotypeNode> paths)
-
encodePathsFromIntArray
static Array<byte> encodePathsFromIntArray(List<Integer> paths)
This method takes a list of haplotype ids and compresses them to a byte array.
-
decodePathsArray
static Array<int> decodePathsArray(Array<byte> dataAsByteArray)
-
encodePathArrayForMultipleLists
static Array<byte> encodePathArrayForMultipleLists(List<List<HaplotypeNode>> paths)
-
decodePathsForMultipleLists
static List<List<Integer>> decodePathsForMultipleLists(Array<byte> dataAsByteArray)
-
splitCigar
static List<String> splitCigar(String cigarString)
-
createInitialAlleles
static List<String> createInitialAlleles(int maxKmerLen)
This method creates a list of allele strings based on the allele set of A,C,G,T,N The size of the set will be 5 + 5^2 + 5^3 + ... + 5^n where "n" is maxKmerLen passed in and "5^n" is 5 to the nth power. For example: if maxKmerLen = 3, size of initial Allele list is: 5 + 25 + 125 = 155; if maxKmerLen = 5, size of initial Allele list is: 5 + 25 + 125 + 625 + 3125 = 3905
-
formatMethodParamsToJSON
static String formatMethodParamsToJSON(Map<String, String> parameterList)
This method takes a Map of parameterName to parameterValue, and formats them into a JSON string. This string will be used by the calling method as the description entry for the PHG methods table.
-
parseMethodJsonParamsToString
static Map<String, String> parseMethodJsonParamsToString(String methodDescription)
Takes a passed method description string from a PHG dd methods table entry, and formats the JSON key/value pairs into a Mapfor the user. If the string does not parse to JSON, a single map entry of "notes":methodDescription will be created and returned.
-
createPathNodesForGameteGrp
static List<Integer> createPathNodesForGameteGrp(String taxon, Connection conn, int gamete_grp_id)
This method connects to a database, finds the haplotypes for a specific gamete group, and creates an ordered-by-ref-range list of haplotype ids. The intended use is for path creation for Assembly and WGS input.
-
getChecksumForFile
static String getChecksumForFile(File file, String protocol)
-
encodeHapidListToByteArray
static Array<byte> encodeHapidListToByteArray(List<Integer> hapidList)
-
decodeHapidList
static List<Integer> decodeHapidList(Array<byte> encodedByteArray)
-
-
-
-