public class KmerUtils
public static long dnaToKmerLong(@NotNull java.lang.String kmerAsDNA, int kmerSize)
Function to first check that the kmer lengths match, then encode the Long from the sequence
public static long byteArrayToKmerLong(@NotNull kotlin.Array[] kmerAsByte, int kmerSize)
Function to check the kmer lengths an then encode the Long from the ByteSeq
public static void exportKmerToHapIdMapToTextFile(@NotNull java.lang.String fileName, @NotNull it.unimi.dsi.fastutil.longs.Long2ObjectMap<kotlin.Array[]> kmerMap)
Function to write out the Long2ObjectMap to a text file. It is in the format: kmerAsLong\tHapId1\tHapId2\tHapId3.....
@NotNull public static it.unimi.dsi.fastutil.longs.Long2ObjectMap<kotlin.Array[]> importKmerToHapIdMapFromTextFile(@NotNull java.lang.String fileName)
Import the KmerToHapIdMap from a text file. This will read from a file with the format: kmerAsLong\tHapId1\tHapId2\tHapId3.....
public static void exportKmerToIntMap(@NotNull java.lang.String fileName, @NotNull it.unimi.dsi.fastutil.longs.Long2IntMap kmerMap)
Function to export the kmer to Id map
@NotNull public static it.unimi.dsi.fastutil.longs.Long2IntMap importKmerToIntMap(@NotNull java.lang.String fileName)
public static void exportPurgeArray(@NotNull java.lang.String fileName, @NotNull KmerToRefRangeIdToPurgeArray purgeArray)
@NotNull public static kotlin.Pair<java.lang.Long,kotlin.Array[]> parseKmerLine(@NotNull java.lang.String currentString)
Function to parse the Kmer line. This just splits the currentLine and then will convert the characters to integers and then add them to an array
public static void exportKmerToHapIdMapToBinaryFile(@NotNull java.lang.String fileName, @NotNull it.unimi.dsi.fastutil.longs.Long2ObjectMap<kotlin.Array[]> kmerMap)
@NotNull public static it.unimi.dsi.fastutil.longs.Long2ObjectMap<kotlin.Array[]> importKmerToHapIdMapFromBinaryFile(@NotNull java.lang.String fileName)
@NotNull public static KmerToRefRangeIdToPurgeArray markKmersForPurgeUsingHammingDistance(@NotNull KmerMap kmerToRefRange, int kmerSize, int minAllowedHammingDist, boolean verboseLogging)
Method that will mark Kmers to purge using HammingDistance
Inputs are the KmerMap where it maps kmer -> RefRangeId
It will then export the Data Class KmerToRefRangeIdToPurgeArray which holds 3 primitive arrays We create a swapper then do the hamming calculation twice one for the first half of the kmer one time for the second half
public static int hammingDistance(long kmer1, long kmer2)
Function to compute paired bit hamming distances for encoded kmers.
public static void compareKmerLoop(@NotNull kotlin.Array[] kmerArray, @NotNull kotlin.Array[] refRangeIdArray, @NotNull kotlin.Array[] purgeArray, long mask, int minAllowedHammingDist, @NotNull it.unimi.dsi.fastutil.Swapper kmerRefRangePurgeSwapper)
Function critical to run the hamming distance comparison.
This will first create a Comparable and then sort the the arrays by the masked comparison Then loop through each kmer Loop through the kmers that match the masked version and compare the hamming distance If the ref Range ids are not the same and they have a hamming distance below what is input, mark the arrays to be purged