public class Minimap2Utils
@NotNull public static java.util.Map<java.lang.Integer,net.maizegenetics.pangenome.api.ReferenceRange> getHapToRefRangeMap(@NotNull HaplotypeGraph graph)
Function to create a HapId to RefRange mapping for all of the HapIds in a graph
@NotNull public static java.util.Map<java.lang.Integer,java.lang.Integer> getHapIdToSequenceLength(@NotNull HaplotypeGraph graph)
@NotNull public static java.util.Map<java.lang.Integer,java.util.Map> getRefRangeToHapidMap(@NotNull HaplotypeGraph graph)
public static int getHaplotypeListIdForGraph(@NotNull HaplotypeGraph graph, @NotNull PHGdbAccess phgAccessor)
public static boolean filterRead(@NotNull htsjdk.samtools.SAMRecord currentSamRecord, boolean pairedEnd, boolean clippingFilter)
@NotNull public static ReadMappingKeyFileParsed loadInReadMappingKeyFile(@NotNull java.lang.String keyFileName, @NotNull ReadMappingInputFileFormat inputFileFormat, @NotNull java.lang.String inputFileDir, @NotNull java.lang.String methodName)
@NotNull public static htsjdk.samtools.SamReader loadSAMFileIntoSAMReader(@NotNull java.lang.String fileName)
Function to load the SAM file into a SAM reader Could remove this, but we should leave it in as you may need to read from a file in the future.
public static void runMinimapFromKeyFile(@NotNull java.lang.String minimapLocation, @NotNull java.lang.String keyFileName, @NotNull java.lang.String inputFileDir, @NotNull java.lang.String referenceFile, @Nullable HaplotypeGraph graph, double maxRefRangeError, @NotNull java.lang.String methodName, @Nullable java.lang.String methodDescription, @NotNull java.util.Map<java.lang.String,java.lang.String> pluginParams, @NotNull java.lang.String outputDebugReadMappingDir, boolean outputSecondaryMappingStats, int maxSecondary, @NotNull ReadMappingInputFileFormat inputFileFormat, @NotNull java.lang.String fParameter, boolean isTestMethod, boolean updateDB, boolean runWithoutGraph, @NotNull java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRefRangeMap, @NotNull java.util.Map<java.lang.Integer,java.lang.Integer> hapIdToLengthMap, @NotNull java.util.Map<java.lang.Integer,? extends java.util.Map<java.lang.Integer,java.lang.Integer>> refRangeToHapIdMap, @NotNull java.lang.String inputFileName)
Method to run minimap2 using information provided by the Key file.
This assumes that there must be a column named filename and if paired end is requested a column named filename2.
There are additional pieces of information that will also be provided by the keyfile, but will be implemented once the DB changes are implemented.
public static void outputKeyFiles(@NotNull java.lang.String keyFileName, @NotNull java.util.List<? extends java.util.List<java.lang.String>> keyFileLines, int flowcellCol, @NotNull PHGdbAccess phg, int taxonCol, @NotNull java.lang.String methodName, @NotNull java.util.Map<java.lang.String,java.lang.Integer> keyFileColumnNameMap)
Function to output mapping id and path key files to be used by the path finding.
public static void loadReadMappingsToDB(@NotNull java.util.Map<java.util.List,java.lang.Integer> hapIdMapping, @NotNull java.lang.String taxon, @NotNull java.lang.String fileGroupName, @NotNull java.util.Map<java.lang.String,java.lang.String> pluginParams, @Nullable java.lang.String methodDescription, @NotNull PHGdbAccess phg, @NotNull java.lang.String methodName, @NotNull java.util.Map<net.maizegenetics.pangenome.hapCalling.KeyFileUniqueRecord,java.lang.Integer> keyFileRecordsToMappingId, @NotNull KeyFileUniqueRecord keyFileRecord, @NotNull java.lang.String outputDebugReadMappingDir, boolean isTestMethod, int hapListId, @NotNull java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRefRangeMap, @NotNull java.util.Map<java.lang.Integer,? extends java.util.Map<java.lang.Integer,java.lang.Integer>> refRangeToHapIdMap)
Function to load the Read Mappings to the DB. This will encode the ReadMappings and the load them in given the provided information.
@NotNull public static kotlin.Pair<java.util.Map,java.util.List> readInKeyFile(@NotNull java.lang.String fileName)
Function to read in the key file. The first of the pair is the column mapping and the second is a 2-d list.
public static void verifyNoDuplicatesInKeyFile(@NotNull java.util.List<net.maizegenetics.pangenome.hapCalling.KeyFileUniqueRecord> keyFileRecords)
verify that there are no duplicate entries in the key file. This is just to let the user know if there is a duplicate.
public static boolean isKeyEntryInDir(@NotNull NonExistentClass fileNames, @NotNull java.util.List<java.lang.String> currentKeyRecord, int fileCol1, int fileCol2)
Method to check to see if there are missing files found in the key file but are missing in the Directory. If they are its ok, we expect the keyfile to have more entries than the directory.
@NotNull public static htsjdk.samtools.SamReader setupMinimapRun(@NotNull java.lang.String minimapLocation, @NotNull java.lang.String referenceFile, @NotNull java.lang.String firstFastq, @NotNull java.lang.String secondFastq, int maxSecondary, @NotNull java.lang.String fParameter)
Function that sets up the paired or single end minimap commands and builds the SamReader
The SamReader is used later to pick optimal hits from the alignments.
If secondFastq is "", this will assume that single end is what needs to be run
@NotNull public static java.util.Map<java.util.List,java.lang.Integer> scoreSamFileCountHapSetHits(@NotNull htsjdk.samtools.SamReader samReader, @NotNull java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRefRangeMap, double maxRefRangeError, boolean pairedEnd, @NotNull java.lang.String outputDebugFile, @NotNull java.util.Map<java.lang.Integer,java.lang.Integer> hapIdToLengthMap)
Function to score a sam record. This will output a Map which is the hapId Hit set.
For each subset of hapids we return a count of how many reads hit that exact subset. This mapping is used in the HMM.
public static void exportHapIdStats(@NotNull java.lang.String outputDebugFile, @NotNull java.util.Map<java.lang.Integer,net.maizegenetics.pangenome.hapCalling.HapIdStats> hapIdToStatMap, @NotNull java.util.Map<java.lang.Integer,java.lang.Integer> hapIdToLengthMap)
Function to export the HapIdStats file. If outputDebugFile is not specified, this will do nothing.
public static void attemptToAddSAMRecordToBestReadMap(boolean pairedEnd, @NotNull htsjdk.samtools.SAMRecord currentSamRecord, @NotNull java.lang.String readName, @NotNull java.util.Map<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> bestReadMap)
Function to try to add a SAM record to bestReadMap. bestReadMap holds the currently best know set of reads which have the best(lowest) NM. If the current Record is better, we replace the old entry with a new one. If the current Record is the same, we add it to the list If the current Record is worse, we ignore.
public static void addBestReadMapToHapIdMultiset(@NotNull java.util.Map<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> bestReadMap, @NotNull java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRefRangeMap, double maxRefRangeError, boolean pairedEnd, @NotNull HapIdMultiset hapIdMultiset, @NotNull java.util.Map<java.lang.Integer,net.maizegenetics.pangenome.hapCalling.HapIdStats> hapIdToStatMap)
@NotNull public static java.util.Map<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> keepHapIdsForSingleRefRange(@NotNull java.util.Map<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> bestHitMap, @NotNull java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRangeMap, double maxRefRangeError)
Function to keep hapids if they are hitting multiple reference ranges too frequently. If there is a little bit of noise it can be filtered.
public static boolean spansSingleRefRange(@NotNull java.util.Map.Entry<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> currentMapping, @NotNull java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRangeMap)
Function to check to see if the reads only hit one reference range
@NotNull public static kotlin.Pair<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> removeExtraRefRangeHits(@NotNull java.util.Map.Entry<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> currentMapping, @NotNull java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRangeMap, double maxRefRangeError)
Function to remove any reads which hit more than one reference range ambiguously.
public static int findBestRefRange(@NotNull com.google.common.collect.Multimap<java.lang.Integer,java.lang.Integer> refRangeToIdMapping, double maxRefRangeError)
Function to check to see if the read hits a multiple reference ranges less than maxRefRangeError Basically this is to filter out any reads which hit multiple reference ranges equally well