Minimap2Utils

net.maizegenetics.pangenome.hapCalling.Minimap2Utils

```
public class Minimap2Utils
```

Method Detail

getHapToRefRangeMap

@NotNull
public static java.util.Map<java.lang.Integer,net.maizegenetics.pangenome.api.ReferenceRange> getHapToRefRangeMap(@NotNull
                                                                                                                           HaplotypeGraph graph)

Function to create a HapId to RefRange mapping for all of the HapIds in a graph

getHapIdToSequenceLength

@NotNull
public static java.util.Map<java.lang.Integer,java.lang.Integer> getHapIdToSequenceLength(@NotNull
                                                                                                   HaplotypeGraph graph)

getRefRangeToHapidMap

@NotNull
public static java.util.Map<java.lang.Integer,java.util.Map> getRefRangeToHapidMap(@NotNull
                                                                                            HaplotypeGraph graph)

getHaplotypeListIdForGraph

public static int getHaplotypeListIdForGraph(@NotNull
                                             HaplotypeGraph graph,
                                             @NotNull
                                             PHGdbAccess phgAccessor)

filterRead

public static boolean filterRead(@NotNull
                                 htsjdk.samtools.SAMRecord currentSamRecord,
                                 boolean pairedEnd,
                                 boolean clippingFilter)

loadInReadMappingKeyFile

@NotNull
public static ReadMappingKeyFileParsed loadInReadMappingKeyFile(@NotNull
                                                                         java.lang.String keyFileName,
                                                                         @NotNull
                                                                         ReadMappingInputFileFormat inputFileFormat,
                                                                         @NotNull
                                                                         java.lang.String inputFileDir,
                                                                         @NotNull
                                                                         java.lang.String methodName)

loadSAMFileIntoSAMReader

@NotNull
public static htsjdk.samtools.SamReader loadSAMFileIntoSAMReader(@NotNull
                                                                          java.lang.String fileName)

Function to load the SAM file into a SAM reader Could remove this, but we should leave it in as you may need to read from a file in the future.

runMinimapFromKeyFile

public static void runMinimapFromKeyFile(@NotNull
                                         java.lang.String minimapLocation,
                                         @NotNull
                                         java.lang.String keyFileName,
                                         @NotNull
                                         java.lang.String inputFileDir,
                                         @NotNull
                                         java.lang.String referenceFile,
                                         @Nullable
                                         HaplotypeGraph graph,
                                         double maxRefRangeError,
                                         @NotNull
                                         java.lang.String methodName,
                                         @Nullable
                                         java.lang.String methodDescription,
                                         @NotNull
                                         java.util.Map<java.lang.String,java.lang.String> pluginParams,
                                         @NotNull
                                         java.lang.String outputDebugReadMappingDir,
                                         boolean outputSecondaryMappingStats,
                                         int maxSecondary,
                                         @NotNull
                                         ReadMappingInputFileFormat inputFileFormat,
                                         @NotNull
                                         java.lang.String fParameter,
                                         boolean isTestMethod,
                                         boolean updateDB,
                                         boolean runWithoutGraph,
                                         @NotNull
                                         java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRefRangeMap,
                                         @NotNull
                                         java.util.Map<java.lang.Integer,java.lang.Integer> hapIdToLengthMap,
                                         @NotNull
                                         java.util.Map<java.lang.Integer,? extends java.util.Map<java.lang.Integer,java.lang.Integer>> refRangeToHapIdMap,
                                         @NotNull
                                         java.lang.String inputFileName)

Method to run minimap2 using information provided by the Key file.

This assumes that there must be a column named filename and if paired end is requested a column named filename2.

There are additional pieces of information that will also be provided by the keyfile, but will be implemented once the DB changes are implemented.

outputKeyFiles

public static void outputKeyFiles(@NotNull
                                  java.lang.String keyFileName,
                                  @NotNull
                                  java.util.List<? extends java.util.List<java.lang.String>> keyFileLines,
                                  int flowcellCol,
                                  @NotNull
                                  PHGdbAccess phg,
                                  int taxonCol,
                                  @NotNull
                                  java.lang.String methodName,
                                  @NotNull
                                  java.util.Map<java.lang.String,java.lang.Integer> keyFileColumnNameMap)

Function to output mapping id and path key files to be used by the path finding.

loadReadMappingsToDB

public static void loadReadMappingsToDB(@NotNull
                                        java.util.Map<java.util.List,java.lang.Integer> hapIdMapping,
                                        @NotNull
                                        java.lang.String taxon,
                                        @NotNull
                                        java.lang.String fileGroupName,
                                        @NotNull
                                        java.util.Map<java.lang.String,java.lang.String> pluginParams,
                                        @Nullable
                                        java.lang.String methodDescription,
                                        @NotNull
                                        PHGdbAccess phg,
                                        @NotNull
                                        java.lang.String methodName,
                                        @NotNull
                                        java.util.Map<net.maizegenetics.pangenome.hapCalling.KeyFileUniqueRecord,java.lang.Integer> keyFileRecordsToMappingId,
                                        @NotNull
                                        KeyFileUniqueRecord keyFileRecord,
                                        @NotNull
                                        java.lang.String outputDebugReadMappingDir,
                                        boolean isTestMethod,
                                        int hapListId,
                                        @NotNull
                                        java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRefRangeMap,
                                        @NotNull
                                        java.util.Map<java.lang.Integer,? extends java.util.Map<java.lang.Integer,java.lang.Integer>> refRangeToHapIdMap)

Function to load the Read Mappings to the DB. This will encode the ReadMappings and the load them in given the provided information.

readInKeyFile

@NotNull
public static kotlin.Pair<java.util.Map,java.util.List> readInKeyFile(@NotNull
                                                                               java.lang.String fileName)

Function to read in the key file. The first of the pair is the column mapping and the second is a 2-d list.

verifyNoDuplicatesInKeyFile

public static void verifyNoDuplicatesInKeyFile(@NotNull
                                               java.util.List<net.maizegenetics.pangenome.hapCalling.KeyFileUniqueRecord> keyFileRecords)

verify that there are no duplicate entries in the key file. This is just to let the user know if there is a duplicate.

isKeyEntryInDir

public static boolean isKeyEntryInDir(@NotNull
                                      NonExistentClass fileNames,
                                      @NotNull
                                      java.util.List<java.lang.String> currentKeyRecord,
                                      int fileCol1,
                                      int fileCol2)

Method to check to see if there are missing files found in the key file but are missing in the Directory. If they are its ok, we expect the keyfile to have more entries than the directory.

setupMinimapRun

@NotNull
public static htsjdk.samtools.SamReader setupMinimapRun(@NotNull
                                                                 java.lang.String minimapLocation,
                                                                 @NotNull
                                                                 java.lang.String referenceFile,
                                                                 @NotNull
                                                                 java.lang.String firstFastq,
                                                                 @NotNull
                                                                 java.lang.String secondFastq,
                                                                 int maxSecondary,
                                                                 @NotNull
                                                                 java.lang.String fParameter)

Function that sets up the paired or single end minimap commands and builds the SamReader

The SamReader is used later to pick optimal hits from the alignments.

If secondFastq is "", this will assume that single end is what needs to be run

scoreSamFileCountHapSetHits

@NotNull
public static java.util.Map<java.util.List,java.lang.Integer> scoreSamFileCountHapSetHits(@NotNull
                                                                                                   htsjdk.samtools.SamReader samReader,
                                                                                                   @NotNull
                                                                                                   java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRefRangeMap,
                                                                                                   double maxRefRangeError,
                                                                                                   boolean pairedEnd,
                                                                                                   @NotNull
                                                                                                   java.lang.String outputDebugFile,
                                                                                                   @NotNull
                                                                                                   java.util.Map<java.lang.Integer,java.lang.Integer> hapIdToLengthMap)

Function to score a sam record. This will output a Map which is the hapId Hit set.

For each subset of hapids we return a count of how many reads hit that exact subset. This mapping is used in the HMM.

exportHapIdStats

public static void exportHapIdStats(@NotNull
                                    java.lang.String outputDebugFile,
                                    @NotNull
                                    java.util.Map<java.lang.Integer,net.maizegenetics.pangenome.hapCalling.HapIdStats> hapIdToStatMap,
                                    @NotNull
                                    java.util.Map<java.lang.Integer,java.lang.Integer> hapIdToLengthMap)

Function to export the HapIdStats file. If outputDebugFile is not specified, this will do nothing.

attemptToAddSAMRecordToBestReadMap

public static void attemptToAddSAMRecordToBestReadMap(boolean pairedEnd,
                                                      @NotNull
                                                      htsjdk.samtools.SAMRecord currentSamRecord,
                                                      @NotNull
                                                      java.lang.String readName,
                                                      @NotNull
                                                      java.util.Map<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> bestReadMap)

Function to try to add a SAM record to bestReadMap. bestReadMap holds the currently best know set of reads which have the best(lowest) NM. If the current Record is better, we replace the old entry with a new one. If the current Record is the same, we add it to the list If the current Record is worse, we ignore.

addBestReadMapToHapIdMultiset

public static void addBestReadMapToHapIdMultiset(@NotNull
                                                 java.util.Map<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> bestReadMap,
                                                 @NotNull
                                                 java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRefRangeMap,
                                                 double maxRefRangeError,
                                                 boolean pairedEnd,
                                                 @NotNull
                                                 HapIdMultiset hapIdMultiset,
                                                 @NotNull
                                                 java.util.Map<java.lang.Integer,net.maizegenetics.pangenome.hapCalling.HapIdStats> hapIdToStatMap)

keepHapIdsForSingleRefRange

@NotNull
public static java.util.Map<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> keepHapIdsForSingleRefRange(@NotNull
                                                                                                                                        java.util.Map<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> bestHitMap,
                                                                                                                                        @NotNull
                                                                                                                                        java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRangeMap,
                                                                                                                                        double maxRefRangeError)

Function to keep hapids if they are hitting multiple reference ranges too frequently. If there is a little bit of noise it can be filtered.

spansSingleRefRange

public static boolean spansSingleRefRange(@NotNull
                                          java.util.Map.Entry<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> currentMapping,
                                          @NotNull
                                          java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRangeMap)

Function to check to see if the reads only hit one reference range

removeExtraRefRangeHits

@NotNull
public static kotlin.Pair<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> removeExtraRefRangeHits(@NotNull
                                                                                                                                  java.util.Map.Entry<kotlin.Pair,net.maizegenetics.pangenome.hapCalling.BestAlignmentGroup> currentMapping,
                                                                                                                                  @NotNull
                                                                                                                                  java.util.Map<java.lang.Integer,? extends net.maizegenetics.pangenome.api.ReferenceRange> hapIdToRangeMap,
                                                                                                                                  double maxRefRangeError)

Function to remove any reads which hit more than one reference range ambiguously.

findBestRefRange

public static int findBestRefRange(@NotNull
                                   com.google.common.collect.Multimap<java.lang.Integer,java.lang.Integer> refRangeToIdMapping,
                                   double maxRefRangeError)

Function to check to see if the read hits a multiple reference ranges less than maxRefRangeError Basically this is to filter out any reads which hit multiple reference ranges equally well

Class Minimap2Utils