Mapping of Illumina reads to assembly ------------------------------------- We are mapping the Illumina reads to the largest contig of our assembly with BWA. BWA is a software package for mapping low-divergent sequences against a large reference genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. The first step is to create an index on the assembly, to allow mapping:: bwa index ~/workdir/assembly/assembly.fasta Then we are mapping all reads to the contig. Note that we are shortening the process of creating a sorted and indexed bam file by piping the output of bwa directly to samtools, thereby avoiding temporary files:: cd ~/workdir/ mkdir illumina_mapping bwa mem -t 14 ~/workdir/assembly/assembly.fasta ~/workdir/data/illumina/Illumina_R1.fastq.gz ~/workdir/data/illumina/Illumina_R2.fastq.gz | samtools view - -Sb | samtools sort - -@14 -o ~/workdir/illumina_mapping/mapping.sorted.bam samtools index ~/workdir/illumina_mapping/mapping.sorted.bam minimap2 -a data/Reference.fna pilon/pilon_round4 We will use a genome browser to look at the mappings (to get an impression on our assembly quality). Start IGV:: igv Now let's look at the mapped contigs: 1. Load the reference genome into IGV. Use the menu ``Genomes->Load Genome from File...`` 2. Load the BAM file into IGV. Use menu ``File->Load from File...`` References ^^^^^^^^^^ **BWA** http://bio-bwa.sourceforge.net/ **samtools** http://www.htslib.org **IGV** http://www.broadinstitute.org/igv/