Linear, Chimeric, Supplementary, Primary and Secondary Alignments
What is the difference between the title mentaioned alignment reads?
Linear Alignment: An alignment of a read to a single reference sequence that may include insertions,
deletions, skips and clipping
, but may not include direction changes (i.e. one portion of the alignment
on forward strand and another portion of alignment on reverse strand). 1
Chimeric Alignment: An alignment of a read that cannot be represented as a linear alignment. Typically, one of the linear alignments in a chimeric alignment is considered the “representative” alignment, and the others are called “supplementary” and are distinguished by the supplementary alignment flag. 1
Chimeric reads are indicative of structural variation in DNA-seq and it may indicate the presence of chimeric genes in RNA-seq. 2
In short, chimeric reads can be split in to two or more parts, each part would be mapped to reference(it’s not hard-clipped), the total length of the mapped part is longger than read length. 3
Representative alignment: A chimeric alignment that is represented as a set of linear alignments that do not have large overlaps typically has one linear alignment that is considered the representative alignment. 4
I don’t understand representative alignment with the word “representative” in my mother tongue and could not find more information(figure) about it.
One read can align to multiple positions, we can find one alignmnet position which sequence do not have large overlaps, it called representative alighment, for other alignment positions, we called them supplementary alignment.
It seems that GATK can realignment those representative reads to the correctly position via RealignerTargetCreator and IndelRealigner
. (WARNING: I am not quite sure if I understand this correctly. If someone could help me, please leave me a message below, thanks, thanks.)
Supplementary Alignment: A chimeric reads but not a representative reads.
Primary Alignment and Secondary Alignment: A read may map ambiguously to multiple locations, e.g. due to repeats. Only one of the multiple read alignments is considered primary, and this decision may be arbitrary. All other alignments have the secondary alignment flag. 5
How to filter those reads?
I usually use samtools to filter those reads. Picard tools website provide a online SAM flags explaination tool.
Using samtools view -f/-F
command can filter
/filter out
the reads with specific flags.
Reference: