CINXE.COM

USADELLAB.org - Trimmomatic: A flexible read trimming tool for Illumina NGS data

<!DOCTYPE HTML> <html lang="en"> <head> <meta charset="utf-8"> <title>USADELLAB.org - Trimmomatic: A flexible read trimming tool for Illumina NGS data</title> <meta name="description" content="<!-- Html blob 'description' does not exist -->"> <meta name="generator" content="USADELLAB.org"> <meta name="author" content="http://www.usadellab.org/cms"> <meta name="robots" content="index, follow"> <meta name="viewport" content="width=device-width, initial-scale=1"> <link rel="shortcut icon" href="http://www.usadellab.org/cms/uploads/template1200/images/touch-icon.png"> <link rel="apple-touch-icon" href="http://www.usadellab.org/cms/uploads/template1200/images/touch-icon.png"> <meta name="application-name" content="USADELLAB.org"/> <meta name="msapplication-navbutton-color" content="white"/> <meta name="msapplication-TileColor" content="#000000"/> <meta name="msapplication-TileImage" content="http://www.usadellab.org/cms/uploads/template1200/images/touch-icon.png"/> <link href='http://fonts.googleapis.com/css?family=Roboto+Condensed:300,300italic,400,400italic,700,700italic' rel='stylesheet' type='text/css'> <link href='http://fonts.googleapis.com/css?family=Quicksand:300,400,700' rel='stylesheet' type='text/css'> <link rel="stylesheet" type="text/css" href="http://www.usadellab.org/cms/tmp/cache/stylesheet_combined_d1a97cc0f639e5066b57056fd7ff7833.css" /> <!-- IEMobile 10 viewport fix --> <script type="text/javascript"> if (navigator.userAgent.match(/IEMobile\/10\.0/)) { var msViewportStyle = document.createElement("style"); msViewportStyle.appendChild(document.createTextNode("@-ms-viewport{width:auto!important}")); document.getElementsByTagName("head")[0].appendChild(msViewportStyle); } </script> <script src="http://www.usadellab.org/cms/uploads/template1200/js/jquery-2.1.1.min.js" type="text/javascript"></script> <script src="http://www.usadellab.org/cms/uploads/template1200/js/template1200.scripts.min.js" type="text/javascript"></script> </head> <body> <div id="top"></div> <div id="header_container"> <div class="content"> <div class="section"> <div class="three-column logo"><a href="http://www.usadellab.org/cms"><h4>USADELLAB.org</h4></a></div> <div class="two-third-column"> <div id="menu-mobile" onclick="void(0)"> <div id="nav-mobile">USADELLAB.org</div> <ul id="menu"> <li><a href="http://www.usadellab.org/cms/"> Home </a> </li> <li><a href="/cms/index.php?page=trimmomatic#">Research</a> <ul> <li><a href="http://www.usadellab.org/cms/index.php?page=plantcellwalls"> Plant cell walls </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=metabolism"> Plant Metabolism </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=bioinformatics"> Bioinformatics &amp; Systems Biology </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=quantitative-bioeconomy"> Quantitative Bioeconomy </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=databases-and-data-science"> Databases and Data Science </a> </li></ul> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=projects"> Projects </a> <ul> <li><a href="http://www.usadellab.org/cms/index.php?page=salt-stress-in-tomato-roots"> Salt stress in tomato roots </a> </li></ul> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=education"> Education </a> <ul> <li><a href="http://www.usadellab.org/cms/index.php?page=introductiontogenetics"> Introduction to Genetics </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=moleculargenetics1"> Moleculargenetics I </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=MolecularBiology"> Molecular Biology </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=vtmplantcellmol"> VTM Plant CellMol </a> </li></ul> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=servicesoftware" > Service &amp; Software </a> <ul> <li><a href="http://www.usadellab.org/cms/index.php?page=trimmomatic" class="currentpage"> Trimmomatic </a> </li> <li><a href="http://mapman.gabipd.org/web/guest/app/mercator" target="_blank"> Mercator </a> </li> <li><a href="http://mapman.gabipd.org/web/guest/robin" target="_blank"> Robin &amp; RobiNA </a> </li> <li><a href="http://mapman.gabipd.org/web/guest/mapman" target="_blank"> MapMan </a> </li> <li><a href="http://mapman.gabipd.org/web/guest/pageman" target="_blank"> PageMan </a> </li> <li><a href="http://mapman.gabipd.org/web/guest/mapcave" target="_blank"> MapCave </a> </li> <li><a href="http://www.plabipd.de" target="_blank"> PlabiPD </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=corto"> CorTo </a> </li></ul> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=publications"> Publications </a> </li> <li><a href="/cms/index.php?page=trimmomatic#">Supporting Info</a> <ul> <li><a href="http://www.usadellab.org/cms/index.php?page=pennellii"> Solanum pennellii </a> </li></ul> </li> <li><a href="/cms/index.php?page=trimmomatic#">About Us</a> <ul> <li><a href="http://www.usadellab.org/cms/index.php?page=staff"> Staff </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=jobsprojects"> Jobs &amp; Projects </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=events"> Events </a> </li></ul> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=ngs-de-and-other-things"> NGS, DE and other things </a> </li> <li><a href="http://www.usadellab.org/cms/index.php?page=datenschutzhinweis"> Data Protection </a> </li> </ul> </div> </div> </div> <div class="section"><!-- clear section --></div> </div> </div> <div id="content_container"> <div class="content_blocks"> <div class="content"> <div class="section"> <div class="two-third-column one"> <h2>Trimmomatic: A flexible read trimming tool for Illumina NGS data</h2> <p>聽</p> <h3>Citations</h3> <p>Bolger, A. M., Lohse, M., &amp; Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina Sequence Data. <em>Bioinformatics</em>, btu170.</p> <h3>聽</h3> <h3>Downloading Trimmomatic</h3> <p>starting on version 0.40 we also offer a <a href="https://github.com/usadellab/Trimmomatic">github page</a> (as well as older versions)</p> <p>Version 0.39: <a href="uploads/supplementary/Trimmomatic/Trimmomatic-0.39.zip">binary</a>, <a href="uploads/supplementary/Trimmomatic/Trimmomatic-Src-0.39.zip">source</a> and <a href="uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf">manual</a></p> <p>Version 0.36: <a href="uploads/supplementary/Trimmomatic/Trimmomatic-0.36.zip">binary</a> and <a href="uploads/supplementary/Trimmomatic/Trimmomatic-Src-0.36.zip">source</a></p> <h3>聽</h3> <h3>Quick start</h3> <h4>Paired End:</h4> <p>With most new data sets you can use gentle quality trimming and adapter clipping.</p> <p>You often don't need leading and traling clipping. Also in general setting聽<em>keepBothReads</em>聽to True can be useful when working with paired end data, you will keep even redunfant information but this likely makes your pipelines more manageable. Note the additional :2 in front of True (for keepBothReads) this is the minimum adapter length in palindrome mode, you can even set this to 1. (Default is a very conservative 8)</p> <p>If you have questions please don't hesitate to contact us, this is not necessarily one size fits all. (e.g. RNAseq expression analysis vs DNA assembly).</p> <p><code>java -jar trimmomatic-0.39.jar PE input_forward.fq.gz input_reverse.fq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:True聽<code>LEADING:3 TRAILING:3</code> MINLEN:36</code></p> <p>聽</p> <p>for reference only (less sensitive for adapters)</p> <p><code>java -jar trimmomatic-0.35.jar PE -phred33 input_forward.fq.gz input_reverse.fq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36</code></p> <p>This will perform the following:</p> <ul> <li>Remove adapters (ILLUMINACLIP:TruSeq3-PE.fa:2:30:10)</li> <li>Remove leading low quality or N bases (below quality 3) (LEADING:3)</li> <li>Remove trailing low quality or N bases (below quality 3) (TRAILING:3)</li> <li>Scan the read with a 4-base wide sliding window, cutting when the average quality per base drops below 15 (SLIDINGWINDOW:4:15)</li> <li>Drop reads below the 36 bases long (MINLEN:36)</li> </ul> <h4>Single End:</h4> <p><code>java -jar trimmomatic-0.35.jar SE -phred33 input.fq.gz output.fq.gz ILLUMINACLIP:TruSeq3-SE:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36</code></p> <p>This will perform the same steps, using the single-ended adapter file</p> <p>聽</p> <h3>Description</h3> <p>Trimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.The selection of trimming steps and their associated parameters are supplied on the command line.</p> <p>The current trimming steps are:</p> <ul> <li>ILLUMINACLIP: Cut adapter and other illumina-specific sequences from the read.</li> <li>SLIDINGWINDOW: Perform a sliding window trimming, cutting once the average quality within the window falls below a threshold.</li> <li>LEADING: Cut bases off the start of a read, if below a threshold quality</li> <li>TRAILING: Cut bases off the end of a read, if below a threshold quality</li> <li>CROP: Cut the read to a specified length</li> <li>HEADCROP: Cut the specified number of bases from the start of the read</li> <li>MINLEN: Drop the read if it is below a specified length</li> <li>TOPHRED33: Convert quality scores to Phred-33</li> <li>TOPHRED64: Convert quality scores to Phred-64</li> </ul> <p>It works with FASTQ (using phred + 33 or phred + 64 quality scores, depending on the Illumina pipeline used), either uncompressed or gzipp'ed FASTQ. Use of gzip format is determined based on the .gz extension.</p> <p>For single-ended data, one input and one output file are specified, plus the processing steps. For paired-end data, two input files are specified, and 4 output files, 2 for the 'paired' output where both reads survived the processing, and 2 for corresponding 'unpaired' output where a read survived, but the partner read did not.</p> <h3>聽</h3> <h3>Running Trimmomatic</h3> <p>Since version 0.27, trimmomatic can be executed using -jar. The 'old' method, using the explicit class, continues to work.</p> <h4>Paired End Mode:</h4> <p><code>java -jar &lt;path to trimmomatic.jar&gt; PE [-threads &lt;threads] [-phred33 | -phred64] [-trimlog &lt;logFile&gt;] &lt;input 1&gt; &lt;input 2&gt; &lt;paired output 1&gt; &lt;unpaired output 1&gt; &lt;paired output 2&gt; &lt;unpaired output 2&gt; &lt;step 1&gt; ...</code></p> <p>or</p> <p><code>java -classpath &lt;path to trimmomatic jar&gt; org.usadellab.trimmomatic.TrimmomaticPE [-threads &lt;threads&gt;] [-phred33 | -phred64] [-trimlog &lt;logFile&gt;] &lt;input 1&gt; &lt;input 2&gt; &lt;paired output 1&gt; &lt;unpaired output 1&gt; &lt;paired output 2&gt; &lt;unpaired output 2&gt; &lt;step 1&gt; ...</code></p> <h4>Single End Mode:</h4> <p><code>java -jar &lt;path to trimmomatic jar&gt; SE [-threads &lt;threads&gt;] [-phred33 | -phred64] [-trimlog &lt;logFile&gt;] &lt;input&gt; &lt;output&gt; &lt;step 1&gt; ...</code></p> <p>or</p> <p><code>java -classpath &lt;path to trimmomatic jar&gt; org.usadellab.trimmomatic.TrimmomaticSE [-threads &lt;threads&gt;] [-phred33 | -phred64] [-trimlog &lt;logFile&gt;] &lt;input&gt; &lt;output&gt; &lt;step 1&gt; ...</code></p> <p>If no quality score is specified, phred-64 is the default. This will be changed to an 'autodetected' quality score in a future version.</p> <p>Specifying a trimlog file creates a log of all read trimmings, indicating the following details:</p> <ul> <li>the read name</li> <li>the surviving sequence length</li> <li>the location of the first surviving base, aka. the amount trimmed from the start</li> <li>the location of the last surviving base in the original read</li> <li>the amount trimmed from the end</li> </ul> <p>Multiple steps can be specified as required, by using additional arguments at the end.</p> <p>Most steps take one or more settings, delimited by ':' (a colon)</p> <p>Step options:</p> <ul> <li>ILLUMINACLIP:&lt;fastaWithAdaptersEtc&gt;:&lt;seed mismatches&gt;:&lt;palindrome clip threshold&gt;:&lt;simple clip threshold&gt; <ul> <li>fastaWithAdaptersEtc: specifies the path to a fasta file containing all the adapters, PCR sequences etc. The naming of the various sequences within this file determines how they are used. See below.</li> <li>seedMismatches: specifies the maximum mismatch count which will still allow a full match to be performed</li> <li>palindromeClipThreshold: specifies how accurate the match between the two 'adapter ligated' reads must be for PE palindrome read alignment.</li> <li>simpleClipThreshold: specifies how accurate the match between any adapter etc. sequence must be against a read.</li> </ul> </li> </ul> <ul> <li>SLIDINGWINDOW:&lt;windowSize&gt;:&lt;requiredQuality&gt; <ul> <li>windowSize: specifies the number of bases to average across</li> <li>requiredQuality: specifies the average quality required.</li> </ul> </li> </ul> <ul> <li>LEADING:&lt;quality&gt; <ul> <li>quality: Specifies the minimum quality required to keep a base.</li> </ul> </li> </ul> <ul> <li>TRAILING:&lt;quality&gt; <ul> <li>quality: Specifies the minimum quality required to keep a base.</li> </ul> </li> </ul> <ul> <li>CROP:&lt;length&gt; <ul> <li>length: The number of bases to keep, from the start of the read.</li> </ul> </li> </ul> <ul> <li>HEADCROP:&lt;length&gt; <ul> <li>length: The number of bases to remove from the start of the read.</li> </ul> </li> </ul> <ul> <li>MINLEN:&lt;length&gt; <ul> <li>length: Specifies the minimum length of reads to be kept.</li> </ul> </li> </ul> <h3>Trimming Order</h3> <p>Trimming occurs in the order which the steps are specified on the command line. It is recommended in most cases that adapter clipping, if required, is done as early as possible.</p> <h3>聽</h3> <h3>The Adapter Fasta</h3> <p>Illumina adapter and other technical sequences are copyrighted by Illumina,but we have been granted permission to distribute them with Trimmomatic. Suggested adapter sequences are provided for TruSeq2 (as used in GAII machines) and TruSeq3 (as used by HiSeq and MiSeq machines), for both single-end and paired-end mode. These sequences have not been extensively tested, and depending on specific issues which may occur in library preparation, other sequences may work better for a given dataset.</p> <p>To make a custom version of fasta, you must first understand how it will be used. Trimmomatic uses two strategies for adapter trimming: Palindrome and Simple</p> <p>With 'simple' trimming, each adapter sequence is tested against the reads, and if a sufficiently accurate match is detected, the read is clipped appropriately.</p> <p>'Palindrome' trimming is specifically designed for the case of 'reading through' a short fragment into the adapter sequence on the other end. In this approach, the appropriate adapter sequences are 'in silico ligated' onto the start of the reads, and the combined adapter+read sequences, forward and reverse are aligned. If they align in a manner which indicates 'read-through', the forward read is clipped and the reverse read dropped (since it contains no new data).</p> <p>Naming of the sequences indicates how they should be used. For 'Palindrome' clipping, the sequence names should both start with 'Prefix', and end in '/1' for the forward adapter and '/2' for the reverse adapter. All other sequences are checked using 'simple' mode. Sequences with names ending in '/1' or '/2' will be checked only against the forward or reverse read. Sequences not ending in '/1' or '/2' will be checked against both the forward and reverse read. If you want to check for the reverse-complement of a specific sequence, you need to specifically include the reverse-complemented form of the sequence as well, with another name.</p> <p>The thresholds used are a simplified log-likelihood approach. Each matching base adds just over 0.6, while each mismatch reduces the alignment score by Q/10. Therefore, a perfect match of a 12 base sequence will score just over 7, while 25 bases are needed to score 15. As such we recommend values between 7 - 15 for this parameter. For palindromic matches, a longer alignment is possible - therefore this threshold can be higher, in the range of 30. The 'seed mismatch' parameter is used to make alignments more efficient, specifying the maximum base mismatch count in the 'seed' (16 bases). Typical values here are 1 or 2.</p> <h3>聽</h3> <h3>Contacts</h3> <p><a href="">Anthony Bolger</a></p> <!-- Add code here that should appear in the content block of all new pages --> </div> <div class="three-column round two"> <div></div> </div> </div> <div class="section"><!-- clear section --></div> </div> </div> <div class="footer"> <div class="content"> <div class="section"> <div class="one-column main-nav"><p><a href="#top" title="Top">^</a></p></div> </div> <div class="section footer-desktop"> <div class="four-column"><h4>Contact</h4> </div> <div class="four-column"><h4>Office hours</h4> <p>Monday-Friday</p> <p>9:00 - 12:00</p></div> <div class="four-column"><p&nbsp;</p></div> <div class="four-column"><div class="search-form"> <form id="cntnt01moduleform_1" method="get" action="http://www.usadellab.org/cms/index.php?page=trimmomatic" class="cms_form"> <div class="hidden"> <input type="hidden" name="mact" value="Search,cntnt01,dosearch,0" /> <input type="hidden" name="cntnt01returnid" value="72" /> </div> <label for="cntnt01searchinput"><h4>Search:&nbsp;</h4></label> <p><input type="text" class="search-input" id="cntnt01searchinput" name="cntnt01searchinput" size="20" maxlength="50" value="Enter Search..." onfocus="if(this.value==this.defaultValue) this.value='';" onblur="if(this.value=='') this.value=this.defaultValue;"/></p> <br/> <p><input class="search-button" name="submit" value="Submit" type="submit" /></p> </form> </div></div> </div> <div class="section footer-mobile"> <div class="one-column"><div class="search-form"> <form id="cntnt01moduleform_2" method="get" action="http://www.usadellab.org/cms/index.php?page=trimmomatic" class="cms_form"> <div class="hidden"> <input type="hidden" name="mact" value="Search,cntnt01,dosearch,0" /> <input type="hidden" name="cntnt01returnid" value="72" /> </div> <label for="cntnt01searchinput"><h4>Search:&nbsp;</h4></label> <p><input type="text" class="search-input" id="cntnt01searchinput" name="cntnt01searchinput" size="20" maxlength="50" value="Enter Search..." onfocus="if(this.value==this.defaultValue) this.value='';" onblur="if(this.value=='') this.value=this.defaultValue;"/></p> <br/> <p><input class="search-button" name="submit" value="Submit" type="submit" /></p> </form> </div></div> </div> <div class="section footer-mobile"> <div class="one-column"><!-- Html blob 'footermobile' does not exist --></div> </div> <div class="section"><!-- clear section --></div> </div> </div> </div> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10