Library preparation is a method where DNA or cDNA is attached to a flow cell (a slide with tiny channels) by adapters allowing the sample to be captured and sequenced. There are two methods used to make a library: ligation-based library preparation and tagmentation-based library preparation.
For library prep, DNA must be cut into small pieces and then connected to the flow cell with the help of adapters.
Adapters are very short, synthetic DNA sequences that allow high throughput sequencing and facilitate the identification of multiple samples or multiplexing in the same run.
Therefore, careful preparation of a library is not a minor aspect because it may affect the output and cause biases in data interpretation.
In this article, I will explain the library prep methods for sequencing and their applications in biotechnology.
Article Table of Contents
Library prep is the process where adapters (short synthetic DNA sequences) are attached to DNA fragments to be further captured and sequenced. Library preparation is important because it allows DNA samples to be sequenced in a high throughput manner in NGS applications.
A library is a collection of randomly sized DNA fragments with attached adapters from a given sample that is used for sequencing.
The initial input material for a library can be either DNA or RNA.
Technological advances and polymerization chemistry allowed us to create efficient sequencers and optimized polymerases to obtain information in depth of genomes, gene expression, and the genetic diversity of organisms in high throughput performance.
However, despite the achieved advances in next-generation sequencing (NGS), this technology still cannot sequence a complex genome in just one shot.
Therefore, DNA is broken into smaller pieces before being sequenced. Furthermore, DNA is attached to flow cells to produce massive, tiny sequences in a process called clustering.
Finally, these molecules are sequenced to make millions of reads (sequences of base pairs obtained from DNA fragments).
Figure 1. Genomic DNA is colored to represent corresponding fragments for the library. Genomic DNA is first fragmented. The fragments of a given sample with attached adapters (dark blue) make up a library. Depending on the type of library preparation, fragments may also be amplified.
There are two main library preparation methods, ligation-based library preparation and tagmentation-based library preparation. Variations of each technique have been implemented for several specialized applications.
Ligation-based library preparation is a process where adapters, or small known synthetic DNA sequences, are ligated to DNA fragments. This method is also called TruSeqTM library prep.
There are three primary steps to ligation-based library prep: fragmentation, ligation of adapters and PCR (optional).
- Ligation of adapters
- PCR amplification (optional)
Figure 2. Represents the basic steps of ligation-based library preparation, starting with genomic DNA being fragmented into smaller pieces. The collection of fragments with attached adapters is a library. Adapters that enable sequencing can then be added to polished fragments.
High-quality DNA is the input material for a library prep. In the case of RNA, this must be converted into complementary DNA (cDNA) using reverse transcriptase before sequencing.
Generally, total DNA/RNA extractions produce very long molecules, and because the polymerases used in NGS cannot sequence a complex complete genome in just one run, the DNA must be fragmented.
During the fragmentation step, double-stranded genomic DNA is cut in multiple smaller pieces using sonication or enzymes.
This process generally produces unpolished DNA fragments, or in simple words, double-stranded DNA fragments with tails or overhangs at different ends.
Figure 3. Simplistic view of double-stranded genomic DNA being fragmented, leaving unpolished DNA fragments with overhangs or tails.
After DNA fragmentation, researchers use a mixture of different enzymes to produce blunt ends (flat, matching ends with no overhangs).
Once researchers have DNA without overhangs, they add a single Adenine base to produce an A-overhang. This process helps in the adapter ligation step.
Figure 4. An unpolished DNA fragment is treated with enzymes to produce blunt ends. Afterward, an adenine base is added to each end of the fragment.
In the ligation step, adapters which are artificial DNA molecules with a T-overhang complement with the A-overhangs in the DNA fragments in a ligation reaction. Each DNA fragment has two adapters, one at each end, so the DNA in between is called an insert.
At this point, you have your fragments with attached adapters, which means you have a prepared library. Congratulations!
Figure 5. Adapters complement with the A-bases on the DNA fragment. Adapters flank each end of the DNA fragment.
There are some cases where there is not enough DNA to sequence, so researchers use PCR to amplify the DNA molecules together with the adapters. Therefore, there are versions of ligation library prep that are free-PCR and others that are not.
An important note when it comes to PCR for library preparation is that there are different types of PCRs used that are more favorable for massive amplification. We have an article that goes into greater detail about the different types of PCR out there and appropriate applications, including bridge amplification and emulsion PCR.
The flow cell also contains thousands of artificially made short DNA molecules called oligos that are complementary to the adapters. This facilitates the DNA attachment to the flow cell and its subsequent sequencing.
Figure 6. Flow cells contains oligos which complement with the adapters in the DNA fragments.
Thanks to the versatility of these adapters, researchers can also add additional synthetic sequences called indexes to the adapters.
Indexes are known synthetic DNA sequences unique to each DNA fragment that facilitate the identification of different samples or treatments in the same run of sequencing in a flow cell.
- The identification of each strand of DNA
- The pooling of samples to be sequenced in the same flow cell
The process of adding indexes to the DNA fragments is known as multiplexing or barcoding.
Figure 7. Indexes used in a library prep for multiplexing or barcoding.
Although PCR amplification is optional in library prep, its use depends on the type of application you will see next.
And when it comes to amplification, researchers must be aware that it can create biases due to the incorrect assignment of bases promoting errors that are multiplied during sequencing.
The advantage of ligation prep is that PCR is not always needed. It avoids introducing bias. The disadvantage, however, is that it is more time-consuming than tagmentation.
To learn more about DNA fragmentation, check our GoldBio article How to break up DNA for NGS library prep.
Tagmentation-based (Nextera®) Library Prep
Tagmentation-based library preparation is a process where the steps of cutting and ligating are performed by an enzyme called transposase.
Tagmentation-based library prep is also called Nextera® library preparation.
Steps in tagmentation-based (Nextera®) library prep
- Cutting and ligating fragments by transposases
- PCR amplification
Figure 8. Tagmentation method cuts and inserts adapters to DNA fragments in the same reaction using an enzyme called transposase.
Tagmentation library preparation starts with uncut genomic DNA that is mixed with transposases holding partial adapters in their structure.
Transposases can cut the genomic DNA and insert the adapters in a single reaction. This process creates small DNA fragments with adapters attached on each side.
Next, a PCR reaction uses these adapters already inserted as priming sites to amplify the DNA fragments and add more adapters.
Consequently, in Nextera®-prep methods, the PCR step follows and is mandatory.
The main advantage of tagmentation is that fragmentation and ligation occur in the same reaction, saving time during library preparation.
However, the disadvantage is that the PCR step is mandatory, so some misleading errors can be generated by amplification.
When to Use ligation-based library preparation
Ligation-based library preparation can be used for whole genome sequencing (WGS), RNA sequencing (RNA-Seq), or methylation sequencing.
For each of these, there are different library preps.
List of library preps for Whole Genome Sequencing (WGS)
- TruSeqTM PCR free DNA library prep
- TruSeqTM Nano DNA library prep
List of library preps for RNA-Sequencing
- TruSeqTM Stranded total RNA library prep
- TruSeqTM small RNA library prep
- TruSeqTM Stranded mRNA library prep
List of library preps for Methylation Sequencing
- TruSeqTM DNA methylation library prep
Tagmentation-based library preps can be used for whole genome sequencing and exome sequencing. Below are different library preps for each application.
List of library preps for Whole Genome sequencing
- Nextera® DNA library prep
- Nextera® XT DNA library prep
List of library preps for Exome Sequencing
- Nextera® Rapid capture exome library prep
- Nextera® Rapid capture expanded exome library prep
- Adapters: Artificially-made short DNA molecules that are attached to DNA fragments and used to bind with the flow cell.
- Barcoding: The process of identifying DNA samples by adding indexes to the DNA fragments in a library prep.
- Index: A short artificially made DNA molecule used to assign unique codes to samples, allowing identification during sequencing.
- Insert: DNA fragment within two adapters.
- Library: A collection of randomly sized DNA fragments from a given sample to be sequenced.
- Ligation library prep: Method for ligating adapters to DNA fragments that will be sequenced.
- Multiplexing: The process of adding indexes to DNA fragments in a library prep.
- Oligos: Artificial DNA molecules attached to the flow cell that binds by complementation to the adapters of DNA fragments.
- Reads: sequences of base pairs obtained from DNA fragments.
- Tagmentation library prep: Method for cutting and ligating adapters to DNA fragments using a transposase enzyme.
- Transposase: Enzyme used in tagmentation library preparation for cutting and ligate adapters to DNA fragments.
Head, S. R., Komori, H. K., LaMere, S. A., Whisenant, T., Van Nieuwerburgh, F., Salomon, D. R., & Ordoukhanian, P. (2014). Library construction for next-generation sequencing: Overviews and challenges. BioTechniques, 56(2), 61-77. https://doi.org/10.2144/000114133
Hennig, B. P., Velten, L., Racke, I., Tu, C. S., Thoms, M., Rybin, V., Besir, H., Remans, K., & Steinmetz, L. M. (2018). Large-Scale Low-Cost NGS Library Preparation Using a Robust Tn5 Purification and Tagmentation Protocol. G3 Genes|Genomes|Genetics, 8(1), 79-89. https://doi.org/10.1534/g3.117.300257
Hess, J. F., Kohl, T. A., Kotrová, M., Rönsch, K., Paprotka, T., Mohr, V., Hutzenlaub, T., Brüggemann, M., Zengerle, R., Niemann, S., & Paust, N. (2020). Library preparation for next generation sequencing: A review of automation strategies. Biotechnology Advances, 41, 107537. https://doi.org/10.1016/j.biotechadv.2020.107537
Jones, M. B., Highlander, S. K., Anderson, E. L., Li, W., Dayrit, M., Klitgord, N., Fabani, M. M., Seguritan, V., Green, J., Pride, D. T., Yooseph, S., Biggs, W., Nelson, K. E., & Venter, J. C. (2015). Library preparation methodology can influence genomic and functional predictions in human microbiome research. Proceedings of the National Academy of Sciences, 112(45), 14024-14029. https://doi.org/10.1073/pnas.1519288112
Kelly, S. T., Hakoyama, T., Kumaishi, K., Okuda-Yabukami, H., Kato, S., Hayashi, M., Minoda, A., & Ichihashi, Y. (2022). A novel genomic DNA library preparation method with low GC bias [Preprint]. Genomics. https://doi.org/10.1101/2022.01.28.478268
Song, Y., Milon, B., Ott, S., Zhao, X., Sadzewicz, L., Shetty, A., Boger, E. T., Tallon, L. J., Morell, R. J., Mahurkar, A., & Hertzano, R. (2018). A comparative analysis of library prep approaches for sequencing low input translatome samples. BMC Genomics, 19(1), 696. https://doi.org/10.1186/s12864-018-5066-2
Tvedte, E. S., Michalski, J., Cheng, S., Patkus, R. S., Tallon, L. J., Sadzewicz, L., Bruno, V. M., Silva, J. C., Rasko, D. A., & Dunning Hotopp, J. C. (2021). Evaluation of a high-throughput, cost-effective Illumina library preparation kit. Scientific Reports, 11(1), 15925. https://doi.org/10.1038/s41598-021-94911-0
Zhao, S., Zhang, C., Mu, J., Zhang, H., Yao, W., Ding, X., Ding, J., & Chang, Y. (2020). All-in-one sequencing: An improved library preparation method for cost-effective and high-throughput next-generation sequencing. Plant Methods, 16(1), 74. https://doi.org/10.1186/s13007-020-00615-3