DNA Analytics for Genomic Study

A variety of protocols are available for the extraction of gDNA from different sources. It is important to employ a validated method leading to an isolation of gDNA of highest quality, suitable for the nucleotide variation analysis (single or array-based qPCR or end-point PCR), and for sequencing (Sanger, NGS).

It is advised to test all DNA samples isolated for quality before any further work on DNA analysis. The routine analysis concerns the ratio of light absorption of the DNA solution at 260 nm compared to 280 nm where the ratio A 260/280 should be > 1.9 [See also ICH E18].

Nucleotide variations can be determined with a variety of methods focusing (i) on targeted sequence areas or (ii) broader sequencing approaches. The latter encompass WES, whole gene sequencing by Sanger or NGS including promoter, introns and exons, on single or multiple genes or WGS covering the entire genome with exception of specific complex loci with high sequence homology.

It is important to test for known functionally relevant nucleotide variations, regardless of whether they are located in the coding or non-coding region of the gene. In the genes important for pharmacokinetics and dynamics, such intron mutations are functionally important in some cases (for CYP-genes about 2 % of all important SNPs) and this reduces the usefulness of plain WES.

Often it is important to validate the sequencing results, using either an independent analytically valid method or by re-sequencing a second amplicon from the same region. Furthermore, it is important to validate the data obtained against samples known to lack the variation in question and samples have the genetic variation of interest.

Current practice for analyses of genetic variants include SNP analyses (single or array-based qPCR, hybridization etc.), PCR (endpoint or quantitative with or without restriction-enzyme digestion) and sequencing (Sanger, NGS). A variety of procedures with different technical and/or chemical approaches are currently used for genomic biomarker analytics involving subjects. The main difference in the testing approaches used lies in the number of variants tested per gene.

Methods using primer-based technologies are prone for allele-drop-out artefacts due to failure of primer hybridization in case of specific mutations, leading to erroneous genotyping results and therefore to an inaccurate phenotype assignment. This should be avoided by identifying the respective allele-drop-outs or use tests able to avoid known allele drop outs.

Caution should be applied when proxy-SNPs are used for predicting the presence of functionally relevant SNPs, since there is not an absolute linkage between the proxy-SNP and the functionally important SNP. Preference should be given to the direct analysis of the respective functional relevant SNPs, either by sequencing, or an array-based or other approach directly analyzing the functional SNPs. Where proxy or tag-SNPs have to be used, a risk estimate for miscalls should be included in the analytical report.

Prior to use in clinical trials or in a diagnostic setting, the testing procedures need proper validation. The implementation of such test must comply with existing regional guidelines and being validated preferentially with respect to genetic variability, by two different methods for sequencing, e.g. NGS and Sanger techniques. Certain exemptions may apply for in-house tests.

NGS based genetic test workflows include DNA extraction, DNA processing, preparation of libraries, generation of sequence reads and base calling, sequence mapping, variant annotation and filtering, variant classification and interpretation. It is necessary that all these steps are carefully carried out using validated methods and continuously subjected to rigorous quality control. For NGS based sequencing the DNA quality must be very high. Before starting a new project, it is recommended to analyse a small number of representative samples using NGS. The chosen DNA isolation methods should be shown to yield satisfactory results before initiation of the full study. Quality Control (QC) steps necessary for the development of an in house diagnostic need to be followed.

A specific issue affecting reliability of NGS is the coverage the method provides for a specific DNA sequence. It is recommended that the technical predictive value should be at least 99.9%. For germline genetics, a minimum coverage of >30x seems to be a reasonable goal. If however, the allele frequency of the mutations analyzed is very low, a higher coverage is needed in order to ensure that also the rarer variants are detected by the sequencing.

NGS analysis of complex loci with high GC-content (guanine-cytosine content) or highly homologues genes and pseudogenes can contribute to miscalled variants due to sequencing artefacts. It is therefore recommended in such cases, to include methods that use substantially longer read lengths, i.e. fragments >1000 base pairs. This can be achieved with initial DNA amplification using long PCR techniques or synthetic long read methods, which use partitioning and barcoding of longer DNA molecules before standard library preparation, that allows the assembly of short reads into longer fragments. It is acknowledged that these techniques are not always technically possible, e.g. when using Formalin-Fixed Paraffin-Embedded (FFPE) samples.


Read also: DNA Extraction Process

Leave a Comment