Review the key concepts, formulae, and examples before starting your quiz.
🔑Concepts
The Human Genome Project (HGP) was a 13-year 'Mega Project' (1990–2003) coordinated by the U.S. Department of Energy and the National Institute of Health.
The human genome contains approximately base pairs (bp).
Methodology - Expressed Sequence Tags (): An approach focused on identifying all the genes that are expressed as .
Methodology - Sequence Annotation: Sequencing the whole set of genome (coding and non-coding) and later assigning functions to different regions.
Vectors and Hosts: Fragments of DNA were cloned in suitable hosts like bacteria and yeast using specialized vectors called (Bacterial Artificial Chromosomes) and (Yeast Artificial Chromosomes).
Salient Features: The average gene consists of bases; the largest known human gene is at bases.
The total number of genes is estimated at , and of nucleotide bases are exactly the same in all people.
Less than of the genome codes for proteins, while a large portion consists of repetitive sequences.
Scientists have identified about million locations where single-base differences ( - Single Nucleotide Polymorphism) occur in humans.
📐Formulae
💡Examples
Problem 1:
Given that the human genome has bp and the average cost of sequencing was per base pair, determine why HGP is considered a 'Mega Project' in terms of finance and data storage.
Solution:
- Finance: 3 \times 10^9 \text{ bp} \times 3 = \9 \text{ billion}$.
- Data Storage: If the sequence was stored in books with pages each and each page had letters, it would require books.
Explanation:
The sheer scale of financial investment () and the massive requirement for high-speed computational devices for data storage and analysis categorizes HGP as a mega project.
Problem 2:
A researcher is studying a specific sequence difference at a single base in the human population. What is this phenomenon called and how many such locations are identified in the human genome?
Solution:
The phenomenon is called (Single Nucleotide Polymorphisms), pronounced as 'snips'. There are about million such locations identified.
Explanation:
These are crucial for finding chromosomal locations for disease-associated sequences and tracing human history.