History & Comments
Back
Added Discovery Character section
Description:Adds surprise level and mode of discovery (serendipity vs systematic vs Edisonian)
# [SCI] Genomics & Computational Biology **Genomics** is the large-scale study of entire genomes — their sequencing, structure, function, and evolution — made possible by the convergence of molecular biology, chemistry, and computational methods. ## Overview Watson and Crick's DNA double helix (1953) revealed the information storage mechanism. Sanger's chain-termination sequencing (1977) enabled reading DNA sequences. The Human Genome Project (1990–2003) sequenced the complete human genome for USD 3 billion. Illumina's short-read sequencing (2007) reduced the cost to ~USD 1,000 per genome by 2013. CRISPR-Cas9 (Doudna & Charpentier, 2012) enables precise genome editing. Bioinformatics — the application of information theory, statistics, and ML to genomic data — is now a major discipline. ## Key Figures & Recognition - **Watson, Crick, Franklin, Wilkins**: DNA structure. **Nobel Prize 1962** (Watson, Crick, Wilkins; Franklin died 1958). - **Frederick Sanger** (1918–2013): DNA sequencing. **Nobel Prize 1980** (his second Nobel). - **Jennifer Doudna** (1964–) & **Emmanuelle Charpentier** (1968–): CRISPR-Cas9. **Nobel Prize 2020**. ## Seminal Papers - Watson, J. & Crick, F. "A Structure for Deoxyribose Nucleic Acid." *Nature* 171 (1953). - [Sanger, F. et al. "DNA sequencing with chain-terminating inhibitors." *PNAS* 74 (1977)](https://doi.org/10.1073/pnas.74.12.5463) ## What This Enables - **[TECH] AI & Large Language Models** — Protein language models (ESMFold, AlphaFold) are transformer LLMs trained on protein sequence databases. ## Discovery Character ⏎ **Surprise level**: High — The Human Genome Project's completion (2003) revealed far fewer genes than expected (~20,000 vs. 100,000 predicted) and vast non-coding regions of unclear function. AlphaFold's solution of the 50-year protein-folding problem (2020) was a genuine shock to the structural biology community. ⏎ **Mode**: Systematic with competitive urgency and ethical complexity. Watson and Crick raced against Pauling; the double helix discovery used Franklin's X-ray data (Photo 51) without her knowledge or consent — a celebrated but ethically troubled origin. Modern genomics is Edisonian in data generation (sequence everything, analyse later) but increasingly systematic in interpretation via ML. ⏎ # Parents * [TECH] Digital Computing * [SCI] Information Theory * [TECH] Digital Computing * [SCI] Machine Learning Theory * [SCI] Molecular Biology & Biochemistry * [SCI] Deep Learning
Sign in to add a new comment