Topological data analysis for genomics and evolution : topology in biology /


Raúl Rabadán, Andrew J. Blumberg.
Bok Engelsk 2020 · Electronic books.

Annen tittel
Medvirkende
Utgitt
Cambridge University Press
Omfang
1 online resource (xvii, 501 pages) : : digital, PDF file(s).
Utgave
1st ed.
Opplysninger
Title from publisher's bibliographic system (viewed on 21 Nov 2019).. - Cover -- Half-title page -- Title page -- Copyright page -- Dedication -- Contents -- List of Contributors -- Preface -- Introduction -- 0.1 Why Algebraic Topology? -- 0.2 Combinatorial Algebraic Topology -- 0.3 Topological Data Analysis (TDA) -- 0.4 Genetics and Genomics -- 0.5 Why Is Topological Data Analysis Useful in Genomics? -- 0.6 What Is in This Book? -- Part I Topological Data Analysis -- 1 Basic Notions of Algebraic Topology -- 1.1 Sets -- 1.2 Metric Spaces -- 1.3 Topological Spaces -- 1.3.1 Maps between Topological Spaces -- 1.3.2 Homeomorphisms -- 1.4 Continuous Deformations and Homotopy Invariants -- 1.4.1 Homotopy Groups -- 1.5 Gluing and CW Complexes -- 1.6 Algebra -- 1.6.1 Groups -- 1.6.2 Homomorphisms -- 1.6.3 New Groups from Old -- 1.6.4 The Group Structure on π[sub(n)](X, x) -- 1.6.5 Rings and Fields -- 1.6.6 Vector Spaces and Linear Algebra -- 1.7 Category Theory -- 1.7.1 Functors -- 1.8 Simplicial Complexes -- 1.9 The Euler Characteristic -- 1.10 Simplicial Homology -- 1.10.1 Chains and Boundaries -- 1.10.2 Homology Groups -- 1.10.3 Homology of Chain Complexes -- 1.10.4 Simplicial Homology with Coefficients in an Abelian Group -- 1.11 Manifolds -- 1.12 Morse Functions and Reeb Spaces -- 1.13 Summary -- 1.14 Suggestions for Further Reading -- 2 Topological Data Analysis -- 2.1 Simplicial Complexes Associated to Data -- 2.2 The Niyogi-Smale-Weinberger Theorem -- 2.3 Persistent Homology -- 2.4 Stability of Persistent Homology under Perturbation -- 2.5 Zigzag Persistence -- 2.6 Multidimensional Persistence -- 2.6.1 Multidimensional Persistence -- 2.6.2 The Persistent Homology Transform -- 2.7 Efficient Computation of Persistent Homology -- 2.8 Multiscale Clustering: Mapper -- 2.9 Towards Persistent Algebraic Topology -- 2.10 Summary -- 2.11 Suggestions for Further Reading -- 3 Statistics and Topological Inference.. - 3.1 What Can Topological Data Analysis Tell Us? -- 3.1.1 Persistent Homology and Sampling -- 3.1.2 Topological Inference -- 3.2 Background: Geometric Sampling and Metric Measure Spaces -- 3.2.1 Metric Measure Spaces -- 3.2.2 The Fréchet Mean and Variance of a Metric Measure Space -- 3.2.3 Distances on Measures and Metric Measure Spaces -- 3.3 Probability Theory in Barcode Space -- 3.3.1 Polish Spaces of Barcodes -- 3.3.2 Sampling and Hypothesis Testing in Barcode Space -- 3.4 Stability Theorems for Persistent Homology of Metric Measure Spaces -- 3.5 Estimating Persistent Homology from Samples -- 3.5.1 Estimating Persistent Homology by Density Estimation -- 3.5.2 Estimating Persistent Homology by Resampling -- 3.6 Summarizing Persistence Diagrams -- 3.6.1 Tractable Features from Persistence Diagrams -- 3.6.2 Kernel Methods for Barcodes -- 3.6.3 Persistence Landscapes -- 3.6.4 Coordinates on Persistent Homology -- 3.7 Stochastic Topology and the Expected Persistent Homology of Random Complexes -- 3.8 Euler Characteristics in Topological Data Analysis -- 3.9 Exploratory Data Analysis with Mapper -- 3.10 Summary -- 3.11 Suggestions for Further Reading -- 4 Dimensionality Reduction, Manifold Learning, and Metric Geometry -- 4.1 A Quick Refresher on Eigenvectors and Eigenvalues -- 4.2 Background on PCA and MDS -- 4.3 Manifold Learning -- 4.3.1 Isomap -- 4.3.2 Local Linear Embedding (LLE) -- 4.3.3 Laplacian Eigenmaps -- 4.3.4 Manifold Learning and Kernel Methods -- 4.3.5 Discrete Harmonic Analysis -- 4.3.6 Other Manifold Learning Techniques -- 4.3.7 Manifolds of Differing Dimension -- 4.4 Neighbor Embedding Algorithms -- 4.4.1 Stochastic neighbor Embedding (SNE) -- 4.4.2 t-Distributed Stochastic Neighbor Embedding (t-SNE) -- 4.4.3 Reliable Use of t-SNE -- 4.5 Mapper and Manifold Learning -- 4.6 Dimensionality Estimation.. - 4.7 Metric Trees and Spaces of Phylogenetic Trees -- 4.7.1 Inferring Trees from Metric Data -- 4.7.2 The Billera-Holmes-Vogtmann Metric Spaces of Phylogenetic Trees -- 4.7.3 Metric Geometry -- 4.8 Summary -- 4.9 Suggestions for Further Reading -- Part II Biological Applications -- 5 Evolution, Trees, and Beyond -- 5.1 Introduction -- 5.2 Evolution and Topology -- 5.3 Viral Evolution: Influenza A -- 5.3.1 Influenza A -- 5.3.2 Reassortments in Influenza through TDA -- 5.3.3 Influenza Virus Evolution and the Space of Phylogenetic Trees -- 5.4 Viral Evolution: HIV -- 5.4.1 Human Immunodeficiency Virus -- 5.4.2 Viral Recombination in HIV -- 5.4.3 Viral Recombination in Late-Stage HIV Infection -- 5.5 Other Viruses -- 5.6 Bacterial Evolution -- 5.6.1 Horizontal Gene Transfer in Bacteria -- 5.6.2 Pathogenic Bacteria -- 5.6.3 Multilocus Sequence Typing Analysis -- 5.6.4 Protein Family Analysis -- 5.6.5 Antibiotic Resistance in Staphylococcus aureus -- 5.7 Persistent Homology Estimators in Population Genetics -- 5.7.1 Coalescent Process -- 5.7.2 Statistical Model -- 5.7.3 Coalescent Simulations -- 5.8 Recombination Landscape in Humans -- 5.8.1 Fine-Scale Resolution of Human Recombination -- 5.9 Gene Trees and Species Trees -- 5.10 Extensions: Median Complex and Topological Minimal Graphs -- 5.10.1 The Median Complex Construction -- 5.10.2 Topological Minimal Graphs and Barcode Ensembles -- 5.11 Summary -- 5.12 Suggestions for Further Reading, Databases, and Software -- 6 Cancer Genomics -- 6.1 A Brief History of Cancer -- 6.2 Cancer in the Era of Molecular Biology -- 6.3 The Standard Model of Tumor Evolution -- 6.4 Cancer in the Era of Genomic Data -- 6.4.1 Point Mutations -- 6.4.2 Copy Number Alterations -- 6.4.3 Gene Fusions and Translocations -- 6.4.4 Viruses -- 6.5 Differential Gene Expression Analysis in Cancer -- 6.6 The Space of Glioblastomas.. - 6.7 Cross-Sectional Data in Cancer and Patient Stratification Using Expression Data -- 6.8 Cross-Sectional Data in Cancer and Identifying Driver Genes in Cancer -- 6.9 The Tissue of Origin of Melanomas -- 6.10 Association between Drug Sensitivity and Genomic Alterations -- 6.11 Summary -- 6.12 Suggestions for Further Reading and Databases -- 7 Single Cell Expression Data -- 7.1 Introduction to Single Cell Technologies -- 7.2 Identifying Distinct Cell Subpopulations in Cancer -- 7.2.1 Clonal Heterogeneity from Single Cell Tumor Genomics -- 7.3 Asynchronous Differentiation Processes -- 7.4 Differentiation in Human Preimplantation Embryos -- 7.5 Summary -- 7.6 Suggestions for Further Reading, Databases, and Software -- 8 Three-Dimensional Structure of DNA -- 8.1 Background -- 8.2 TDA and Chromatin Structure -- 8.3 Simulations -- 8.4 The Topology of Bacterial DNA -- 8.5 The Topology of Human DNA -- 8.6 Summary -- 8.7 Suggestions for Databases and Software -- 9 Topological Data Analysis beyond Genomics -- 9.1 Topological Study of Series Analysis -- 9.1.1 Time Series Analysis of Gene Expression Data -- 9.1.2 Time Series Analysis Using Topological Data Analysis -- 9.1.3 Topological Data Analysis of Sliding Windows -- 9.1.4 Identification of Copy Number Alterations -- 9.2 Topological Data Analysis in Networks and Neuroscience -- 9.2.1 Cellular Scales: Neuronal Activity -- 9.2.2 Mesoscopic Scales: Brain Functional Networks -- 9.3 Topological Approaches to Biomedical Imaging -- 9.4 Spreading of Infectious Diseases -- 9.5 Summary -- 9.6 Suggestions for Further Reading -- 10 Conclusions -- Appendix A Algorithms in Topological Data Analysis -- Appendix B Introduction to Population Genetics -- Appendix C Molecular Phylogenetics -- References -- Index.. - Biology has entered the age of Big Data. A technical revolution has transformed the field, and extracting meaningful information from large biological data sets is now a central methodological challenge. Algebraic topology is a well-established branch of pure mathematics that studies qualitative descriptors of the shape of geometric objects. It aims to reduce comparisons of shape to a comparison of algebraic invariants, such as numbers, which are typically easier to work with. Topological data analysis is a rapidly developing subfield that leverages the tools of algebraic topology to provide robust multiscale analysis of data sets. This book introduces the central ideas and techniques of topological data analysis and its specific applications to biology, including the evolution of viruses, bacteria and humans, genomics of cancer, and single cell characterization of developmental processes. Bridging two disciplines, the book is for researchers and graduate students in genomics and evolutionary biology as well as mathematicians interested in applied topology.
Emner
Sjanger
Dewey
ISBN
1-316-67166-6

Bibliotek som har denne