Lectures on bioinformatics: from statistics to genetic constructions

    To immerse yourself in a relatively new scientific field, there is a huge number of various activities and projects. In recent years, their number and formats have expanded considerably: these are open lectures and entire scientific festivals, online courses and online programs, summer internships and schools, informal lectures in bars, open-source projects, and so on.

    For five years, the Institute of Bioinformaticsgathers bioinformatics scientists and students from all over the country and during the week-long intensive study in the country at the summer school directs biologists, physicians, computer scientists and mathematicians towards bioinformatics, which is still a very dynamically developing field. Since 2013, we have been recording lectures on video and collecting a selection of useful materials for those who are not participating in events, but would like to develop in this area.

    The school program is designed to unite the world of biology and programming and to stimulate not only the development of professional development, but also interdisciplinary communication.

    image

    We continue to share the archive of summer school lecture videos. Lectures that can be watched without additional training are marked with "*". Viewing other lectures requires knowledge in biology and programming. Under the cut description of the content of lectures, links to slides and videos.

    Statistics in bioinformatics





    Statistical analysis of biomedical data (Mikhail Pyatnitsky, Orekhovich Research Institute of Biomedical Chemistry)
    Video | Slides

    Lecture is devoted to practical aspects of statistical analysis of '-mix' data. In particular, methods of exploration analysis, pattern recognition, cluster analysis are described.

    How to work with data and not feel helpless? (Nikita Alekseev, George Washington University)
    Video | Slides

    On the one hand, the natural sciences provide huge amounts of data and ask a variety of questions about this data. On the other hand, statistics has many methods for solving such issues. Such an abundance, of course, brings with it difficulties - how to choose a method that is suitable for solving exactly your problem, how to take into account all the nuances and not get confused in all this. There is no universal recipe. The lecture discusses various approaches to this problem.

    How to ask a question to a friend statistics (Nikita Alekseev, postdoc, George Washington University)
    Video | Slides

    The lecture will be useful to anyone who faces problems with statistical data processing. What solutions are possible for them, what difficulties arise, and what to ask the statistics with which they managed to start cooperating in order to get the maximum benefit for their project.

    Immunoinformatics




    Analysis of immune receptor repertoires (Vadim Nazarov, Higher School of Economics, Institute of Bioorganic Chemistry, RAS)
    Video | Slides

    The use of NGS technology in immunology has made it possible to very deeply sequence the cell receptor repertoires. But unfortunately, it is impossible to simply look at the obtained data and get insights - it is necessary to develop various methods for analyzing repertoires. About what methods were developed, how adequate they are, where this world is heading, and where you can put yourself in it.

    Immunoinformatics: an algorithmic approach to solving applied problems of immunology (Yana Safonova, Center for Algorithmic Biotechnology, St. Petersburg State University)
    Video | Slides

    Analysis of the adaptive immune system is an important step in the development of drugs, evaluating the effectiveness of treatment, and studying various diseases. Modern NGS technologies have allowed for deep scanning of repertoires of antibodies and T-cell receptors, which contributed to the development of a new field of bioinformatics: immunoinformatics.

    Immunoinformatics solves problems that have applications in various immunological directions: monitoring the development of the immune response, analyzing the evolutionary development of repertoires, understanding the diversity of the adaptive immune system. The lecture deals with the tasks of modern immuno-informatics and discusses the prospects for its development.



    Molecular barcoding, analysis of repertoires of T-cell receptors and antibodies (Dmitry Chudakov, Head of the Laboratory of Adaptive Immunity Genomics at the Institute of Bioorganic Chemistry, Russian Academy of Sciences, Head of the Adaptive Immunity Group at CEITEC MU, Masaryk University)
    Video | Slides

    High-throughput sequencing of genome-of-interest fragments (targeted resequencing) potentially allows for in-depth analysis that reveals the presence of rare subvariant sequences in a sample, as well as giving a complete picture of the structure of sequence diversity in the sample.

    However, bottlenecks at the stages of obtaining and preparing samples for massive sequencing, quantitative distortions associated with the stochastic nature of PCR, unequal amplification and sequencing efficiency of different sequences, as well as the accumulation of PCR errors and sequencing proper, significantly limit the possibilities of such analysis.

    Unique molecular barcoding (unique molecular bacterial, unique molecular identifiers, UMI) allows you to radically improve the quality of sequencing, including extended, to effectively correct the accumulated errors without losing the real variety of options, eliminate quantitative distortions, and also almost perfectly normalize samples for comparative analysis.

    The lecture describes how molecular barcoding approaches work with examples from personal experience with the repertoires of immune cell receptors — T-cell receptors and antibodies.

    Systems Biology


    Introduction to Systems Biology (Ilya Serebriysky, Fox Chase Cancer Center, USA)
    Video | Slides

    The lecture gives a general idea of ​​the system properties of biological objects. Brief description of the main components of system biology. Interactomics, building models. Some achievements of systems biology (selectively, mainly in the field of oncology) and the corresponding generally available resources (TCGA / cBioPortal, CCLE) Computational systems biology for the study and treatment of cancer (Andrei Zinoviev, Institut Curie) Video | Slides






    Computational systems biology of cancer is the application of general systems biology approaches related to the system-wide collection of genome-wide data and their mathematical modeling to study carcinogenesis, prediction and development of new cancer treatment methods. The data approach is associated with a number of features such as taking into account the rapid evolution of a biological system under conditions of genomic and epigenomic instability, interactions with cells of the normal stroma and the effects of various intercellular factors, the diversity and quality of clinical material. The lecture briefly describes several characteristic approaches to the analysis and modeling of data in cancer biology. In particular, the principles of formalization and use in modeling knowledge of cancer biochemistry ( Atlas of Signal Networks in Cancer), approaches to the deconvolution of genome-wide molecular profiles in cancer, the construction of discrete mathematical models to predict the evolution of a cancer tumor.

    The problem of reproducible results in systems and not only biology (Ilya Serebriysky, Fox Chase Cancer Center, USA)
    Video | Slides

    The problem of reproducible results is a key issue for modern biology, especially for systems biology. The lecture is devoted to a review of the current state of affairs, the main problems of reproducibility, their causes. Responsibility of organizations, scientific journals, researchers. Features of the problem in systems biology. The main directions of solving the problem of reproducibility.

    miscellanea


    “Motives” - patterns in genomic sequences (Ivan Kulakovsky, IMB RAS; ING RAS)
    Video | Slides

    From the point of view of molecular biology, the lecture discusses the regulation of gene transcription activity in higher eukaryotes and the role of regulatory proteins-transcription factors. From the point of view of bioinformatics, the lecturer tells how a computer representation of motifs — characteristic patterns in genomic texts — helps to recognize regulatory signals recognized by transcription factors in DNA. From the point of view of computer science, he considers the problem of constructing a model of a 'motive' as the task of finding the local similarity of a set of sequences. Annotation of promoter sequences (Tatiana Tatarinova, University of Southern California) Video |




    Slides

    The lecture addresses issues of regularity and properties of promoter sequences. Motives and methylation of promoters. Algorithms for prediction and analysis of promoter sequences. Application in biotechnology.

    Prediction of Origin Based on Admixture GPS and Readmix Algorithms (Tatiana Tatarinova, University of Southern California)
    Video | Slides The

    lecture is dedicated to the genotyping and selection of informative positions on the genome, a review of modern technologies, the prediction of the bio-geographical origin of humans and other organisms by analyzing the genome. As well as analyzing and comparing existing algorithms for biogeography.

    Algorithms in bioinformatics (Anton Banevich, Center for Algorithmic Biotechnology, St. Petersburg State University)
    Video| Slides An

    introductory lecture on algorithms in bioinformatics, which discusses the main approaches and examples of their use.

    The connection between the brain and Deep Learning (Dmitry Fishman, Quretec, University of Tartu, Estonia)
    Video | Slides

    The lecture consists of four parts: in the first, the ways of processing different signals from the outside world by the brain, and forming decision-making on the basis of the received signals are considered. The second is the evolution of machine learning methods that led to the emergence of deep learning technology, which revolutionized many areas of science. The third part focuses on the similarities and differences between the basic principles of Deep Learning. Finally, the lecturer gives several examples of successful use of Deep Learning in bioinformatics, and what can be achieved in the field of medical imaging using Deep Neural Networks.

    This lecture was created by representatives of the Computing Neuroscience Research Group of the University of Tartu.. In particular, the idea and the slides belong to Raul Vincente and Ilya Kuzovkin. Original presentation in English . Prospects for artificial modification of human genotypes (Alexey Kondrashov, MGU, MSU) Video No laws of nature prohibit the synthesis of long DNA molecules with a given sequence. What will be the phenotype of a person whose genotype does not carry young derived alleles? It depends on how common is sign and narrowing epistasis. The lecture examines approaches to the study of this issue. Bioinformatics in the synthesis of genetic constructs (Pavel Yakovlev, BIOCAD) Video | Slides











    The development of methods in in silico molecular design allows you to build any protein constructs with desired properties. The resulting amino acid sequences are likely to form proteins with the desired functionality. But there is a new challenge: to build a cell line that would synthesize such proteins. The lecture deals with the questions arising in solving this problem: why it is impossible to just take any reverse transcript, how to assemble the required gene, how to insert it into a vector, and, of course, where does bioinformatics come from? Review of modern genomic measurements of individual cells (Peter Harchenko, Harvard University) Video | Slides






    The study of complex tissues and the classification of cell types have traditionally been based on morphological and cytological properties. Several types of experimental technology now allow us to study the genomic characteristics of individual cells and simultaneously measure hundreds or thousands of individual cells. The lecture gives an overview of such technologies and bioinformatic methods that are used to classify cell types, states and genetic lines from similar data.

    The use of omix data in the study of human evolution (Philip, Shanghai Institutes for Biological Sciences, SkolTech)
    Video | Slides

    The concentration of metabolites and lipids can be used to assess the physiological state of the tissues. The lecture presents several comprehensive studies of the concentration of metabolites and lipids in human and animal tissues, which provide new knowledge about the molecular mechanisms underlying the physiological features unique to humans.

    Afterword


    In 2016, the summer school on bioinformatics was supported by the companies JetBrains , RVC , BIOCAD , EPAM Systems , Parseq Lab , for which many thanks to them.

    In 2017, the summer school on bioinformatics will be held from July 31 to August 5 in Dolgoprudny on the basis of MIPT . The focus of the school this year is data mining in bioinformatics. Deadline for applications - June 10th . Hurry up to apply.

    Also popular now: