Joe Bielawski
Institutional home pages
Arrival and Departure
-
Arrival: 25 May 2024 (Sat)
-
Depart: 04 June 2024 (Ts)
1. Content for workshop
-
1.1 Quick start: Models & methods
-
1.2 Quick start: Codon models
-
1.3 Lecture slides
-
1.4 PAML lab: English
-
1.5 PAML lab: Translated
-
1.6 Scientific ethics
2. Content for additional activties
-
2.1 Additional reading on codon models
-
2.1 Advanced PAML lab
-
2.3 Best practices in genome scans
-
2.4 Alternative software for codon models in the ML framework
1. Content for workshop
1.1 Quick start for models and methods
Are you completely new to models of sequence evolution? Start with this…
- Bielawski, J. P. (2016). Models of Molecular Evolution. In Encyclopedia of Evolutionary Biology. Vol 1, pp. 61-70. Oxford: Academic Press. (Section 2 can be skipped without any loss of continuity.)
Do you have some background in phylogenetics and DNA sequence models, but you are new to the area of detecting adaptive sequence evolution? Start with this…
- Bielawski & Jones (2016). Adaptive Molecular Evolution: Detection Methods. In Encyclopedia of Evolutionary Biology. Vol 1, pp. 16-25. Oxford: Academic Press.
1.2 Quick start for codon models
Are you completely new to codon models? Start with this…
- Delport, Scheffler, & Seoighe (2009). Models of coding sequence evolution. Briefings in bioinformatics, 10(1), 97-109.
Do you want to take a deep dive into the theory and legitimate (and illegitimate) interpretation of codon models? The review below combines evolutionary theory and statistical theory to explain the major inference challenges under codon models.
- Jones C.T., Susko E., Bielawski J.P., 2019. Looking for Darwin in genomic sequences: validity and success depends on the relationship between model and data. In Evolutionary Genomics: Statistical and Computational Methods. Maria Anisimova (ed.) 2nd edition, Human press.]
1.3 Lecture slides
-
2024 Lecture slides (Part 1), Intro to the Neutral & Nearly Neutral Theories of Molecular Evolution: slide set 1
-
2024 Lecture slides (Part 2), Intro to Codon Models: slide set 2
1.4 PAML lab in English
-
PAML lab: PAML Lab website (In 2024, we will do excercises 1, 3, and 4 only.)
-
PAML lab resources: webpage
-
PAML lab slides: slides (PDF)
If you want doing the lab independently (at home, on your own computer), then download all the files from an archive here, or individually here.
NOTE: If you are doing the PAML Lab at the workshop, then use the VM and the symlink in your home directory named “moledata” to obtain the course data files!!!
1.5 Translated PAML Tutorials
You can do the tutorial in Portuguese via this link. Many thanks to Letícia Magpali for the translation!
Letícia Magpali and Esteban Salazar are working on a Spanish Language translation! Hopefully we can provide this soon.
If you have any suggestions or comments on the Portuguese translation, please send them to Letícia Magpali (leticiamagpali@dal.ca). Feel free to communicate directly with her in Portuguese.
If you would like to assist with a translation, please contact me or Letícia Magpali (leticiamagpali@dal.ca).
1.6 Scientific ethics
In 2022 we added a session dedicated to scientific ethics. If you want to get a copy of the “Applied Ethics Primer” you can get it here for free: https://caul-cbua.pressbooks.pub/aep/
Our discussions touch the “unwritten rules” of science, and how such rules can privilege members of some groups and serve as a barrier others.
If you just want to know more, here are a few resources.
-
We recently wrote a paper called “Ten simple rules for succeeding as an underrepresented STEM undergraduate” to help make explicit the “hidden curriculum” of science.
-
Many of the unwritten rules of science are the same as those that operate in society at large, including racism operating within people who adhere to egalitarian attitudes. In the essay “Science in the Belly of the Beast: my Career in the Academy” Joseph L. Graves, Jr. describes his personal experience with the “unwritten rules” of the academy, including the “one and one-quarter rule”.
-
Here is a copy of “Snow Brown and the Seven Detergents: A Metanarrative on Science and the Scientific Method.” by Banu Sbramaniam (Women’s Studies Quarterly, Vol. 28, No. 1/2, (2000), pp. 296-304.)
2. Content for additional activties
2.1 Additional readings and advanced topics
-
A novel phenotype+genotype codon-model (PG-BSM) formulated to test and identify sites within a gene involved in phenotypic adaptation. This method does NOT require dN/dS>1 to infer adaptive molecular evolution!:
(Jones, C. T., Youssef, N., Susko, E., & Bielawski, J. P. (2020). A Phenotype–Genotype Codon Model for Detecting Adaptive Evolution. Systematic biology, 69(4), 722-738.) -
Phenomenological load (PL) and biological conclusions under codon models:
(Jones C.T., Youssef N., Susko E., Bielawski J.P., 2018. Phenomenological Load on Model Parameters Can Lead to False Biological Conclusions. Mol Biol Evol. 35(6):1473-1488.) -
Review of major inference challenges under codon models:
(Jones C.T., Susko E., Bielawski J.P., 2019. Looking for Darwin in genomic sequences: validity and success depends on the relationship between model and data. In Evolutionary Genomics: Statistical and Computational Methods. Maria Anisimova (ed.) 2nd edition, Human press.) -
Positive selection, purifying selection, shifting balance & fitness landscapes:
(Jones, C., Youssef, N., Susko, E. and Bielawski, J., 2017. Shifting balance on a static mutation-selection landscape: a novel scenario of positive selection. Molecular Biology and Evolution, 34(2):391-407.) -
Improved inference of site-specific positive selection under a generalized parametric codon model when there are multi-nucleotide mutations and multiple nonsynonymous rates:
(Dunn KA, Kenney T, Gu H, Bielawski JP. Improved inference of site-specific positive selection under a generalized parametric codon model when there are multinucleotide mutations and multiple nonsynonymous rates. BMC Evol Biol. 2019 Jan 14;19(1):22.) -
ModL: restoring regularity when testing for positive selection:
(Mingrone J, Susko E, Bielawski JP. ModL: exploring and restoring regularity when testing for positive selection. Bioinformatics. 2019 Aug 1;35(15):2545-2554.) -
Smoothed Bootstrap Aggregation (SBA) for assessing and correcting parameter estimate uncertainty in codon models:
(Mingrone, J., Susko, E. and Bielawski, J., 2016. Smoothed bootstrap aggregation for assessing selection pressure at amino acid sites. Molecular Biology and Evolution, 33(11):2976-2989.) -
Protocols, experimental design, and best practices for inference under complex codon models:
(Bielawski, J.P., Baker, J.L. and Mingrone, J., 2016. Inference of episodic changes in natural selection acting on protein coding sequences via CODEML. Current Protocols in Bioinformatics, pp.6-15.)
2.2 Alternative lab and advanced inferences
If you have some experience with codon models, and want to try out a tutorial for more advanced materials then use the link below to download an archive for a complete different set of PAML activities. This tutorial focuses on detecting episodic protein evolution via Branch-Site Model A. The tutorial also includes activities about (i) detecting MLE instabilities, (ii) carrying out robustness analyses, and (iii) use of smoothed bootstrap aggregation (SBA). The protocols for each activity are presented in Protocols in Bioinformatics UNIT 6.15. The included PDF file for UNIT 6.16 also presents recommendations for “best practices” when carrying out a large-scale evolutionary survey for episodic adaptive evolution by using PAML. The files required for this “alternative lab” are available via Bitbucket repository. The repository link is given below.
Advanced PAML demo: Bitbucket repository
codeml_SBA: a program that implements Smoothed Bootstrap Aggregation (SBA) for assessing selection pressure at amino acid sites. https://github.com/Jehops/codeml_sba
DendroCypher: a program to assist labelling the branches of a Newick-formatted tree-file for use with a “branch model” or a “branch-site codon model”: Bitbucket repository
2.3 Best practices in large-scale evolutionary surveys
Large-scale evolutionary surveys are now commonplace. But with the use of progressively more complex codon models, these surveys are fraught with perils. Complex models are more prone to statistical problems such as MLE irregularities, and some can be quite sensitive to model misspecification. UNIT 6.16 (see above) provides some recommended “best practices” for a 2-phase approach to quality control and robustness in evolutionary surveys. We have applied these to a large scale survey for functional divergence in nuclear receptors during homing evolution, and we used experimental approaches to investigate hypotheses about the role of a particular nuclear receptor (NR2C1) as a key modulator of developmental pluripotnetiality during hominid evolution. The paper that illustrates the power of such an evolutionary surgery, and the importance of an experimental design having explicit protocols for “best practices”, is given below.
Example large-scale survey: PDF
2.4 Alternative software for codon models in the ML framework
HyPhy: comparative sequence analysis using stochastic evolutionary models; http://www.hyphy.org/
DataMonkey: a server that supports a variety of HYPHY tools at no cost; http://www.datamonkey.org/
COLD: a program that implements a general-purpose parametric (GPP) codon model. Most codon models are special cases of the GPP codon model. https://github.com/tjk23/COLD
codeml_SBA: a program that implements smoothed Bootstrap Aggregation for Assessing Selection Pressure at Amino Acid Sites.https://github.com/Jehops/codeml_sba
ModL: a program for restoring regularity when testing for positive selection using codon models https://github.com/jehops/codeml_modl