Joe Bielawski
Joe Bielawski

Institutional home pages


Arrival and Departure

Arrival: 24 May 2024 (Fri)

Depart: 04 June 2024 (Ts)


Scientific ethics

In the 2022 workshop, our discussion of scientific ethics touched on the many ways that science has “unwritten rules”. Here I provide a few supplementary resources.

One way that unwritten rules impact who succeeds in science is sometimes referred to as the “hidden curriculum”. These are best practices for success in STEM that first-generation and underrepresented minority students must navigate but are not taught in classrooms. With this in mind, we recently wrote a paper called “Ten simple rules for succeeding as an underrepresented STEM undergraduate” to help make explicit the hidden curriculum of science.

Many of the unwritten rules of science are the same as those that operate in society at large, including racism operating within people who adhere to egalitarian attitudes. In the essay “Science in the Belly of the Beast: my Career in the Academy” Joseph L. Graves, Jr. describes his personal experience with the unwritten rules of the academy, including the “one and one-quarter rule”.


Getting started

Are you completely new to models of sequence evolution? Then this review might be good for you. Section 2 provides a great introduction to Markov models, but it can be skipped without any loss of continuity by readers in a hurry or who do not need to know the nitty-gritty details of the models.

  • Citation: Bielawski, J. P. (2016). Molecular Evolution, Models of. In Kliman, R.M. (ed.) Encyclopedia of Evolutionary Biology. Vol 1, pp. 61-70. Oxford: Academic Press.

Do you some background in phylogenetics and DNA sequence models, but you are new to the are of detecting adaptive sequence evolution? Then this review will give you a very broad starting point (population genetics, non-coding, codon and amino-acid-based methods are all covered). After reading this, your next steps would be to get started on the papers presented below.

  • Citation: Bielawski, Joseph & Jones, Chris. (2016). Adaptive Molecular Evolution: Detection Methods. In Kliman, R.M. (ed.) Encyclopedia of Evolutionary Biology. Vol 1, pp. 16-25. Oxford: Academic Press.


PAML lab Materials

The lab exercises (PAML demo) are available via small website (link below). The site contains some additional resources that are worth a look when you have time. Please note that slides may change a little prior to the lab. I will post modified PDFs as required.

PAML demo: PAML Lab website

PAML demo resources: webpage

PAML demo slides: slides (PDF) (updated for 2023)

If you want doing the lab independently of the workshop (at home, on your own time and on your own computer), then you can do this by downloading all the necessary files from an archive here, or you can download the files individually for each exercise as you need them here.

NOTE: If you are doing the PAML Lab at the workshop, then use the VM and the symlink in your home directory named “moledata” to obtain the course data files!!!


Translated PAML Tutorials

[Will add Portuguese and possibly Spanish language translations in Summer of 2024.]


Lecture Materials

Current slides

I am changing the lecture content for 2022 and beyond. This lecture will provide a more general background on evolutionary forces, and the Neutral and Nearly-Neutral theories of molecular evolution. Some details about fitting codon models to real data, have been moved to the “PAML Lab” lecture.

  • 2023 Lecture slides (Part 1), Intro to Neutral & Nearly Neutral Theories of Molecular Evolution: slide set 1 (updated)

  • 2023 Lecture slides (Part 2), Intro to Codon Models: slide set 2 (updated)


Some material from previous workshops

I have included links to the 2019 slides below. This update includes more information on mechanistic processes of codon evolution (via the MutSel framework). Also, some might be interested in parts 3 and 4, which cover more advanced statistical topics, such as the requirements for likelihood inference and “phenomenological load” on parameter estimates. These topics will not be covered in 2022.

I updated the lecture slides on codon models for 2017. Because the older slides tend to have more details about fitting codon models to real data, I have included links to the 2015 and 2016 slides below; these slides provide more information about the powers and pitfalls of inference under codon models.


Key papers related to the lecture material:


Alternative Lab (advanced topics)

If you have some experience with codon models, and want to try out a tutorial for more advanced materials then use the link below to download an archive for a complete different set of PAML activities. This tutorial focuses on detecting episodic protein evolution via Branch-Site Model A. The tutorial also includes activities about (i) detecting MLE instabilities, (ii) carrying out robustness analyses, and (iii) use of smoothed bootstrap aggregation (SBA). The protocols for each activity are presented in Protocols in Bioinformatics UNIT 6.15. The included PDF file for UNIT 6.16 also presents recommendations for “best practices” when carrying out a large-scale evolutionary survey for episodic adaptive evolution by using PAML. The files required for this “alternative lab” are available via Bitbucket repository. The repository link is given below.

Advanced PAML demo: Bitbucket repository

codeml_SBA: a program that implements Smoothed Bootstrap Aggregation (SBA) for assessing selection pressure at amino acid sites. https://github.com/Jehops/codeml_sba

DendroCypher: a program to assist labelling the branches of a Newick-formatted tree-file for use with a “branch model” or a “branch-site codon model”: Bitbucket repository


“Best practices” in large-scale evolutionary surveys

Large-scale evolutionary surveys are now commonplace. But with the use of progressively more complex codon models, these surveys are fraught with perils. Complex models are more prone to statistical problems such as MLE irregularities, and some can be quite sensitive to model misspecification. UNIT 6.16 (see above) provides some recommended “best practices” for a 2-phase approach to quality control and robustness in evolutionary surveys. We have applied these to a large scale survey for functional divergence in nuclear receptors during homing evolution, and we used experimental approaches to investigate hypotheses about the role of a particular nuclear receptor (NR2C1) as a key modulator of developmental pluripotnetiality during hominid evolution. The paper that illustrates the power of such an evolutionary surgery, and the importance of an experimental design having explicit protocols for “best practices”, is given below.

Example large-scale survey: PDF


Alternative software for codon models in the ML framework

HyPhy: comparative sequence analysis using stochastic evolutionary models; http://www.hyphy.org/

DataMonkey: a server that supports a variety of HYPHY tools at no cost; http://www.datamonkey.org/

COLD: a program that implements a general-purpose parametric (GPP) codon model. Most codon models are special cases of the GPP codon model. https://github.com/tjk23/COLD

codeml_SBA: a program that implements smoothed Bootstrap Aggregation for Assessing Selection Pressure at Amino Acid Sites.https://github.com/Jehops/codeml_sba

ModL: a program for restoring regularity when testing for positive selection using codon models https://github.com/jehops/codeml_modl


At the Captain Kidd
At the Captain Kidd
Workshop concept map
Workshop concept map