Rogue Scholar

Informatique et sciences de l'informationAnglais

Using Bayesian tools to be a better frequentist

Publié 9 juillet 2025

I am not a staunch advocate of Bayesian methods — I can totally see how for some questions a frequentist approach may provide more satisfactory answers. In this post, we’ll explore how for a simple scenario (negative binomial regression with small sample size), standard frequentist methods fail at being frequentist while standard Bayesian methods provide good frequentist guarantees.

Informatique et sciences de l'informationAnglais

Cross-validation — a fourth way to compute a Bayes factor

https://doi.org/10.59350/1x5nq-4y737

Publié 23 mars 2024

Auteur Martin Modrák

In this post we’ll explore a particular link between Bayes factors and cross-validation I was introduced to via Fong & Holmes 2020. I’ll then argue why this is a reason to not trust Bayes factors too much. This is a followup to Three ways to compute a Bayes factor, though I will repeat all the important bits here.

Informatique et sciences de l'informationAnglais

Brms hacking: linear predictors for random effect standard deviations

https://doi.org/10.59350/rd59h-hca42

Publié 17 février 2024

Auteur Martin Modrák

brms is a great package. It allows you to put predictors on a lot of things. Its power is however not absolute — one thing it doesn’t let you directly do is use data to predict variances of random/varying effects.

Informatique et sciences de l'informationAnglais

The SBC package - check your models before you wreck yourself

https://doi.org/10.59350/rwg8s-q3838

Publié 1 novembre 2023

Auteur Martin Modrák

To celebrate a new paper out in Bayesian Analysis, let’s talk simulation-based calibration checking (SBC). SBC is a method where you use simulated datasets to verify that you implemented you model correctly and/or that your sampling algorithm work. It was introduced by Talts et al. and has been known and used for a while, but was considered to have a few shortcomings, which we try to address.

Informatique et sciences de l'informationAnglais

Using brms to model reaction times contaminated with errors

https://doi.org/10.59350/nda32-61r30

Publié 1 avril 2021

Auteur Martin Modrák

Nathaniel Haines made a neat tweet showing off his model of reaction times that handles possible contamination with both implausibly short reaction times (e.g., if people make an anticipatory response that is not actually based on processing the stimulus of interest) or implausibly large reaction times (e.g., if their attention drifts away from the task, but they snap back to it after having “zoned out” for a few seconds). Response times that

Informatique et sciences de l'informationAnglais

Three ways to compute a Bayes factor

https://doi.org/10.59350/80dxt-sxd75

Publié 28 mars 2021

Auteur Martin Modrák

This post was inspired by a very interesting paper on Bayes factors: Workflow Techniques for the Robust Use of Bayes Factors by Schad, Nicenboim, Bürkner, Betancourt and Vasishth. I would specifically recommend it for its introduction into what actually is a hypothesis in the Bayesian context and insights into what Bayes factors are.

Informatique et sciences de l'informationAnglais

Enforcing line-break rules in RMarkdown via Pandoc

https://doi.org/10.59350/21xxj-t5x81

Publié 16 décembre 2020

Auteur Martin Modrák

Generating document via RMarkdown is fun! So I recently used RMarkdown to generate reports that were written in Czech. Interestingly, Czech has rules on some words that are not allowed to be the last on a line of text - those are almost all single-letter words and a few abbreviations. MS Word is actually smart enough to enforce this policy, but this does not happen for the HTML and PDF outputs from RMarkdown.

Informatique et sciences de l'informationAnglais

Approximate Densities for Sums of Variables: Negative Binomials and Saddlepoint

https://doi.org/10.59350/dwf8j-hqb80

Publié 20 juin 2019

Auteur Martin Modrák

The Approximation - Big Picture Saddlepoint for Sum of NBs Implementing the Approximation in Stan A Simple Baseline Eyeballing Masses Evaluating Performance Summing up Saddlepoint Approximations for Other Families I recently needed to find the distribution of sum of non-identical but independent negative binomial (NB) random variables. Although for some special cases the sum is itself NB, analytical solution is not feasible in the general case.

Informatique et sciences de l'informationAnglais

Thank you: Statistics as a Journey

https://doi.org/10.59350/3s60r-q9963

Publié 24 mars 2019

Auteur Martin Modrák

Recently another high-profile piece on abandoning statistical significance by Amrhein, Greenland & McShane was published. I have mixed feelings about this, me and my Twitter bubble are mostly like “Another one of those?!”… But how did I get from not doing almost any statistics five years ago to considering myself a cool insider that can look down on a prominent piece by a group of lifelong experts?

Informatique et sciences de l'informationAnglais

A Plea for Tests in R Markdown

https://doi.org/10.59350/yzane-9xe90

Publié 17 septembre 2018

Auteur Martin Modrák

I’ve read Yihui Xie’s thoughtful response to the I don’t like notebooks talk from JupyterCon 2018. And I agree with basically everything Yihui said, only one point felt like it could give a wrong impression. It states: This reads as if there is no room for automated tests in markdown/notebooks.

Informatique et sciences de l'informationAnglais

Kangaroo and DESeq2 (ENBIK 2018)

https://doi.org/10.59350/s90tr-b0q76

Publié 11 juin 2018

Auteur Martin Modrák

For the Czech national bioinformatics conference (ENBIK) I prepared a short presentation on Type S and Type M errors and how to use simulations to understand what your method might do before conducting an experiment. I show how t-test can fail, inspired by Andrew Gelman’s take on power = .06 and how DESeq2 (used to determine differentially expressed genes) does a good job at mitigating false positives at the cost of increased false negatives.