Informatique et sciences de l'informationAnglaisHugo

Martin Modrák

Recent content on Martin Modrák
Page d'accueilFlux RSSMastodon
language
Informatique et sciences de l'informationAnglais
Publié

I am not a staunch advocate of Bayesian methods — I can totally see how for some questions a frequentist approach may provide more satisfactory answers. In this post, we’ll explore how for a simple scenario (negative binomial regression with small sample size), standard frequentist methods fail at being frequentist while standard Bayesian methods provide good frequentist guarantees.

Informatique et sciences de l'informationAnglais
Publié

In this post we’ll explore a particular link between Bayes factors and cross-validation I was introduced to via Fong & Holmes 2020. I’ll then argue why this is a reason to not trust Bayes factors too much. This is a followup to Three ways to compute a Bayes factor, though I will repeat all the important bits here.

Informatique et sciences de l'informationAnglais
Publié

To celebrate a new paper out in Bayesian Analysis, let’s talk simulation-based calibration checking (SBC). SBC is a method where you use simulated datasets to verify that you implemented you model correctly and/or that your sampling algorithm work. It was introduced by Talts et al. and has been known and used for a while, but was considered to have a few shortcomings, which we try to address.

Informatique et sciences de l'informationAnglais
Publié

Nathaniel Haines made a neat tweet showing off his model of reaction times that handles possible contamination with both implausibly short reaction times (e.g., if people make an anticipatory response that is not actually based on processing the stimulus of interest) or implausibly large reaction times (e.g., if their attention drifts away from the task, but they snap back to it after having “zoned out” for a few seconds). Response times that

Informatique et sciences de l'informationAnglais
Publié

This post was inspired by a very interesting paper on Bayes factors: Workflow Techniques for the Robust Use of Bayes Factors by Schad, Nicenboim, Bürkner, Betancourt and Vasishth. I would specifically recommend it for its introduction into what actually is a hypothesis in the Bayesian context and insights into what Bayes factors are.

Informatique et sciences de l'informationAnglais
Publié

Generating document via RMarkdown is fun! So I recently used RMarkdown to generate reports that were written in Czech. Interestingly, Czech has rules on some words that are not allowed to be the last on a line of text - those are almost all single-letter words and a few abbreviations. MS Word is actually smart enough to enforce this policy, but this does not happen for the HTML and PDF outputs from RMarkdown.

Informatique et sciences de l'informationAnglais
Publié

The Approximation - Big Picture Saddlepoint for Sum of NBs Implementing the Approximation in Stan A Simple Baseline Eyeballing Masses Evaluating Performance Summing up Saddlepoint Approximations for Other Families I recently needed to find the distribution of sum of non-identical but independent negative binomial (NB) random variables. Although for some special cases the sum is itself NB, analytical solution is not feasible in the general case.

Informatique et sciences de l'informationAnglais
Publié

Recently another high-profile piece on abandoning statistical significance by Amrhein, Greenland & McShane was published. I have mixed feelings about this, me and my Twitter bubble are mostly like “Another one of those?!”… But how did I get from not doing almost any statistics five years ago to considering myself a cool insider that can look down on a prominent piece by a group of lifelong experts?

Informatique et sciences de l'informationAnglais
Publié

I’ve read Yihui Xie’s thoughtful response to the I don’t like notebooks talk from JupyterCon 2018. And I agree with basically everything Yihui said, only one point felt like it could give a wrong impression. It states: This reads as if there is no room for automated tests in markdown/notebooks.

Informatique et sciences de l'informationAnglais
Publié

For the Czech national bioinformatics conference (ENBIK) I prepared a short presentation on Type S and Type M errors and how to use simulations to understand what your method might do before conducting an experiment. I show how t-test can fail, inspired by Andrew Gelman’s take on power = .06 and how DESeq2 (used to determine differentially expressed genes) does a good job at mitigating false positives at the cost of increased false negatives.