Rogue Scholar

Mathematics

The Inequality

Published November 23, 2015

Author Jeremy Kun

Math and computer science are full of inequalities, but there is one that shows up more often in my work than any other. Of course, I’m talking about $$\displaystyle 1+x \leq e^{x}$$ This is The Inequality. I’ve been told on many occasions that the entire field of machine learning reduces to The Inequality combined with the Chernoff bound (which is proved using The Inequality). Why does it show up so often in machine learning?

Mathematics

A Quasipolynomial Time Algorithm for Graph Isomorphism: The Details

https://doi.org/10.59350/vre69-edk82

Published November 12, 2015

Author Jeremy Kun

Update 2017-01-09: Laci claims to have found a workaround to the previously posted error, and the claim is again quasipolynoimal time! Updated arXiv paper to follow. Update 2017-01-04: Laci has posted an update on his paper. The short version is that one small step of his analysis was not quite correct, and the result is that his algorithm is sub-exponential, but not quasipolynomial time.

Mathematics

Serial Dictatorships and House Allocation

https://doi.org/10.59350/m254r-91257

Published October 26, 2015

Author Jeremy Kun

I was recently an invited speaker in a series of STEM talks at Moraine Valley Community College. My talk was called “What can algorithms tell us about life, love, and happiness?” and it’s on Youtube now so you can go watch it. The central theme of the talk was the lens of computation, that algorithms and theoretical computer science can provide new and novel explanations for the natural phenomena we observe in the world.

Mathematics

One definition of algorithmic fairness: statistical parity

https://doi.org/10.59350/c1ea1-4gs47

Published October 19, 2015

Author Jeremy Kun

If you haven’t read the first post on fairness, I suggest you go back and read it because it motivates why we’re talking about fairness for algorithms in the first place. In this post I’ll describe one of the existing mathematical definitions of “fairness,” its origin, and discuss its strengths and shortcomings.

Mathematics

The Boosting Margin, or Why Boosting Doesn't Overfit

https://doi.org/10.59350/q83ve-8n204

Published September 21, 2015

Author Jeremy Kun

There’s a well-understood phenomenon in machine learning called overfitting. The idea is best shown by a graph: overfitting Let me explain. The vertical axis represents the error of a hypothesis. The horizontal axis represents the complexity of the hypothesis. The blue curve represents the error of a machine learning algorithm’s output on its training data, and the red curve represents the generalization of that hypothesis to the real world.

Mathematics

The Welch-Berlekamp Algorithm for Correcting Errors in Data

https://doi.org/10.59350/nbs03-gnh20

Published September 7, 2015

Author Jeremy Kun

In this post we’ll implement Reed-Solomon error-correcting codes and use them to play with codes. In our last post we defined Reed-Solomon codes rigorously, but in this post we’ll focus on intuition and code. As usual the code and data used in this post is available on this blog’s Github page.

Mathematics

The Čech Complex and the Vietoris-Rips Complex

https://doi.org/10.59350/x6g7z-v8j53

Published August 6, 2015

Author Jeremy Kun

It’s about time we got back to computational topology. Previously in this series we endured a lightning tour of the fundamental group and homology, then we saw how to compute the homology of a simplicial complex using linear algebra. What we really want to do is talk about the inherent shape of data.

Mathematics

What does it mean for an algorithm to be fair?

https://doi.org/10.59350/hd748-25k50

Published July 13, 2015

Author Jeremy Kun

In 2014 the White House commissioned a 90-day study that culminated in a report (pdf) on the state of “big data” and related technologies. The authors give many recommendations, including this central warning. Warning: algorithms can facilitate illegal discrimination! Here’s a not-so-imaginary example of the problem. A bank wants people to take loans with high interest rates, and it also serves ads for these loans.

Mathematics

Methods of Proof — Diagonalization

https://doi.org/10.59350/6r5gy-d0n31

Published June 8, 2015

Author Jeremy Kun

A while back we featured a post about why learning mathematics can be hard for programmers, and I claimed a major issue was not understanding the basic methods of proof (the lingua franca between intuition and rigorous mathematics). I boiled these down to the “basic four,” direct implication, contrapositive, contradiction, and induction. But in mathematics there is an ever growing supply of proof methods.

Mathematics

Weak Learning, Boosting, and the AdaBoost algorithm

https://doi.org/10.59350/c8nrk-jr359

Published May 18, 2015

Author Jeremy Kun

When addressing the question of what it means for an algorithm to learn, one can imagine many different models, and there are quite a few. This invariably raises the question of which models are “the same” and which are “different,” along with a precise description of how we’re comparing models.

Mathematics

The Many Faces of Set Cover

https://doi.org/10.59350/e539d-7gf86

Published May 4, 2015

Author Jeremy Kun

A while back Peter Norvig posted a wonderful pair of articles about regex golf. The idea behind regex golf is to come up with the shortest possible regular expression that matches one given list of strings, but not the other. “Regex Golf,” by Randall Munroe. In the first article, Norvig runs a basic algorithm to recreate and improve the results from the comic, and in the second he beefs it up with some improved search heuristics.

Math ∩ Programming

The Inequality

A Quasipolynomial Time Algorithm for Graph Isomorphism: The Details

Serial Dictatorships and House Allocation

One definition of algorithmic fairness: statistical parity

The Boosting Margin, or Why Boosting Doesn't Overfit

The Welch-Berlekamp Algorithm for Correcting Errors in Data

The Čech Complex and the Vietoris-Rips Complex

What does it mean for an algorithm to be fair?

Methods of Proof — Diagonalization

Weak Learning, Boosting, and the AdaBoost algorithm

The Many Faces of Set Cover