Rogue Scholar

MatematicaInglese

Big Dimensions, and What You Can Do About It

Pubblicato 8 febbraio 2016

Autore Jeremy Kun

Data is abundant, data is big, and big is a problem. Let me start with an example. Let’s say you have a list of movie titles and you want to learn their genre: romance, action, drama, etc. And maybe in this scenario IMDB doesn’t exist so you can’t scrape the answer. Well, the title alone is almost never enough information.

MatematicaInglese

Concrete Examples of Quantum Gates

https://doi.org/10.59350/4qdcb-jj561

Pubblicato 11 gennaio 2016

Autore Jeremy Kun

So far in this series we’ve seen a lot of motivation and defined basic ideas of what a quantum circuit is. But on rereading my posts, I think we would all benefit from some concreteness. “Local” operations So by now we’ve understood that quantum circuits consist of a sequence of gates $ A_1, \dots, A_k$, where each $ A_i$ is an 8-by-8 matrix that operates “locally” on some choice of three (or fewer) qubits.

MatematicaInglese

Hashing to Estimate the Size of a Stream

https://doi.org/10.59350/8mcnb-9en81

Pubblicato 4 gennaio 2016

Autore Jeremy Kun

Problem: Estimate the number of distinct items in a data stream that is too large to fit in memory.

MatematicaInglese

Load Balancing and the Power of Hashing

https://doi.org/10.59350/zebh6-4sw38

Pubblicato 28 dicembre 2015

Autore Jeremy Kun

Here’s a bit of folklore I often hear (and retell) that’s somewhere between a joke and deep wisdom: if you’re doing a software interview that involves some algorithms problem that seems hard, your best bet is to use hash tables. More succinctly put: Google loves hash tables. As someone with a passion for math and theoretical CS, it’s kind of silly and reductionist.

MatematicaInglese

The Inequality

https://doi.org/10.59350/98pdm-tn731

Pubblicato 23 novembre 2015

Autore Jeremy Kun

Math and computer science are full of inequalities, but there is one that shows up more often in my work than any other. Of course, I’m talking about $$\displaystyle 1+x \leq e^{x}$$ This is The Inequality. I’ve been told on many occasions that the entire field of machine learning reduces to The Inequality combined with the Chernoff bound (which is proved using The Inequality). Why does it show up so often in machine learning?

MatematicaInglese

A Quasipolynomial Time Algorithm for Graph Isomorphism: The Details

https://doi.org/10.59350/vre69-edk82

Pubblicato 12 novembre 2015

Autore Jeremy Kun

Update 2017-01-09: Laci claims to have found a workaround to the previously posted error, and the claim is again quasipolynoimal time! Updated arXiv paper to follow. Update 2017-01-04: Laci has posted an update on his paper. The short version is that one small step of his analysis was not quite correct, and the result is that his algorithm is sub-exponential, but not quasipolynomial time.

MatematicaInglese

Serial Dictatorships and House Allocation

https://doi.org/10.59350/m254r-91257

Pubblicato 26 ottobre 2015

Autore Jeremy Kun

I was recently an invited speaker in a series of STEM talks at Moraine Valley Community College. My talk was called “What can algorithms tell us about life, love, and happiness?” and it’s on Youtube now so you can go watch it. The central theme of the talk was the lens of computation, that algorithms and theoretical computer science can provide new and novel explanations for the natural phenomena we observe in the world.

MatematicaInglese

One definition of algorithmic fairness: statistical parity

https://doi.org/10.59350/c1ea1-4gs47

Pubblicato 19 ottobre 2015

Autore Jeremy Kun

If you haven’t read the first post on fairness, I suggest you go back and read it because it motivates why we’re talking about fairness for algorithms in the first place. In this post I’ll describe one of the existing mathematical definitions of “fairness,” its origin, and discuss its strengths and shortcomings.

MatematicaInglese

The Boosting Margin, or Why Boosting Doesn't Overfit

https://doi.org/10.59350/q83ve-8n204

Pubblicato 21 settembre 2015

Autore Jeremy Kun

There’s a well-understood phenomenon in machine learning called overfitting. The idea is best shown by a graph: overfitting Let me explain. The vertical axis represents the error of a hypothesis. The horizontal axis represents the complexity of the hypothesis. The blue curve represents the error of a machine learning algorithm’s output on its training data, and the red curve represents the generalization of that hypothesis to the real world.

MatematicaInglese

The Welch-Berlekamp Algorithm for Correcting Errors in Data

https://doi.org/10.59350/nbs03-gnh20

Pubblicato 7 settembre 2015

Autore Jeremy Kun

In this post we’ll implement Reed-Solomon error-correcting codes and use them to play with codes. In our last post we defined Reed-Solomon codes rigorously, but in this post we’ll focus on intuition and code. As usual the code and data used in this post is available on this blog’s Github page.

MatematicaInglese

The Čech Complex and the Vietoris-Rips Complex

https://doi.org/10.59350/x6g7z-v8j53

Pubblicato 6 agosto 2015

Autore Jeremy Kun

It’s about time we got back to computational topology. Previously in this series we endured a lightning tour of the fundamental group and homology, then we saw how to compute the homology of a simplicial complex using linear algebra. What we really want to do is talk about the inherent shape of data.

Math ∩ Programming

Big Dimensions, and What You Can Do About It

Concrete Examples of Quantum Gates

Hashing to Estimate the Size of a Stream

Load Balancing and the Power of Hashing

The Inequality

A Quasipolynomial Time Algorithm for Graph Isomorphism: The Details

Serial Dictatorships and House Allocation

One definition of algorithmic fairness: statistical parity

The Boosting Margin, or Why Boosting Doesn't Overfit

The Welch-Berlekamp Algorithm for Correcting Errors in Data

The Čech Complex and the Vietoris-Rips Complex