Reservoir Sampling using JavaScript

I needed to use reservoir sampling recently, and because I’ve been working with JavaScript, just for hoots I figured I’d code an example from scratch.

The reservoir sampling algorithm can be used when you want to pull n items from the range [0,N). For example, reservoir(3, 6) returns three values between [0, 6) exclusive or equivalently, [0, 5] inclusive.


Click to enlarge.

If N isn’t too large, say 10,000 or less, then a simple approach is to create an array with values [0, N), shuffle the values (using the Fisher-Yates algorithm), then return the first n values in the array. But if N is unknown in advance, reservoir sampling is useful.

Reservoir sampling is very tricky, even though the code is short, and in my opinion, the best way to understand the algorithm is to examine a code implementation very carefully.

In my demo, instead of using the built-in JavaScript random number generator, which stunningly does not allow you to set the seed, I used a custom Erratic class that generates sort-of random values (which is often good enough for things like ML).

My reservoir() JavaScript function is pretty much the same as the NumPy Python choice() function. For example, the NumPy version is:

vals = np.random.choice(6, 3, replace=False) # three values between [0,5] inclusive

So, the coding exercise was good fun for me while I was taking a little work break this morning.



The popularity of artist Henry Clive (1882-1960) has waxed and waned (somewhat randomly in my mind) quite a bit over the past several decades.

This entry was posted in JavaScript, Miscellaneous. Bookmark the permalink.