Reservoir Sampling (offline)

Given an array A of n elements and a number 0<=k<=n, return a subset of A of size k .

Example

Input: A = [5, 11, 3, 7], k =3
Output: [3, 7, 11]
Note that there is no fixed output. It might vary for each run.

Solution

Idea:

  • Choose samples one by one

  • For each sample, choose a random integer which indicates the location, then swap this element to the beginning of the array

Time complexity: O(k)O(k)

Space complexity: O(1)O(1)

import(random)

def random_sampling(A, k):
    for i in range(k) :
        # generate a random number from i to len(A)-1
        j = random.randint(i, len(A)-1)
        A[i], A[j] = A[j], A[i]
    return A[:k]

Last updated