USC DSCI 553 - Foundations & Applications of Data Mining - Spring 2024 - Prof. Wei-Min Shen
-
Updated
May 14, 2024 - Python
USC DSCI 553 - Foundations & Applications of Data Mining - Spring 2024 - Prof. Wei-Min Shen
Sampling methods for data streams
SAT'18 Paper: SPUR - Satisfying Perfectly Uniform Random sampler (Winner Best Student Paper)
Reservoir sampling implementation with akka-streams support
Reservoir Sampling for Group-By Queries in Flink Platform. Answering effectively Single Aggregate.
Output randomly sampled lines from input stream or file
Sample documents from MongoDB collections.
A collection of algorithms in Java 8 for the problem of random sampling with a reservoir
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Optimal implementation of reservoir sampling algorithm in Julia.
Produce a sample of lines from files.
Bloom filtering, Flajolet-Martin algorithm, and reservoir sampling
Data- and processor- parallelism for fast weighted sampling
A stream sampler extracts one or more sample sets, each with a given number of elements, from a stream. Each possible sample set (of the given size) has an equal probability of being extracted. A stream sampler is an online algorithm: The size of the input is unknown, and only one pass over the stream is possible.
A collection of random sampling algorithms in Python.
The aim of this project was to sample a sports data set
reservoir-sampling-go implements the Reservoir Sampling algorithm written in Go (Golang).
Add a description, image, and links to the reservoir-sampling topic page so that developers can more easily learn about it.
To associate your repository with the reservoir-sampling topic, visit your repo's landing page and select "manage topics."