Skip to content

Sampling and resampling techniques for random sample generation, estimation, and simulation

License

Notifications You must be signed in to change notification settings

siavashtab/SampSimu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SampSimu

A Python package for multivariate sampling and resampling techniques for simulation, random sample generation, estimation, and experimental design


Author: Siavash Tabrizian - stabrizian@smu.edu


1 - Sampling:

There are different sampling techniques can be used in order to generate sample. This package helps the user to generate samples with three methods:

1 - Crude Monte Carlo sampling (Simple random sampling/SRS):

The unbiased sample mean estimator

,

can be used in this case in order to estimate the population mean . In this sampling technique, in order to obtain observations, first random numbers should be generated from where is the number of random variables, and after that using the CDF, the value can be taken from the distribution.

Sampling steps for generating observations:

For i <= n:
    For j <= R: 
		1. build the cumulative distribution of the random variable (CDF)
		2. draw a random number from [0,1] interval = r
		3. find the value of the random variable for r using the CDF of jth random variable

======================================

2 - Antithetic Sampling

In this sampling technique, in order to obtain observations, first random numbers should be generated from where is the number of random variables, and after that using the CDF, two values can be taken from the distribution using and .

Sampling steps:

For i <= n/2:
    For j <= R: 
		1. build the cumulative distribution of the random variable (CDF)
		2. draw a random number from [0,1] interval = r
		3. find the value of the random variable for r using the CDF of jth random variable
		3. find the value of the random variable for 1-r using the CDF of jth random variable

======================================

2 - Latin Hypercube Sampling (LHS)

In this sampling technique, in order to obtain observations, first each random variable should be stratified into intervals. Thereafter, a permutation of intervals should be generated for each random variable, and they all together represent hypercubes in the sample space, then a random observation can be taken from each hypercube randomely.

Sampling steps:

For i <= n:
    1. Generate $R$ random permutations of \{1,...,n\} = p^r_i
    For j <= R: 
		1. build the cumulative distribution of the random variable (CDF)
		2. draw a random number from p^r_i interval = r
		3. find the value of the random variable for r using the CDF of jth random variable

2 - Resampling:

In this section of the code the description of the second class of sampling module is presented:

1 - Monte Carlo simulation:

There are number of replications and in each replication a sample is going to be generated using one of the techniques from the previous section. The final estimation is the sample mean over the obtained estimations:

2 - Bootstraping:

In this resmapling technique, of smaller size samples are going to be generated from a given sample of the larger size. The estimation can be done by using the sample mean estimator.

3 - Jacknife:

It is another resampling technique for generating a set of samples of smaller size from a given sample of larger size. In this method number of samples are going to be generated from a sample of size . In each sample , observation is taken out from the sample, and this leads to samples of size . The estimation is similar to bootstraping can be obtained using the sample mean estimator.

Example

In the experiment folder there are some expermints using this package. exp1 has 4 random variables and the evaluation function is defined in evalfunc. We can test the law of large numbers when the sample size increases.

var1 = [[0.0,1.0,2.0],[0.1,0.4,0.5]]
var2 = [[1.5,2.5,3.5,4.5,8.0],[0.05,0.05,0.2,0.4,0.3]]
var3 = [[0.1,7.0],[0.05,0.95]]
var4 = [[0.0,0.05,0.07,0.9,0.4],[0.2,0.2,0.5,0.05,0.05]]

RVs =[var1,var2,var3,var4]


def evalfunc(obs):
    out = 0.0
    out += 10*obs[0]*obs[1]
    out += 100*obs[2]
    out += (10*obs[3])**2
    return out

dist = ProbDist(RVs, evalfunc) #instance of the distribution with its evaluation function

sampl = samp_gen(dist)

resampl = resampling(sampl)

visual = visualsamp(resampl,'Res/')

visual.lawlargevs()