Skip to content

How to speed up Kernel Density Estimation based sampling #414

@lizhengyuhang

Description

@lizhengyuhang

Hi Jonathan:

When I use the distribution generated by kernel density estimation for sampling, it takes a lot of time. And I use the distribution generated by sklearn's KDE for sampling, which is very fast.
Moreover, the orthogonal polynomial generation based on kernel density estimation is also time-consuming.
Is there any method that can accelerate sampling and orthogonal polynomial generation based on kernel density estimation?

Here is my code:

My environment:
Python 3.8.8
chaospy 4.3.13
numpoly 1.2.11

# sampling based on kernel density estimation
import numpy as np
import chaospy as cp
from sklearn.neighbors import KernelDensity
import time
samples_x_mc = np.random.randn(4, 1000)+2
print('MC:',np.mean(samples_x_mc,axis=1))

time_b = time.time()
dist_kde =KernelDensity(bandwidth='silverman',kernel='gaussian').fit(samples_x_mc.T)
samples_ked = dist_kde.sample(1000).T
time_e = time.time()
print('SKL:',np.mean(samples_ked,axis=1))
print('SKL time:',(time_e-time_b))

time_b = time.time()
dist_cp =cp.GaussianKDE(samples_x_mc,estimator_rule='silverman')
samples_cp = dist_cp.sample(1000)
time_e = time.time()
print('CP:',np.mean(samples_cp,axis=1))
print('CP time:',(time_e-time_b))

The outout is:
MC: [1.99415654 1.97854686 1.99451553 1.99307159]
SKL: [2.01294929 2.01646145 2.02962535 1.91007853]
SKL time: 0.001994609832763672
CP: [1.92208752 1.95432996 1.98291103 1.94525712]
CP time: 81.79704451560974
If there are more samples, the CP time will be larger and even difficult to calculate.

# the orthogonal polynomial generation based on kernel density estimation
import numpy as np
import chaospy as cp
import time

samples_x_mc = np.random.randn(5, 1000)+2
time_b = time.time()
distributions =cp.GaussianKDE(samples_x_mc,estimator_rule='silverman')
expansion,norms = cp.generate_expansion(2, distributions, rule="cholesky",retall=True)
time_e = time.time()
print('time:',(time_e-time_b))

The output is:
time: 71.0097062587738

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions