python - Correlating an array row-wise with a vector - TagMerge
3Correlating an array row-wise with a vectorCorrelating an array row-wise with a vector

Correlating an array row-wise with a vector

Asked 1 years ago
0
3 answers

This should work to compute the correlation coefficient for each row with a specified y in a vectorized manner.

X = np.random.random([1000, 10])
y = np.random.random(10)
r = (len(y) * np.sum(X * y[None, :], axis=-1) - (np.sum(X, axis=-1) * np.sum(y))) / (np.sqrt((len(y) * np.sum(X**2, axis=-1) - np.sum(X, axis=-1) ** 2) * (len(y) * np.sum(y**2) - np.sum(y)**2)))
print(r[0], np.corrcoef(X[0], y))
0.4243951, 0.4243951

Source: link

0

The first way you might consider doing this is in a loop with the np.corrcoef function, which returns the linear correlation coefficient between two vectors:
r = np.zeros((N,1))
for i in xrange(0,N):
    r[k] = np.corrcoef(X[i,:], y)[0,1]
If N is large, or if you need to perform this calculation many times (with an outer loop wrapped around it), this will be very slow. We can time it as follows:
for t in xrange(0,10): # average 10 timings

    start = time.time()
    r = np.zeros((10000,1))
    for k in xrange(0,10000):
        r[k] = np.corrcoef(vic_runoff[k,:], obs_runoff)[0,1]

    end = time.time()
    times[t] = end-start

print np.mean(times)
We can write a function using NumPy’s vectorized arithmetic to compute these values all at once rather than in a loop. For example, np.multiply(X,y) (also given by X*y) performs element-wise multiplication of the vector y over all rows of the matrix X. The function might look something like this:
def vcorrcoef(X,y):
    Xm = np.reshape(np.mean(X,axis=1),(X.shape[0],1))
    ym = np.mean(y)
    r_num = np.sum((X-Xm)*(y-ym),axis=1)
    r_den = np.sqrt(np.sum((X-Xm)**2,axis=1)*np.sum((y-ym)**2))
    r = r_num/r_den
    return r
If we try timing it as before:
for t in xrange(0,10): # average 10 timings

    start = time.time()
    r = vcorrcoef(vic_runoff,obs_runoff)
    end = time.time()
    times[t] = end-start

print np.mean(times)

Source: link

0

In Matlab this would be possible with the corr function corr(X,y). For Python however this does not seem possible with the np.corrcoef function:
import numpy as np
X = np.random.random([1000, 10])
y = np.random.random(10)
np.corrcoef(X,y).shape
Of course this could be done via a list comprehension:
np.array([np.corrcoef(X[i, :], y)[0,1] for i in range(X.shape[0])])

Source: link

Recent Questions on python

    Programming Languages