You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For some reason there doesn't seem to be a built in cross-correlation method in NumPy that is fast for large input arrays. Hence this code (it computes the CCF using FFTs, I know there's one in statsmodels, but mine has more options :P, and this code was written somewhat as an exercise in understanding the cross correlation).
I adapted it to Pandas because Pandas totally rocks for organizing and munging, so I thought I would share this code. The functions not listed here are available upon request if there's interest in adding this to Pandas.
"""Module for cross-correlation."""importnumpyasnpimportpandasaspdfromspan.utilsimportnextpow2, pad_larger, get_fft_funcs, detrend_meandefautocorr(x, n):
"""Compute the autocorrelation of `x` Parameters ---------- x : array_like Input array n : int Number of fft points Returns ------- r : array_like The autocorrelation of `x`. """ifft, fft=get_fft_funcs(x)
returnifft(np.absolute(fft(x, n)) **2.0, n)
defcrosscorr(x, y, n):
"""Compute the cross correlation of `x` and `y` Parameters ---------- x, y : array_like n : int Returns ------- c : array_like Cross correlation of `x` and `y` """ifft, fft=get_fft_funcs(x, y)
returnifft(fft(x, n) *fft(y, n).conj(), n)
defmatrixcorr(x, nfft=None):
"""Cross-correlation of the columns in a matrix Parameters ---------- x : array_like The matrix from which to compute the cross correlations of each column with the others Returns ------- c : array_like """m, n=x.shapeifnfftisNone:
nfft=int(2**nextpow2(2*m-1))
ifft, fft=get_fft_funcs(x)
X=fft(x.T, nfft)
Xc=X.conj()
mx, nx=X.shapec=np.empty((mx**2, nx), dtype=X.dtype)
foriinxrange(n):
c[i*n:(i+1) *n] =X[i] *Xcreturnifft(c, nfft).T, mdefxcorr(x, y=None, maxlags=None, detrend=detrend_mean, scale_type='normalize'):
"""Compute the cross correlation of `x` and `y`. This function computes the cross correlation of `x` and `y`. It uses the equivalence of the cross correlation with the negative convolution computed using a FFT to achieve much faster cross correlation than is possible with the traditional signal processing definition. By default it computes the cross correlation at each of 1 - maxlags to maxlags, scaled by the lag 0 cross correlation after mean centering the data. Note that it is not necessary for `x` and `y` to be the same size. Parameters ---------- x : array_like y : array_like, optional If not given or is equal to `x`, the autocorrelation is computed. maxlags : int, optional The highest lag at which to compute the cross correlation. detrend : callable, optional A callable to detrend the data. scale_type : str, optional Returns ------- c : pandas.Series or pandas.DataFrame """defunbiased(c, lsize):
""" """returnc/ (lsize-np.abs(c.index))
defbiased(c, lsize):
""" """returnc/lsizedefnormalize(c, lsize):
""" """assertc.ndimin (1, 2), 'invalid size of cross correlation array'ifc.ndim==1:
cdiv=c.ix[0]
else:
mc, nc=c.shapencsqrt=int(np.sqrt(nc))
jkl=np.diag(np.r_[:nc].reshape((ncsqrt, ncsqrt)))
tmp=np.sqrt(c.ix[0, jkl])
cdiv=np.outer(tmp, tmp).ravel()
returnc/cdivSCALE_FUNCTIONS= {
None: lambdac, lsize: c,
'none': lambdac, lsize: c,
'unbiased': unbiased,
'biased': biased,
'normalize': normalize
}
assertx.ndimin (1, 2), 'x must be a vector or matrix'x=detrend(x)
ifx.ndim==2andnp.all(np.greater(x.shape, 1)):
assertyisNone, 'y argument not allowed when x is a 2D array'ctmp, lsize=matrixcorr(x)
elifyisNoneoryisxornp.array_equal(x, y):
lsize=x.sizectmp=autocorr(x, int(2**nextpow2(2*lsize-1)))
else:
x, y, lsize=pad_larger(x, detrend(y))
ctmp=crosscorr(x, y, int(2**nextpow2(2*lsize-1)))
ifmaxlagsisNone:
maxlags=lsizeelse:
assertmaxlags<=lsize, 'max lags must be less than or equal to %i'%lsizelags=np.r_[1-maxlags:maxlags]
return_type=pd.DataFrameifctmp.ndim==2elsepd.Seriesscaler=SCALE_FUNCTIONS[scale_type]
returnscaler(return_type(ctmp[lags], index=lags), lsize)
For some reason there doesn't seem to be a built in cross-correlation method in NumPy that is fast for large input arrays. Hence this code (it computes the CCF using FFTs, I know there's one in statsmodels, but mine has more options :P, and this code was written somewhat as an exercise in understanding the cross correlation).
I adapted it to Pandas because Pandas totally rocks for organizing and munging, so I thought I would share this code. The functions not listed here are available upon request if there's interest in adding this to Pandas.
Right now I'm using it like
The text was updated successfully, but these errors were encountered: