dpnp.cov
- dpnp.cov(m, y=None, rowvar=True, bias=False, ddof=None, fweights=None, aweights=None, *, dtype=None)
Estimate a covariance matrix, given data and weights.
For full documentation refer to
numpy.cov
.- Parameters:
m ({dpnp.ndarray, usm_ndarray}) -- A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables. Also see rowvar below.
y ({None, dpnp.ndarray, usm_ndarray}, optional) --
An additional set of variables and observations. y has the same form as that of m.
Default:
None
.rowvar (bool, optional) --
If rowvar is
True
, then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.Default:
True
.bias (bool, optional) --
Default normalization is by
(N - 1)
, whereN
is the number of observations given (unbiased estimate). If bias isTrue
, then normalization is byN
. These values can be overridden by using the keyword ddof.Default:
False
.ddof ({None, int}, optional) --
If not
None
the default value implied by bias is overridden. Note thatddof=1
will return the unbiased estimate, even if both fweights and aweights are specified, andddof=0
will return the simple average. See the notes for the details.Default:
None
.fweights ({None, dpnp.ndarray, usm_ndarray}, optional) --
1-D array of integer frequency weights; the number of times each observation vector should be repeated. It is required that
fweights >= 0
. However, the function will not raise an error whenfweights < 0
for performance reasons.Default:
None
.aweights ({None, dpnp.ndarray, usm_ndarray}, optional) --
1-D array of observation vector weights. These relative weights are typically large for observations considered "important" and smaller for observations considered less "important". If
ddof=0
the array of weights can be used to assign probabilities to observation vectors. It is required thataweights >= 0
. However, the function will not error whenaweights < 0
for performance reasons.Default:
None
.dtype ({None, str, dtype object}, optional) --
Data-type of the result. By default, the return data-type will have at least floating point type based on the capabilities of the device on which the input arrays reside.
Default:
None
.
- Returns:
out -- The covariance matrix of the variables.
- Return type:
dpnp.ndarray
See also
dpnp.corrcoef
Normalized covariance matrix.
Notes
Assume that the observations are in the columns of the observation array m and let
f = fweights
anda = aweights
for brevity. The steps to compute the weighted covariance are as follows:>>> import dpnp as np >>> m = np.arange(10, dtype=np.float32) >>> f = np.arange(10) * 2 >>> a = np.arange(10) ** 2.0 >>> ddof = 1 >>> w = f * a >>> v1 = np.sum(w) >>> v2 = np.sum(w * a) >>> m -= np.sum(m * w, axis=None, keepdims=True) / v1 >>> cov = np.dot(m * w, m.T) * v1 / (v1**2 - ddof * v2)
Note that when
a == 1
, the normalization factorv1 / (v1**2 - ddof * v2)
goes over to1 / (np.sum(f) - ddof)
as it should.Examples
>>> import dpnp as np >>> x = np.array([[0, 2], [1, 1], [2, 0]]).T
Consider two variables, \(x_0\) and \(x_1\), which correlate perfectly, but in opposite directions:
>>> x array([[0, 1, 2], [2, 1, 0]])
Note how \(x_0\) increases while \(x_1\) decreases. The covariance matrix shows this clearly:
>>> np.cov(x) array([[ 1., -1.], [-1., 1.]])
Note that element \(C_{0,1}\), which shows the correlation between \(x_0\) and \(x_1\), is negative.
Further, note how x and y are combined:
>>> x = np.array([-2.1, -1, 4.3]) >>> y = np.array([3, 1.1, 0.12]) >>> X = np.stack((x, y), axis=0) >>> np.cov(X) array([[11.71 , -4.286 ], # may vary [-4.286 , 2.14413333]]) >>> np.cov(x, y) array([[11.71 , -4.286 ], # may vary [-4.286 , 2.14413333]]) >>> np.cov(x) array(11.71)