batched_indices

sisl.utils.batched_indices(ref, y, axis=-1, atol=1e-8, batch_size=200, diff_func=None)[source]

Locate x in ref by examining np.abs(diff_func(ref - y)) <= atol

This method is necessary for very large groups of data since the ref-y calls will use broad-casting to create very large memory chunks.

This method will allow broad-casting arrays up to a size of batch_size.

The memory is calculating using np.prod(ref.shape) / ref.shape[axis] * n such that n chunks of y is using the batch_size MB of memory. n will minimally be 1, regardless of batch_size.

Parameters:

ref (ndarray) – reference array where we wish to locate the indices of y in
y (ndarray of 1D or 2D) – array to locate in ref. For 2D arrays and axis not None,
axis (int or None, optional) – which axis to do a logical reduction along, if None it means that they are 1D arrays and no axis will be reduced, i.e. same as ref.ravel() - y.reshape(-1, 1) but retaining indices of the same dimension as ref.
atol (float or ndarray, optional) – absolute tolerance for comparing values, for array_like values it must have the same length as ref.shape[axis]
batch_size (float, optional) – maximum size of broad-casted array. Internal algorithms will determine the chunk size of y
diff_func (callable, optional) – function to post-process the difference values, defaults to do nothing. Should have an interface def diff_func(diff)

Returns:

the indices for each ref dimension that matches the values in y. the returned indices behave like numpy.nonzero.

Return type:

tuple of ndarray