ArrayObject#

class abtem.array.ArrayObject(array, ensemble_axes_metadata=None, metadata=None)[source]#

Bases: Ensemble, EqualityMixin, CopyMixin

A base class for simulation objects described by an array and associated metadata.

Parameters:
  • array (ndarray) – Array representing the array object.

  • ensemble_axes_metadata (list of AxesMetadata) – Axis metadata for each ensemble axis. The axis metadata must be compatible with the shape of the array.

  • metadata (dict) – A dictionary defining wave function metadata. All items will be added to the metadata of measurements derived from the waves.

__init__(array, ensemble_axes_metadata=None, metadata=None)[source]#

Methods

__init__(array[, ensemble_axes_metadata, ...])

apply_func(func, **kwargs)

rtype:

TypeVar(T, bound= ArrayObject)

apply_transform(transform[, max_batch])

Transform the wave functions by a given transformation.

compute([progress_bar, profiler, ...])

Turn a lazy abTEM object into its in-memory equivalent.

copy()

Make a copy.

copy_to_device(device)

Copy array to specified device.

ensemble_blocks([chunks])

Split the ensemble into an array of smaller ensembles.

ensure_lazy([chunks])

Creates an equivalent lazy version of the array object.

expand_dims([axis, axis_metadata])

Expand the shape of the array object.

from_array_and_metadata(array, ...)

Creates array object from a given array and metadata.

from_zarr(url[, chunks])

Read wave functions from a hdf5 file.

generate_blocks([chunks])

Generate chunks of the ensemble.

generate_ensemble([keepdims])

Generate every member of the ensemble.

get_from_metadata(name[, broadcastable])

get_items(items[, keepdims])

Index the array and the corresponding axes metadata.

lazy([chunks])

rtype:

TypeVar(T, bound= ArrayObject)

max([axis, keepdims, split_every])

Maximum of array object over one or more axes.

mean([axis, keepdims, split_every])

Mean of array object over one or more axes.

min([axis, keepdims, split_every])

Minmimum of array object over one or more axes.

no_base_chunks()

Rechunk to remove chunks across the base dimensions.

rechunk(chunks, **kwargs)

Rechunk dask array.

select_block(index, chunks)

Select a block from the ensemble.

set_ensemble_axes_metadata(axes_metadata, axis)

Sets the axes metadata of an ensemble axis.

squeeze([axis])

Remove axes of length one from array object.

std([axis, keepdims, split_every])

Standard deviation of array object over one or more axes.

sum([axis, keepdims, split_every])

Sum of array object over one or more axes.

to_cpu()

Move the array to the host memory from an arbitrary source array.

to_data_array()

Convert ArrayObject to a xarray DataArray.

to_gpu([device])

Move the array from the host memory to a gpu.

to_hyperspy()

Convert ArrayObject to a Hyperspy signal.

to_tiff(filename, **kwargs)

Write data to a tiff file.

to_zarr(url[, compute, overwrite])

Write data to a zarr file.

Attributes

array

Underlying array describing the array object.

axes_metadata

List of AxisMetadata.

base_axes_metadata

List of AxisMetadata of the base axes.

base_dims

Number of base dimensions.

base_shape

Shape of the base axes of the underlying array.

device

The device where the array is stored.

dtype

Datatype of array.

ensemble_axes_metadata

List of AxisMetadata of the ensemble axes.

ensemble_dims

Number of ensemble dimensions.

ensemble_shape

Shape of the ensemble axes of the underlying array.

is_complex

True if array is complex.

is_lazy

True if array is lazy.

metadata

Metadata stored as a dictionary.

shape

Shape of the underlying array.

apply_transform(transform, max_batch='auto')[source]#

Transform the wave functions by a given transformation.

Parameters:
  • transform (ArrayObjectTransform) – The array object transformation to apply.

  • max_batch (int, optional) – The number of wave functions in each chunk of the Dask array. If ‘auto’ (default), the batch size is automatically chosen based on the abtem user configuration settings “dask.chunk-size” and “dask.chunk-size-gpu”.

Returns:

transformed_array_object – The transformed array object.

Return type:

ArrayObjectTransform

property array: ndarray | Array#

Underlying array describing the array object.

property axes_metadata: AxesMetadataList#

List of AxisMetadata.

property base_axes_metadata: list[AxisMetadata]#

List of AxisMetadata of the base axes.

property base_dims#

Number of base dimensions.

property base_shape: tuple[int, ...]#

Shape of the base axes of the underlying array.

compute(progress_bar=None, profiler=False, resource_profiler=False, **kwargs)[source]#

Turn a lazy abTEM object into its in-memory equivalent.

Parameters:
  • progress_bar (bool) – Display a progress bar in the terminal or notebook during computation. The progress bar is only displayed with a local scheduler.

  • profiler (bool) – Return Profiler class used to profile Dask’s execution at the task level. Only execution with a local is profiled.

  • resource_profiler (bool) – Return ResourceProfiler class is used to profile Dask’s execution at the resource level.

  • kwargs – Additional keyword arguments passes to dask.compute.

copy()#

Make a copy.

copy_to_device(device)[source]#

Copy array to specified device.

Parameters:

device (str) –

Returns:

object_on_device

Return type:

T

property device: str#

The device where the array is stored.

property dtype: base#

Datatype of array.

property ensemble_axes_metadata#

List of AxisMetadata of the ensemble axes.

ensemble_blocks(chunks=None)#

Split the ensemble into an array of smaller ensembles.

Parameters:

chunks (iterable of tuples) – Block sizes along each dimension.

Return type:

Array

property ensemble_dims#

Number of ensemble dimensions.

property ensemble_shape: tuple[int, ...]#

Shape of the ensemble axes of the underlying array.

ensure_lazy(chunks='auto')[source]#

Creates an equivalent lazy version of the array object.

Parameters:

chunks (int or tuple or str) – How to chunk the array. See dask.array.from_array.

Returns:

lazy_array_object – Lazy version of the array object.

Return type:

ArrayObject or subclass of ArrayObject

expand_dims(axis=None, axis_metadata=None)[source]#

Expand the shape of the array object.

Parameters:
  • axis (int or tuple of ints) – Position in the expanded axes where the new axis (or axes) is placed.

  • axis_metadata (AxisMetadata or List of AxisMetadata, optional) – The axis metadata describing the expanded axes. Default is UnknownAxis.

Returns:

expanded – View of array object with the number of dimensions increased.

Return type:

ArrayObject or subclass of ArrayObject

classmethod from_array_and_metadata(array, axes_metadata, metadata)[source]#

Creates array object from a given array and metadata.

Parameters:
  • array (array) – Complex array defining one or more 2D wave functions. The second-to-last and last dimensions are the wave function y- and x-axis, respectively.

  • axes_metadata (list of AxesMetadata) – Axis metadata for each axis. The axis metadata must be compatible with the shape of the array. The last two axes must be RealSpaceAxis.

  • metadata (dict) – A dictionary defining wave function metadata. All items will be added to the metadata of measurements derived from the waves. The metadata must contain the electron energy [eV].

Returns:

wave_functions – The created wave functions.

Return type:

Waves

classmethod from_zarr(url, chunks='auto')[source]#

Read wave functions from a hdf5 file.

Return type:

TypeVar(T, bound= ArrayObject)

urlstr

Location of the data, typically a path to a local file. A URL can also include a protocol specifier like s3:// for remote data.

chunkstuple of ints or tuples of ints

Passed to dask.array.from_array(), allows setting the chunks on initialisation, if the chunking scheme in the on-disc dataset is not optimal for the calculations to follow.

generate_blocks(chunks=1)#

Generate chunks of the ensemble.

Parameters:

chunks (iterable of tuples) – Block sizes along each dimension.

generate_ensemble(keepdims=False)[source]#

Generate every member of the ensemble.

Parameters:

keepdims (bool, opptional) – If True, all ensemble axes are left in the result as dimensions with size one. Default is False.

Yields:

ArrayObject or subclass of ArrayObject – Member of the ensemble.

get_items(items, keepdims=False)[source]#

Index the array and the corresponding axes metadata. Only ensemble axes can be indexed.

Parameters:
  • items (int or tuple of int or slice) – The array is indexed according to this.

  • keepdims (bool, optional) – If True, all ensemble axes are left in the result as dimensions with size one. Default is False.

Returns:

indexed_array – The indexed array object.

Return type:

ArrayObject or subclass of ArrayObject

property is_complex: bool#

True if array is complex.

property is_lazy: bool#

True if array is lazy.

max(axis=None, keepdims=False, split_every=2)[source]#

Maximum of array object over one or more axes. Only ensemble axes can be reduced.

Parameters:
  • axis (int or tuple of ints, optional) – Axis or axes along which a maxima are calculated. The default is to compute the mean of the flattened array. If this is a tuple of ints, the maxima are calculated over multiple axes. The indicated axes must be ensemble axes.

  • keepdims (bool, optional) – If True, the reduced axes are left in the result as dimensions with size one. Default is False.

  • split_every (int) – Only used for lazy arrays. See dask.array.reductions.

Returns:

reduced_array – The reduced array object.

Return type:

ArrayObject or subclass of ArrayObject

mean(axis=None, keepdims=False, split_every=2)[source]#

Mean of array object over one or more axes. Only ensemble axes can be reduced.

Parameters:
  • axis (int or tuple of ints, optional) – Axis or axes along which a means are calculated. The default is to compute the mean of the flattened array. If this is a tuple of ints, the mean is calculated over multiple axes. The indicated axes must be ensemble axes.

  • keepdims (bool, optional) – If True, the reduced axes are left in the result as dimensions with size one. Default is False.

  • split_every (int) – Only used for lazy arrays. See dask.array.reductions.

Returns:

reduced_array – The reduced array object.

Return type:

ArrayObject or subclass of ArrayObject

property metadata#

Metadata stored as a dictionary.

min(axis=None, keepdims=False, split_every=2)[source]#

Minmimum of array object over one or more axes. Only ensemble axes can be reduced.

Parameters:
  • axis (int or tuple of ints, optional) – Axis or axes along which a minima are calculated. The default is to compute the mean of the flattened array. If this is a tuple of ints, the minima are calculated over multiple axes. The indicated axes must be ensemble axes.

  • keepdims (bool, optional) – If True, the reduced axes are left in the result as dimensions with size one. Default is False.

  • split_every (int) – Only used for lazy arrays. See dask.array.reductions.

Returns:

reduced_array – The reduced array object.

Return type:

ArrayObject or subclass of ArrayObject

no_base_chunks()[source]#

Rechunk to remove chunks across the base dimensions.

rechunk(chunks, **kwargs)[source]#

Rechunk dask array.

chunksint or tuple or str

How to rechunk the array. See dask.array.rechunk.

kwargs :

Additional keyword arguments passes to dask.array.rechunk.

select_block(index, chunks)#

Select a block from the ensemble.

Parameters:
  • index (tuple of ints) – Index of selected block.

  • chunks (iterable of tuples) – Block sizes along each dimension.

set_ensemble_axes_metadata(axes_metadata, axis)[source]#

Sets the axes metadata of an ensemble axis.

Parameters:
  • axes_metadata (AxisMetadata) – The new axis metadata.

  • axis (int) – The axis to set.

property shape: tuple[int, ...]#

Shape of the underlying array.

squeeze(axis=None)[source]#

Remove axes of length one from array object.

Parameters:

axis (int or tuple of ints, optional) – Selects a subset of the entries of length one in the shape.

Returns:

squeezed – The input array object, but with all or a subset of the dimensions of length 1 removed.

Return type:

ArrayObject or subclass of ArrayObject

std(axis=None, keepdims=False, split_every=2)[source]#

Standard deviation of array object over one or more axes. Only ensemble axes can be reduced.

Parameters:
  • axis (int or tuple of ints, optional) – Axis or axes along which a standard deviations are calculated. The default is to compute the mean of the flattened array. If this is a tuple of ints, the standard deviations are calculated over multiple axes. The indicated axes must be ensemble axes.

  • keepdims (bool, optional) – If True, the reduced axes are left in the result as dimensions with size one. Default is False.

  • split_every (int) – Only used for lazy arrays. See dask.array.reductions.

Returns:

reduced_array – The reduced array object.

Return type:

ArrayObject or subclass of ArrayObject

sum(axis=None, keepdims=False, split_every=2)[source]#

Sum of array object over one or more axes. Only ensemble axes can be reduced.

Parameters:
  • axis (int or tuple of ints, optional) – Axis or axes along which a sums are performed. The default is to compute the mean of the flattened array. If this is a tuple of ints, the sum is performed over multiple axes. The indicated axes must be ensemble axes.

  • keepdims (bool, optional) – If True, the reduced axes are left in the result as dimensions with size one. Default is False.

  • split_every (int) – Only used for lazy arrays. See dask.array.reductions.

Returns:

reduced_array – The reduced array object.

Return type:

ArrayObject or subclass of ArrayObject

to_cpu()[source]#

Move the array to the host memory from an arbitrary source array.

Return type:

TypeVar(T, bound= ArrayObject)

to_data_array()[source]#

Convert ArrayObject to a xarray DataArray.

to_gpu(device='gpu')[source]#

Move the array from the host memory to a gpu.

Return type:

TypeVar(T, bound= ArrayObject)

to_hyperspy()[source]#

Convert ArrayObject to a Hyperspy signal.

to_tiff(filename, **kwargs)[source]#

Write data to a tiff file.

Parameters:
  • filename (str) – The filename of the file to write.

  • kwargs – Keyword arguments passed to tifffile.imwrite.

to_zarr(url, compute=True, overwrite=False, **kwargs)[source]#

Write data to a zarr file.

Parameters:
  • url (str) – Location of the data, typically a path to a local file. A URL can also include a protocol specifier like s3:// for remote data.

  • compute (bool) – If true compute immediately; return dask.delayed.Delayed otherwise.

  • overwrite (bool) – If given array already exists, overwrite=False will cause an error, where overwrite=True will replace the existing data.

  • kwargs – Keyword arguments passed to dask.array.to_zarr.