DataModel

class stdatamodels.DataModel(init=None, schema=None, pass_invalid_values=None, strict_validation=None, validate_on_assignment=None, validate_arrays=False, ignore_missing_extensions=True, ignore_unrecognized_tag=False, **kwargs)

Bases: ObjectNode

Base class of all of the data models.

Initialize a data model.

Parameters:
initstr, tuple, astropy.io.fits.HDUList, ndarray, dict, None
  • None : Create a default data model with no shape.

  • tuple : Shape of the data array. Initialize with default data array with shape specified by the tuple.

  • file path: Initialize from the given file (FITS or ASDF)

  • readable file object: Initialize from the given file object

  • astropy.io.fits.HDUList : Initialize from the given astropy.io.fits.HDUList.

  • A numpy array: Used to initialize the data array

  • dict: The object model tree for the data model

  • DataModel: Initialize from an existing DataModel instance. This will perform a shallow copy, and will convert between model subtypes as long as their schemas are compatible.

schemadict, str (optional)

Tree of objects representing a JSON schema, or string naming a schema. The schema to use to understand the elements on the model. If not provided, the schema associated with this class will be used.

pass_invalid_valuesbool or None

If True, values that do not validate the schema will be added to the metadata. If False, they will be set to None. If None, value will be taken from the environmental PASS_INVALID_VALUES. Otherwise the default value is False.

strict_validationbool or None

If True, schema validation errors will generate an exception. If False, they will generate a warning. If None, value will be taken from the environmental STRICT_VALIDATION. Otherwise, the default value is False.

validate_on_assignmentbool or None

Defaults to ‘None’. If None, value will be taken from the environmental VALIDATE_ON_ASSIGNMENT, defaulting to ‘True’ if no environment variable is set. If ‘True’, attribute assignments are validated at the time of assignment. Validation errors generate warnings and values will be set to None. If ‘False’, schema validation occurs only once at the time of write. Validation errors generate warnings.

validate_arraysbool

If True, arrays will be validated against ndim, max_ndim, and datatype validators in the schemas.

ignore_missing_extensionsbool

When False, raise warnings when a file is read that contains metadata about extensions that are not available. Defaults to True.

ignore_unrecognized_tagbool

When False, raise warnings when an unrecognized tag is encountered. When True, ignore unrecognized tags.

**kwargs

Additional keyword arguments are expected to be array-like attributes of the data model. These will be initialized with the given values only if they are defined in the schema and the schema expects an array-like value. Kwargs are only allowed when init is None, a tuple, or a numpy array. Example usage:

model = ImageModel(data=np.ones((10, 10)), dq=np.zeros((10, 10)))

Attributes Summary

crds_observatory

Get the CRDS observatory code for this model.

history

Get the history as a list of entries.

instance

override_handle

Identify in-memory models where a filepath would normally be used.

schema

Retrieve the schema for this model.

schema_url

The schema URI to validate the model against.

shape

Return the shape of the primary array.

Methods Summary

add_history_entry(description[, software])

Add an entry to the history list.

add_schema_entry(position, new_schema)

Extend the model's schema.

clone(target, source[, deepcopy, memo])

Clone the contents of one model into another.

close()

Close all file references.

copy([memo])

Return a deep copy of this model.

extend_schema(new_schema)

Extend the model's schema using the given schema, by combining it in an "allOf" array.

find_fits_keyword(keyword[, return_result])

Find a reference to a FITS keyword in this model's schema.

get_crds_parameters()

Get the parameters used by CRDS to select references for this model.

get_default(attr)

Retrieve the schema-defined default value of an attribute.

get_dtype(attr)

Retrieve the numpy dtype for an attribute, if defined.

get_primary_array_name()

Retrieve the name of the "primary" array for this model.

getarray_noinit(attribute)

Retrieve array but without initialization.

hasattr(attr)

Check if the node has an attribute in its instance.

info([max_rows, max_cols, show_values, ...])

Print a rendering of this file's tree to stdout.

items()

Iterate over all of the datamodel contents in a flat way.

keys()

Iterate over all of the datamodel contents in a flat way.

on_init(init)

Customize model attributes at the end of __init__.

on_save([path])

Modify the model just before saving to disk.

save(path[, dir_path])

Save to either a FITS or ASDF file, depending on the path.

search([key, type_, value, filter_])

Search this file's tree.

search_schema(substring)

Search the metadata schema for a particular phrase.

to_asdf(init, *args, **kwargs)

Write a data model to an ASDF file.

to_fits(init, *args, **kwargs)

Write a data model to a FITS file.

to_flat_dict([include_arrays])

Return a dictionary of all of the datamodel contents as a flat dictionary.

update(d[, only, extra_fits])

Update this model with the metadata elements from another model.

validate()

Validate the model instance against its schema.

values()

Iterate over all of the datamodel contents in a flat way.

Attributes Documentation

crds_observatory

Get the CRDS observatory code for this model.

Raises:
NotImplementedError

Subclasses should override this method to return a str.

history

Get the history as a list of entries.

Returns:
historyHistoryList

A list of history entries.

instance
override_handle

Identify in-memory models where a filepath would normally be used.

Returns:
str

A string that can be used to identify the model as an in-memory model.

schema

Retrieve the schema for this model.

Returns:
dict

The datamodel schema.

schema_url = None

The schema URI to validate the model against. If None, only basic validation of required metadata properties (filename, model_type) will occur.

shape

Return the shape of the primary array.

Methods Documentation

add_history_entry(description, software=None)

Add an entry to the history list.

Parameters:
descriptionstr

A description of the change.

softwaredict or list of dict

A description of the software used. It should not include asdf itself, as that is automatically notated in the asdf_library entry.

Each dict must have the following keys:

  • name: The name of the software

  • author: The author or institution that produced the software

  • homepage: A URI to the homepage of the software

  • version: The version of the software

add_schema_entry(position, new_schema)

Extend the model’s schema.

Place the given new_schema at the given dot-separated position in the tree.

Parameters:
positionstr

Dot separated string indicating the position, e.g. meta.instrument.name.

new_schemadict

Schema tree.

Returns:
selfDataModel

The datamodel with the schema entry added.

static clone(target, source, deepcopy=False, memo=None)

Clone the contents of one model into another.

Parameters:
targetDataModel

The model to clone into.

sourceDataModel

The model to clone from.

deepcopybool, optional

If True, perform a deep copy of the source model. If False, perform a shallow copy.

memodict, optional

A dictionary to use as a memoization table for deep copy.

close()

Close all file references.

copy(memo=None)

Return a deep copy of this model.

Parameters:
memodict, optional

A dictionary to use as a memoization table for deep copy.

extend_schema(new_schema)

Extend the model’s schema using the given schema, by combining it in an “allOf” array.

Parameters:
new_schemadict

Schema tree.

Returns:
selfDataModel

The datamodel with its schema updated.

find_fits_keyword(keyword, return_result=True)

Find a reference to a FITS keyword in this model’s schema.

This is intended for interactive use, and not for use within library code.

Parameters:
keywordstr

A FITS keyword name.

Returns:
locationslist of str

If return_result is True, a list of the locations in the schema where this FITS keyword is used. Each element is a dot-separated path.

get_crds_parameters()

Get the parameters used by CRDS to select references for this model.

Raises:
NotImplementedError

Subclasses should override this method to return a dict.

get_default(attr)

Retrieve the schema-defined default value of an attribute.

Parameters:
attrstr

Attribute to set to its default value.

Returns:
object or None

The default value for the given attribute. If the attribute is schema-defined but has no default value in the schema, this will return None.

Raises:
AttributeError

If the given attribute is not defined in the schema.

get_dtype(attr)

Retrieve the numpy dtype for an attribute, if defined.

Parameters:
attrstr

The attribute to retrieve the dtype for.

Returns:
numpy.dtype

The numpy dtype for the attribute.

Raises:
AttributeError

If the given attribute is not defined in the schema.

ValueError

If the given attribute is defined in the schema but has no datatype.

get_primary_array_name()

Retrieve the name of the “primary” array for this model.

The primary array controls the size of other arrays that are implicitly created. If the schema has the “data” property, then this method returns “data”. Otherwise, it returns an empty string. This is intended to be overridden in the subclasses if the primary array’s name is not “data”.

Returns:
primary_array_namestr

The name of the primary array.

getarray_noinit(attribute)

Retrieve array but without initialization.

Arrays initialize when directly referenced if they had not previously been initialized. This circumvents the initialization and instead raises AttributeError.

Parameters:
attributestr

The attribute to retrieve.

Returns:
valueobject

The value of the attribute.

Raises:
AttributeError

If the attribute does not exist.

hasattr(attr)

Check if the node has an attribute in its instance.

Parameters:
attrstr

The name of the attribute to check for.

Returns:
bool

True if the attribute is in the instance, False otherwise.

info(max_rows=24, max_cols=120, show_values=True, show_blocks=False)

Print a rendering of this file’s tree to stdout.

Parameters:
max_rowsint, tuple, or None, optional

Maximum number of lines to print. Nodes that cannot be displayed will be elided with a message. If int, constrain total number of displayed lines. If tuple, constrain lines per node at the depth corresponding to the tuple index. If None, display all lines.

max_colsint or None, optional

Maximum length of line to print. Nodes that cannot be fully displayed will be truncated with a message. If int, constrain length of displayed lines. If None, line length is unconstrained.

show_valuesbool, optional

Set to False to disable display of primitive values in the rendered tree.

show_blocks: bool, optional

Set to True to also print a table of block header fields for each block in the file.

items()

Iterate over all of the datamodel contents in a flat way.

Each element is a pair (key, value). Each key is a dot-separated name. For example, the schema element meta.observation.date will end up in the result as:

("meta.observation.date": "2012-04-22T03:22:05.432")
keys()

Iterate over all of the datamodel contents in a flat way.

Yields:
keystr

The key of the schema element. Each key is a dot-separated name. For example, the schema element meta.observation.date will end up in the result as the string “meta.observation.date”.

on_init(init)

Customize model attributes at the end of __init__.

Parameters:
initobject

First argument to __init__.

on_save(path=None)

Modify the model just before saving to disk.

This hook can be used, for example, to update values in the metadata that are based on the content of the data.

Override it in the subclass to make it do something, but don’t forget to “chain up” to the base class, since it does things there, too.

Parameters:
pathstr

The path to the file that we’re about to save to.

save(path, dir_path=None, *args, **kwargs)

Save to either a FITS or ASDF file, depending on the path.

Parameters:
pathstr or func

File path to save to. If function, it takes one argument with is model.meta.filename and returns the full path string.

dir_pathstr

Directory to save to. If not None, this will override any directory information in the path

Returns:
output_pathstr

The file path the model was saved in.

search(key=NotSet, type_=NotSet, value=NotSet, filter_=None)

Search this file’s tree.

Parameters:
keyNotSet, str, or any other object

Search query that selects nodes by dict key or list index. If NotSet, the node key is unconstrained. If str, the input is searched among keys/indexes as a regular expression pattern. If any other object, node’s key or index must equal the queried key.

type_NotSet, str, or builtins.type

Search query that selects nodes by type. If NotSet, the node type is unconstrained. If str, the input is searched among (fully qualified) node type names as a regular expression pattern. If builtins.type, the node must be an instance of the input.

valueNotSet, str, or any other object

Search query that selects nodes by value. If NotSet, the node value is unconstrained. If str, the input is searched among values as a regular expression pattern. If any other object, node’s value must equal the queried value.

filter_callable

Callable that filters nodes by arbitrary criteria. The callable accepts one or two arguments:

  • the node

  • the node’s list index or dict key (optional)

and returns True to retain the node, or False to remove it from the search results.

Returns:
asdf.search.AsdfSearchResult

the result of the search

search_schema(substring)

Search the metadata schema for a particular phrase.

This is intended for interactive use, and not for use within library code.

The searching is case insensitive.

Parameters:
substringstr

The substring to search for.

Returns:
locationslist of tuples

The locations within the schema where the element is found.

to_asdf(init, *args, **kwargs)

Write a data model to an ASDF file.

Parameters:
initfile path or file object

The file to write to.

*args

Additional positional arguments passed to ~asdf.AsdfFile.write_to.

**kwargs

Any additional keyword arguments are passed along to ~asdf.AsdfFile.write_to.

to_fits(init, *args, **kwargs)

Write a data model to a FITS file.

Parameters:
initfile path or file object

The file to write to.

*args

Additional positional arguments passed to astropy.io.fits.writeto.

**kwargs

Additional keyword arguments passed to astropy.io.fits.writeto.

to_flat_dict(include_arrays=True)

Return a dictionary of all of the datamodel contents as a flat dictionary.

Each dictionary key is a dot-separated name. For example, the schema element meta.observation.date will end up in the dictionary as:

{"meta.observation.date": "2012-04-22T03:22:05.432"}
Parameters:
include_arraysbool

If True, include arrays in the output. If False, exclude arrays. Default is True.

Returns:
flat_dictdict

A dictionary of all of the datamodel contents as a flat dictionary.

update(d, only=None, extra_fits=False)

Update this model with the metadata elements from another model.

update only assigns values to metadata elements that are defined in both this model’s schema and the schema of the source model d (if d is a datamodel). If extra_fits is True it will also update from the extra_fits subtree. Attributes not meeting these criteria will be silently ignored. The update method skips a WCS object, if present. The update method skips arrays.

Parameters:
d~jwst.datamodels.DataModel or dictionary-like object

The model to copy the metadata elements from. Can also be a dictionary or dictionary of dictionaries or lists, or an ~stdatamodels.properties.ObjectNode.

onlystr, None

Update only the named hdu, e.g. only='PRIMARY'. Can either be a string or list of hdu names. Default is to update all the hdus.

extra_fitsbool

Update from extra_fits. Default is False.

validate()

Validate the model instance against its schema.

values()

Iterate over all of the datamodel contents in a flat way.

Yields:
valueobject

The value of the schema element.