Working with models
Creating a data model from scratch
To create a new ImageModel,
just call its constructor. For example, to create a
new model with the primary array set to its default, simply
provide a shape as the first argument:
from stdatamodels.jwst.datamodels import ImageModel
with ImageModel((1024, 1024)) as im:
...
In this form, the memory for the primary array (typically “data”) is
allocated on init, but additional schema-defined arrays are not.
This is useful if, for example, you don’t
need a data quality array – the memory for such an array will not be
consumed. We can use the info() method to
see everything that’s currently in the model:
im.info()
root (AsdfObject)
├─data (ndarray)
│ ├─shape (tuple)
│ │ ├─[0] (int): 1024
│ │ └─[1] (int): 1024
│ └─dtype (Float32DType): float32
└─meta (dict)
├─model_type (str): ImageModel
└─date (str): 2026-02-26T14:56:12.278
Notice that only the primary array (“data”) is allocated, has the shape we put in,
and has a data type of float32 as defined by the ImageModel schema.
To set additional arrays to their default values, use the
get_default method:
im.dq = im.get_default("dq")
im.info()
root (AsdfObject)
├─data (ndarray)
│ ├─shape (tuple)
│ │ ├─[0] (int): 1024
│ │ └─[1] (int): 1024
│ └─dtype (Float32DType): float32
├─meta (dict)
│ ├─model_type (str): ImageModel
│ └─date (str): 2026-02-26T14:56:12.278
└─dq (ndarray)
├─shape (tuple)
│ ├─[0] (int): 1024
│ └─[1] (int): 1024
└─dtype (UInt32DType): uint32
As you can see, the dq array is now allocated with the same shape as the primary array.
Note
In previous versions of stdatamodels, allocating memory for an
attribute and setting it to its default value could be achieved using e.g.
im.dq = im.dq. This syntax took advantage of a bug wherein data
arrays were created on access, leading to unexpected behaviors. This construction
no longer sets dq to a default array.
Use get_default as shown above instead.
If you already have data in numpy arrays, you can also create a model using those arrays by passing it in as a data keyword argument:
data = np.empty((50, 50))
dq = np.empty((50, 50))
with ImageModel(data=data, dq=dq) as im:
...
Loading a data model from a file
The open() function is a convenient way to create a
model from a file on disk. It may be passed any of the following:
a path to a FITS file
a path to an ASDF file
a
astropy.io.fits.HDUListobjecta readable file-like object
The file will be opened, and based on the nature of the data in the
file, the correct data model class will be returned. For example, if
the file contains 2-dimensional data, an ImageModel instance will be
returned. You will generally want to instantiate a model using a
with statement so that the file will be closed automatically when
exiting the with block.:
from stdatamodels.jwst import datamodels
with datamodels.open("myimage.fits") as im:
assert isinstance(im, datamodels.ImageModel)
If you know the type of data stored in the file, or you want to ensure that what is being loaded is of a particular type, use the constructor of the desired concrete class. For example, if you want to ensure that the file being opened contains 2-dimensional image data:
from stdatamodels.jwst.datamodels import ImageModel
with ImageModel("myimage.fits") as im:
# raises exception if myimage.fits is not an image file
pass
This will raise an exception if the file contains data of the wrong shape.
Saving a data model to a file
Simply call the save() method on the model instance. The format to
save into will either be deduced from the filename (if provided) or
the format keyword argument:
im.save("myimage.fits")
Note
Unlike astropy.io.fits, save() always clobbers the output file.
Reading Metadata Only
The datamodels.open method loads the entire file into memory and validates it against
its schema. While this is a good thing in most cases, there are times when read-only
access to metadata is useful.
To access the metadata without loading the entire file, use the
datamodels.read_metadata method. For example, to access the s_region, use
the following code:
from stdatamodels.jwst.datamodels import read_metadata
meta = read_metadata("myfile.fits")
print(meta["meta.wcsinfo.s_region"])
Notice that the metadata is returned as a flat dictionary by default.
The keys are the dot-separated names of the metadata elements, and
the values are the corresponding values in the file. A nested dictionary
will be returned instead if the flatten keyword argument is set to False.
Warning
This method bypasses schema validation, so use it with caution.
It also only returns metadata that is mapped to FITS keywords,
so some useful items (e.g., meta.wcs) will be missing.
Looking at the contents of a model
Use info() to look at the contents of a data model. It renders
the underlying ASDF tree starting at the root or a specified node.
The number of displayed rows is controlled by the max_row argument:
im.info()
root.tree (AsdfObject)
├─asdf_library (Software)
│ ├─author (str): Space Telescope Science Institute
│ ├─homepage (str): http://github.com/spacetelescope/asdf
│ ├─name (str): asdf
│ └─version (str): 2.5.2a1.dev12+g12aa460
├─history (dict)
│ └─extensions (list) ...
├─data (ndarray): shape=(2048, 2048), dtype=float32
├─dq (ndarray): shape=(2048, 2048), dtype=uint32
├─err (ndarray): shape=(2048, 2048), dtype=float32
├─meta (dict)
│ ├─aperture (dict) ...
│ ├─bunit_data (str): DN/s
│ ├─bunit_err (str): DN/s
│ ├─cal_step (dict) ...
│ ├─calibration_software_revision (str): 3bfd782b
│ ├─calibration_software_version (str): 0.14.3a1.dev133+g3bfd782b.d20200216
│ ├─coordinates (dict) ...
│ └─28 not shown
├─var_poisson (ndarray): shape=(2048, 2048), dtype=float32
├─var_rnoise (ndarray): shape=(2048, 2048), dtype=float32
└─extra_fits (dict) ...
Some nodes not shown.
Searching a model
search() can be used to search the ASDF tree by key or
value:
im.search(key='filter')
root.tree (AsdfObject)
└─meta (dict)
├─instrument (dict)
│ └─filter (str): F170LP
└─ref_file (dict)
└─filteroffset (dict)
Data Arrays
Most datamodels have one or more data arrays as attributes.
For most datamodels containing science data, the primary data array is
accessible through the data attribute. Other commonly used data arrays
are dq (data quality) and err (uncertainty/error).
The data, err, dq, etc., attributes of most models are assumed to be
numpy.ndarray arrays, or at least objects that have some of the attributes
of these arrays. numpy is used explicitly to create these arrays in some
cases (e.g., when a default value is needed). The data and err arrays
are a floating point type, and the data quality arrays are an integer type.
Metadata
Metadata information associated with a data model is accessed through
its meta member. For example, to access the date that an
observation was made:
print(model.meta.observation.date)
Metadata values are automatically type-checked against the schema when
they are set. Therefore, setting a keyword which expects a number to a
string will raise a ValidationWarning:
>>> from stdatamodels.jwst.datamodels import ImageModel
>>> model = ImageModel()
>>> model.meta.target.ra = "foo"
Traceback (most recent call last):
...
ValidationWarning: While validating ra the following error occurred: 'foo' is not of type 'number'
Such validation warnings are typically promoted to errors in the JWST pipeline.
The set of available metadata elements is defined in a YAML Schema
that ships with stdatamodels.
There is also a utility method for finding elements in the metadata
schema. search_schema will search the schema for the given
substring in metadata names as well as their documentation. The
search is case-insensitive:
>>> from stdatamodels.jwst.datamodels import ImageModel
>>> # Create a model of the desired type
>>> model = ImageModel()
>>> # Call search_schema on it to find possibly related elements.
>>> model.search_schema('target')
meta.target
meta.target.catalog_name
meta.target.category
meta.target.dec
meta.target.dec_uncertainty
meta.target.description
meta.target.proper_motion_dec
meta.target.proper_motion_epoch
meta.target.proper_motion_ra
meta.target.proposer_dec
meta.target.proposer_name
meta.target.proposer_ra
meta.target.ra
meta.target.ra_uncertainty
meta.target.source_type
meta.target.source_type_apt
meta.target.type
meta.visit.internal_target
An alternative method to get and set metadata values is to use a dot-separated name as a dictionary lookup. This is useful for databases, such as CRDS, where the path to the metadata element is most conveniently stored as a string. The following two lines are equivalent:
print(model['meta.observation.date'])
print(model.meta.observation.date)
Copying a model
To create a new model based on another model, simply use its copy()
method. This will perform a deep-copy: that is, no changes to the
original model will propagate to the new model:
new_model = old_model.copy()
It is also possible to copy all of the known metadata from one
model into a new one using the update() method:
new_model.update(old_model)
History information
Models contain a list of history records, accessed through the
history attribute. This is just an ordered list of strings –
nothing more sophisticated.
To get to the history:
entries = model.history
for entry in entries:
pass
To add an entry to the history, first create the entry by calling
stdatamodels.util.create_history_entry and appending the entry to the model
history:
import stdatamodels
entry = stdatamodels.util.create_history_entry("Processed through the frobulator step")
model.history.append(entry)
These history entries are stored in HISTORY keywords when saving
to FITS format. As an option, history entries can contain a dictionary
with a description of the software used. The dictionary must have the
following keys:
name: The name of the softwareauthor: The author or institution that produced the softwarehomepage: A URI to the homepage of the softwareversion: The version of the software
The calling sequence to create a history entry with the software description is:
entry = stdatamodels.util.create_history_entry(description, software=software_dict)
where the second argument is the dictionary with the keywords mentioned.
Working with lists
Unlike ordinary Python lists, lists in the schema may be restricted to
only accept a certain set of values. Items may be added to lists in
two ways: by passing a dictionary containing the desired key/value
pairs for the object, or using the lists special method item to
create a metadata object and then assigning that to the list.
For example, suppose the metadata element meta.transformations is a
list of transformation objects, each of which has a type (string)
and a coeff (number) member. We can assign elements to the list in
the following equivalent ways:
trans = model.meta.transformations.item()
trans.type = 'SIN'
trans.coeff = 42.0
model.meta.transformations.append(trans)
model.meta.transformations.append({'type': 'SIN', 'coeff': 42.0})
When accessing the items of the list, the result is a normal metadata object where the attributes are type-checked:
>>> trans = model.meta.transformations[0]
>>> print(trans)
<jwst.datamodels.schema.Transformations object at 0x123a810>
>>> print(trans.type)
SIN
>>> trans.type = 42.0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "site-packages/jwst.datamodels/schema.py", line 672, in __setattr__
object.__setattr__(self, attr, val)
File "site-packages/jwst.datamodels/schema.py", line 490, in __set__
val = self.to_basic_type(val)
File "site-packages/jwst.datamodels/schema.py", line 422, in to_basic_type
raise ValueError(e.message)
ValueError: 42.0 is not of type u'string'
Environment Variables
There are a number of environment variables that affect how models are read.
- PASS_INVALID_VALUES
Used by
JwstDataModelwhen instantiating a model from a file. IfTrue, values that do not validate the schema will still be added to the metadata. IfFalse, they will be set toNone. Default isFalse.- STRICT_VALIDATION
Used by
JwstDataModelwhen instantiating a model from a file. IfTrue, schema validation errors will generate an exception. IfFalse, they will generate a warning. Default isFalse.
For flag or boolean variables, any value in ('true', 't', 'yes', 'y') or a
non-zero number, will evaluate as True. Any value in ('false', 'f', 'no',
'n', '0') will evaluate as False. The values are case-insensitive.
All of the environment variables have equivalent function arguments in the API for the relevant code. The environment variables are used only if explicit values had not been used in a script. In other words, values in code override environment variables.