Keyword Tool

The “keyword tool” (kwtool) compares the FITS keyword definitions in the data model schemas with the Keyword Dictionary.

This tool is only useful for data model schema developers and possibly maintainers of the keyword dictionary. As such it’s not considered part of the public API for this package and is subject to breaking changes at any point.

If a stable API is preferred please open an issue.

Usage

The primary interface is via the command line interface:

python -m stdatamodels.jwst._kwtool path_to_keyword_dictionary

This will generate a HTML report in the working directory that describes the differences.

Parsing

To make comparisons easier both keyword dictionary and data model schema FITS keyword descriptions are first parsed and converted to a common structure, a “keyword definition”.

Keyword definitions

Each FITS keyword is defined by a dictionary containing keys:

  • “scope”: The keyword dictionary top file or data model class name.

  • “path”: The “data model path”

  • “keyword”: The dictionary taken from either the keyword dictionary or data model schema

These definitions are collected from each source (described below) and stored in a dictionary with:

  • key: 2-length tuple of (FITS hdu, FITS keyword) (in upper case)

  • value: list of definitions

Value is a list since each source will list the same (FITS hdu, FITS keyword) multiple times (for each scope).

Keyword dictionary parsing

See Keyword Dictionary for an overview and the code for complete details.

Data model schema parsing

Datamodel schemas are parsed by:

  • Finding all subclasses of JwstDataModel (recursively)

  • Ignore some subclasses (for example ReferenceFileModel, see code full list).

  • For each subclass load the corresponding schema

  • Walk each schema using stdatamodels.schema.walk_schema

  • Construct a “keyword definition” for each dict node with a “fits_keyword” key

Comparison

After both sources are parsed the comparison code will check if the (FITS hdu, FITS keyword) keys match for the two sources. Keys missing from one source (but found in the other) will be reported as a difference.

Ignored keywords

However, some (FITS hdu, FITS keyword) keys are ignored.

Standard keywords

Keywords defined in the FITS standard (BITPIX, BUNIT, etc) are defined in the keyword dictionary but do not need to be defined in the data model schemas. These standard keywords will be removed from both sources prior to comparison (see the code for the corresponding regex).

Pattern keywords

Data model schemas contain “pattern” keywords. These are identified by searching for keywords starting with P_. These are not compared as they are only needed for reference files (to aid in generating rmaps for CRDS) and don’t need to be defined in the keyword dictionary.

Matching keys

If a (FITS hdu, FITS keyword) key is found in both sources the contents of the definitions will be compared. The following comparisons will be performed.

For all comparisons except “path” if a source is missing a required sub-definition (for example if the keyword definition does not have a “title”) then a MISSING_VALUE singleton will be added in place. If both sources are missing an item, both will have MISSING_VALUE and no difference will be reported.

Type

Both the keyword dictionary and data model schemas allow defining a keyword type. There are slight differences between how this is defined in each source (the keyword dictionary uses “float” whereas the data model schemas use “number”).

Enum

Comparison of “enum” definitions involves generating a set of possible values from each source and comparing the sets (so order is not compared). The generated sets include all found “scopes” (so the possible values from the keyword dictionary include possible enum values taken from all top files combined). This combination is done due to the data model schemas using a combined enum for all instruments and modes.

Title

The tool will report differences in “title” definitions between the two sources.

Path (“data model name”)

The paths at which each keyword definition is found is compared by constructing a set of paths for each source then comparing these sets. Sets are used here since each key might appear in multiple top files in the keyword dictionary and in multiple data model schemas.

There are a few instances where “path” won’t be compared. These are:

  • if the keyword dictionary entry does not have an archive destination

  • if the datamodel schema keyword definition is nested in an “items” array

Report format

The report has 3 sections:

  • Keywords in the keyword dictionary but NOT in the datamodel schemas

  • Keywords in the datamodel schemas but NOT in the keyword dictionary

  • Keywords in both with definition differences

Keywords that match (and report no difference) won’t be included in the report.

In each section, click an item to see details about the difference. A short-hand is used in the difference descriptions:

  • kwd: Keyword dictionary

  • dmd: Data model dictionary (derived from the data model schemas)