Instantiation is the process of reading or specifying information, such as
storage type and values for a data field. To optimize system resources, instantiating is a
user-directed process—you tell the software to read values by running data through a Type
node.
Data with unknown types is also referred to as uninstantiated. Data whose storage
type and values are unknown is displayed in the Measure column of the Type
node settings as Typeless.
When you have some information about a field's storage, such as string or numeric, the data is
called partially instantiated. Categorical or
Continuous are partially instantiated measurement levels. For example,
Categorical specifies that the field is symbolic, but you don't know whether
it's nominal, ordinal, or flag.
When all of the details about a type are known, including the values, a fully
instantiated measurement level—nominal, ordinal, flag, or continuous—is displayed in this
column. Note that the continuous type is used for both partially instantiated and fully
instantiated data fields. Continuous data can be either integers or real numbers.
When a data flow with a Type node runs, uninstantiated types immediately become partially
instantiated, based on the initial data values. After all of the data passes through the node, all
data becomes fully instantiated unless values were set to Pass. If the flow
run is interrupted, the data will remain partially instantiated. After the Types settings are
instantiated, the values of a field are static at that point in the flow. This means that any
upstream changes will not affect the values of a particular field, even if you rerun the flow. To
change or update the values based on new data or added manipulations, you need to edit them in the
Types settings or set the value for a field to Read or
Extend.
When to instantiate
Copy link to section
Instantiating in a Type node is useful when:
The dataset is large, and the flow filters a subset prior to the Type node