Reclassifying the data
- Add a Data Asset node that points to drug_long_name.csv.
- Add a Type node after the Data Asset node. Double-click the Type node to open its properties,
and select
Cholesterol_long
as the target. - Add a Logistic Regression node after the Type node. Double-click the node and select the Binomial procedure (instead of the default Multinomial procedure).
- Hover over the Logistic Regression node and click the
Run icon . An error
message warns you that the
Cholesterol_long
string values are too long. When you encounter this type of message, follow the procedure described in the rest of this example to modify your data. - Add a Reclassify node after the Type node and double-click it to open its properties.
- For the Reclassify Field, select
Cholesterol_long
and type Cholesterol for the new field name. - Click Get values to add the
Cholesterol_long
values to the original value column. - In the new value column, type High next to the original value of
High level of cholesterol
and Normal next to the original value ofNormal level of cholesterol
. - Add a Filter node after the Reclassify node. Double-click the node, choose Filter the
selected fields, and select the
Cholesterol_long
field. - Add a Type node after the Filter node. Double-click the node and select
Cholesterol
as the target. - Add a Logistic node after the Type node. Double-click the node and select the Binomial procedure.
You can now run the binomial Logistic node and generate a model without encountering the error as you did before.
This example only shows part of a flow. For more information about the types of flows in which
you might need to reclassify long strings, see the following example:
- Auto Classifier node. See Automated modeling for a flag target.