Building the flow
- Add a Data Asset node that points to pm_customer_train1.csv.
- Add a Type node, and select
response
as the target field (Role = Target). Set the measure for this field to Flag. - Set the role to None for the following fields:
customer_id
,campaign
,response_date
,purchase
,purchase_date
,product_id
,Rowid
, andX_random
. These fields will be ignored when you are building the model. - Click Read Values in the Type node to make sure that
values are instantiated.
As we saw earlier, our source data includes information about four different campaigns, each targeted to a different type of customer account. These campaigns are coded as integers in the data, so to make it easier to remember which account type each integer represents, let's define labels for each one.
- On the row for the campaign field, click the entry in the Value mode column.
- Choose Specify from the drop-down.
- Click the Edit icon in the column for the campaign field. Type the labels as shown for each of the four values.
- Click OK. Now the labels will be displayed in output windows instead of the integers.
- Attach a Table node to the Type node.
- Hover over the Table node and click the Run icon .
- In the Outputs panel, double-click the table output to open it.
- Click OK to close the output window.
Although the data includes information about four different campaigns, you
will focus the analysis on one campaign at a time. Since the largest number of records fall under
the Premium account campaign (coded campaign=2
in the data), you can use a Select
node to include only these records in the flow.