0 / 0
Data types STRING, TEXT, and VARCHAR in Snowflake tables are assigned CLOB data type in Data Virtualization
Last updated: Nov 26, 2024
Data types STRING, TEXT, and VARCHAR in Snowflake tables are assigned CLOB data type in Data Virtualization

When you virtualize tables in a Snowflake data source, Data Virtualization converts data with types STRING, TEXT, and VARCHAR to type CLOB instead of type VARCHAR when they exceed the maximum string length.

Symptoms

When you virtualize a Snowflake table that contains data types STRING, TEXT, and VARCHAR that exceed the default maximum string length of 32 K for the MaxStringSize parameter, the data type of the column is converted to CLOB.

Snowflake designates a maximum length of 16,777,216 for STRING, TEXT, and VARCHAR types if the maximum length is not user-defined. Therefore, Data Virtualization virtualizes the data type to a specific length to avoid truncation. However, you can adjust the maximum string length of these data types to avoid conversion to CLOB by setting the string size to a value less than 32 K.

The following example is from a Snowflake table with strings where you can adjust the column type to a length suitable for aggregation.

describe table TESTDATA.BASIC_STRING;

                                Data type                     Column
Column name                     schema    Data type name      Length     Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
c1                              SYSIBM    INTEGER                      4     0 Yes
c_chr5                          SYSIBM    CHARACTER                    5     0 Yes
c_vchr10                        SYSIBM    VARCHAR                     10     0 Yes
c_str                           SYSIBM    CLOB                     65535     0 Yes
  4 record(s) selected.
The following query fails because of the CLOB data type.
select "c_str", sum("c1") from TESTDATA.BASIC_STRING group by "c_str";
SQL0134N  Improper use of a string column, host variable, constant, or function "c_str".  SQLSTATE=42907

Resolving the problem

You can alter the column that causes the failure to reduce its size to something that can be used in aggregation. In the following example, VARCHAR(200) is used but you can pick what is appropriate for the maximum length of your data.
alter nickname TESTDATA.BASIC_STRING alter column "c_str" local type VARCHAR(200);
The same query now succeeds.
select "c_str", sum("c1") from TESTDATA.BASIC_STRING group by "c_str";

c_str                                                                                                                      2
------------------------------------------------------------------------------------------------------------ ---------------
                                                                                                                         112
                                                                                                                         382
something a little longer to test string, where we do support pushdown for aggregation                                   244
a                                                                                                                        126
9995-12-31 23:59:59.999999                                                                                               124
A STRING                                                                                                                 118
-                                                                                                                        114
0005-01-01 00:00:00.000001                                                                                               122
  8 record(s) selected.