Data type STRING in Hive tables is assigned CLOB data type in Watson Query
When you virtualize tables in a Hive data source, Watson Query converts data with type STRING to type CLOB instead of type Varchar when it exceeds the maximum string length.
Symptoms
When you virtualize a Hive table that contains data type STRING that exceeds the default maximum string length (MaxStringSize), the data type of the column is converted to CLOB.
Hive does not give a maximum length for STRING types; therefore, Watson Query virtualizes at a specific length to avoid truncation. However, you can adjust the maximum length.
The following example is from a Hive table with strings where you can adjust the column type to a length suitable for aggregation.
describe table TESTDATA.BASIC_STRING;
Data type Column
Column name schema Data type name Length Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
c1 SYSIBM INTEGER 4 0 Yes
c_chr5 SYSIBM CHARACTER 5 0 Yes
c_vchr10 SYSIBM VARCHAR 10 0 Yes
c_str SYSIBM CLOB 65535 0 Yes
4 record(s) selected.
The following query fails because of the CLOB data
type.
select "c_str", sum("c1") from TESTDATA.BASIC_STRING group by "c_str";
SQL0134N Improper use of a string column, host variable, constant, or function "c_str". SQLSTATE=42907
Resolving the problem
You can alter the column that causes the failure to reduce its size to something that can be used
in aggregation. In the following example,
VARCHAR(200)
is used but you can pick
what is appropriate for the maximum length of your
data.alter nickname TESTDATA.BASIC_STRING alter column "c_str" local type VARCHAR(200);
The same query now
succeeds.
select "c_str", sum("c1") from TESTDATA.BASIC_STRING group by "c_str";
c_str 2
------------------------------------------------------------------------------------------------------------ ---------------
112
382
something a little longer to test string, where we do support pushdown for aggregation 244
a 126
9995-12-31 23:59:59.999999 124
A STRING 118
- 114
0005-01-01 00:00:00.000001 122
8 record(s) selected.