Add grounding documents to a vector index that can be used to add contextual information to foundation model prompts for retrieval-augmented generation tasks.
- Required permissions
- To create vector index assets and associate them with a prompt, you must have the Admin or Editor role in a project.
- Data format
- Differ by vector store. See Supported grounding document file types.
- Data size
- Maximum file sizes differ by file type. See Supported grounding document file types.
When you use foundation models for question-answering tasks, you can help the foundation model generate factual and up-to-date answers by adding contextual information to the foundation model prompt. When a foundation model is given factual information as input, it is more likely to incorporate that factual information in its output.
For more information, see Using vectorized text with retrieval-augmented generation tasks.
To make contextual information available to a prompt, first add grounding documents to a vector index asset, and then associate the vector index with a foundation model prompt.
The task of adding grounding documents to an index is depicted in the retrieval-augmented generation diagram by the preprocessing step, where company documents are vectorized.
Supported vector stores
You can use one of the following vector stores to store your grounding documents:
-
In memory: A Chroma database vector index that is associated with your project and provides temporary vector storage.
Note: The in-memory vector index asset is created for you automatically; you don't need to set up the vector store. -
Elasticsearch: A third-party vector index that you set up and connect to your project.
-
watsonx.data Milvus: A third-party vector index that you can set up in watsonx.data, and then connect to your project.
Choosing a vector store
When you create a vector index for your documents, you can choose the vector store to use. To determine the right vector store for your use case, consider the following factors:
-
What types of files can the vector store index?
The supported file types differ by vector store. For details, see Supported grounding document file types.
-
What embedding models can be used with the vector store?
The embedding models that you can use to vectorize documents that you add to the index differ by vector store. For details, see Embedding models and vectorization settings.
-
How many grounding documents do you want to be able to search from your foundation model prompts?
When you connect to a third-party vector store, you can choose to do one of the following things:
- Add files to vectorize and store in a new vector index or collection in the vector store.
- Use vectorized data from an existing index or collection in the vector store.
Supported grounding document file types
When you add grounding documents to create a new vector index, you can upload files or connect to a data asset that contains files.
The following table lists the supported file types and maximum file sizes that you can add when you create a new vector index. The supported file types differ by vector store.
File types are listed in the first column. The maximum total file size that is allowed for each file type is listed in the second column. A checkmark (✓) indicates that the vector store that is named in the column header supports the file type that is listed in the first column.
File type | Maximum total file size | In-memory | Elasticsearch | Milvus |
---|---|---|---|---|
CSV | 5 MB | ✓ | ✓ | |
DOCX | 10 MB | ✓ | ✓ | ✓ |
HTML | 5 MB | ✓ | ✓ | |
JSON | 5 MB | ✓ | ✓ | |
50 MB | ✓ | ✓ | ✓ | |
PPTX | 300 MB | ✓ | ✓ | ✓ |
TXT | 5 MB | ✓ | ✓ | ✓ |
XLSX | 5 MB | ✓ | ✓ |
Supported embedding models
When you upload grounding documents, an embedding model is used to calculate vectors that represent the document text. You can choose the embedding model to use.
For in-memory and Milvus data stores, the following embedding models are supported:
- all-MiniLM-L6-v2
- Requires a smaller chunk size than the IBM Slate embedding models.
- all-MiniLM-l12-v2
- Requires a smaller chunk size than the IBM Slate embedding models.
- granite-embedding-107m-multilingual
- Standard sentence transformer model based on bi-encoders and part of the IBM Granite Embeddings suite.
- granite-embedding-278m-multilingual
- Standard sentence transformer model based on bi-encoders and part of the IBM Granite Embeddings suite.
- slate-30m-english-rtrvr
- IBM model that is faster than the 125m version.
- slate-125m-english-rtrvr
- IBM model that is more precise than the 30m version.
- slate-30m-english-rtrvr-v2
- Latest version of the IBM model that is faster than the 125m version.
- slate-125m-english-rtrvr-v2
- Latest version of the IBM model that is more precise than the 30m version.
For more information about the IBM-provided embedding models, see Supported encoder models.
For the Elasticsearch data store, ELSER (Elastic Learned Sparse EncodeR) embedding models are supported. For more information, see ELSER – Elastic Learned Sparse EncodeR
Learn more
Parent topic: Getting and preparing data in a project