0 / 0
Reviewing the auto-generated RAG notebooks
Last updated: Dec 12, 2024
Reviewing the auto-generated RAG notebooks

When you save a retrieval-augmented generation (RAG) experiment pipeline, notebooks are automatically generated and saved to the project. Review the indexing and inferencing notebooks and learn how to use them in your RAG solutions.

Important: This feature is a beta release. It is not intended for production use.

Saving a RAG pattern

After you run an experiment, you can review the generated patterns that are ranked in the leaderboard according to performance against the optimized metric. When you are satisfied with a pattern, you can save it, generating one or two notebooks saved as project assets.

The notebooks that are generated for a saved RAG pattern depend on the vector store used for the experiment, as follows:

  • The index notebook populates, updates, and maintains the vector index for the document collection. All AutoAI RAG patterns can generate an indexing notebook.
  • The inference notebook provides an endpoint for inferencing against a large language model with the augmented retrieval capabilities. Only experiments that use a Milvus database as a vector store generate an inferencing notebook.

Generating the indexing and inferencing notebooks

After you review your pipelines, follow these steps to save a pipeline and generate the associated notebooks.

  1. From the experiment leaderboard, click the name of a pipeline to view the details.
  2. Click Save. The panel lists the notebook or notebooks that are auto-generated. For example, the following image shows the save panel for a pattern created by using the in-memory Chroma database as a vector store. Saving a pattern as an auto-generated notebook
  3. Click Create.
  4. Open the notebooks from the associated project to review or run the code. For example, the indexing notebook looks as follows: Viewing the auto-generated indexing notebook

You can review the notebooks or run them by adding authentication credentials.

Reviewing the index notebook

The index notebook contains Python code for building the vector database index for your document collection.

The notebook is annotated so that you can review the steps and code for:

  • Retrieving the data to vectorize
  • Chunking the data
  • Creating the embeddings
  • Reading the benchmark data
  • Using the benchmark data to evaluate the quality of the retrieval

Reviewing the inference notebook

The inference notebook contains Python code to:

  • Retrieve relevant passages from the indexed documents for each user query
  • Generate a response to each user query by feeding the retrieved passages into a large language model for use in the generated response

The notebook is annotated so that you can review the steps and code for:

  • Building the inference Python function by using the RAG pattern that was identified in the experiment
  • Deploying the function as the inference endpoint
  • Testing the retrieval of relevant passages as input for the generated response

Run the inferencing notebook to use the RAG pattern for retrieving and generating answers to questions.

Learn more

Use the indexed documents from this experiment in the Prompt Lab to ground prompts for a foundation model. See Using an AutoAI Rag index to chat with documents.

Parent topic: Creating a RAG experiment

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more