When you save a retrieval-augmented generation (RAG) experiment pipeline, notebooks are automatically generated and saved to the project. Review the indexing and inferencing notebooks and learn how to use them in your RAG solutions.
Saving a RAG pattern
After you run an experiment, you can review the generated patterns that are ranked in the leaderboard according to performance against the optimized metric. When you are satisfied with a pattern, you can save it, generating one or two notebooks saved as project assets.
The notebooks that are generated for a saved RAG pattern depend on the vector store used for the experiment, as follows:
- The index notebook populates, updates, and maintains the vector index for the document collection. All AutoAI RAG patterns can generate an indexing notebook.
- The inference notebook provides an endpoint for inferencing against a large language model with the augmented retrieval capabilities. Only experiments that use a Milvus database as a vector store generate an inferencing notebook.
Generating the indexing and inferencing notebooks
After you review your pipelines, follow these steps to save a pipeline and generate the associated notebooks.
- From the experiment leaderboard, click the name of a pipeline to view the details.
- Click Save. The panel lists the notebook or notebooks that are auto-generated. For example, the following image shows the save panel for a pattern created by using the in-memory Chroma database as a vector store.
- Click Create.
- Open the notebooks from the associated project to review or run the code. For example, the indexing notebook looks as follows:
You can review the notebooks or run them by adding authentication credentials.
Reviewing the index notebook
The index notebook contains Python code for building the vector database index for your document collection.
The notebook is annotated so that you can review the steps and code for:
- Retrieving the data to vectorize
- Chunking the data
- Creating the embeddings
- Reading the benchmark data
- Using the benchmark data to evaluate the quality of the retrieval
Reviewing the inference notebook
The inference notebook contains Python code to:
- Retrieve relevant passages from the indexed documents for each user query
- Generate a response to each user query by feeding the retrieved passages into a large language model for use in the generated response
The notebook is annotated so that you can review the steps and code for:
- Building the inference Python function by using the RAG pattern that was identified in the experiment
- Deploying the function as the inference endpoint
- Testing the retrieval of relevant passages as input for the generated response
Run the inferencing notebook to use the RAG pattern for retrieving and generating answers to questions.
Learn more
Use the indexed documents from this experiment in the Prompt Lab to ground prompts for a foundation model. See Using an AutoAI Rag index to chat with documents.
Parent topic: Creating a RAG experiment