0 / 0
Deploying AI assets
Last updated: Nov 21, 2024
Deploying AI assets

Using IBM watsonx.ai Runtime, you can deploy machine learning models, scripts, functions , and prompt templates for generative AI models. After you create deployments, you can test and manage them, and prepare your assets to deploy into pre-production and production environments to generate predictions and insights.

Service The administrator must provision the watsonx.ai Runtime service on watsonx platform to use its capabilities.

Deployment process

The typical process for deploying an asset is as follows:

  1. Choose deployment type: Choose a deployment type for the asset type that you want to deploy.
  2. Create deployment: Depending on the asset type, you can create an online, batch, application, or detached deployment.
  3. Test deployment: You can test your deployments by entering test data, providing JSON payload, or creating a batch job.
  4. Retrieve deployment endpoint: To use your deployment in an application, you must retrieve the endpoint of your deployment. The model endpoint provides access to an interface to invoke and manage model deployments.

The following graphic shows the process for deploying AI assets:

Deployment details

Types of deployments

The most common types of deployments are as follows:

  • Online deployment: Create an online deployment to process input data in real-time. To test the online deployment in real-time, you can submit new customer data to the deployment endpoint to get a prediction in real-time.

  • Batch deployment: Create a batch deployment to process a large batch of input data from a data source and write the output to a selected destination. You can configure the batch deployment job and run the job on a schedule or on demand.

Types of deployable assets

The type of asset that you deploy dictates the type of deployment that you can create. For example, Python functions, scripts, and models, such as AutoAI or Decision Optimization models support online and batch deployments. However, you can create online deployments only for models that are imported from a file. The different types of deployable assets are as follows:

  • Foundation model assets: You can deploy foundation model assets such as tuned model, prompt template assets, or custom foundation models with watsonx.ai.

  • watsonx.ai Runtime assets: You can deploy machine learning watsonx.ai Runtime assets such as Python functions, R Shiny applications, NLP models, scripts, and more with watsonx.ai Runtime.

  • Decision Optimization models: You can deploy Decision Optimization model with watsonx.ai Runtime.

Ways to deploy assets

You can deploy and manage your assets in the following ways:

Deploying and managing assets in deployment spaces

Create a deployment space to collaborate with stakeholders and deploy and manage assets in a deployment space.

To manage your assets within a deployment space, you must promote your assets from a project to your deployment space. You can also import or export assets from your deployment space.

Use the deployments dashboard to get an aggregate view of your deployments and monitor deployment activity.

Deploying and managing assets programmatically

You can deploy and manage assets by using the Python client library or watsonx.ai Runtime API. For more information, see Deploying AI assets programmatically.

Managing frameworks and software specifications for deployments

Software specifications and frameworks contain bundles of packages with corresponding versions of the packages.

You can use predefined software specifications or create custom software specifications by adding new packages to existing frameworks, create new packages, or updating package versions in software specifications.

Learn more

Deployment spaces

Parent topic: Deploying and managing AI assets

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more