Deploying AI assets
Using IBM Watson Machine Learning, you can deploy machine learning models, scripts, and functions, and prompt templates for generative AI models. After you create deployments, you can test and manage them, and prepare your assets to deploy into pre-production and production environments to generate predictions and insights.
Service The administrator must provision the Watson Machine Learning service on Cloud Pak for Data as a Service platform to use its capabilities.
Deploying and managing assets in deployment spaces
Create a deployment space to collaborate with stakeholders and deploy and manage your AI assets. To manage your assets within a deployment space, you must promote your assets from a project to your deployment space. You can also import or export assets from your deployment space. For more information, see Deployment spaces.
The following graphic shows the typical activities to deploy AI assets:
Ways to deploy assets
You can deploy and manage your assets in the following ways:
-
Use a no-code approach: You can use a no-code approach to deploy and manage assets in a deployment space. For more information, see Deploying and managing assets in deployment spaces
-
Use a custom-code approach You can use a custom-code approach to deploy and manage assets programmatically by using:
For additional Cloud Pak for Data as a Service APIs, see Cloud Pak for Data APIs.
Types of deployments
Depending on your organization's needs, you can create an online or a batch deployment:
-
Online deployment: Create an online deployment to process input data in real-time. To test the online deployment in real-time, you can submit new customer data to the deployment endpoint to get a prediction in real-time. You can test your online deployment by entering test data in a form or through JSON code. For more information, see Creating online deployments in Watson Machine Learning.
-
Batch deployment: Create a batch deployment to process a large batch of input data from a data source and write the output to a selected destination. To test your batch deployment, you must create a batch deployment job. You can configure the batch deployment job by providing details about the input data, output file, and information about running the job on a schedule or on demand. For more information, see Creating batch deployments in Watson Machine Learning.
-
Application deployment: Create an application deployment to deploy your application assets, such as R Shiny applications. For more information, see Deploying Shiny apps in Watson Machine Learning.
Retrieving deployment endpoints
To use you deployed asset in applications for making predictions, retrieve the endpoint URL for your online or batch deployment. The model endpoint provides access to an interface to invoke and manage model deployments.
For more information, see Retrieving the endpoint for an online deployment or Retrieving the endpoint for a batch deployment.
Types of deployable assets
You can use certain assets only to create online or batch deployments. For example, both online and batch deployments support the deployment of assets such as Python functions, scripts, and models, such as AutoAI or Decision Optimization models. However, you can create online deployments only for models that are imported from a file. The different types of deployable assets are as follows:
- Foundation model assets: You can deploy foundation model assets such as tuned model or prompt template assets with watsonx.ai. For more information, see Deploying foundation model assets.
- Machine Learning assets: You can deploy machine learning Machine Learning assets such as Python functions, R Shiny applications, NLP models, scripts, and more with Watson Machine Learning. For more information, see Deploying Machine Learning assets.
- Decision Optimization models: You can deploy Decision Optimization model with Watson Machine Learning.
Managing deployments
You can access, update, scale, delete, and monitor the performance for your deployment in your deployment space:
- Accessing a deployment: You can access details that are related to your deployment, such as stage type, which describes whether the deployment space is for preproduction or production purposes.
- Updating a deployment: You can update your deployment details such as deployment name, software specification, and more. For more information, see Updating a deployment.
- Scaling a deployment: You can create multiple copies of your deployment to increase scalability and availability for a larger volume of scoring requests. For more information, see Scaling a deployment.
- Deleting a deployment: Delete your deployment when you no longer need it to free up resources. For more information, see Deleting a deployment.
- Monitor deployment performance: You can evaluate your deployments to measure performance and understand model predictions by provisioning a Watson OpenScale instance and configuring monitors for fairness, quality, drift, and explainability.
Monitoring deployment activity
Use the deployments dashboard to get an aggregate view of your deployments and monitor deployment activity. You can use the dashboard to monitor the status of your batch deployment jobs, such as active runs and finished runs based on job schedule that you defined when you created the job. You can also get information about the number of successful and failed online deployments. For more information, see Deployments dashboard.
Managing runtime environments for deployments
Runtime envionments provide the necessary functions that are required to run your deployment.
You can use predefined runtime environments or create custom runtime environments to include more components, depending on your use case. To create a custom runtime environment for your deployment, you must create a Dockerfile and add a base
image. Further, you can add the docker
commands to build the runtime environment for your deployment. For more information, see Customizing Watson Machine Learning deployment runtimes.
Managing frameworks and software specifications for deployments
Software specifications and frameworks contain bundles of packages with corresponding versions of the packages.
You can use predefined software specifications or create custom software specifications, depending on your use case. For example, you can add new packages to existing frameworks, create new packages, or replace package versions from the software specifications.
You must update your model's software specifications to the latest version after deprecation to ensure continued service. When a framework is deprecated, support for this framework is removed in a future release.
For more information, see Frameworks and software specifications in Watson Machine Learning.
Learn more
Parent topic: Deploying and managing AI assets