You can deploy and inference machine learning models from PyTorch or TensorFlow that are saved in different formats and converted to the Open Neural Network Exchange (ONNX) format. ONNX is an open-source format for representing deep learning models. Developers can use the ONNX format to train their models in one framework, such as PyTorch or TensorFlow, and then export it to run in another environment with different performance characteristics. The ONNX format provides a powerful solution for converting a maching learning model to ONNX and perform inferencing by using the ONNX runtime.
Benefits of converting models to ONNX runtime
Converting a model to ONNX runtime offers several benefits, especially in the context of machine learning and deep learning applications. Some of the advantages of converting models to ONNX runtime are as follows:
-
Cross-platform compatibility: ONNX provides a standard format for representing machine learning models, which makes it easier to deploy models across different frameworks such as PyTorch or Tensorflow. You can train models in one frameworks and deploy them in another framework that supports ONNX runtime.
-
Improved performance: ONNX runtime optimizes models for inferencing by applying various hardware and software-specific optimizaitons, such as graph optimizations. Also, it supports execution on diverse hardware, such as CPUs and GPUs, ensuring efficient utilization of resources.
-
Interoperability: ONNX provides a way to train models, such as PyTorch, TensorFlow, and scikit-learn in one framework and then export them to run in another environment, which streamlines the workflows. It breaks down the barriers between different deep learning frameworks, allowing developers to leverage the strengths of different libraries without getting locked into a single ecosystem.
Supported frameworks for conversion
You can convert machine learning models that use the following frameworks to ONNX format:
- PyTorch
- TensorFlow
Converting PyTorch models to ONNX format
Follow this process to convert your trained model in PyTorch to the ONNX format:
-
Import libraries: Start by importing the essential libraries, such as
onnxruntime
for running the model,torch
for PyTorch functionalities, and other libraries required for your application. -
Create or download PyTorch model: You can create a PyTorch model by using your own data set or use models provided by external open source model repositories like Hugging Face.
-
Convert PyTorch model to ONNX format: To convert the PyTorch model to ONNX format:
a. Prepare the model: Ensure that your PyTorch model is in evaluation mode by using
model.eval()
function. You may need a dummy input tensor to match the shape of the model.b. Export the model: Use the torch.onnx.export function to convert the model to ONNX format.
-
Verify the conversion: After converting the model, verify that the model is functioning as expected by using the
onnx
library.
Converting TensorFlow models to ONNX format
Follow this process to convert your model TensorFlow to the ONNX format:
-
Import libraries: Start by importing the essential libraries, such as
tf2onnx
to facilitate conversion of TensorFlow models to ONNX, and other libraries required for your application. -
Download TensorFlow model: You must download the externally created TensorFlow model and the data that is used for training the model.
-
Convert TensorFlow model to ONNX format: Use the
tf2onnx.convert
command to convert your TensorFlow model that is created in theSavedModel
format to ONNX format. If you want to convert a TensorFlow Lite model, use the--tflite
flag instead of the--saved-model
flag.
Keras
models and tf
functions and can be converted directly within Python.
- Verify the conversion: After converting the model, verify that the model is functioning as expected by using the
onnx
library.
Additional considerations
Here are some additional considerations for converting your models to ONNX format:
-
Dynamic axes: Dynamic axes can be used by a model to handle variable input shapes, such as dynamic batch sizes or sequence lengths, which is useful for models deployed in application where the input dimensions may vary. Use dynamic axes if your model handles variable input sizes, such as dynamic batch size or sequence length.
Dynamic axes also reduce memory overhead as they can be used with multiple inputs and outputs to adapt dynamically without re-exporting the model. You can specify the dynamic axes during model export in PyTorch or TensorFlow.
-
Opset version: The opset version in ONNX determines the set of operations and their specifications that are supported by the model. It is a critical factor during model conversion and deployment.
Different ONNX runtimes and frameworks support specific opset versions. Older opset versions may lack features or optimizations present in newer versions. Incompatibility between a model's opset version and the ONNX runtime can cause errors during inferencing. You must ensure that the ONNX opset version that you choose is supported by your target runtime.
Deploying models converted to ONNX format
Use the onnxruntime_opset_19
software specification to deploy your machine learning model converted to ONNX format. You must specify the software specification and model type when you store the model to the watsonx.ai Runtime repository.
For more information, see Supported software specifications.
To deploy models converted to ONNX format from the user interface, follow these steps:
-
In your deployment space, go to the Assets tab.
-
Find your model in the asset list, click the Menu icon Menu icon, and select Deploy.
-
Select the deployment type for your model. Choose between online and batch deployment options.
-
Enter a name for your deployment and optionally enter a serving name, description, and tags.
Note:- Use the Serving name field to specify a name for your deployment instead of deployment ID.
- The serving name must be unique within the namespace.
- The serving name must contain only these characters: [a-z,0-9,_] and must be a maximum 36 characters long.
- In workflows where your custom foundation model is used periodically, consider assigning your model the same serving name each time you deploy it. This way, after you delete and then re-deploy the model, you can keep using the same endpoint in your code.
-
Select a hardware specification for your model.
-
Select a configuration and a software specification for your model.
-
Click Create.
Testing the model
Follow these steps to test your deployed models converted to ONNX format:
- In your deployment space, open the Deployments tab and click the deployment name.
- Click the Test tab to input prompt text and get a response from the deployed asset.
- Enter test data in one of the following formats, depending on the type of asset that you deployed:
- Text: Enter text input data to generate a block of text as output.
- JSON: Enter JSON input data to generate output in JSON format.
- Click Generate to get results that are based on your prompt.
Sample notebooks
The following sample notebooks demonstrate how to deploy machine learning models converted from PyTorch or TensorFlow to the ONNX format by using the Python client library:
Notebook | Framework | Description |
---|---|---|
Convert ONNX neural network from fixed axes to dynamic axes and use it with ibm-watsonx-ai | ONNX | Set up the environment Create and export basic ONNX model Convert model from fixed axes to dynamic axes Persist converted ONNX model Deploy and score ONNX model Clean up Summary and next steps |
Use ONNX model converted from PyTorch with ibm-watsonx-ai | ONNX | Create PyTorch model with dataset. Convert PyTorch model to ONNX format Persist converted model in Watson Machine Learning repository. Deploy model for online scoring using client library. Score sample records using client library. |
Use ONNX model converted from TensorFlow to recognize hand-written digits with ibm-watsonx-ai | ONNX | Download an externally trained TensorFlow model with dataset. Convert TensorFlow model to ONNX format Persist converted model in Watson Machine Learning repository. Deploy model for online scoring using client library. Score sample records using client library. |
Parent topic: Deploying machine learning assets