optimum-onnx documentation
ONNX Runtime Pipelines
ONNX Runtime Pipelines
optimum.onnxruntime.pipeline
< source >( task: str | None = None model: str | ORTModel | None = None config: str | PretrainedConfig | None = None tokenizer: str | PreTrainedTokenizer | PreTrainedTokenizerFast | None = None feature_extractor: str | FeatureExtractionMixin | None = None image_processor: str | BaseImageProcessor | None = None processor: str | ProcessorMixin | None = None revision: str | None = None use_fast: bool = True token: str | bool | None = None device: int | str | torch.device | None = None trust_remote_code: bool | None = None model_kwargs: dict[str, Any] | None = None pipeline_class: Any | None = None **kwargs: Any ) → Pipeline
Parameters
- task (
str
) — The task defining which pipeline will be returned. Currently accepted tasks are:"audio-classification"
: will return aAudioClassificationPipeline
."automatic-speech-recognition"
: will return aAutomaticSpeechRecognitionPipeline
."depth-estimation"
: will return aDepthEstimationPipeline
."document-question-answering"
: will return aDocumentQuestionAnsweringPipeline
."feature-extraction"
: will return aFeatureExtractionPipeline
."fill-mask"
: will return aFillMaskPipeline
:."image-classification"
: will return aImageClassificationPipeline
."image-feature-extraction"
: will return anImageFeatureExtractionPipeline
."image-segmentation"
: will return aImageSegmentationPipeline
."image-text-to-text"
: will return aImageTextToTextPipeline
."image-to-image"
: will return aImageToImagePipeline
."image-to-text"
: will return aImageToTextPipeline
."mask-generation"
: will return aMaskGenerationPipeline
."object-detection"
: will return aObjectDetectionPipeline
."question-answering"
: will return aQuestionAnsweringPipeline
."summarization"
: will return aSummarizationPipeline
."table-question-answering"
: will return aTableQuestionAnsweringPipeline
."text2text-generation"
: will return aText2TextGenerationPipeline
."text-classification"
(alias"sentiment-analysis"
available): will return aTextClassificationPipeline
."text-generation"
: will return aTextGenerationPipeline
:."text-to-audio"
(alias"text-to-speech"
available): will return aTextToAudioPipeline
:."token-classification"
(alias"ner"
available): will return aTokenClassificationPipeline
."translation"
: will return aTranslationPipeline
."translation_xx_to_yy"
: will return aTranslationPipeline
."video-classification"
: will return aVideoClassificationPipeline
."visual-question-answering"
: will return aVisualQuestionAnsweringPipeline
."zero-shot-classification"
: will return aZeroShotClassificationPipeline
."zero-shot-image-classification"
: will return aZeroShotImageClassificationPipeline
."zero-shot-audio-classification"
: will return aZeroShotAudioClassificationPipeline
."zero-shot-object-detection"
: will return aZeroShotObjectDetectionPipeline
.
- model (
str
orORTModel
, optional) — The model that will be used by the pipeline to make predictions. This can be a model identifier or an actual instance of a ONNX Runtime model inheriting fromORTModel
.If not provided, the default for the
task
will be loaded. - config (
str
orPretrainedConfig
, optional) — The configuration that will be used by the pipeline to instantiate the model. This can be a model identifier or an actual pretrained model configuration inheriting fromPretrainedConfig
.If not provided, the default configuration file for the requested model will be used. That means that if
model
is given, its default configuration will be used. However, ifmodel
is not supplied, thistask
’s default model’s config is used instead. - tokenizer (
str
orPreTrainedTokenizer
, optional) — The tokenizer that will be used by the pipeline to encode data for the model. This can be a model identifier or an actual pretrained tokenizer inheriting fromPreTrainedTokenizer
.If not provided, the default tokenizer for the given
model
will be loaded (if it is a string). Ifmodel
is not specified or not a string, then the default tokenizer forconfig
is loaded (if it is a string). However, ifconfig
is also not given or not a string, then the default tokenizer for the giventask
will be loaded. - feature_extractor (
str
orPreTrainedFeatureExtractor
, optional) — The feature extractor that will be used by the pipeline to encode data for the model. This can be a model identifier or an actual pretrained feature extractor inheriting fromPreTrainedFeatureExtractor
.Feature extractors are used for non-NLP models, such as Speech or Vision models as well as multi-modal models. Multi-modal models will also require a tokenizer to be passed.
If not provided, the default feature extractor for the given
model
will be loaded (if it is a string). Ifmodel
is not specified or not a string, then the default feature extractor forconfig
is loaded (if it is a string). However, ifconfig
is also not given or not a string, then the default feature extractor for the giventask
will be loaded. - image_processor (
str
orBaseImageProcessor
, optional) — The image processor that will be used by the pipeline to preprocess images for the model. This can be a model identifier or an actual image processor inheriting fromBaseImageProcessor
.Image processors are used for Vision models and multi-modal models that require image inputs. Multi-modal models will also require a tokenizer to be passed.
If not provided, the default image processor for the given
model
will be loaded (if it is a string). Ifmodel
is not specified or not a string, then the default image processor forconfig
is loaded (if it is a string). - processor (
str
orProcessorMixin
, optional) — The processor that will be used by the pipeline to preprocess data for the model. This can be a model identifier or an actual processor inheriting fromProcessorMixin
.Processors are used for multi-modal models that require multi-modal inputs, for example, a model that requires both text and image inputs.
If not provided, the default processor for the given
model
will be loaded (if it is a string). Ifmodel
is not specified or not a string, then the default processor forconfig
is loaded (if it is a string). - framework (
str
, optional) — The framework to use, either"pt"
for PyTorch or"tf"
for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the
model
, or to PyTorch if no model is provided. - revision (
str
, optional, defaults to"main"
) — When passing a task name or a string model identifier: The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git. - use_fast (
bool
, optional, defaults toTrue
) — Whether or not to use a Fast tokenizer if possible (aPreTrainedTokenizerFast
). - use_auth_token (
str
or bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runninghf auth login
(stored in~/.huggingface
). - device (
int
orstr
ortorch.device
) — Defines the device (e.g.,"cpu"
,"cuda:1"
,"mps"
, or a GPU ordinal rank like1
) on which this pipeline will be allocated. - device_map (
str
ordict[str, Union[int, str, torch.device]
, optional) — Sent directly asmodel_kwargs
(just a simpler shortcut). Whenaccelerate
library is present, setdevice_map="auto"
to compute the most optimizeddevice_map
automatically (see here for more information).Do not use
device_map
ANDdevice
at the same time as they will conflict - torch_dtype (
str
ortorch.dtype
, optional) — Sent directly asmodel_kwargs
(just a simpler shortcut) to use the available precision for this model (torch.float16
,torch.bfloat16
, … or"auto"
). - trust_remote_code (
bool
, optional, defaults toFalse
) — Whether or not to allow for custom code defined on the Hub in their own modeling, configuration, tokenization or even pipeline files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - model_kwargs (
dict[str, Any]
, optional) — Additional dictionary of keyword arguments passed along to the model’sfrom_pretrained(..., **model_kwargs)
function. - kwargs (
dict[str, Any]
, optional) — Additional keyword arguments passed along to the specific pipeline init (see the documentation for the corresponding pipeline class for possible values).
Returns
Pipeline
A suitable pipeline for the task.
Utility factory method to build a Pipeline
with an ONNX Runtime model, similar to transformers.pipeline
.
A pipeline consists of:
- One or more components for pre-processing model inputs, such as a tokenizer, image_processor, feature_extractor, or processor.
- A model that generates predictions from the inputs.
- Optional post-processing steps to refine the model’s output, which can also be handled by processors.
Examples:
>>> from optimum.onnxruntime import pipeline
>>> # Sentiment analysis pipeline
>>> analyzer = pipeline("sentiment-analysis")
>>> # Question answering pipeline, specifying the checkpoint identifier
>>> oracle = pipeline(
... "question-answering", model="distilbert/distilbert-base-cased-distilled-squad", tokenizer="google-bert/bert-base-cased"
... )
>>> # Named entity recognition pipeline, passing in a specific model and tokenizer
>>> model = ORTModelForTokenClassification.from_pretrained("dbmdz/bert-large-cased-finetuned-conll03-english")
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-cased")
>>> recognizer = pipeline("ner", model=model, tokenizer=tokenizer)