diff --git "a/README.md" "b/README.md" --- "a/README.md" +++ "b/README.md" @@ -31,1019 +31,964 @@ tags: - loss:MatryoshkaLoss - loss:MultipleNegativesRankingLoss widget: -- source_sentence: What is the error message related to the blob-container for the - azure-generic subscription in ZenML? +- source_sentence: What is the RESOURCE NAME for the kubernetes-cluster in the ZenML + documentation? sentences: - - '─────────────────────────────────────────────────┨┃ 🇦 azure-generic │ ZenML - Subscription ┃ + - ' ┃┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - ┠───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ + ┃ RESOURCE TYPES │ 🌀 kubernetes-cluster ┃ - ┃ 📦 blob-container │ 💥 error: connector authorization failure: the ''access-token'' - authentication method is not supported for blob storage resources ┃ + ┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - ┠───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ + ┃ RESOURCE NAME │ arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster ┃ - ┃ 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ + ┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - ┠───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ + ┃ SECRET ID │ ┃ - ┃ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ + ┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - ┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + ┃ SESSION DURATION │ N/A ┃ - zenml service-connector describe azure-session-token + ┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - Example Command Output - - - Service connector ''azure-session-token'' of type ''azure'' with id ''94d64103-9902-4aa5-8ce4-877061af89af'' - is owned by user ''default'' and is ''private''. - - - ''azure-session-token'' azure Service Connector Details - - - ┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - - - ┃ PROPERTY │ VALUE ┃ - - - ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - - - ┃ ID │ 94d64103-9902-4aa5-8ce4-877061af89af ┃' - - '🪆Use the Model Control Plane - - - A Model is simply an entity that groups pipelines, artifacts, metadata, and other - crucial business data into a unified entity. A ZenML Model is a concept that more - broadly encapsulates your ML products business logic. You may even think of a - ZenML Model as a "project" or a "workspace" - - - Please note that one of the most common artifacts that is associated with a Model - in ZenML is the so-called technical model, which is the actually model file/files - that holds the weight and parameters of a machine learning training result. However, - this is not the only artifact that is relevant; artifacts such as the training - data and the predictions this model produces in production are also linked inside - a ZenML Model. - - - Models are first-class citizens in ZenML and as such viewing and using them is - unified and centralized in the ZenML API, client as well as on the ZenML Pro dashboard. - - - A Model captures lineage information and more. Within a Model, different Model - versions can be staged. For example, you can rely on your predictions at a specific - stage, like Production, and decide whether the Model version should be promoted - based on your business rules during training. Plus, accessing data from other - Models and their versions is just as simple. - - - The Model Control Plane is how you manage your models through this unified interface. - It allows you to combine the logic of your pipelines, artifacts and crucial business - data along with the actual ''technical model''. - - - To see an end-to-end example, please refer to the starter guide. - - - PreviousDisabling visualizations + ┃ EXPIRES IN │ 11h59m57s ┃ - NextRegistering a Model + ┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - Last updated 12 days ago' - - 'turns: - + ┃ OWNER │ default ┃ - The Docker image repo digest or name. + ┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - """This is a slimmed-down version of the base implementation which aims to highlight - the abstraction layer. In order to see the full implementation and get the complete - docstrings, please check the source code on GitHub . + ┃ WORKSPACE │ default ┃ - Build your own custom image builder + ┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - If you want to create your own custom flavor for an image builder, you can follow - the following steps: + ┃ SHARED │ ➖ ┃ - Create a class that inherits from the BaseImageBuilder class and implement the - abstract build method. This method should use the given build context and build - a Docker image with it. If additionally a container registry is passed to the - build method, the image builder is also responsible for pushing the image there. + ┠──────────────────┼────────────────────────────────────────────────────────────────��────┨ - If you need to provide any configuration, create a class that inherits from the - BaseImageBuilderConfig class and adds your configuration parameters. + ┃ CREATED_AT │ 2023-06-16 10:17:46.931091 ┃ - Bring both the implementation and the configuration together by inheriting from - the BaseImageBuilderFlavor class. Make sure that you give a name to the flavor - through its abstract property. + ┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - Once you are done with the implementation, you can register it through the CLI. - Please ensure you point to the flavor class via dot notation: + ┃ UPDATED_AT │ 2023-06-16 10:17:46.931094 ┃ - zenml image-builder flavor register + ┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - For example, if your flavor class MyImageBuilderFlavor is defined in flavors/my_flavor.py, - you''d register it by doing: + Configuration' + - 'urns it with the configuration of the cloud stack.Based on the stack info and + pipeline specification, the client builds and pushes an image to the container + registry. The image contains the environment needed to execute the pipeline and + the code of the steps. - zenml image-builder flavor register flavors.my_flavor.MyImageBuilderFlavor + The client creates a run in the orchestrator. For example, in the case of the + Skypilot orchestrator, it creates a virtual machine in the cloud with some commands + to pull and run a Docker image from the specified container registry. - ZenML resolves the flavor class by taking the path where you initialized zenml - (via zenml init) as the starting point of resolution. Therefore, please ensure - you follow the best practice of initializing zenml at the root of your repository. + The orchestrator pulls the appropriate image from the container registry as it''s + executing the pipeline (each step has an image). - If ZenML does not find an initialized ZenML repository in any parent directory, - it will default to the current working directory, but usually it''s better to - not have to rely on this mechanism, and initialize zenml at the root. + As each pipeline runs, it stores artifacts physically in the artifact store. Of + course, this artifact store needs to be some form of cloud storage. - Afterward, you should see the new flavor in the list of available flavors:' -- source_sentence: Where can I find more information on configuring the Spark step - operator in ZenML? - sentences: - - 'upplied a custom value while creating the cluster.Run the following command. - - aws eks update-kubeconfig --name --region + As each pipeline runs, it reports status back to the ZenML server and optionally + queries the server for metadata. - Get the name of the deployed cluster. + Provisioning and registering a Skypilot orchestrator alongside a container registry - zenml stack recipe output gke-cluster-name\ + While there are detailed docs on how to set up a Skypilot orchestrator and a container + registry on each public cloud, we have put the most relevant details here for + convenience: - Figure out the region that the cluster is deployed to. By default, the region - is set to europe-west1, which you should use in the next step if you haven''t - supplied a custom value while creating the cluster.\ + In order to launch a pipeline on AWS with the SkyPilot orchestrator, the first + thing that you need to do is to install the AWS and Skypilot integrations: - Figure out the project that the cluster is deployed to. You must have passed in - a project ID while creating a GCP resource for the first time.\ + zenml integration install aws skypilot_aws -y - Run the following command. - gcloud container clusters get-credentials --region --project + Before we start registering any components, there is another step that we have + to execute. As we explained in the previous section, components such as orchestrators + and container registries often require you to set up the right permissions. In + ZenML, this process is simplified with the use of Service Connectors. For this + example, we need to use the IAM role authentication method of our AWS service + connector: - You may already have your kubectl client configured with your cluster. Check by - running kubectl get nodes before proceeding. + AWS_PROFILE= zenml service-connector register cloud_connector --type + aws --auto-configure - Get the name of the deployed cluster. + Once the service connector is set up, we can register a Skypilot orchestrator: - zenml stack recipe output k3d-cluster-name\ + zenml orchestrator register skypilot_orchestrator -f vm_aws' + - 'pose -f /path/to/docker-compose.yml -p zenml up -dYou need to visit the ZenML + dashboard at http://localhost:8080 to activate the server by creating an initial + admin account. You can then connect your client to the server with the web login + flow: - Set the KUBECONFIG env variable to the kubeconfig file from the cluster. + zenml connect --url http://localhost:8080 - export KUBECONFIG=$(k3d kubeconfig get )\ + Tearing down the installation is as simple as running: - You can now use the kubectl client to talk to the cluster. + docker-compose -p zenml down - Stack Recipe Deploy + Database backup and recovery - The steps for the stack recipe case should be the same as the ones listed above. - The only difference that you need to take into account is the name of the outputs - that contain your cluster name and the default regions. + An automated database backup and recovery feature is enabled by default for all + Docker deployments. The ZenML server will automatically back up the database in-memory + before every database schema migration and restore it if the migration fails. - Each recipe might have its own values and here''s how you can ascertain those - values. + The database backup automatically created by the ZenML server is only temporary + and only used as an immediate recovery in case of database migration failures. + It is not meant to be used as a long-term backup solution. If you need to back + up your database for long-term storage, you should use a dedicated backup solution. - For the cluster name, go into the outputs.tf file in the root directory and search - for the output that exposes the cluster name. + Several database backup strategies are supported, depending on where and how the + backup is stored. The strategy can be configured by means of the ZENML_STORE_BACKUP_STRATEGY + environment variable: - For the region, check out the variables.tf or the locals.tf file for the default - value assigned to it. + disabled - no backup is performed - PreviousTroubleshoot the deployed server + in-memory - the database schema and data are stored in memory. This is the fastest + backup strategy, but the backup is not persisted across container restarts, so + no manual intervention is possible in case the automatic DB recovery fails after + a failed DB migration. Adequate memory resources should be allocated to the ZenML + server container when using this backup strategy with larger databases. This is + the default backup strategy.' +- source_sentence: What are the benefits of deploying ZenML to a production environment? + sentences: + - 'graph that includes custom TRANSFORMER and ROUTER.If you are looking for a more + easy way to deploy your models locally, you can use the MLflow Model Deployer + flavor. - NextCustom secret stores + How to deploy it? - Last updated 10 months ago' - - 'ettings to specify AzureML step operator settings.Difference between stack component - settings at registration-time vs real-time + ZenML provides a Seldon Core flavor build on top of the Seldon Core Integration + to allow you to deploy and use your models in a production-grade environment. + In order to use the integration you need to install it on your local machine to + be able to register a Seldon Core Model deployer with ZenML and add it to your + stack: - For stack-component-specific settings, you might be wondering what the difference - is between these and the configuration passed in while doing zenml stack-component - register --config1=configvalue --config2=configvalue, etc. The answer is - that the configuration passed in at registration time is static and fixed throughout - all pipeline runs, while the settings can change. + zenml integration install seldon -y - A good example of this is the MLflow Experiment Tracker, where configuration which - remains static such as the tracking_url is sent through at registration time, - while runtime configuration such as the experiment_name (which might change every - pipeline run) is sent through as runtime settings. + To deploy and make use of the Seldon Core integration we need to have the following + prerequisites: - Even though settings can be overridden at runtime, you can also specify default - values for settings while configuring a stack component. For example, you could - set a default value for the nested setting of your MLflow experiment tracker: - zenml experiment-tracker register --flavor=mlflow --nested=True + access to a Kubernetes cluster. This can be configured using the kubernetes_context + configuration attribute to point to a local kubectl context or an in-cluster configuration, + but the recommended approach is to use a Service Connector to link the Seldon + Deployer Stack Component to a Kubernetes cluster. - This means that all pipelines that run using this experiment tracker use nested - MLflow runs unless overridden by specifying settings for the pipeline at runtime. + Seldon Core needs to be preinstalled and running in the target Kubernetes cluster. + Check out the official Seldon Core installation instructions or the EKS installation + example below. - Using the right key for Stack-component-specific settings + models deployed with Seldon Core need to be stored in some form of persistent + shared storage that is accessible from the Kubernetes cluster where Seldon Core + is installed (e.g. AWS S3, GCS, Azure Blob Storage, etc.). You can use one of + the supported remote artifact store flavors to store your models as part of your + stack. For a smoother experience running Seldon Core with a cloud artifact store, + we also recommend configuring explicit credentials for the artifact store. The + Seldon Core model deployer knows how to automatically convert those credentials + in the format needed by Seldon Core model servers to authenticate to the storage + back-end where models are stored. - When specifying stack-component-specific settings, a key needs to be passed. This - key should always correspond to the pattern: . + Since the Seldon Model Deployer is interacting with the Seldon Core model server + deployed on a Kubernetes cluster, you need to provide a set of configuration parameters. + These parameters are:' + - 'S Secrets Manager accounts or regions may be used.Always make sure that the backup + Secrets Store is configured to use a different location than the primary Secrets + Store. The location can be different in terms of the Secrets Store back-end type + (e.g. internal database vs. AWS Secrets Manager) or the actual location of the + Secrets Store back-end (e.g. different AWS Secrets Manager account or region, + GCP Secret Manager project or Azure Key Vault''s vault). - For example, the SagemakerStepOperator supports passing in estimator_args. The - way to specify this would be to use the key step_operator.sagemaker + Using the same location for both the primary and backup Secrets Store will not + provide any additional benefits and may even result in unexpected behavior. - @step(step_operator="nameofstepoperator", settings= {"step_operator.sagemaker": - {"estimator_args": {"instance_type": "m7g.medium"}}}) + When a backup secrets store is in use, the ZenML Server will always attempt to + read and write secret values from/to the primary Secrets Store first while ensuring + to keep the backup Secrets Store in sync. If the primary Secrets Store is unreachable, + if the secret values are not found there or any otherwise unexpected error occurs, + the ZenML Server falls back to reading and writing from/to the backup Secrets + Store. Only if the backup Secrets Store is also unavailable, the ZenML Server + will return an error. - def my_step(): + In addition to the hidden backup operations, users can also explicitly trigger + a backup operation by using the zenml secret backup CLI command. This command + will attempt to read all secrets from the primary Secrets Store and write them + to the backup Secrets Store. Similarly, the zenml secret restore CLI command can + be used to restore secrets from the backup Secrets Store to the primary Secrets + Store. These CLI commands are useful for migrating secrets from one Secrets Store + to another. - ... + Secrets migration strategy - # Using the class + Sometimes you may need to change the external provider or location where secrets + values are stored by the Secrets Store. The immediate implication of this is that + the ZenML server will no longer be able to access existing secrets with the new + configuration until they are also manually copied to the new location. Some examples + of such changes include:' + - '🤔Deploying ZenML - @step(step_operator="nameofstepoperator", settings= {"step_operator.sagemaker": - SagemakerStepOperatorSettings(instance_type="m7g.medium")}) + Why do we need to deploy ZenML? - def my_step(): + Moving your ZenML Server to a production environment offers several benefits over + staying local: - ... + Scalability: Production environments are designed to handle large-scale workloads, + allowing your models to process more data and deliver faster results. - or in YAML: + Reliability: Production-grade infrastructure ensures high availability and fault + tolerance, minimizing downtime and ensuring consistent performance. - steps: + Collaboration: A shared production environment enables seamless collaboration + between team members, making it easier to iterate on models and share insights. - my_step:' - - '_operator + Despite these advantages, transitioning to production can be challenging due to + the complexities involved in setting up the needed infrastructure. - @step(step_operator=step_operator.name)def step_on_spark(...) -> ...: + ZenML Server - ... + When you first get started with ZenML, it relies with the following architecture + on your machine. - Additional configuration + The SQLite database that you can see in this diagram is used to store information + about pipelines, pipeline runs, stacks, and other configurations. Users can run + the zenml up command to spin up a local REST server to serve the dashboard. The + diagram for this looks as follows: - For additional configuration of the Spark step operator, you can pass SparkStepOperatorSettings - when defining or running your pipeline. Check out the SDK docs for a full list - of available attributes and this docs page for more information on how to specify - settings. + In Scenario 2, the zenml up command implicitly connects the client to the server. - PreviousAzureML + Currently the ZenML server supports a legacy and a brand-new version of the dashboard. + To use the legacy version simply use the following command zenml up --legacy - NextDevelop a Custom Step Operator + In order to move into production, the ZenML server needs to be deployed somewhere + centrally so that the different cloud stack components can read from and write + to the server. Additionally, this also allows all your team members to connect + to it and share stacks and pipelines. - Last updated 19 days ago' -- source_sentence: How can I register an Azure Service Connector for an ACR registry - in ZenML using the CLI? + Deploying a ZenML Server' +- source_sentence: What is the tenant_id value in the configuration section? sentences: - - 'ure Container Registry to the remote ACR registry.To set up the Azure Container - Registry to authenticate to Azure and access an ACR registry, it is recommended - to leverage the many features provided by the Azure Service Connector such as - auto-configuration, local login, best security practices regarding long-lived - credentials and reusing the same credentials across multiple stack components. - - - If you don''t already have an Azure Service Connector configured in your ZenML - deployment, you can register one using the interactive CLI command. You have the - option to configure an Azure Service Connector that can be used to access a ACR - registry or even more than one type of Azure resource: - - - zenml service-connector register --type azure -i - - - A non-interactive CLI example that uses Azure Service Principal credentials to - configure an Azure Service Connector targeting a single ACR registry is: - - - zenml service-connector register --type azure --auth-method service-principal - --tenant_id= --client_id= --client_secret= - --resource-type docker-registry --resource-id - - - Example Command Output - - - $ zenml service-connector register azure-demo --type azure --auth-method service-principal - --tenant_id=a79f3633-8f45-4a74-a42e-68871c17b7fb --client_id=8926254a-8c3f-430a-a2fd-bdab234d491e - --client_secret=AzureSuperSecret --resource-type docker-registry --resource-id - demozenmlcontainerregistry.azurecr.io - - - ⠸ Registering service connector ''azure-demo''... - - - Successfully registered service connector `azure-demo` with access to the following - resources: + - '─────────────────────────────────────────────────┨┃ OWNER │ default ┃ - ┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - - - ┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - - - ┠────────────────────┼───────────────────────────────────────┨ - - - ┃ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - - - ┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛' - - 'Default Container Registry - - - Storing container images locally. - - - The Default container registry is a container registry flavor that comes built-in - with ZenML and allows container registry URIs of any format. - - - When to use it - - - You should use the Default container registry if you want to use a local container - registry or when using a remote container registry that is not covered by other - container registry flavors. + ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - Local registry URI format + ┃ WORKSPACE │ default ┃ - To specify a URI for a local container registry, use the following format: + ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - localhost: + ┃ SHARED │ ➖ ┃ - # Examples: + ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - localhost:5000 + ┃ CREATED_AT │ 2023-06-20 19:16:26.802374 ┃ - localhost:8000 + ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - localhost:9999 + ┃ UPDATED_AT │ 2023-06-20 19:16:26.802378 ┃ - How to use it + ┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - To use the Default container registry, we need: + Configuration - Docker installed and running. + ┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - The registry URI. If you''re using a local container registry, check out + ┃ PROPERTY │ VALUE ┃ - the previous section on the URI format. + ┠───────────────┼──────────────────────────────────────┨ - We can then register the container registry and use it in our active stack: + ┃ tenant_id │ a79ff333-8f45-4a74-a42e-68871c17b7fb ┃ - zenml container-registry register \ + ┠───────────────┼──────────────────────────────────────┨ - --flavor=default \ + ┃ client_id │ 8926254a-8c3f-430a-a2fd-bdab234d491e ┃ - --uri= + ┠───────────────┼──────────────────────────────────────┨ - # Add the container registry to the active stack + ┃ client_secret │ [HIDDEN] ┃ - zenml stack update -c + ┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - You may also need to set up authentication required to log in to the container - registry. + Azure Access Token - Authentication Methods + Uses temporary Azure access tokens explicitly configured by the user or auto-configured + from a local environment.' + - ' should pick the one that best fits your use case.If you already have one or + more GCP Service Connectors configured in your ZenML deployment, you can check + which of them can be used to access generic GCP resources like the GCP Image Builder + required for your GCP Image Builder by running e.g.: - If you are using a private container registry, you will need to configure some - form of authentication to login to the registry. If you''re looking for a quick - way to get started locally, you can use the Local Authentication method. However, - the recommended way to authenticate to a remote private container registry is - through a Docker Service Connector. + zenml service-connector list-resources --resource-type gcp-generic - If your target private container registry comes from a cloud provider like AWS, - GCP or Azure, you should use the container registry flavor targeted at that cloud - provider. For example, if you''re using AWS, you should use the AWS Container - Registry flavor. These cloud provider flavors also use specialized cloud provider - Service Connectors to authenticate to the container registry.' - - 'egister gcp-demo-multi --type gcp --auto-configureExample Command Output + Example Command Output - ```text + The following ''gcp-generic'' resources can be accessed by service connectors + configured in your workspace: - Successfully registered service connector `gcp-demo-multi` with access to the - following resources: + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - ┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + ┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE + TYPE │ RESOURCE NAMES ┃ - ┃ RESOURCE TYPE │ RESOURCE NAMES ┃ + ┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────┼────────────────┨ - ┠───────────────────────┼─────────────────────────────────────────────────┨ + ┃ bfdb657d-d808-47e7-9974-9ba6e4919d83 │ gcp-generic │ 🔵 gcp │ 🔵 gcp-generic + │ zenml-core ┃ - ┃ 🔵 gcp-generic │ zenml-core ┃ + ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - ┠───────────────────────┼─────────────────────────────────────────────────┨ + After having set up or decided on a GCP Service Connector to use to authenticate + to GCP, you can register the GCP Image Builder as follows: - ┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ + zenml image-builder register \ - ┃ │ gs://zenml-core.appspot.com ┃ + --flavor=gcp \ - ┃ │ gs://zenml-core_cloudbuild ┃ + --cloud_builder_image= \ - ┃ │ gs://zenml-datasets ┃ + --network= \ - ┠───────────────────────┼─────────────────────────────────────────────────┨ + --build_timeout= - ┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ + # Connect the GCP Image Builder to GCP via a GCP Service Connector - ┠───────────────────────┼─────────────────────────────────────────────────┨ + zenml image-builder connect -i - ┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ + A non-interactive version that connects the GCP Image Builder to a target GCP + Service Connector: - ┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + zenml image-builder connect --connector - ``` + Example Command Output - **NOTE**: from this point forward, we don''t need the local GCP CLI credentials - or the local GCP CLI at all. The steps that follow can be run on any machine regardless - of whether it has been configured and authorized to access the GCP project. + $ zenml image-builder connect gcp-image-builder --connector gcp-generic - 4. find out which GCS buckets, GCR registries, and GKE Kubernetes clusters we - can gain access to. We''ll use this information to configure the Stack Components - in our minimal GCP stack: a GCS Artifact Store, a Kubernetes Orchestrator, and - a GCP Container Registry. + Successfully connected image builder `gcp-image-builder` to the following resources: - ```sh + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓' + - 'gistry or even more than one type of AWS resource:zenml service-connector register + --type aws -i - zenml service-connector list-resources --resource-type gcs-bucket + A non-interactive CLI example that leverages the AWS CLI configuration on your + local machine to auto-configure an AWS Service Connector targeting an ECR registry + is: - ``` + zenml service-connector register --type aws --resource-type docker-registry + --auto-configure Example Command Output - ```text + $ zenml service-connector register aws-us-east-1 --type aws --resource-type docker-registry + --auto-configure - The following ''gcs-bucket'' resources can be accessed by service connectors configured - in your workspace:' -- source_sentence: What resources does the `gcp-demo-multi` service connector have - access to after registration? - sentences: - - 'Find out which configuration was used for a run + ⠸ Registering service connector ''aws-us-east-1''... - Sometimes you might want to extract the used configuration from a pipeline that - has already run. You can do this simply by loading the pipeline run and accessing - its config attribute. + Successfully registered service connector `aws-us-east-1` with access to the following + resources: - from zenml.client import Client + ┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - pipeline_run = Client().get_pipeline_run("") + ┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - configuration = pipeline_run.config + ┠────────────────────┼──────────────────────────────────────────────┨ - PreviousConfiguration hierarchy + ┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - NextAutogenerate a template yaml file + ┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - Last updated 15 days ago' - - 'onfig class and add your configuration parameters.Bring both the implementation - and the configuration together by inheriting from the BaseModelDeployerFlavor - class. Make sure that you give a name to the flavor through its abstract property. + Note: Please remember to grant the entity associated with your AWS credentials + permissions to read and write to one or more ECR repositories as well as to list + accessible ECR repositories. For a full list of permissions required to use an + AWS Service Connector to access an ECR registry, please refer to the AWS Service + Connector ECR registry resource type documentation or read the documentation available + in the interactive CLI commands and dashboard. The AWS Service Connector supports + many different authentication methods with different levels of security and convenience. + You should pick the one that best fits your use case. - Create a service class that inherits from the BaseService class and implements - the abstract methods. This class will be used to represent the deployed model - server in ZenML. + If you already have one or more AWS Service Connectors configured in your ZenML + deployment, you can check which of them can be used to access the ECR registry + you want to use for your AWS Container Registry by running e.g.: - Once you are done with the implementation, you can register it through the CLI. - Please ensure you point to the flavor class via dot notation: + zenml service-connector list-resources --connector-type aws --resource-type docker-registry - zenml model-deployer flavor register + Example Command Output' +- source_sentence: How can I customize the Docker settings for individual steps in + a ZenML pipeline? + sentences: + - '🌎Environment Variables - For example, if your flavor class MyModelDeployerFlavor is defined in flavors/my_flavor.py, - you''d register it by doing: + How to control ZenML behavior with environmental variables. - zenml model-deployer flavor register flavors.my_flavor.MyModelDeployerFlavor + There are a few pre-defined environmental variables that can be used to control + the behavior of ZenML. See the list below with default values and options: - ZenML resolves the flavor class by taking the path where you initialized zenml - (via zenml init) as the starting point of resolution. Therefore, please ensure - you follow the best practice of initializing zenml at the root of your repository. + Logging verbosity - If ZenML does not find an initialized ZenML repository in any parent directory, - it will default to the current working directory, but usually, it''s better to - not have to rely on this mechanism and initialize zenml at the root. + export ZENML_LOGGING_VERBOSITY=INFO - Afterward, you should see the new flavor in the list of available flavors: + Choose from INFO, WARN, ERROR, CRITICAL, DEBUG. - zenml model-deployer flavor list + Disable step logs - It is important to draw attention to when and how these base abstractions are - coming into play in a ZenML workflow. + Usually, ZenML stores step logs in the artifact store, but this can sometimes + cause performance bottlenecks, especially if the code utilizes progress bars. - The CustomModelDeployerFlavor class is imported and utilized upon the creation - of the custom flavor through the CLI. + If you want to configure whether logged output from steps is stored or not, set + the ZENML_DISABLE_STEP_LOGS_STORAGE environment variable to true. Note that this + will mean that logs from your steps will no longer be stored and thus won''t be + visible on the dashboard anymore. - The CustomModelDeployerConfig class is imported when someone tries to register/update - a stack component with this custom flavor. Especially, during the registration - process of the stack component, the config will be used to validate the values - given by the user. As Config objects are inherently pydantic objects, you can - also add your own custom validators here.' - - 'egister gcp-demo-multi --type gcp --auto-configureExample Command Output + export ZENML_DISABLE_STEP_LOGS_STORAGE=false - ```text + ZenML repository path - Successfully registered service connector `gcp-demo-multi` with access to the - following resources: + To configure where ZenML will install and look for its repository, set the environment + variable ZENML_REPOSITORY_PATH. - ┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + export ZENML_REPOSITORY_PATH=/path/to/somewhere - ┃ RESOURCE TYPE │ RESOURCE NAMES ┃ + Analytics - ┠───────────────────────┼─────────────────────────────────────────────────┨ + Please see our full page on what analytics are tracked and how you can opt out, + but the quick summary is that you can set this to false if you want to opt out + of analytics. - ┃ 🔵 gcp-generic │ zenml-core ┃ + export ZENML_ANALYTICS_OPT_IN=false - ┠───────────────────────┼─────────────────────────────────────────────────┨ + Debug mode - ┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ + Setting to true switches to developer mode: - ┃ │ gs://zenml-core.appspot.com ┃ + export ZENML_DEBUG=true - ┃ │ gs://zenml-core_cloudbuild ┃ + Active stack - ┃ │ gs://zenml-datasets ┃ + Setting the ZENML_ACTIVE_STACK_ID to a specific UUID will make the corresponding + stack the active stack: - ┠───────────────────────┼─────────────────────────────────────────────────┨ + export ZENML_ACTIVE_STACK_ID= - ┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ + Prevent pipeline execution - ┠───────────────────────┼──────���──────────────────────────────────────────┨ + When true, this prevents a pipeline from executing: - ┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ + export ZENML_PREVENT_PIPELINE_EXECUTION=false - ┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + Disable rich traceback - ``` + Set to false to disable the rich traceback: - **NOTE**: from this point forward, we don''t need the local GCP CLI credentials - or the local GCP CLI at all. The steps that follow can be run on any machine regardless - of whether it has been configured and authorized to access the GCP project. + export ZENML_ENABLE_RICH_TRACEBACK=true - 4. find out which GCS buckets, GCR registries, and GKE Kubernetes clusters we - can gain access to. We''ll use this information to configure the Stack Components - in our minimal GCP stack: a GCS Artifact Store, a Kubernetes Orchestrator, and - a GCP Container Registry. + Disable colourful logging - ```sh + If you wish to disable colourful logging, set the following environment variable: - zenml service-connector list-resources --resource-type gcs-bucket + ZENML_LOGGING_COLORS_DISABLED=true' + - 'pd.Series(model.predict(data)) - ``` + return predictionsHowever, this approach has the downside that if the step is + cached, then it could lead to unexpected results. You could simply disable the + cache in the above step or the corresponding pipeline. However, one other way + of achieving this would be to resolve the artifact at the pipeline level: - Example Command Output + from typing_extensions import Annotated - ```text + from zenml import get_pipeline_context, pipeline, Model - The following ''gcs-bucket'' resources can be accessed by service connectors configured - in your workspace:' -- source_sentence: What is the result of executing a Deepchecks test suite in ZenML? - sentences: - - 'urns: + from zenml.enums import ModelStages - Deepchecks test suite execution result + import pandas as pd - """# validation pre-processing (e.g. dataset preparation) can take place here + from sklearn.base import ClassifierMixin - data_validator = DeepchecksDataValidator.get_active_data_validator() + @step - suite = data_validator.data_validation( + def predict( - dataset=dataset, + model: ClassifierMixin, - check_list=[ + data: pd.DataFrame, - DeepchecksDataIntegrityCheck.TABULAR_OUTLIER_SAMPLE_DETECTION, + ) -> Annotated[pd.Series, "predictions"]: - DeepchecksDataIntegrityCheck.TABULAR_STRING_LENGTH_OUT_OF_BOUNDS, + predictions = pd.Series(model.predict(data)) - ], + return predictions - # validation post-processing (e.g. interpret results, take actions) can happen - here + @pipeline( - return suite + model=Model( - The arguments that the Deepchecks Data Validator methods can take in are the same - as those used for the Deepchecks standard steps. + name="iris_classifier", - Have a look at the complete list of methods and parameters available in the DeepchecksDataValidator - API in the SDK docs. + # Using the production stage - Call Deepchecks directly + version=ModelStages.PRODUCTION, - You can use the Deepchecks library directly in your custom pipeline steps, and - only leverage ZenML''s capability of serializing, versioning and storing the SuiteResult - objects in its Artifact Store, e.g.: + ), - import pandas as pd + def do_predictions(): - import deepchecks.tabular.checks as tabular_checks + # model name and version are derived from pipeline context - from deepchecks.core.suite import SuiteResult + model = get_pipeline_context().model - from deepchecks.tabular import Suite + inference_data = load_data() - from deepchecks.tabular import Dataset + predict( - from zenml import step + # Here, we load in the `trained_model` from a trainer step - @step + model=model.get_model_artifact("trained_model"), - def data_integrity_check( + data=inference_data, - dataset: pd.DataFrame, + if __name__ == "__main__": - ) -> SuiteResult: + do_predictions() - """Custom data integrity check step with Deepchecks + Ultimately, both approaches are fine. You should decide which one to use based + on your own preferences. - Args: + PreviousLoad artifacts into memory - dataset: a Pandas DataFrame + NextVisualizing artifacts - Returns: + Last updated 15 days ago' + - 'Docker settings on a step - Deepchecks test suite execution result + You have the option to customize the Docker settings at a step level. - """ + By default every step of a pipeline uses the same Docker image that is defined + at the pipeline level. Sometimes your steps will have special requirements that + make it necessary to define a different Docker image for one or many steps. This + can easily be accomplished by adding the DockerSettings to the step decorator + directly. - # validation pre-processing (e.g. dataset preparation) can take place here + from zenml import step - train_dataset = Dataset( + from zenml.config import DockerSettings - dataset, + @step( - label=''class'', + settings={ - cat_features=[''country'', ''state''] + "docker": DockerSettings( - suite = Suite(name="custom") + parent_image="pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime" - check = tabular_checks.OutlierSampleDetection( + def training(...): - nearest_neighbors_percent=0.01, + ... - extent_parameter=3, + Alternatively, this can also be done within the configuration file. - check.add_condition_outlier_ratio_less_or_equal( + steps: - max_outliers_ratio=0.007, + training: - outlier_score_threshold=0.5, + settings: - suite.add(check) + docker: - check = tabular_checks.StringLengthOutOfBounds( + parent_image: pytorch/pytorch:2.2.0-cuda11.8-cudnn8-runtime - num_percentiles=1000, + required_integrations: - min_unique_values=3, + gcp - check.add_condition_number_of_outliers_less_or_equal( + github - max_outliers=3,' - - 'ervice-principal + requirements: - ``` + zenml # Make sure to include ZenML for other parent images - Example Command Output + numpy - ```Successfully connected orchestrator `aks-demo-cluster` to the following resources: + PreviousDocker settings on a pipeline - ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + NextSpecify pip dependencies and apt packages - ┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE - │ RESOURCE TYPE │ RESOURCE NAMES ┃ + Last updated 19 days ago' +- source_sentence: How do I configure the Kubernetes Service Connector to connect + ZenML to Kubernetes clusters? + sentences: + - 'Kubernetes Service Connector - ┠──────────────────────────────────────┼─────────────────────────┼────────────────┼───────────────────────┼───────────────────────────────────────────────┨ + Configuring Kubernetes Service Connectors to connect ZenML to Kubernetes clusters. - ┃ f2316191-d20b-4348-a68b-f5e347862196 │ azure-service-principal │ 🇦 azure │ - 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ + The ZenML Kubernetes service connector facilitates authenticating and connecting + to a Kubernetes cluster. The connector can be used to access to any generic Kubernetes + cluster by providing pre-authenticated Kubernetes python clients to Stack Components + that are linked to it and also allows configuring the local Kubernetes CLI (i.e. + kubectl). - ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + Prerequisites - ``` + The Kubernetes Service Connector is part of the Kubernetes ZenML integration. + You can either install the entire integration or use a pypi extra to install it + independently of the integration: - Register and connect an Azure Container Registry Stack Component to an ACR container - registry:Copyzenml container-registry register acr-demo-registry --flavor azure - --uri=demozenmlcontainerregistry.azurecr.io + pip install "zenml[connectors-kubernetes]" installs only prerequisites for the + Kubernetes Service Connector Type - Example Command Output + zenml integration install kubernetes installs the entire Kubernetes ZenML integration - ``` + A local Kubernetes CLI (i.e. kubectl ) and setting up local kubectl configuration + contexts is not required to access Kubernetes clusters in your Stack Components + through the Kubernetes Service Connector. - Successfully registered container_registry `acr-demo-registry`. + $ zenml service-connector list-types --type kubernetes - ``` + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - ```sh + ┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH + METHODS │ LOCAL │ REMOTE ┃ - zenml container-registry connect acr-demo-registry --connector azure-service-principal + ┠──────────────────────────────┼───────────────┼───────────────────────┼──────────────┼───────┼────────┨ - ``` + ┃ Kubernetes Service Connector │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ password │ + ✅ │ ✅ ┃ - Example Command Output + ┃ │ │ │ token │ │ ┃ - ``` + ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ - Successfully connected container registry `acr-demo-registry` to the following - resources: - + Resource Types - ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + The Kubernetes Service Connector only supports authenticating to and granting + access to a generic Kubernetes cluster. This type of resource is identified by + the kubernetes-cluster Resource Type.' + - 'to the container registry. - ┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE - │ RESOURCE TYPE │ RESOURCE NAMES ┃ + Authentication MethodsIntegrating and using an Azure Container Registry in your + pipelines is not possible without employing some form of authentication. If you''re + looking for a quick way to get started locally, you can use the Local Authentication + method. However, the recommended way to authenticate to the Azure cloud platform + is through an Azure Service Connector. This is particularly useful if you are + configuring ZenML stacks that combine the Azure Container Registry with other + remote stack components also running in Azure. - ┠──────────────────────────────────────┼─────────────────────────┼────────────────┼────────────────────┼───────────────────────────────────────┨ + This method uses the Docker client authentication available in the environment + where the ZenML code is running. On your local machine, this is the quickest way + to configure an Azure Container Registry. You don''t need to supply credentials + explicitly when you register the Azure Container Registry, as it leverages the + local credentials and configuration that the Azure CLI and Docker client store + on your local machine. However, you will need to install and set up the Azure + CLI on your machine as a prerequisite, as covered in the Azure CLI documentation, + before you register the Azure Container Registry. - ┃ f2316191-d20b-4348-a68b-f5e347862196 │ azure-service-principal │ 🇦 azure │ - 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃' - - 'r │ zenhacks-cluster ┃┠───────────────────────┼──────────────────────────────────────────────┨ + With the Azure CLI installed and set up with credentials, you need to login to + the container registry so Docker can pull and push images: - ┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ + # Fill your REGISTRY_NAME in the placeholder in the following command. - ┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + # You can find the REGISTRY_NAME as part of your registry URI: `.azurecr.io` - The Service Connector configuration shows long-lived credentials were lifted from - the local environment and the AWS Session Token authentication method was configured: + az acr login --name= - zenml service-connector describe aws-session-token - - Example Command Output + Stacks using the Azure Container Registry set up with local authentication are + not portable across environments. To make ZenML pipelines fully portable, it is + recommended to use an Azure Service Connector to link your Azure Container Registry + to the remote ACR registry.' + - 'he Post-execution workflow has changed as follows:The get_pipelines and get_pipeline + methods have been moved out of the Repository (i.e. the new Client ) class and + lie directly in the post_execution module now. To use the user has to do: - Service connector ''aws-session-token'' of type ''aws'' with id ''3ae3e595-5cbc-446e-be64-e54e854e0e3f'' - is owned by user ''default'' and is ''private''. + from zenml.post_execution import get_pipelines, get_pipeline - ''aws-session-token'' aws Service Connector Details + New methods to directly get a run have been introduced: get_run and get_unlisted_runs + method has been introduced to get unlisted runs. - ┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + Usage remains largely similar. Please read the new docs for post-execution to + inform yourself of what further has changed. - ┃ PROPERTY │ VALUE ┃ + How to migrate: Replace all post-execution workflows from the paradigm of Repository.get_pipelines + or Repository.get_pipeline_run to the corresponding post_execution methods. - ┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ + 📡Future Changes - ┃ ID │ c0f8e857-47f9-418b-a60f-c3b03023da54 ┃ + While this rehaul is big and will break previous releases, we do have some more + work left to do. However we also expect this to be the last big rehaul of ZenML + before our 1.0.0 release, and no other release will be so hard breaking as this + one. Currently planned future breaking changes are: - ┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ + Following the metadata store, the secrets manager stack component might move out + of the stack. - ┃ NAME │ aws-session-token ┃ + ZenML StepContext might be deprecated. - ┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ + 🐞 Reporting Bugs - ┃ TYPE │ 🔶 aws ┃ + While we have tried our best to document everything that has changed, we realize + that mistakes can be made and smaller changes overlooked. If this is the case, + or you encounter a bug at any time, the ZenML core team and community are available + around the clock on the growing Slack community. - ┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ + For bug reports, please also consider submitting a GitHub Issue. - ┃ AUTH METHOD │ session-token ┃ + Lastly, if the new changes have left you desiring a feature, then consider adding + it to our public feature voting board. Before doing so, do check what is already + on there and consider upvoting the features you desire the most. - ┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ + PreviousMigration guide - ┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry - ┃ + NextMigration guide 0.23.0 → 0.30.0 - ┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨' + Last updated 12 days ago' model-index: - name: zenml/finetuned-snowflake-arctic-embed-m results: @@ -1058,46 +1003,46 @@ model-index: value: 0.3614457831325301 name: Cosine Accuracy@1 - type: cosine_accuracy@3 - value: 0.6987951807228916 + value: 0.6024096385542169 name: Cosine Accuracy@3 - type: cosine_accuracy@5 - value: 0.7530120481927711 + value: 0.6987951807228916 name: Cosine Accuracy@5 - type: cosine_accuracy@10 - value: 0.8554216867469879 + value: 0.7831325301204819 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.3614457831325301 name: Cosine Precision@1 - type: cosine_precision@3 - value: 0.23293172690763048 + value: 0.2008032128514056 name: Cosine Precision@3 - type: cosine_precision@5 - value: 0.15060240963855417 + value: 0.1397590361445783 name: Cosine Precision@5 - type: cosine_precision@10 - value: 0.08554216867469877 + value: 0.07831325301204817 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.3614457831325301 name: Cosine Recall@1 - type: cosine_recall@3 - value: 0.6987951807228916 + value: 0.6024096385542169 name: Cosine Recall@3 - type: cosine_recall@5 - value: 0.7530120481927711 + value: 0.6987951807228916 name: Cosine Recall@5 - type: cosine_recall@10 - value: 0.8554216867469879 + value: 0.7831325301204819 name: Cosine Recall@10 - type: cosine_ndcg@10 - value: 0.6194049451779184 + value: 0.5756072832948543 name: Cosine Ndcg@10 - type: cosine_mrr@10 - value: 0.5427878179384205 + value: 0.5091365461847391 name: Cosine Mrr@10 - type: cosine_map@100 - value: 0.5472907234693755 + value: 0.5165480061197206 name: Cosine Map@100 - task: type: information-retrieval @@ -1107,49 +1052,49 @@ model-index: type: dim_256 metrics: - type: cosine_accuracy@1 - value: 0.3433734939759036 + value: 0.3674698795180723 name: Cosine Accuracy@1 - type: cosine_accuracy@3 - value: 0.6807228915662651 + value: 0.6144578313253012 name: Cosine Accuracy@3 - type: cosine_accuracy@5 - value: 0.7650602409638554 + value: 0.6987951807228916 name: Cosine Accuracy@5 - type: cosine_accuracy@10 - value: 0.8373493975903614 + value: 0.7710843373493976 name: Cosine Accuracy@10 - type: cosine_precision@1 - value: 0.3433734939759036 + value: 0.3674698795180723 name: Cosine Precision@1 - type: cosine_precision@3 - value: 0.2269076305220883 + value: 0.2048192771084337 name: Cosine Precision@3 - type: cosine_precision@5 - value: 0.15301204819277103 + value: 0.1397590361445783 name: Cosine Precision@5 - type: cosine_precision@10 - value: 0.08373493975903612 + value: 0.07710843373493974 name: Cosine Precision@10 - type: cosine_recall@1 - value: 0.3433734939759036 + value: 0.3674698795180723 name: Cosine Recall@1 - type: cosine_recall@3 - value: 0.6807228915662651 + value: 0.6144578313253012 name: Cosine Recall@3 - type: cosine_recall@5 - value: 0.7650602409638554 + value: 0.6987951807228916 name: Cosine Recall@5 - type: cosine_recall@10 - value: 0.8373493975903614 + value: 0.7710843373493976 name: Cosine Recall@10 - type: cosine_ndcg@10 - value: 0.602546157610675 + value: 0.5732430988480587 name: Cosine Ndcg@10 - type: cosine_mrr@10 - value: 0.525891661885638 + value: 0.509569229298145 name: Cosine Mrr@10 - type: cosine_map@100 - value: 0.5310273317942533 + value: 0.5167702755195493 name: Cosine Map@100 - task: type: information-retrieval @@ -1159,49 +1104,49 @@ model-index: type: dim_128 metrics: - type: cosine_accuracy@1 - value: 0.3132530120481928 + value: 0.29518072289156627 name: Cosine Accuracy@1 - type: cosine_accuracy@3 - value: 0.6265060240963856 + value: 0.5542168674698795 name: Cosine Accuracy@3 - type: cosine_accuracy@5 - value: 0.7168674698795181 + value: 0.6506024096385542 name: Cosine Accuracy@5 - type: cosine_accuracy@10 - value: 0.7891566265060241 + value: 0.7469879518072289 name: Cosine Accuracy@10 - type: cosine_precision@1 - value: 0.3132530120481928 + value: 0.29518072289156627 name: Cosine Precision@1 - type: cosine_precision@3 - value: 0.20883534136546178 + value: 0.18473895582329317 name: Cosine Precision@3 - type: cosine_precision@5 - value: 0.1433734939759036 + value: 0.1301204819277108 name: Cosine Precision@5 - type: cosine_precision@10 - value: 0.0789156626506024 + value: 0.07469879518072288 name: Cosine Precision@10 - type: cosine_recall@1 - value: 0.3132530120481928 + value: 0.29518072289156627 name: Cosine Recall@1 - type: cosine_recall@3 - value: 0.6265060240963856 + value: 0.5542168674698795 name: Cosine Recall@3 - type: cosine_recall@5 - value: 0.7168674698795181 + value: 0.6506024096385542 name: Cosine Recall@5 - type: cosine_recall@10 - value: 0.7891566265060241 + value: 0.7469879518072289 name: Cosine Recall@10 - type: cosine_ndcg@10 - value: 0.5630057581169484 + value: 0.5199227959343978 name: Cosine Ndcg@10 - type: cosine_mrr@10 - value: 0.4893144004589788 + value: 0.44722939376553855 name: Cosine Mrr@10 - type: cosine_map@100 - value: 0.4960510164414996 + value: 0.4541483656933914 name: Cosine Map@100 - task: type: information-retrieval @@ -1211,49 +1156,49 @@ model-index: type: dim_64 metrics: - type: cosine_accuracy@1 - value: 0.25903614457831325 + value: 0.28313253012048195 name: Cosine Accuracy@1 - type: cosine_accuracy@3 - value: 0.5120481927710844 + value: 0.5180722891566265 name: Cosine Accuracy@3 - type: cosine_accuracy@5 - value: 0.6325301204819277 + value: 0.5843373493975904 name: Cosine Accuracy@5 - type: cosine_accuracy@10 - value: 0.7168674698795181 + value: 0.6746987951807228 name: Cosine Accuracy@10 - type: cosine_precision@1 - value: 0.25903614457831325 + value: 0.28313253012048195 name: Cosine Precision@1 - type: cosine_precision@3 - value: 0.17068273092369476 + value: 0.17269076305220882 name: Cosine Precision@3 - type: cosine_precision@5 - value: 0.12650602409638553 + value: 0.11686746987951806 name: Cosine Precision@5 - type: cosine_precision@10 - value: 0.07168674698795179 + value: 0.06746987951807228 name: Cosine Precision@10 - type: cosine_recall@1 - value: 0.25903614457831325 + value: 0.28313253012048195 name: Cosine Recall@1 - type: cosine_recall@3 - value: 0.5120481927710844 + value: 0.5180722891566265 name: Cosine Recall@3 - type: cosine_recall@5 - value: 0.6325301204819277 + value: 0.5843373493975904 name: Cosine Recall@5 - type: cosine_recall@10 - value: 0.7168674698795181 + value: 0.6746987951807228 name: Cosine Recall@10 - type: cosine_ndcg@10 - value: 0.48618223058871674 + value: 0.47987356927913916 name: Cosine Ndcg@10 - type: cosine_mrr@10 - value: 0.41233027347485207 + value: 0.4177519602218399 name: Cosine Mrr@10 - type: cosine_map@100 - value: 0.42094598177412385 + value: 0.4261749847732839 name: Cosine Map@100 --- @@ -1307,9 +1252,9 @@ from sentence_transformers import SentenceTransformer model = SentenceTransformer("zenml/finetuned-snowflake-arctic-embed-m") # Run inference sentences = [ - 'What is the result of executing a Deepchecks test suite in ZenML?', - 'urns:\n\nDeepchecks test suite execution result\n\n"""# validation pre-processing (e.g. dataset preparation) can take place here\n\ndata_validator = DeepchecksDataValidator.get_active_data_validator()\n\nsuite = data_validator.data_validation(\n\ndataset=dataset,\n\ncheck_list=[\n\nDeepchecksDataIntegrityCheck.TABULAR_OUTLIER_SAMPLE_DETECTION,\n\nDeepchecksDataIntegrityCheck.TABULAR_STRING_LENGTH_OUT_OF_BOUNDS,\n\n],\n\n# validation post-processing (e.g. interpret results, take actions) can happen here\n\nreturn suite\n\nThe arguments that the Deepchecks Data Validator methods can take in are the same as those used for the Deepchecks standard steps.\n\nHave a look at the complete list of methods and parameters available in the DeepchecksDataValidator API in the SDK docs.\n\nCall Deepchecks directly\n\nYou can use the Deepchecks library directly in your custom pipeline steps, and only leverage ZenML\'s capability of serializing, versioning and storing the SuiteResult objects in its Artifact Store, e.g.:\n\nimport pandas as pd\n\nimport deepchecks.tabular.checks as tabular_checks\n\nfrom deepchecks.core.suite import SuiteResult\n\nfrom deepchecks.tabular import Suite\n\nfrom deepchecks.tabular import Dataset\n\nfrom zenml import step\n\n@step\n\ndef data_integrity_check(\n\ndataset: pd.DataFrame,\n\n) -> SuiteResult:\n\n"""Custom data integrity check step with Deepchecks\n\nArgs:\n\ndataset: a Pandas DataFrame\n\nReturns:\n\nDeepchecks test suite execution result\n\n"""\n\n# validation pre-processing (e.g. dataset preparation) can take place here\n\ntrain_dataset = Dataset(\n\ndataset,\n\nlabel=\'class\',\n\ncat_features=[\'country\', \'state\']\n\nsuite = Suite(name="custom")\n\ncheck = tabular_checks.OutlierSampleDetection(\n\nnearest_neighbors_percent=0.01,\n\nextent_parameter=3,\n\ncheck.add_condition_outlier_ratio_less_or_equal(\n\nmax_outliers_ratio=0.007,\n\noutlier_score_threshold=0.5,\n\nsuite.add(check)\n\ncheck = tabular_checks.StringLengthOutOfBounds(\n\nnum_percentiles=1000,\n\nmin_unique_values=3,\n\ncheck.add_condition_number_of_outliers_less_or_equal(\n\nmax_outliers=3,', - "r │ zenhacks-cluster ┃┠───────────────────────┼──────────────────────────────────────────────┨\n\n┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃\n\n┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛\n\nThe Service Connector configuration shows long-lived credentials were lifted from the local environment and the AWS Session Token authentication method was configured:\n\nzenml service-connector describe aws-session-token\n\nExample Command Output\n\nService connector 'aws-session-token' of type 'aws' with id '3ae3e595-5cbc-446e-be64-e54e854e0e3f' is owned by user 'default' and is 'private'.\n\n'aws-session-token' aws Service Connector Details\n\n┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n\n┃ PROPERTY │ VALUE ┃\n\n┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨\n\n┃ ID │ c0f8e857-47f9-418b-a60f-c3b03023da54 ┃\n\n┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨\n\n┃ NAME │ aws-session-token ┃\n\n┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨\n\n┃ TYPE │ 🔶 aws ┃\n\n┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨\n\n┃ AUTH METHOD │ session-token ┃\n\n┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨\n\n┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃\n\n┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨", + 'How do I configure the Kubernetes Service Connector to connect ZenML to Kubernetes clusters?', + 'Kubernetes Service Connector\n\nConfiguring Kubernetes Service Connectors to connect ZenML to Kubernetes clusters.\n\nThe ZenML Kubernetes service connector facilitates authenticating and connecting to a Kubernetes cluster. The connector can be used to access to any generic Kubernetes cluster by providing pre-authenticated Kubernetes python clients to Stack Components that are linked to it and also allows configuring the local Kubernetes CLI (i.e. kubectl).\n\nPrerequisites\n\nThe Kubernetes Service Connector is part of the Kubernetes ZenML integration. You can either install the entire integration or use a pypi extra to install it independently of the integration:\n\npip install "zenml[connectors-kubernetes]" installs only prerequisites for the Kubernetes Service Connector Type\n\nzenml integration install kubernetes installs the entire Kubernetes ZenML integration\n\nA local Kubernetes CLI (i.e. kubectl ) and setting up local kubectl configuration contexts is not required to access Kubernetes clusters in your Stack Components through the Kubernetes Service Connector.\n\n$ zenml service-connector list-types --type kubernetes\n\n┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓\n\n┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃\n\n┠──────────────────────────────┼───────────────┼───────────────────────┼──────────────┼───────┼────────┨\n\n┃ Kubernetes Service Connector │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ password │ ✅ │ ✅ ┃\n\n┃ │ │ │ token │ │ ┃\n\n┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛\n\nResource Types\n\nThe Kubernetes Service Connector only supports authenticating to and granting access to a generic Kubernetes cluster. This type of resource is identified by the kubernetes-cluster Resource Type.', + 'he Post-execution workflow has changed as follows:The get_pipelines and get_pipeline methods have been moved out of the Repository (i.e. the new Client ) class and lie directly in the post_execution module now. To use the user has to do:\n\nfrom zenml.post_execution import get_pipelines, get_pipeline\n\nNew methods to directly get a run have been introduced: get_run and get_unlisted_runs method has been introduced to get unlisted runs.\n\nUsage remains largely similar. Please read the new docs for post-execution to inform yourself of what further has changed.\n\nHow to migrate: Replace all post-execution workflows from the paradigm of Repository.get_pipelines or Repository.get_pipeline_run to the corresponding post_execution methods.\n\n📡Future Changes\n\nWhile this rehaul is big and will break previous releases, we do have some more work left to do. However we also expect this to be the last big rehaul of ZenML before our 1.0.0 release, and no other release will be so hard breaking as this one. Currently planned future breaking changes are:\n\nFollowing the metadata store, the secrets manager stack component might move out of the stack.\n\nZenML StepContext might be deprecated.\n\n🐞 Reporting Bugs\n\nWhile we have tried our best to document everything that has changed, we realize that mistakes can be made and smaller changes overlooked. If this is the case, or you encounter a bug at any time, the ZenML core team and community are available around the clock on the growing Slack community.\n\nFor bug reports, please also consider submitting a GitHub Issue.\n\nLastly, if the new changes have left you desiring a feature, then consider adding it to our public feature voting board. Before doing so, do check what is already on there and consider upvoting the features you desire the most.\n\nPreviousMigration guide\n\nNextMigration guide 0.23.0 → 0.30.0\n\nLast updated 12 days ago', ] embeddings = model.encode(sentences) print(embeddings.shape) @@ -1356,42 +1301,42 @@ You can finetune this model on your own dataset. | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.3614 | -| cosine_accuracy@3 | 0.6988 | -| cosine_accuracy@5 | 0.753 | -| cosine_accuracy@10 | 0.8554 | +| cosine_accuracy@3 | 0.6024 | +| cosine_accuracy@5 | 0.6988 | +| cosine_accuracy@10 | 0.7831 | | cosine_precision@1 | 0.3614 | -| cosine_precision@3 | 0.2329 | -| cosine_precision@5 | 0.1506 | -| cosine_precision@10 | 0.0855 | +| cosine_precision@3 | 0.2008 | +| cosine_precision@5 | 0.1398 | +| cosine_precision@10 | 0.0783 | | cosine_recall@1 | 0.3614 | -| cosine_recall@3 | 0.6988 | -| cosine_recall@5 | 0.753 | -| cosine_recall@10 | 0.8554 | -| cosine_ndcg@10 | 0.6194 | -| cosine_mrr@10 | 0.5428 | -| **cosine_map@100** | **0.5473** | +| cosine_recall@3 | 0.6024 | +| cosine_recall@5 | 0.6988 | +| cosine_recall@10 | 0.7831 | +| cosine_ndcg@10 | 0.5756 | +| cosine_mrr@10 | 0.5091 | +| **cosine_map@100** | **0.5165** | #### Information Retrieval * Dataset: `dim_256` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) -| Metric | Value | -|:--------------------|:----------| -| cosine_accuracy@1 | 0.3434 | -| cosine_accuracy@3 | 0.6807 | -| cosine_accuracy@5 | 0.7651 | -| cosine_accuracy@10 | 0.8373 | -| cosine_precision@1 | 0.3434 | -| cosine_precision@3 | 0.2269 | -| cosine_precision@5 | 0.153 | -| cosine_precision@10 | 0.0837 | -| cosine_recall@1 | 0.3434 | -| cosine_recall@3 | 0.6807 | -| cosine_recall@5 | 0.7651 | -| cosine_recall@10 | 0.8373 | -| cosine_ndcg@10 | 0.6025 | -| cosine_mrr@10 | 0.5259 | -| **cosine_map@100** | **0.531** | +| Metric | Value | +|:--------------------|:-----------| +| cosine_accuracy@1 | 0.3675 | +| cosine_accuracy@3 | 0.6145 | +| cosine_accuracy@5 | 0.6988 | +| cosine_accuracy@10 | 0.7711 | +| cosine_precision@1 | 0.3675 | +| cosine_precision@3 | 0.2048 | +| cosine_precision@5 | 0.1398 | +| cosine_precision@10 | 0.0771 | +| cosine_recall@1 | 0.3675 | +| cosine_recall@3 | 0.6145 | +| cosine_recall@5 | 0.6988 | +| cosine_recall@10 | 0.7711 | +| cosine_ndcg@10 | 0.5732 | +| cosine_mrr@10 | 0.5096 | +| **cosine_map@100** | **0.5168** | #### Information Retrieval * Dataset: `dim_128` @@ -1399,21 +1344,21 @@ You can finetune this model on your own dataset. | Metric | Value | |:--------------------|:-----------| -| cosine_accuracy@1 | 0.3133 | -| cosine_accuracy@3 | 0.6265 | -| cosine_accuracy@5 | 0.7169 | -| cosine_accuracy@10 | 0.7892 | -| cosine_precision@1 | 0.3133 | -| cosine_precision@3 | 0.2088 | -| cosine_precision@5 | 0.1434 | -| cosine_precision@10 | 0.0789 | -| cosine_recall@1 | 0.3133 | -| cosine_recall@3 | 0.6265 | -| cosine_recall@5 | 0.7169 | -| cosine_recall@10 | 0.7892 | -| cosine_ndcg@10 | 0.563 | -| cosine_mrr@10 | 0.4893 | -| **cosine_map@100** | **0.4961** | +| cosine_accuracy@1 | 0.2952 | +| cosine_accuracy@3 | 0.5542 | +| cosine_accuracy@5 | 0.6506 | +| cosine_accuracy@10 | 0.747 | +| cosine_precision@1 | 0.2952 | +| cosine_precision@3 | 0.1847 | +| cosine_precision@5 | 0.1301 | +| cosine_precision@10 | 0.0747 | +| cosine_recall@1 | 0.2952 | +| cosine_recall@3 | 0.5542 | +| cosine_recall@5 | 0.6506 | +| cosine_recall@10 | 0.747 | +| cosine_ndcg@10 | 0.5199 | +| cosine_mrr@10 | 0.4472 | +| **cosine_map@100** | **0.4541** | #### Information Retrieval * Dataset: `dim_64` @@ -1421,21 +1366,21 @@ You can finetune this model on your own dataset. | Metric | Value | |:--------------------|:-----------| -| cosine_accuracy@1 | 0.259 | -| cosine_accuracy@3 | 0.512 | -| cosine_accuracy@5 | 0.6325 | -| cosine_accuracy@10 | 0.7169 | -| cosine_precision@1 | 0.259 | -| cosine_precision@3 | 0.1707 | -| cosine_precision@5 | 0.1265 | -| cosine_precision@10 | 0.0717 | -| cosine_recall@1 | 0.259 | -| cosine_recall@3 | 0.512 | -| cosine_recall@5 | 0.6325 | -| cosine_recall@10 | 0.7169 | -| cosine_ndcg@10 | 0.4862 | -| cosine_mrr@10 | 0.4123 | -| **cosine_map@100** | **0.4209** | +| cosine_accuracy@1 | 0.2831 | +| cosine_accuracy@3 | 0.5181 | +| cosine_accuracy@5 | 0.5843 | +| cosine_accuracy@10 | 0.6747 | +| cosine_precision@1 | 0.2831 | +| cosine_precision@3 | 0.1727 | +| cosine_precision@5 | 0.1169 | +| cosine_precision@10 | 0.0675 | +| cosine_recall@1 | 0.2831 | +| cosine_recall@3 | 0.5181 | +| cosine_recall@5 | 0.5843 | +| cosine_recall@10 | 0.6747 | +| cosine_ndcg@10 | 0.4799 | +| cosine_mrr@10 | 0.4178 | +| **cosine_map@100** | **0.4262** |