Integrate a custom Job with Kueue
Kueue has built-in integrations for several Job types, including Kubernetes batch Job, MPIJob, RayJob and JobSet.
There are two options for adding an additional integration for a Job-like CRD with Kueue:
- As part of the Kueue repository
- Writing an external controller
This guide is for platform developers and describes how
to build a new integration. Integrations should be built using the APIs provided by
Kueue’s jobframework
package. This will both simplify development and ensure that
your controller will be properly structured to become a core built-in integration if your
Job type is widely used by the community.
Overview of Requirements
Kueue uses the controller-runtime. We recommend becoming familiar with it and with Kubebuilder before starting to build a Kueue integration.
Whether you are building an external or built-in integration, your main tasks are the following:
-
To work with Kueue, your custom Job CRD should have a suspend-like field, with semantics similar to the
suspend
field in a Kubernetes Job. The field needs to be in your CRD’sspec
, not itsstatus
, to enable its value to be set from a webhook. Your CRD’s primary controller must respond to changes in the value of this field by suspending or unsuspending its owned resources. -
You will need to register the GroupVersionKind of your CRD with Kueue as an integration.
-
You will need to instantiate various pieces of Kueue’s
jobframework
package for your CRD:- You will need to implement Kueue’s
GenericJob
interface for your CRD - You will need to instantiate a
ReconcilerFactory
and register it with the controller runtime. - You will need to add a
+kubebuilder:rbac
directive for your CRD so Kueue will be permitted to manage it. - You will need to register webhooks that set the initial value of the
suspend
field in instances of your CRD and validate Kueue invariants on creation and update operations. - You will need to instantiate a
Workload
indexer for your CRD.
- You will need to implement Kueue’s
Building a Built-in Integration
To get started, add a new folder in ./pkg/controller/jobs/
to host the implementation of the integration.
Here are completed built-in integrations you can learn from:
Registration
Add your framework name to .integrations.frameworks
in controller_manager_config.yaml
Add RBAC Authorization for your CRD using kubebuilder marker comments.
Write a go func init()
that invokes the jobframework RegisterIntegration()
function.
Job Framework
In the mycrd_controller.go
file in your folder, implement the GenericJob
interface, and other optional interfaces defined by the framework.
In the mycrd_webhook.go
file in your folder, provide the webhook that invokes helper methods in from the jobframework to
set the initial suspend status of created jobs and validate invariants.
Add testing files for both the controller and the webhook. You can check the test files in the other subfolders of ./pkg/controller/jobs/
to learn how to implement them.
Adjust build system
Add required dependencies to compile your code. For example, using go get github.com/kubeflow/mpi-operator@0.4.0
.
Update the Makefile for testing.
- Add commands which copy the CRD of your custom job to the Kueue project.
- Add your custom job operator CRD dependencies into
test-integration
.
Building an External Integration
Here are completed external integrations you can learn from:
Registration
Add your framework’s GroupVersionKind to .integrations.externalFrameworks
in controller_manager_config.yaml.
RBAC Augmentation
Kueue will need permission to get, list, and watch instances of your CRD.
If you are building a custom Kueue deployment, simply add a kubebuilder:rbac
annotation to a source code file
(for example in integrationmanager.go) and regenerate the manifests.
If you are deploying a Kueue release, modify either charts/kueue/templates/rbac/role.yaml or config/components/rbac/role.yaml to add the needed permissions.
Job Framework
Add a dependency on Kueue to your go.mod
, import the jobframework
and use it as described above to
create your controller and webhook implementations. In the main
function of your controller, instantiate the controller-runtime manager
and register your webhook, indexer, and controller.
For a concrete example, consult these pieces of the AppWrapper controller:
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.