# Inference Instances

### Get Inference models

This endpoint allows you to view available AI models for deployment, aiding in selecting the right model for your needs.

{% hint style="info" %}
Check model features to match your specific project requirements.
{% endhint %}

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/models" method="get" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Get inference hardware

You have several hardware options available through the AI inference feature of Sesterce Cloud (you can consult this section for more information). This endpoint allows you to explore options for deploying AI instances, which are crucial for planning resources and manage latency rate.

{% hint style="info" %}
Evaluate hardware capabilities to ensure optimal performance for your AI tasks.
{% endhint %}

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/hardwares" method="get" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Get Regions available for inference instances

Identify available regions for deploying AI instances, important for compliance and latency considerations.

{% hint style="info" %}
The region choice is a crucial parameter for your inference endpoint hosting. It will determine the latency rate for your final end-users. Choose regions that align with your data residency and latency needs.
{% endhint %}

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/regions" method="get" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Create a Registry

A registry is necessary if you need to infere your own custom model, which is not publicly available. [Click here to learn more about Registries](https://docs.sesterce.com/ai-inference-instances/inference-instance-configuration#how-to-use-private-custom-model) on Sesterce Cloud AI Inference service!

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/registries" method="post" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Get the list of registries created

To manage your registries for storing and accessing AI models, use the following endpoint:

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/registries" method="get" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Update a registry

To modify registry details to ensure they meet current security and access needs, use the following endpoint:

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/registries/{id}" method="patch" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Delete a Registry

The following endpoint allows to remove outdated or unused registries to maintain a clean environment.

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/registries/{id}" method="delete" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Create an inference instance

Time has come! You can now deploy a new AI inference instance to scale your applications and services, or deploy in production an existing model! Use the following endpoint to perform this action.

{% hint style="warning" %}
To create an inference instance, check that your credit balance is filled. Please [check here ](https://docs.sesterce.com/welcome-on-sesterce-cloud/payment-and-billing)our documentation to top up your balance.
{% endhint %}

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/instances" method="post" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Start an inference instance

This endpoint allows you to activate an AI inference instance to begin processing tasks and data.

{% hint style="info" %}
You can monitor startup times to assess performance efficiency.
{% endhint %}

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/instances/{id}/start" method="post" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Get the list of your Inference instances

Here is the endpoint to monitor your active AI instances to manage resources and performance.

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/instances" method="get" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Get details about a specific Inference Instance

Retrieve detailed information about a specific AI instance for management and troubleshooting.

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/instances/{id}" method="get" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Preview AI instance pricing

This endpoints allows you to estimate costs for your running AI instances, helping in budget planning.

{% hint style="info" %}
Sesterce Cloud AI inference service is based on an unlimited-token pricing. This means you are charged for a global hour price, whatever the use of your dedicated endpoint.
{% endhint %}

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/instances/pricing" method="post" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Update an inference instance

This endpoint allows you to modify existing AI instances to adapt to changing project needs. This is particularly useful is you need to update your hardware flavor and/or autoscaling limits according to the use of your dedicated endpoint :

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/instances/{id}" method="patch" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}

### Stop an inference instance

If you need to pause an AI instance to conserve resources and manage costs, use the following endpoint:

{% openapi src="<https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media&token=7cdfae61-f77a-40df-8a75-197d1ad217a1>" path="/ai-inference/instances/{id}/stop" method="post" %}
[user-api-docs.json](https://3376774032-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoExYATuECEyEKGJ4cD8X%2Fuploads%2FPJS3wkW5AlGxKDItJyuE%2Fuser-api-docs.json?alt=media\&token=7cdfae61-f77a-40df-8a75-197d1ad217a1)
{% endopenapi %}
