💬AI Inference instances

What is Sesterce AI Inference service?

We built our inference feature to enable our users to bring their ML model to life by deploying it in a dedicated production environment accessible to all via an endpoint.

You can use our inference service to deploy your own custom model to make it accessible to your users, or to infer with the best models on the market which you can then seamlessly integrate into your applications.

In addition to the classic compute instances that are very useful for building and training your model, the AI inference feature is an additional brick that will allow you to manage also the deployment of your ML Model as closely as possible to your users.

Deploy your Model as closely as possible to your customers

Sesterce's AI inference feature allows you to deploy your model as close as possible to your end-users, to guarantee minimal latency, here's why:

Edge inference nodes distributed around the world

Processes data locally at the network's edge, minimizing latency and bandwidth usage for real-time applications.

Anycast Endpoint setupped automatically

Directs user requests to the nearest instance of a service, optimizing performance and reducing response time.

Smart Routing technology to 180 points of presence worldwide

End users' queries are routed to the closest active model, ensuring low latency and an improved user experience.

Pre-charged public models

Here is a non exhaustive list of models available on Sesterce Cloud. Click here to discover the entire model catalog!

Model

Type

Description

distilbert-base

Text processing

A smaller, faster version of BERT used for natural language tasks.

stable-diffusion

Text-to-image

Generates images from text descriptions using deep learning techniques.

stable-cascade

Text-to-image

Enhances image generation with multiple refinement steps.

sdxl-lightning

Text-to-image

Optimized for fast image generation from text inputs.

ResNet-50

Image classification

A convolutional neural network designed for image recognition tasks.

Llama-Pro-8b

Text generation

A large language model designed for generating human-like text.

Llama-3.2-3B-Instruct

Text generation

An instruction-tuned model for generating text with specific guidelines.

Mistral-Nemo-Instruct-2407

Text generation

Tailored for creating text based on given instructions.

Llama-3.1-8B-Instruct

Text generation

An advanced model for generating text with detailed instructions.

Pixtral-12B-2409

Text-to-image

Produces high-quality images from text prompts using a large model.

Llama-3.2-1B-Instruct

Text generation

Focused on generating text according to user-provided instructions.

Mistral-7B-Instruct-v.0.3

Text generation

Designed for generating guided text outputs with minimal latency.

Whisper-large-V3-turbo

Audio-to-text

Quickly transcribes audio into text with high accuracy.

Whisper-large-V3

Audio-to-text

Transcribes spoken language into written text using deep learning.

PreviousTerminal connection NextInference Instance configuration

Last updated 7 months ago

Was this helpful?