Sesterce Cloud Doc
  • 👋Welcome on Sesterce Cloud
    • 🚀Get Started!
      • Account creation
      • Manage your account
      • Payment & Billing
        • Invoicing
  • 🚀Compute instances
    • Compute Instance configuration
      • Persistent storage (volumes)
      • SSH Keys
    • Terminal connection
  • 💬AI Inference instances
    • Inference Instance configuration
      • Select your Flavor
      • Select your regions
      • Autoscaling limits
    • Edit an inference instance
    • Chat with Endpoint
  • ▶️Manage your instances
  • 🔗API Reference
    • Authentication
    • GPU Cloud instances
    • SSH Keys
    • Volumes
    • Inference Instances
  • 📗Tutorials
    • Expose AI model from Hugging Face using vLLM
Powered by GitBook
On this page
  • When to choose GPU or CPU flavors?
  • Which GPU Flavors are available for AI Inference?

Was this helpful?

  1. AI Inference instances
  2. Inference Instance configuration

Select your Flavor

PreviousInference Instance configurationNextSelect your regions

Last updated 2 months ago

Was this helpful?

When to choose GPU or CPU flavors?

According to your needs, you can choose from two options: CPU and GPU Flavors.

GPU Flavor is ideal for inference of complex deep learning models, image processing, or multimedia content generation.

CPU Flavor is more dedicated to light tasks as simple text processing algorithms or structured data processing, that do not require a very low latency time.

Which GPU Flavors are available for AI Inference?

💬
Cover

NVIDIA L40S

GPU specialized for inference tasks able to accelerate multiple workloads.

Cover

NVIDIA H100 TensorCore

Up to 30 times acceleration of LLM processing. Ideal for complete models with up to 30 billion parameters.

Cover

NVIDIA A100 TensorCore

A100 provide up to 20X higher performance over the NVIDIA Volta with zero code changes and an additional 2X boost with automatic mixed precision and FP16