Which compute instance for AI models training and inference?
Why is the compute instance choice so important?
Which compute instance to choose for model training?
1. Large Language Models (LLMs) models training
Model Size
Server type
VRAM
Recommended Offers
2. Computer Vision models training
Use Case
Models example
Resource intensity
Batch
Classification Models
Data Volume
Training Time
Recommended Instance
Type
Comment
Segmentation Models
Data Volume
Training Time
Recommended Instance
Type
Comment
Generative Models
Data Volume
Training Time
Recommended Instance
Type
Notes
3. Audio/Speech models training
Use Case
Input Data Type
Dataset Size
Model Examples
Resource Intensity
Speech Recognition Models
Model Size (param)
VRAM/RAM Needed
Recommended Instance
Type
Comment
Text-to-Speech Models
Model Size
VRAM/RAM Needed
Recommended Instance
Type
Notes
Which compute instance to choose for model inference?
1. Large Language Models (LLMs) inference
LLM Inference Sizing - Small Scale (1-50 concurrent users)
Model Size
Concurrent Users
VRAM/RAM needed
Latency Target
Recommended Instance
Type
Estimated RPS*
Comment
LLM inference Sizing - Medium Scale (51-200 concurrent users)
Model Size
Concurrent Users
VRAM/RAM
Latency Target
Recommended Instance
Type
Estimated RPS*
Notes
LLM Inference Sizing - Large Scale (201-1000+ concurrent users)
Model Size
Concurrent Users
VRAM/RAM
Latency Target
Recommended Instance
Type
Estimated RPS*
Notes
2. Image Generation Inference Sizing
Small scale (1-50 concurrent users)
Model Type
Concurrent Users
VRAM/RAM
Latency Target*
Recommended Instance
Type
Images/Minute**
Notes
Medium scale (51-200 concurrent users)
Model Type
Concurrent Users
VRAM/RAM
Latency Target*
Recommended Instance
Type
Images/Minute**
Notes
Large scale (201-1000+ concurrent users)
Model Type
Concurrent Users
VRAM/RAM
Latency Target*
Recommended Instance
Type
Images/Minute**
Notes
Last updated