Chat with Endpoint

From Chat Playground

When your inference instance turns active, a button "Open Playground" will appear if the model hosted is Open AI compatible.

Click the button to access Chat Playground! You'll be able to interact with your endpoint, and change parameters such as Temperature, Top-p, Top-k and repetition penalty.

From your Terminal

To interact with the endpoint directly from your terminal, you'll need to follow this process. Our endpoints follow OpenAI's specification, enabling seamless integration with existing tools and libraries.

Prerequisites

  • Your endpoint URL (format: https://<id>-<hash>.ai.sesterce.dev/)

  • Your API Secret (provided upon deployment)

  • Model ID (retrieved via API)

  • OpenAI-compatible client or SDK

Authentication

Verifying connection

First, list available models:

curl -H "x-api-key: <SECRET>" -X GET "<ENDPOINT>/v1/models"

Model types and endpoints

Ensure to replace <SECRET> by your own secret (available from launched instance page), as well as <MODEL_ID> and <ENDPOINT>, to be replaced by your own data.

1. Text generation

Endpoint: /v1/chat/completions

curl -H "Content-Type: application/json" \
     -H "x-api-key: <SECRET>" \
     -X POST "<ENDPOINT>/v1/chat/completions" \
     -d '{
       "model": "<MODEL_ID>",
       "messages": [
         {
           "role": "user",
           "content": "Hello, how are you?"
         }
       ]
     }'

2. Multimodel (text+image)

Endpoint: /v1/chat/completions

curl -H "Content-Type: application/json" \
     -H "x-api-key: <SECRET>" \
     -X POST "<ENDPOINT>/v1/chat/completions" \
     -d '{
       "model": "<MODEL_ID>",
       "messages": [
         {
           "role": "user",
           "content": [
             {
               "type": "text",
               "text": "What's in this image?"
             },
             {
               "type": "image_url",
               "image_url": {
                 "url": "https://example.com/image.jpg"
               }
             }
           ]
         }
       ]
     }'

3. Audio-Speech Recognition (ASR)

Endpoint: /v1/audio/transcriptions

curl -H "x-api-key: <SECRET>" \
     -X POST "<ENDPOINT>/v1/audio/transcriptions" \
     -H "Content-Type: multipart/form-data" \
     -F file="@/path/to/audio.mp3" \
     -F model="<MODEL_ID>"

Open AI SDK integration

Javascript/Typescript

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: "<SECRET>",
    baseURL: "<ENDPOINT>/v1"
});

// Text Generation
async function generateText() {
    const completion = await openai.chat.completions.create({
        model: "<MODEL_ID>",
        messages: [
            {
                role: "user",
                content: "Hello, how are you?"
            }
        ]
    });
    console.log(completion.choices[0].message.content);
}

// Multimodal
async function analyzeImage() {
    const response = await openai.chat.completions.create({
        model: "<MODEL_ID>",
        messages: [
            {
                role: "user",
                content: [
                    {
                        type: "text",
                        text: "What's in this image?"
                    },
                    {
                        type: "image_url",
                        image_url: {
                            url: "https://example.com/image.jpg"
                        }
                    }
                ]
            }
        ]
    });
}

Python

from openai import OpenAI

client = OpenAI(
    api_key="<SECRET>",
    base_url="<ENDPOINT>/v1"
)

# Text Generation
response = client.chat.completions.create(
    model="<MODEL_ID>",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

# Audio Transcription
with open("audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="<MODEL_ID>", 
        file=audio_file
    )

Best Practices

  1. Performance Optimization

    • Use batch processing for multiple requests

    • Implement caching when possible

    • Compress files before upload

    • Use URLs for large files

  2. Security

    • Never share your API Secret

    • Implement rate limiting

    • Monitor usage patterns

    • Secure stored credentials

  3. Integration Tips

    • Always validate model availability

    • Implement proper error handling

    • Use the SDK for robust integration

    • Keep dependencies updated

Supported Formats and Limitations

File Formats

  • Images: JPEG, PNG, WEBP, GIF

  • Audio: mp3, mp4, mpeg, mpga, m4a, wav, webm

Size Limits

  • Images: Maximum 20MB

  • Audio: Maximum 25MB

  • Audio Duration: Up to 4 hours

Error Handling

Common Error Codes

  • 401: Invalid API Secret

  • 404: Model not found

  • 429: Too many requests

  • 500: Server error

Error Handling example

try:
    response = client.chat.completions.create(...)
except Exception as e:
    if "file too large" in str(e):
        # Handle size error
    elif "unsupported file type" in str(e):
        # Handle format error
    else:
        # Handle other errors

If you need support, please reach us at [email protected].

Last updated

Was this helpful?