Skip to content

Find the complete code on Github

You can find the code used on this integration directly on OVHcloud's Github.

Hugging Face

Hugging Face Inference Providers

Hugging Face Inference Providers offers a streamlined, unified access to hundreds of machine learning models powered by world-class inference partners.

Python

Before getting started, you'll need:

  1. Hugging Face Account: Sign up at huggingface.co
  2. Access Token: Create a token with "Make calls to Inference Providers" permissions at huggingface.co/settings/tokens

Installation

Install the required packages:

pip install openai huggingface-hub
uv add openai huggingface-hub

Setting Up Authentication

Set your Hugging Face token as an environment variable:

export HF_TOKEN="hf_your_token_here"

Or use a .env file:

HF_TOKEN=hf_your_token_here

Usage

Chat Completion with LLMs

Use the OpenAI SDK for familiar, easy-to-use chat completions:

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct:ovhcloud",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful AI assistant."
        },
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
    temperature=0.7,
    max_tokens=100,
)

print(completion.choices[0].message.content)

Using Hugging Face Hub Client

Alternatively, use the native Hugging Face Hub client:

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="ovhcloud",
    api_key=os.environ["HF_TOKEN"],
)

result = client.chat_completion(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[
        {
            "role": "user",
            "content": "Explain quantum computing in simple terms."
        }
    ],
    temperature=0.7,
    max_tokens=200,
)

print(result.choices[0].message.content)

Javascript / TypeScript

Installation

npm install @huggingface/inference
yarn add @huggingface/inference

Basic Chat Completion

import { InferenceClient } from '@huggingface/inference';

const client = new InferenceClient({
  provider: 'ovhcloud',
  apiKey: process.env.HF_TOKEN,
});

const result = await client.chatCompletion({
  model: 'meta-llama/Llama-3.1-8B-Instruct',
  messages: [
    {
      role: 'user',
      content: 'What is the capital of France?',
    },
  ],
  max_tokens: 100,
});

console.log(result.choices[0].message.content);

Using OpenAI SDK

Don't forget to install the OpenAI SDK before executing this program.

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://router.huggingface.co/v1',
  apiKey: process.env.HF_TOKEN,
});

const completion = await client.chat.completions.create({
  model: 'meta-llama/Llama-3.1-8B-Instruct:ovhcloud',
  messages: [
    {
      role: 'user',
      content: 'Explain machine learning in simple terms.',
    },
  ],
  temperature: 0.7,
  max_tokens: 200,
});

console.log(completion.choices[0].message.content);

Provider Selection Strategies

Hugging Face offers flexible provider selection:

Automatic Selection (Default)

# Uses the first available provider based on your preference order
model="meta-llama/Llama-3.1-8B-Instruct"

Specific Provider

# Force OVHcloud provider
model="meta-llama/Llama-3.1-8B-Instruct:ovhcloud"

Performance-Based Selection

# Select fastest provider (highest throughput)
model="meta-llama/Llama-3.1-8B-Instruct:fastest"

# Select cheapest provider (lowest cost per token)
model="meta-llama/Llama-3.1-8B-Instruct:cheapest"

Setting Provider Preferences

Configure your preferred provider order at huggingface.co/settings/inference-providers.

Pricing and Billing

Hugging Face Inference Providers uses a pay-as-you-go model:

  • Billing: Usage is billed directly to your Hugging Face account
  • No Setup Costs: No infrastructure or commitment required
  • Transparent Pricing: View pricing details at huggingface.co/pricing
  • Cost Control: Monitor usage in your Hugging Face settings

You can also Bring Your Own Key to be billed by OVHcloud and not Hugging Face. You have to add first your OVHcloud AI Endpoints API key in Hugging Face settings.

Going Further

Additional Resources

Community and Support