Youtu-Embedding

Youtu-Embedding Introduction

A unified text representation model (Embeddings) for enterprises and developers, covering retrieval, similarity, clustering, re-ranking, classification and other scenarios.

Overview

Youtu-Embedding is a general-purpose text representation model open-sourced by Tencent Youtu Lab. It can be used for various natural language processing tasks including information retrieval (IR), semantic textual similarity (STS), clustering, re-ranking, and classification, balancing performance and ease of use.

Why Choose Youtu-Embedding

  • Quick Deployment: The repository includes built-in test scripts and examples, enabling environment setup and inference testing within minutes.
  • Unified Representation Capability: Through collaborative-differential learning framework, it balances discriminative ability and generalization across multiple tasks, mitigating negative transfer.
  • Engineering Friendly: Supports Hugging Face model loading and provides ecosystem examples like LangChain / LlamaIndex for easy integration into RAG/retrieval systems.
  • Open and Extensible: Open-source weights, inference and training code for convenient secondary development and customization.

Core Capabilities

  • Multi-scenario Adaptation: Supports unified vector representation for IR / STS / clustering / re-ranking / classification tasks.
  • High-performance Representation: Achieves leading results on authoritative benchmarks like CMTEB (as of 2025-09).
  • Multi-device Support: Automatic selection of CUDA / macOS MPS / CPU, easy for local and cloud deployment.
  • Ecosystem Integration: Built-in LangChain and LlamaIndex examples for quick integration into retrieval workflows.

Main Components

1. Inference

  • Methods:
    • Cloud API (using Tencent Cloud SDK for quick deployment)
    • Local self-hosting (transformers native or sentence-transformers)
  • Scripts and Examples:
    • test_transformers_online_cuda.py (CUDA)
    • test_transformers_online_macos.py (macOS MPS/CPU)
    • test_transformers_local.py (local model directory)
    • usage/infer_llm_embedding.py (custom wrapper class LLMEmbeddingModel)
    • usage/langchain_embedding.py (LangChain integration)
    • usage/llamaindex_embedding.py (LlamaIndex integration)

Visit the code repository to get script example files.

2. Training

  • Location: training/CoDiEmb
  • Features:
    • Unified data structure covering IR / STS / classification / re-ranking
    • Task-differential loss functions (e.g., InfoNCE for IR with multiple positive and hard negative examples; ranking-aware optimization for STS)
    • Dynamic single-task sampling ensuring clean and stable gradient signals
  • Evaluation: See the evaluation/ directory, access address.

Usage Methods

1) Cloud API

  • Use Tencent Cloud SDK and documentation for authentication and calls
  • Suitable for quick deployment and enterprise compliance
  • Usage Address

2) Local/Private Deployment

  • Directly load Hugging Face models or local directories
  • Suitable for data privacy-sensitive scenarios or those requiring deep customization

For detailed instructions, please refer to Quick Start.

Directory and Architecture Overview

Directory/ComponentDescription
usageInference and ecosystem integration examples (API / LangChain / LlamaIndex, etc.)
trainingCollaborative-discriminative fine-tuning training framework and scripts
evaluationReproducible evaluation and results
youtu-model / Youtu-EmbeddingLocal model directory (pulled from Hugging Face or cloned)
test_transformers_*.pyPre-built test scripts for quick validation in different runtime environments

Next Steps

After familiarizing yourself with the basic capabilities, proceed to Quick Start to complete inference and integration locally or via cloud.

Related Links:

On this page