Google debuts a new Gemini-based text embedding model

0
5


Google on Friday added a new, experimental “embedding” model for text, Gemini Embedding, to its Gemini developer API.

Embedding models translate text inputs like words and phrases into numerical representations, known as embeddings, that capture the semantic meaning of the text. Embeddings are used in a range of applications, such as document retrieval and classification, in part because they can reduce costs while improving latency.

Companies including Amazon, Cohere, and OpenAI offer embedding models through their respective APIs. Google has offered embedding models before, but Gemini Embedding is its first trained on the Gemini family of AI models.

“Trained on the Gemini model itself, this embedding model has inherited Gemini’s understanding of language and nuanced context, making it applicable for a wide range of uses,” Google said in a blog post. “We’ve trained our model to be remarkably general, delivering exceptional performance across diverse domains, including finance, science, legal, search, and more.”

Google claims that Gemini Embedding surpasses the performance of its previous state-of-the-art embedding model, text-embedding-004, and achieves competitive performance on popular embedding benchmarks. Compared to text-embedding-004, Gemini Embedding can also accept larger chunks of text and code at once, and it supports twice as many languages (over 100).

Google notes that Gemini Embedding is in an “experimental phase” with limited capacity and is subject to change. “[W]e’re working towards a stable, generally available release in the months to come,” the company wrote in its blog post.

LEAVE A REPLY

Please enter your comment!
Please enter your name here