Skip to content

Embeddings

GPT4All supports generating high quality embeddings of arbitrary length documents of text using a CPU optimized contrastively trained Sentence Transformer. These embeddings are comparable in quality for many tasks with OpenAI.

Quickstart

pip install gpt4all

Generating embeddings

The embedding model will automatically be downloaded if not installed.

from gpt4all import GPT4All, Embed4All
text = 'The quick brown fox jumps over the lazy dog'
embedder = Embed4All()
output = embedder.embed(text)
print(output)
[0.034696947783231735, -0.07192722707986832, 0.06923297047615051, ...]

Speed of embedding generation

The following table lists the generation speed for text document captured on an Intel i913900HX CPU with DDR5 5600 running with 8 threads under stable load.

Tokens 128 512 2048 8129 16,384
Wall time (s) .02 .08 .24 .96 1.9
Tokens / Second 6508 6431 8622 8509 8369

API documentation

Embed4All

Python class that handles embeddings for GPT4All.

Source code in gpt4all/gpt4all.py
class Embed4All:
    """
    Python class that handles embeddings for GPT4All.
    """

    def __init__(
        self,
        n_threads: Optional[int] = None,
    ):
        """
        Constructor

        Args:
            n_threads: number of CPU threads used by GPT4All. Default is None, then the number of threads are determined automatically.
        """
        self.gpt4all = GPT4All(model_name='ggml-all-MiniLM-L6-v2-f16.bin', n_threads=n_threads)

    def embed(self, text: str) -> List[float]:
        """
        Generate an embedding.

        Args:
            text: The text document to generate an embedding for.

        Returns:
            An embedding of your document of text.
        """
        return self.gpt4all.model.generate_embedding(text)
__init__(n_threads=None)

Constructor

Parameters:

  • n_threads (Optional[int], default: None ) –

    number of CPU threads used by GPT4All. Default is None, then the number of threads are determined automatically.

Source code in gpt4all/gpt4all.py
def __init__(
    self,
    n_threads: Optional[int] = None,
):
    """
    Constructor

    Args:
        n_threads: number of CPU threads used by GPT4All. Default is None, then the number of threads are determined automatically.
    """
    self.gpt4all = GPT4All(model_name='ggml-all-MiniLM-L6-v2-f16.bin', n_threads=n_threads)
embed(text)

Generate an embedding.

Parameters:

  • text (str) –

    The text document to generate an embedding for.

Returns:

  • List[float]

    An embedding of your document of text.

Source code in gpt4all/gpt4all.py
def embed(self, text: str) -> List[float]:
    """
    Generate an embedding.

    Args:
        text: The text document to generate an embedding for.

    Returns:
        An embedding of your document of text.
    """
    return self.gpt4all.model.generate_embedding(text)