Chat Model API

The Chat Model API offers developers the ability to integrate AI-powered chat completion capabilities into their applications. It leverages pre-trained language models, such as GPT (Generative Pre-trained Transformer), to generate human-like responses to user inputs in natural language.

The API typically works by sending a prompt or partial conversation to the AI model, which then generates a completion or continuation of the conversation based on its training data and understanding of natural language patterns. The completed response is then returned to the application, which can present it to the user or use it for further processing.

The NestJS AI Chat Model API is designed to be a simple and portable interface for interacting with various AI Models, allowing developers to switch between different models with minimal code changes. This design aligns with NestJS’s philosophy of modularity and interchangeability.

Also with the help of companion classes like Prompt for input encapsulation and ChatResponse for output handling, the Chat Model API unifies the communication with AI Models. It manages the complexity of request preparation and response parsing, offering a direct and simplified API interaction.

You can find more about available implementations in the Available Implementations section as well as detailed comparison in the Chat Models Comparison section.

API Overview

This section provides a guide to the NestJS AI Chat Model API interface and associated classes.

ChatModel

Here is the ChatModel abstract class definition:

export abstract class ChatModel
  extends StreamingChatModel
  implements Model<Prompt, ChatResponse>
{
  call(message: string): Promise<string | null>;
  call(...messages: Message[]): Promise<string | null>;
  call(prompt: Prompt): Promise<ChatResponse>;

  protected abstract callPrompt(prompt: Prompt): Promise<ChatResponse>;

  get defaultOptions(): ChatOptions;
}

The call() method with a string parameter simplifies initial use, avoiding the complexities of the more sophisticated Prompt and ChatResponse classes. In real-world applications, it is more common to use the call() method that takes a Prompt instance and returns a ChatResponse.

StreamingChatModel

Here is the StreamingChatModel abstract class definition:

export abstract class StreamingChatModel
  implements StreamingModel<Prompt, ChatResponse>
{
  stream(message: string): Observable<string>;
  stream(...messages: Message[]): Observable<string>;
  stream(prompt: Prompt): Observable<ChatResponse>;

  protected abstract streamPrompt(prompt: Prompt): Observable<ChatResponse>;
}

The stream() method takes a string or Prompt parameter similar to ChatModel but it streams the responses using the reactive RxJS Observable API.

Prompt

The Prompt is a ModelRequest that encapsulates a list of Message objects and optional model request options. The following listing shows a truncated version of the Prompt class, excluding constructors and other utility methods:

export class Prompt implements ModelRequest<Message[]> {

  private readonly messages: Message[];
  private readonly chatOptions: ChatOptions | null;

  get options(): ChatOptions | null {...}

  get instructions(): Message[] {...}

  // constructors and utility methods omitted
}

Message

The Message interface encapsulates a Prompt textual content, a collection of metadata attributes, and a categorization known as MessageType.

The interface is defined as follows:

export interface Content {
  get text(): string | null;

  get metadata(): Record<string, unknown>;
}

export interface Message extends Content {
  get messageType(): MessageType;
}

The multimodal message types implement also the MediaContent interface providing a list of Media content objects.

export interface MediaContent extends Content {
  get media(): Media[];
}

The Message interface has various implementations that correspond to the categories of messages that an AI model can process:

The chat completion endpoint, distinguish between message categories based on conversational roles, effectively mapped by the MessageType.

For instance, OpenAI recognizes message categories for distinct conversational roles such as system, user, function, or assistant.

While the term MessageType might imply a specific message format, in this context it effectively designates the role a message plays in the dialogue.

For AI models that do not use specific roles, the UserMessage implementation acts as a standard category, typically representing user-generated inquiries or instructions. To understand the practical application and the relationship between Prompt and Message, especially in the context of these roles or message categories, see the detailed explanations in the Prompts section.

Chat Options

Represents the options that can be passed to the AI model. The ChatOptions interface extends ModelOptions and is used to define a few portable options that can be passed to the AI model. The ChatOptions interface is defined as follows:

export interface ChatOptions extends ModelOptions {
  model?: string | null;
  frequencyPenalty?: number | null;
  maxTokens?: number | null;
  presencePenalty?: number | null;
  stopSequences?: string[] | null;
  temperature?: number | null;
  topK?: number | null;
  topP?: number | null;

  copy(): ChatOptions;
  mutate(): ChatOptions.Builder;
}

Additionally, every model specific ChatModel / StreamingChatModel implementation can have its own options that can be passed to the AI model. For example, the OpenAI Chat Completion model has its own options like logitBias, seed, and user.

NestJS AI provides a sophisticated system for configuring and using Chat Models. It allows for default configuration to be set at start-up, while also providing the flexibility to override these settings on a per-request basis. This approach enables developers to easily work with different AI models and adjust parameters as needed, all within a consistent interface provided by the NestJS AI framework.

When using ChatModel.call() / ChatModel.stream(), the passed prompt needs to contain a full set of options that will completely take precedence over options set in the model (or use null options in the Prompt to use the model’s defaults).

The ChatClient abstraction allows for an incremental approach where users can provide a "delta" customizer that

Following flow diagram illustrates how NestJS AI handles the configuration and execution of Chat Models:

Start-up Configuration - The ChatModel/StreamingChatModel is initialized with "Start-Up" Chat Options. These options are set during the ChatModel initialization and are meant to provide default configurations.
Runtime Configuration - For each request, the Prompt can contain a Runtime Chat Options: These fully override the start-up options.
Input Processing - The "Convert Input" step transforms the input instructions into native, model-specific formats.
Output Processing - The "Convert Output" step transforms the model’s response into a standardized ChatResponse format.

ChatResponse

The structure of the ChatResponse class is as follows:

export class ChatResponse implements ModelResponse<Generation> {

  private readonly _generations: Generation[];
  private readonly _chatResponseMetadata: ChatResponseMetadata;

  get result(): Generation | null {...}

  get results(): Generation[] {...}

  get metadata(): ChatResponseMetadata {...}

  // other methods omitted
}

The ChatResponse class holds the AI Model’s output, with each Generation instance containing one of potentially multiple outputs resulting from a single prompt.

The ChatResponse class also carries a ChatResponseMetadata metadata about the AI Model’s response.

Generation

Finally, the Generation class implements ModelResult to represent the model output (assistant message) and related metadata:

export class Generation implements ModelResult<AssistantMessage> {

  private readonly _assistantMessage: AssistantMessage;
  private readonly _chatGenerationMetadata: ChatGenerationMetadata;

  get output(): AssistantMessage {...}

  get metadata(): ChatGenerationMetadata {...}

  // other methods omitted
}

Available Implementations

This diagram illustrates the unified interfaces, ChatModel and StreamingChatModel, are used for interacting with various AI chat models from different providers, allowing easy integration and switching between different AI services while maintaining a consistent API for the client application.

OpenAI Chat Completion (streaming, multi-modality & function-calling support)
Google GenAI Chat Completion (streaming, multi-modality & function-calling support)
Anthropic Chat Completion (streaming, multi-modality & function-calling support)
Ollama Chat Completion (streaming, multi-modality & function-calling support)

The following chat model integrations from Spring AI are not yet available in NestJS AI: Azure OpenAI, Amazon Bedrock, Mistral AI, DeepSeek, Groq, MiniMax, Moonshot AI, NVIDIA, Perplexity AI, and QianFan.

Find a detailed comparison of the available Chat Models in the Chat Models Comparison section.

Chat Model API

The NestJS AI Chat Model API is built on top of the NestJS AI Generic Model API providing Chat specific abstractions and implementations. This allows an easy integration and switching between different AI services while maintaining a consistent API for the client application. The following class diagram illustrates the main classes and interfaces of the NestJS AI Chat Model API.