Tip
Location for concrete implementations within the framework bee-agent-framework/adapters
.
Location for base abstraction within the framework bee-agent-framework/llms
.
A Large Language Model (LLM) is an AI designed to understand and generate human-like text. Trained on extensive text data, LLMs learn language patterns, grammar, context, and basic reasoning to perform tasks like text completion, translation, summarization, and answering questions.
To unify differences between various APIs, the framework defines a common interface—a set of actions that can be performed with it.
Name | LLM | Chat LLM | Structured output (constrained decoding) |
---|---|---|---|
WatsonX |
✅ | ❌ | |
Ollama |
✅ | ✅ | |
OpenAI |
❌ | ✅ | |
Azure OpenAI |
❌ | ✅ | |
LangChain |
❌ | ||
Groq |
❌ | ✅ | |
AWS Bedrock |
❌ | ✅ | |
VertexAI |
✅ | ✅ | |
➕ Request |
All providers' examples can be found in examples/llms/providers.
Are you interested in creating your own adapter? Jump to the adding a new provider section.
import "dotenv/config.js";
import { createConsoleReader } from "examples/helpers/io.js";
import { WatsonXLLM } from "bee-agent-framework/adapters/watsonx/llm";
const llm = new WatsonXLLM({
modelId: "google/flan-ul2",
projectId: process.env.WATSONX_PROJECT_ID,
apiKey: process.env.WATSONX_API_KEY,
region: process.env.WATSONX_REGION, // (optional) default is us-south
parameters: {
decoding_method: "greedy",
max_new_tokens: 50,
},
});
const reader = createConsoleReader();
const prompt = await reader.prompt();
const response = await llm.generate(prompt);
reader.write(`LLM 🤖 (text) : `, response.getTextContent());
reader.close();
Source: examples/llms/text.ts
Note
The generate
method returns a class that extends the base BaseLLMOutput
class.
This class allows you to retrieve the response as text using the getTextContent
method and other useful metadata.
Tip
You can enable streaming communication (internally) by passing { stream: true }
as a second parameter to the generate
method.
import "dotenv/config.js";
import { createConsoleReader } from "examples/helpers/io.js";
import { BaseMessage, Role } from "bee-agent-framework/llms/primitives/message";
import { OllamaChatLLM } from "bee-agent-framework/adapters/ollama/chat";
const llm = new OllamaChatLLM();
const reader = createConsoleReader();
for await (const { prompt } of reader) {
const response = await llm.generate([
BaseMessage.of({
role: Role.USER,
text: prompt,
}),
]);
reader.write(`LLM 🤖 (txt) : `, response.getTextContent());
reader.write(`LLM 🤖 (raw) : `, JSON.stringify(response.finalResult));
}
Source: examples/llms/chat.ts
Note
The generate
method returns a class that extends the base ChatLLMOutput
class.
This class allows you to retrieve the response as text using the getTextContent
method and other useful metadata.
To retrieve all messages (chunks) access the messages
property (getter).
Tip
You can enable streaming communication (internally) by passing { stream: true }
as a second parameter to the generate
method.
import "dotenv/config.js";
import { createConsoleReader } from "examples/helpers/io.js";
import { BaseMessage, Role } from "bee-agent-framework/llms/primitives/message";
import { OllamaChatLLM } from "bee-agent-framework/adapters/ollama/chat";
const llm = new OllamaChatLLM();
const reader = createConsoleReader();
for await (const { prompt } of reader) {
for await (const chunk of llm.stream([
BaseMessage.of({
role: Role.USER,
text: prompt,
}),
])) {
reader.write(`LLM 🤖 (txt) : `, chunk.getTextContent());
reader.write(`LLM 🤖 (raw) : `, JSON.stringify(chunk.finalResult));
}
}
Source: examples/llms/chatStream.ts
import "dotenv/config.js";
import { createConsoleReader } from "examples/helpers/io.js";
import { BaseMessage, Role } from "bee-agent-framework/llms/primitives/message";
import { OllamaChatLLM } from "bee-agent-framework/adapters/ollama/chat";
const llm = new OllamaChatLLM();
const reader = createConsoleReader();
for await (const { prompt } of reader) {
const response = await llm
.generate(
[
BaseMessage.of({
role: Role.USER,
text: prompt,
}),
],
{},
)
.observe((emitter) =>
emitter.match("*", (data, event) => {
reader.write(`LLM 🤖 (event: ${event.name})`, JSON.stringify(data));
// if you want to close the stream prematurely, just uncomment the following line
// callbacks.abort()
}),
);
reader.write(`LLM 🤖 (txt) : `, response.getTextContent());
reader.write(`LLM 🤖 (raw) : `, JSON.stringify(response.finalResult));
}
Source: examples/llms/chatCallback.ts
import "dotenv/config.js";
import { z } from "zod";
import { BaseMessage, Role } from "bee-agent-framework/llms/primitives/message";
import { OllamaChatLLM } from "bee-agent-framework/adapters/ollama/chat";
import { JsonDriver } from "bee-agent-framework/llms/drivers/json";
const llm = new OllamaChatLLM();
const driver = new JsonDriver(llm);
const response = await driver.generate(
z.union([
z.object({
firstName: z.string().min(1),
lastName: z.string().min(1),
address: z.string(),
age: z.number().int().min(1),
hobby: z.string(),
}),
z.object({
error: z.string(),
}),
]),
[
BaseMessage.of({
role: Role.USER,
text: "Generate a profile of a citizen of Europe.",
}),
],
);
console.info(response);
Source: examples/llms/structured.ts
To use an inference provider that is not mentioned in our providers list feel free to create a request.
If approved and you want to create it on your own, you must do the following things. Let's assume the name of your provider is Custom.
- Base location within the framework:
bee-agent-framework/adapters/custom
- Text LLM (filename):
llm.ts
(example implementation) - Chat LLM (filename):
chat.ts
(example implementation)
- Text LLM (filename):
Important
If the target provider provides an SDK, use it.
Important
All provider-related dependencies (if any) must be included in devDependencies
and peerDependencies
in the package.json
.
Tip
To simplify work with the target RestAPI feel free to use the helper RestfulClient
class.
The client usage can be seen in the WatsonX LLM Adapter here.
Tip
Parsing environment variables should be done via helper functions (parseEnv
/ hasEnv
/ getEnv
) that can be found here.