Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for local models (+ Other OpenAI compatible end-points) #32

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 41 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,13 +116,49 @@ The CLI will prompt you to input instructions interactively:

You can configure the demo by specifying the following parameters:

- `--aggregator`: The primary model used for final response generation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we renaming this parameter?

Copy link
Author

@tijszwinkels tijszwinkels Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the documentation is incorrect. Typer will just get the parameters from the main function, where 'model' is used to define the primary model.

I changed the code to use --aggregator instead of --model

- `--reference_models`: List of models used as references.
- `--model`: The primary model used for final response generation.
- `--reference-models`: Models used as references.
- `--temperature`: Controls the randomness of the response generation.
- `--max_tokens`: Maximum number of tokens in the response.
- `--max-tokens`: Maximum number of tokens in the response.
- `--rounds`: Number of rounds to process the input for refinement. (num rounds == num of MoA layers - 1)
- `--num_proc`: Number of processes to run in parallel for faster execution.
- `--multi_turn`: Boolean to toggle multi-turn interaction capability.
- `--num-proc`: Number of processes to run in parallel for faster execution.
- `--multi-turn`: Boolean to toggle multi-turn interaction capability.

Specify `--reference-models` multiple times to use multiple models as references. For example:

```bash
# Specify multiple reference models
python bot.py --reference-models "mistralai/Mixtral-8x22B-Instruct-v0.1" --reference-models "Qwen/Qwen2-72B-Instruct"
```

## Other OpenAI Compatible API endpoints

To use different OpenAI-compatible API endpoints, set the OPENAI_BASE_URL and OPENAI_API_KEY variable.

```
export TOGETHER_API_KEY=
export OPENAI_BASE_URL="https://your-api-provider.com/v1"
export OPENAI_API_KEY="your-api-key-here"
```

This way, any 3rd party API can be used, such as OpenRouter, Groq, local models, etc.

### Ollama

For example, to run the bot using Ollama:

1. Set up the environment:

```
export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_API_KEY=ollama
```

2. Run the bot command:

```
python bot.py --model llama3 --reference-models llama3 --reference-models mistral
```

## Evaluation

Expand Down
7 changes: 4 additions & 3 deletions bot.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ def main(

model = Prompt.ask(
"\n1. What main model do you want to use?",
default="Qwen/Qwen2-72B-Instruct",
default=model,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest reverting to "aggregator"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

)
console.print(f"Selected {model}.", style="yellow italic")
temperature = float(
Expand Down Expand Up @@ -199,8 +199,9 @@ def main(

for chunk in output:
out = chunk.choices[0].delta.content
console.print(out, end="")
all_output += out
if out is not None:
console.print(out, end="")
all_output += out
print()

if DEBUG:
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ loguru
datasets
typer
rich
cffi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this

Copy link
Author

@tijszwinkels tijszwinkels Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing, this seems required through the 'datasets' import:

Without it:

conda create --name moa-test python=3.11
conda activate moa-test
pip install -r requirements.txt
tijs@Perplexity MoA % DEBUG=1 python bot.py --model "groq/mixtral-8x7b-32768" --reference-models "groq/llama3-70b-8192" --reference-models "groq/mixtral-8x7b-32768" --reference-models "groq/gemma2-9b-it" --rounds 2  
Traceback (most recent call last):
  File "/Users/tijs/os/MoA/bot.py", line 1, in <module>
    import datasets
  File "/opt/homebrew/anaconda3/envs/moa-test/lib/python3.11/site-packages/datasets/__init__.py", line 17, in <module>
    from .arrow_dataset import Dataset
  File "/opt/homebrew/anaconda3/envs/moa-test/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 75, in <module>
    from . import config
  File "/opt/homebrew/anaconda3/envs/moa-test/lib/python3.11/site-packages/datasets/config.py", line 145, in <module>
    importlib.import_module("soundfile").__libsndfile_version__
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/anaconda3/envs/moa-test/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tijs/.local/lib/python3.11/site-packages/soundfile.py", line 17, in <module>
    from _soundfile import ffi as _ffi
  File "/Users/tijs/.local/lib/python3.11/site-packages/_soundfile.py", line 2, in <module>
    import _cffi_backend
ModuleNotFoundError: No module named '_cffi_backend'
(moa-test) 

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you're right. This looks like a bug in whatever version of 'datasets' I happened to pick up. I removed it.

So: Done!

28 changes: 21 additions & 7 deletions utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,19 @@

DEBUG = int(os.environ.get("DEBUG", "0"))

TOGETHER_API_KEY = os.environ.get("TOGETHER_API_KEY")
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
EVAL_API_KEY = os.environ.get("EVAL_API_KEY")

# If TOGETHER_API_KEY is set, use that one instead and use OPENAI for evaluations
if TOGETHER_API_KEY:
OPENAI_API_KEY = TOGETHER_API_KEY
EVAL_API_KEY = os.environ.get("OPENAI_API_KEY")

OPENAI_BASE_URL = os.environ.get("OPENAI_BASE_URL", "https://api.together.xyz/v1")
EVAL_BASE_URL = os.environ.get("EVAL_BASE_URL", "https://api.openai.com/v1")



def generate_together(
model,
Expand All @@ -21,12 +34,12 @@ def generate_together(

output = None

endpoint = f"{OPENAI_BASE_URL}/chat/completions"

for sleep_time in [1, 2, 4, 8, 16, 32]:

try:

endpoint = "https://api.together.xyz/v1/chat/completions"

if DEBUG:
logger.debug(
f"Sending messages ({len(messages)}) (last message: `{messages[-1]['content'][:20]}...`) to `{model}`."
Expand All @@ -41,7 +54,7 @@ def generate_together(
"messages": messages,
},
headers={
"Authorization": f"Bearer {os.environ.get('TOGETHER_API_KEY')}",
"Authorization": f"Bearer {OPENAI_API_KEY}",
},
)
if "error" in res.json():
Expand Down Expand Up @@ -80,11 +93,10 @@ def generate_together_stream(
max_tokens=2048,
temperature=0.7,
):
endpoint = "https://api.together.xyz/v1"
client = openai.OpenAI(
api_key=os.environ.get("TOGETHER_API_KEY"), base_url=endpoint
api_key=OPENAI_API_KEY, base_url=OPENAI_BASE_URL
)
endpoint = "https://api.together.xyz/v1/chat/completions"
endpoint = f"{OPENAI_BASE_URL}/chat/completions"
response = client.chat.completions.create(
model=model,
messages=messages,
Expand All @@ -104,7 +116,8 @@ def generate_openai(
):

client = openai.OpenAI(
api_key=os.environ.get("OPENAI_API_KEY"),
api_key=EVAL_API_KEY,
base_url=EVAL_BASE_URL,
)

for sleep_time in [1, 2, 4, 8, 16, 32]:
Expand Down Expand Up @@ -179,3 +192,4 @@ def generate_with_references(
temperature=temperature,
max_tokens=max_tokens,
)