forked from mlc-ai/mlc-llm
-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor to prepare for parallel sampling (#100)
* wip * wip * wip * fix * fix * fix * refactor * more refactor * wip * wip * more refactor * more refactor * fixed * fixed mypy * minor * msg clean * fix missing finish_reason * remove unnecessary type annot on defaultdict * Return requests state from get_requests_to_process * simplify typing * reduced list concat * remove dict add and lookup * wrong comment * Revert "remove dict add and lookup" This reverts commit 5382004. * fix sampler test * make it possible to disable prometheus metrics * collect metrics only in staging engine * return False in stop_by_length if request is already finished * move check_stopping_sequences to engine_common.py * add missing free_request method to Dummy cache manager * update Dummy cache manager to operate on sequence * fix request finish condition
- Loading branch information
Showing
12 changed files
with
745 additions
and
629 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.