Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFace Endpoint Inference Model Deployer #86

Merged
merged 32 commits into from
Jan 30, 2024
Merged

HuggingFace Endpoint Inference Model Deployer #86

merged 32 commits into from
Jan 30, 2024

Conversation

dudeperf3ct
Copy link
Contributor

@dudeperf3ct dudeperf3ct commented Jan 16, 2024

In this PR, we implement a custom model deployer that uses huggingface inference endpoint.

Copy link

dagshub bot commented Jan 16, 2024

Copy link
Contributor

@htahir1 htahir1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow how good is this! I love it!

I think the next step would be to use this deployer in the deployment step... maybe we can return the service in the step and mark it as a deployment_artifact

Might also make sense to test this out on some sort of base model already available on the huggingface hub

zencoder/steps/deployment.py Outdated Show resolved Hide resolved
zencoder/steps/deployment.py Outdated Show resolved Hide resolved
zencoder/huggingface/hf_model_deployer.py Outdated Show resolved Hide resolved
zencoder/huggingface/hf_deployment.py Outdated Show resolved Hide resolved
@htahir1 htahir1 changed the base branch from main to feature/add-huggingface-deployer January 22, 2024 09:45
@htahir1 htahir1 changed the base branch from feature/add-huggingface-deployer to main January 22, 2024 09:46
Copy link
Contributor

@htahir1 htahir1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so awesome and you did such a fantastic effort @dudeperf3ct ! I left some comments which are mostly nits and questions

llm-finetuning/huggingface/README.md Outdated Show resolved Hide resolved
Comment on lines 9 to 16
endpoint_name: Optional[str] = None
repository: Optional[str] = None
framework: Optional[str] = None
accelerator: Optional[str] = None
instance_size: Optional[str] = None
instance_type: Optional[str] = None
region: Optional[str] = None
vendor: Optional[str] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nitpick, but as these are now optional, there is no way that the user really knows what to pass in at the step level. Do you think it makes sense to rework this "BaseConfig" idea and just make seperate configs for the deployer and the service, so we can mark things as optional or not?


logger = get_logger(__name__)

POLLING_TIMEOUT = 1200
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a parameter somewhere for the user to control?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky to pass to. We want some way to pass timeout to provision metthod where POLLING_TIMEOUT is being used. service.start(timeout=timeout) calls provision method underneath but does not provide a way to pass timeout.

https://github.com/zenml-io/zenml/blob/42baca0115eca3293f98468fdd7025ce20d02262/src/zenml/services/service.py#L376-L369

We can increase pass the same timeout constant POLLING_TIMEOUT when we call start function here. Right now it uses DEFAULT_DEPLOYMENT_START_STOP_TIMEOUT as default value.

https://github.com/dudeperf3ct/zenml-projects/blob/feature/zencoder-huggingface-model-deployer/llm-finetuning/huggingface/hf_model_deployer.py#L93

)
else:
raise NotImplementedError(
"Tasks other than text-generation is not implemented."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much work is it to actually implement other tasks? This list here gives so many tasks and I am wondering whether we can easily support them? https://huggingface.co/docs/inference-endpoints/supported_tasks

llm-finetuning/huggingface/hf_model_deployer.py Outdated Show resolved Hide resolved
llm-finetuning/steps/deployment.py Outdated Show resolved Hide resolved
llm-finetuning/huggingface/hf_model_deployer.py Outdated Show resolved Hide resolved
llm-finetuning/huggingface/hf_deployment_service.py Outdated Show resolved Hide resolved
@htahir1 htahir1 marked this pull request as ready for review January 26, 2024 09:42
@htahir1 htahir1 merged commit c254ccc into zenml-io:main Jan 30, 2024
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants