远程模型
远程模型(Remote Model)是高能AI框架的特色功能之一,通过将模型等任意软件程序搭载到Worker中部署到云端服务器,搭配高能AI客户端实现远程模型等低延迟、分布式调用。

需要hepai>=1.1.15
, 通过pip install hepai --upgrade
安装。
1 启动自定义远程模型
custom_remote_model.py
:
from hepai import HRModel
class CustomWorkerModel(HRModel): # Define a custom worker model inheriting from HRModel.
def __init__(self, name: str = "hepai/custom-model", **kwargs):
super().__init__(name=name, **kwargs)
@HRModel.remote_callable # Decorate the function to enable remote call.
def custom_method(self, a: int = 1, b: int = 2) -> int:
"""Define your custom method here."""
return a + b
if __name__ == "__main__":
CustomWorkerModel.run() # Run the custom worker model.
- 运行
python custom_remote_model.py
以启动模型。运行python custom_remote_model.py -h
可查看启动参数。 - 启动后,你可打开http://localhost:4260查看API文档。
2 本地调用远程模型
本地调用远程模型可以需要知道远程模型的名称name
和部署的服务器地址base_url
。
本例中;远程模型的名称为hepai/custom-model
,服务器地址为http://localhost:42602/apiv2
。
from hepai import HRModel
model = HRModel.connect(
name="hepai/custom-model",
base_url="http://localhost:4260/apiv2"
)
funcs = model.functions() # Get all remote callable functions.
print(f"Remote callable funcs: {funcs}")
# 请求远程模型的custom_method方法
output = model.custom_method(a=1, b=2)
assert isinstance(output, int), f"output: type: {type(output)}, {output}"
print(f"Output of custon_method: {output}, type: {type(output)}")
# 测试流式响应
stream = model.get_stream(stream=True) # Note: You should set `stream=True` to get a stream.
print(f"Output of get_stream:")
for x in stream:
print(f"{x}, type: {type(x)}", flush=True)
3 进阶功能
3.1 设置参数
远程模型被搭载到HepAI Worker中运行,相关参数有两部分:模型参数和Worker参数。分别通过HModelConfig
和HWorkerConfig
设置,如下完整代码:
from typing import Dict, Union, Literal
from dataclasses import dataclass, field
import json
import hepai as hai
from hepai import HRModel, HModelConfig, HWorkerConfig, HWorkerAPP
@dataclass
class CustomModelConfig(HModelConfig):
name: str = field(default="hepai/custom-model", metadata={"help": "Model's name"})
permission: Union[str, Dict] = field(default=None, metadata={"help": "Model's permission, separated by ;, e.g., 'groups: all; users: a, b; owner: c', will inherit from worker permissions if not setted"})
version: str = field(default="2.0", metadata={"help": "Model's version"})
@dataclass
class CustomWorkerConfig(HWorkerConfig):
host: str = field(default="0.0.0.0", metadata={"help": "Worker's address, enable to access from outside if set to `0.0.0.0`, otherwise only localhost can access"})
port: int = field(default=4260, metadata={"help": "Worker's port, default is None, which means auto start from `auto_start_port`"})
auto_start_port: int = field(default=42602, metadata={"help": "Worker's start port, only used when port is set to `auto`"})
route_prefix: str = field(default="/apiv2", metadata={"help": "Route prefix for worker"})
permissions: str = field(default='users: admin', metadata={"help": "Model's permissions, separated by ;, e.g., 'groups: default; users: a, b; owner: c'"})
description: str = field(default='This is a demo worker of HEP AI framework (HepAI)', metadata={"help": "Model's description"})
author: str = field(default=None, metadata={"help": "Model's author"})
daemon: bool = field(default=False, metadata={"help": "Run as daemon"})
class CustomWorkerModel(HRModel): # Define a custom worker model inheriting from HRModel.
def __init__(self, config: HModelConfig):
super().__init__(config=config)
@HRModel.remote_callable # Decorate the function to enable remote call.
def custom_method(self, a: int = 1, b: int = 2) -> int:
"""Define your custom method here."""
return a + b
@HRModel.remote_callable
def get_stream(self):
for x in range(10):
yield f"data: {json.dumps(x)}\n\n"
if __name__ == "__main__":
import uvicorn
from fastapi import FastAPI
model_config, worker_config = hai.parse_args((CustomModelConfig, CustomWorkerConfig))
model = CustomWorkerModel(model_config) # Instantiate the custom worker model.
app: FastAPI = HWorkerAPP(model, worker_config=worker_config) # Instantiate the APP, which is a FastAPI application.
print(app.worker.get_worker_info(), flush=True)
# 启动服务
uvicorn.run(app, host=app.host, port=app.port)
说明:
- 由于采用了hai.parse_args()
解析参数,因此可以通过命令行参数设置模型和Worker的参数,如python custom_remote_model.py --port None
, 亦可通过python custom_remote_model.py -h
查看所有参数说明、默认值等信息。
3.2 流式输出
远程模型设置流式输出需满足SSE(Server-Sent Events)格式,即每次输出的数据以data:
开头,以\n\n
结尾,正文内容用json.dumps()
转换为json字符串,例如:
import json
from hepai import HRModel
class CustomWorkerModel(HRModel):
@HRModel.remote_callable
def get_stream(self):
for x in range(10):
yield f"data: {json.dumps(x)}\n\n"
说明:
- yield
返回Generator对象,正文内容x
可以是任意类型,HepAI会服务端会自动编码,客户端自动解码为原始数据和对应类型。
- 客户端请求时,需额外设置stream=True
保留参数,如model.get_stream(stream=True)
。
3.3 单worker搭载多模型
如3.1所示,只需在HWorkerAPP
中传入多个模型实例即可。
from hepai import HRModel, HWorkerAPP
model1 = HRModel("hepai/hr-model1")
model2 = HRModel("hepai/hr-model2")
app = HWorkerAPP(models=[model1, model2])
app.run() # Run the APP.
(4) 设置模型的访问权限
HModelConfig
中的permission
参数用于设置模型的访问权限,如permission="groups: default; users: a, b; owner: c"
,表示只有default
用户组的用户和a
、b
用户可以访问该模型,c
用户为模型的所有者。-
HWorkerConfig
中的permissions
参数用于设置Worker的访问权限。 -
一个worker可以搭载多个模型,每个模型可以设置不同的访问权限。如果模型没有设置访问权限,则会继承Worker的访问权限。
(5) 分布式和多模型负载均衡
- 多Worker分别部署到不同的服务器上,搭配HepAI统一网关控制器可以实现分布式部署和自动负载均衡。
- 需要先启动HepAI网关控制器,控制器地址为
http://localhost:42601
. - 通过
HWorkerConfig
设置Controller
参数,如controller="http://localhost:42601"
,并设置no-register=False
,则Worker会自动注册到控制器。 - 客户端访问时,设置
base_url
为控制器地址,如base_url="http://localhost:42601/apiv2"
,控制器会自动转发请求到对应的Worker及对应的模型。
4 API接口
默认远程模型启动的地址端口号为4260
,则访问
http://localhost:4260可查看该模型的API接口。