跳转至

远程模型

远程模型(Remote Model)是高能AI框架的特色功能之一,通过将模型等任意软件程序搭载到Worker中部署到云端服务器,搭配高能AI客户端实现远程模型等低延迟、分布式调用。

LOGO

需要hepai>=1.1.15, 通过pip install hepai --upgrade安装。

1 启动自定义远程模型

custom_remote_model.py:

from hepai import HRModel

class CustomWorkerModel(HRModel):  # Define a custom worker model inheriting from HRModel.
    def __init__(self, name: str = "hepai/custom-model", **kwargs):
        super().__init__(name=name, **kwargs)

    @HRModel.remote_callable  # Decorate the function to enable remote call.
    def custom_method(self, a: int = 1, b: int = 2) -> int:
        """Define your custom method here."""
        return a + b

if __name__ == "__main__":
    CustomWorkerModel.run()  # Run the custom worker model.
  • 运行python custom_remote_model.py以启动模型。运行python custom_remote_model.py -h可查看启动参数。
  • 启动后,你可打开http://localhost:4260查看API文档。

2 本地调用远程模型

本地调用远程模型可以需要知道远程模型的名称name和部署的服务器地址base_url。 本例中;远程模型的名称为hepai/custom-model,服务器地址为http://localhost:42602/apiv2

from hepai import HRModel

model = HRModel.connect(
    name="hepai/custom-model",
    base_url="http://localhost:4260/apiv2"
)

funcs = model.functions()  # Get all remote callable functions.
print(f"Remote callable funcs: {funcs}")

# 请求远程模型的custom_method方法
output = model.custom_method(a=1, b=2)
assert isinstance(output, int), f"output: type: {type(output)}, {output}"
print(f"Output of custon_method: {output}, type: {type(output)}")

# 测试流式响应
stream = model.get_stream(stream=True)  # Note: You should set `stream=True` to get a stream.
print(f"Output of get_stream:")
for x in stream:
    print(f"{x}, type: {type(x)}", flush=True)

3 进阶功能

3.1 设置参数

远程模型被搭载到HepAI Worker中运行,相关参数有两部分:模型参数和Worker参数。分别通过HModelConfigHWorkerConfig设置,如下完整代码:

from typing import Dict, Union, Literal
from dataclasses import dataclass, field
import json
import hepai as hai
from hepai import HRModel, HModelConfig, HWorkerConfig, HWorkerAPP

@dataclass
class CustomModelConfig(HModelConfig):
    name: str = field(default="hepai/custom-model", metadata={"help": "Model's name"})
    permission: Union[str, Dict] = field(default=None, metadata={"help": "Model's permission, separated by ;, e.g., 'groups: all; users: a, b; owner: c', will inherit from worker permissions if not setted"})
    version: str = field(default="2.0", metadata={"help": "Model's version"})

@dataclass
class CustomWorkerConfig(HWorkerConfig):
    host: str = field(default="0.0.0.0", metadata={"help": "Worker's address, enable to access from outside if set to `0.0.0.0`, otherwise only localhost can access"})
    port: int = field(default=4260, metadata={"help": "Worker's port, default is None, which means auto start from `auto_start_port`"})
    auto_start_port: int = field(default=42602, metadata={"help": "Worker's start port, only used when port is set to `auto`"})
    route_prefix: str = field(default="/apiv2", metadata={"help": "Route prefix for worker"})

    permissions: str = field(default='users: admin', metadata={"help": "Model's permissions, separated by ;, e.g., 'groups: default; users: a, b; owner: c'"})
    description: str = field(default='This is a demo worker of HEP AI framework (HepAI)', metadata={"help": "Model's description"})
    author: str = field(default=None, metadata={"help": "Model's author"})
    daemon: bool = field(default=False, metadata={"help": "Run as daemon"})

class CustomWorkerModel(HRModel):  # Define a custom worker model inheriting from HRModel.
    def __init__(self, config: HModelConfig):
        super().__init__(config=config)

    @HRModel.remote_callable  # Decorate the function to enable remote call.
    def custom_method(self, a: int = 1, b: int = 2) -> int:
        """Define your custom method here."""
        return a + b

    @HRModel.remote_callable
    def get_stream(self):
        for x in range(10):
            yield f"data: {json.dumps(x)}\n\n"

if __name__ == "__main__":

    import uvicorn
    from fastapi import FastAPI
    model_config, worker_config = hai.parse_args((CustomModelConfig, CustomWorkerConfig))
    model = CustomWorkerModel(model_config)  # Instantiate the custom worker model.
    app: FastAPI = HWorkerAPP(model, worker_config=worker_config)  # Instantiate the APP, which is a FastAPI application.

    print(app.worker.get_worker_info(), flush=True)
    # 启动服务
    uvicorn.run(app, host=app.host, port=app.port)

说明: - 由于采用了hai.parse_args()解析参数,因此可以通过命令行参数设置模型和Worker的参数,如python custom_remote_model.py --port None, 亦可通过python custom_remote_model.py -h查看所有参数说明、默认值等信息。

3.2 流式输出

远程模型设置流式输出需满足SSE(Server-Sent Events)格式,即每次输出的数据以data:开头,以\n\n结尾,正文内容用json.dumps()转换为json字符串,例如:

import json
from hepai import HRModel

class CustomWorkerModel(HRModel):
    @HRModel.remote_callable
    def get_stream(self):
        for x in range(10):
            yield f"data: {json.dumps(x)}\n\n"

说明: - yield返回Generator对象,正文内容x可以是任意类型,HepAI会服务端会自动编码,客户端自动解码为原始数据和对应类型。 - 客户端请求时,需额外设置stream=True保留参数,如model.get_stream(stream=True)

3.3 单worker搭载多模型

如3.1所示,只需在HWorkerAPP中传入多个模型实例即可。

from hepai import HRModel, HWorkerAPP

model1 = HRModel("hepai/hr-model1")
model2 = HRModel("hepai/hr-model2")
app = HWorkerAPP(models=[model1, model2])
app.run()  # Run the APP.

(4) 设置模型的访问权限

  • HModelConfig中的permission参数用于设置模型的访问权限,如permission="groups: default; users: a, b; owner: c",表示只有default用户组的用户和ab用户可以访问该模型,c用户为模型的所有者。
  • HWorkerConfig中的permissions参数用于设置Worker的访问权限。

  • 一个worker可以搭载多个模型,每个模型可以设置不同的访问权限。如果模型没有设置访问权限,则会继承Worker的访问权限。

(5) 分布式和多模型负载均衡

  • 多Worker分别部署到不同的服务器上,搭配HepAI统一网关控制器可以实现分布式部署和自动负载均衡。
  • 需要先启动HepAI网关控制器,控制器地址为http://localhost:42601.
  • 通过HWorkerConfig设置Controller参数,如controller="http://localhost:42601",并设置no-register=False,则Worker会自动注册到控制器。
  • 客户端访问时,设置base_url为控制器地址,如base_url="http://localhost:42601/apiv2",控制器会自动转发请求到对应的Worker及对应的模型。

4 API接口

默认远程模型启动的地址端口号为4260,则访问 http://localhost:4260可查看该模型的API接口。