-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: add hunyuan llm support for plugins/ai-proxy #1018
feature: add hunyuan llm support for plugins/ai-proxy #1018
Conversation
无法测通是什么意思?报什么错,gateway 容器输出什么 |
你是用的最新的代码吗?参考一下这个 PR 里的文档:#1005 |
嗯,代码是最新的,这个感谢今天上午的提醒,我先参考这个编译方案试一试 |
9458869
to
0a17538
Compare
Ⅰ. Describe what this PR did # File generated by hgctl. Modify as required.
admin:
address:
socket_address:
protocol: TCP
address: 0.0.0.0
port_value: 9901
static_resources:
listeners:
- name: listener_0
address:
socket_address:
protocol: TCP
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
scheme_header_transformation:
scheme_to_overwrite: https
stat_prefix: ingress_http
# Output envoy logs to stdout
access_log:
- name: envoy.access_loggers.stdout
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
# Modify as required
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: [ "*" ]
routes:
- match:
prefix: "/"
route:
cluster: moonshot
timeout: 300s
http_filters:
- name: wasmtest
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
value:
config:
name: wasmtest
vm_config:
runtime: envoy.wasm.runtime.v8
code:
local:
filename: /etc/envoy/plugin.wasm
configuration:
"@type": "type.googleapis.com/google.protobuf.StringValue"
value: |
{
"provider": {
"type": "hunyuan",
"hunyuanAuthKey": "VuR92ugGi04yr0EezLe7lm0FiKzrw27N",
"apiTokens": [
"sk-YGeSIaMRA2oSaDa86NCBVPGKdaiSuQ0YSOGI3nEkfvSb4HdT"
],
"hunyuanAuthId": "AKID2669UvMvTMJF86HbuMnB1rmdZTEvY2KQ",
"timeout": 1200000,
"modelMapping": {
"*": "hunyuan-lite"
}
}
}
- name: envoy.filters.http.router
clusters:
- name: httpbin
connect_timeout: 30s
type: LOGICAL_DNS
# Comment out the following line to test on v6 networks
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: httpbin
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: httpbin
port_value: 80
- name: moonshot
connect_timeout: 30s
type: LOGICAL_DNS
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: moonshot
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: hunyuan.tencentcloudapi.com
port_value: 443
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
"sni": "hunyuan.tencentcloudapi.com" 使用如下docker-compose启动该插件: version: '3.7'
services:
envoy:
image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.4.0
entrypoint: /usr/local/bin/envoy
# 注意这里对wasm开启了debug级别日志,正式部署时则默认info级别
command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
depends_on:
- httpbin
networks:
- wasmtest
ports:
- "10000:10000"
volumes:
- ./envoy.yaml:/etc/envoy/envoy.yaml
- ./out/plugin.wasm:/etc/envoy/plugin.wasm
httpbin:
image: kennethreitz/httpbin:latest
networks:
- wasmtest
ports:
- "12345:80"
networks:
wasmtest: {} 请求样例如下: curl --location 'http://127.0.0.1:10000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3",
"messages": [
{
"role": "system",
"content": "你是一个名专业的开发人员!"
},
{
"role": "user",
"content": "你好,你是谁?"
}
],
"temperature": 0.3,
"stream": false
}' |
hi,我已经按照意见进行了修改,请再review一下吧^v^ @CH3CHO |
hi,我已经按照意见进行了修改,请辛苦再review一下吧^v^ @CH3CHO |
麻烦按照上面的提示签署一下 CLA。 @xychen5 |
done,感谢review~ |
给出了本地测试千问的配置文件,发现无法测通,请求url是:curl --location 'http://127.0.0.1:10000/v1/chat/completions'
--header 'Content-Type: application/json'
--data '{
"model": "gpt-4-turbo",
"messages": [
{
"role": "user",
"content": "你是谁?"
}
],
"temperature": 0.3
}'