feat: AI 代理 Wasm 插件对接 Mistral #1257

EnableAsync · 2024-08-27T15:02:11Z

Ⅰ. Describe what this PR did

Add the mistral provider to the ai-proxy wasm plugin.

Ⅱ. Does this pull request fix one issue?

issue #948

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

1. build wasm

cd plugins/wasm-go/extensions/ai-proxy

tinygo build -o ai-proxy.wasm -scheduler=none -target=wasi -gc=custom -tags="custommalloc nottinygc_finalizer proxy_wasm_version_0_2_100" ./

2. docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.4.2
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志，正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
    - httpbin
    networks:
    - wasmtest
    ports:
    - "10000:10000"
    volumes:
    - ./envoy.yaml:/etc/envoy/envoy.yaml
    - ./ai-proxy.wasm:/etc/envoy/ai-proxy.wasm

  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
    - wasmtest
    ports:
    - "12345:80"

networks:
  wasmtest: {}

3. envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        protocol: TCP
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          scheme_header_transformation:
            scheme_to_overwrite: https
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: mistral
                  timeout: 300s

          http_filters:
          # llm-proxy
          - name: llm-proxy
            typed_config:
              "@type": type.googleapis.com/udpa.type.v1.TypedStruct
              type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
              value:
                config:
                  name: llm
                  vm_config:
                    runtime: envoy.wasm.runtime.v8
                    code:
                      local:
                        filename: /etc/envoy/ai-proxy.wasm
                  configuration:
                    "@type": "type.googleapis.com/google.protobuf.StringValue"
                    value: | # 插件配置
                      {
                        "provider": {
                          "type": "mistral",                                
                          "apiTokens": [
                            "apiToken"
                          ]
                        }
                      }


          - name: envoy.filters.http.router
      
  clusters:
  # mistral
  - name: mistral
    connect_timeout: 30s
    type: LOGICAL_DNS
    dns_lookup_family: V4_ONLY
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: mistral
      endpoints:
        - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: api.mistral.ai
                    port_value: 443
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        "sni": "api.mistral.ai"

4. curl

curl http://localhost:10000/v1/chat/completions -X POST -d '{"model":"mistral-small-latest","messages":[{"content":"你是谁呢？你在安全模式吗","role":"user"}, {"role": "assistant", "content": "我现在是一个可以回答任何问题的智能
助手，", "prefix": true}],"safe_prompt": true}' -H "Content-Type: application/json"

5. 响应

{
  "id": "xxx",
  "object": "chat.completion",
  "created": 1724770533,
  "model": "mistral-small-latest",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "我现在是一个可以回答任何问题的智能助手，目前处于安全模式，确保我的回答符合上述原则。我的目标是提供有用的信息，并遵循正面、公正和无害的原则。",
        "tool_calls": null
      },
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "usage": {
    "prompt_tokens": 83,
    "total_tokens": 155,
    "completion_tokens": 72
  }
}

Ⅴ. Special notes for reviews

CH3CHO

LGTM

codecov-commenter · 2024-08-27T15:17:46Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 44.19%. Comparing base (ef31e09) to head (a17b19e).
Report is 68 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1257      +/-   ##
==========================================
+ Coverage   35.91%   44.19%   +8.28%     
==========================================
  Files          69       75       +6     
  Lines       11576     9821    -1755     
==========================================
+ Hits         4157     4340     +183     
+ Misses       7104     5152    -1952     
- Partials      315      329      +14

see 80 files with indirect coverage changes

EnableAsync added 2 commits August 27, 2024 13:00

feat: add mistral support

29018eb

fix: readme

e35dac5

EnableAsync requested review from johnlanni, WeixinX and CH3CHO as code owners August 27, 2024 15:02

CH3CHO approved these changes Aug 27, 2024

View reviewed changes

Merge branch 'main' into feat/mistralai

a17b19e

CH3CHO merged commit 1128da0 into alibaba:main Aug 28, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: AI 代理 Wasm 插件对接 Mistral #1257

feat: AI 代理 Wasm 插件对接 Mistral #1257

EnableAsync commented Aug 27, 2024

CH3CHO left a comment

codecov-commenter commented Aug 27, 2024

feat: AI 代理 Wasm 插件对接 Mistral #1257

feat: AI 代理 Wasm 插件对接 Mistral #1257

Conversation

EnableAsync commented Aug 27, 2024

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

CH3CHO left a comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 27, 2024

Codecov Report