Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: AI 代理 Wasm 插件对接 Mistral #1257

Merged
merged 3 commits into from
Aug 28, 2024

Conversation

EnableAsync
Copy link
Contributor

Ⅰ. Describe what this PR did

Add the mistral provider to the ai-proxy wasm plugin.

Ⅱ. Does this pull request fix one issue?

issue #948

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

1. build wasm

cd plugins/wasm-go/extensions/ai-proxy
tinygo build -o ai-proxy.wasm -scheduler=none -target=wasi -gc=custom -tags="custommalloc nottinygc_finalizer proxy_wasm_version_0_2_100" ./

2. docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.4.2
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志,正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
    - httpbin
    networks:
    - wasmtest
    ports:
    - "10000:10000"
    volumes:
    - ./envoy.yaml:/etc/envoy/envoy.yaml
    - ./ai-proxy.wasm:/etc/envoy/ai-proxy.wasm

  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
    - wasmtest
    ports:
    - "12345:80"

networks:
  wasmtest: {}

3. envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        protocol: TCP
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          scheme_header_transformation:
            scheme_to_overwrite: https
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: mistral
                  timeout: 300s

          http_filters:
          # llm-proxy
          - name: llm-proxy
            typed_config:
              "@type": type.googleapis.com/udpa.type.v1.TypedStruct
              type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
              value:
                config:
                  name: llm
                  vm_config:
                    runtime: envoy.wasm.runtime.v8
                    code:
                      local:
                        filename: /etc/envoy/ai-proxy.wasm
                  configuration:
                    "@type": "type.googleapis.com/google.protobuf.StringValue"
                    value: | # 插件配置
                      {
                        "provider": {
                          "type": "mistral",                                
                          "apiTokens": [
                            "apiToken"
                          ]
                        }
                      }


          - name: envoy.filters.http.router
      
  clusters:
  # mistral
  - name: mistral
    connect_timeout: 30s
    type: LOGICAL_DNS
    dns_lookup_family: V4_ONLY
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: mistral
      endpoints:
        - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: api.mistral.ai
                    port_value: 443
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        "sni": "api.mistral.ai"

4. curl

curl http://localhost:10000/v1/chat/completions -X POST -d '{"model":"mistral-small-latest","messages":[{"content":"你是谁呢?你在安全模式吗","role":"user"}, {"role": "assistant", "content": "我现在是一个可以回答任何问题的智能
助手,", "prefix": true}],"safe_prompt": true}' -H "Content-Type: application/json"

5. 响应

{
  "id": "xxx",
  "object": "chat.completion",
  "created": 1724770533,
  "model": "mistral-small-latest",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "我现在是一个可以回答任何问题的智能助手,目前处于安全模式,确保我的回答符合上述原则。我的目标是提供有用的信息,并遵循正面、公正和无害的原则。",
        "tool_calls": null
      },
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "usage": {
    "prompt_tokens": 83,
    "total_tokens": 155,
    "completion_tokens": 72
  }
}

Ⅴ. Special notes for reviews

Copy link
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 44.19%. Comparing base (ef31e09) to head (a17b19e).
Report is 68 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1257      +/-   ##
==========================================
+ Coverage   35.91%   44.19%   +8.28%     
==========================================
  Files          69       75       +6     
  Lines       11576     9821    -1755     
==========================================
+ Hits         4157     4340     +183     
+ Misses       7104     5152    -1952     
- Partials      315      329      +14     

see 80 files with indirect coverage changes

@CH3CHO CH3CHO merged commit 1128da0 into alibaba:main Aug 28, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants