feat: support Cloudflare Workers AI #1068

imp2002 · 2024-06-28T15:51:21Z

Ⅰ. Describe what this PR did

Support Cloudflare Workers AI, API documentation: https://developers.cloudflare.com/workers-ai/configuration/open-ai-compatibility/

Ⅱ. Does this pull request fix one issue?

ref: #962

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

envoy.yaml:

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: cloudflare
                http_filters:
                  - name: wasmdemo
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmdemo
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/main.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "cloudflare",
                                  "cloudflareAccountId": "******",
                                  "apiTokens": [
                                    "******"
                                  ]
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: cloudflare
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: cloudflare
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.cloudflare.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.cloudflare.com"

Use llama-3-8b-instruct

Request

curl POST 'http://localhost:10000/v1/chat/completions'
-H 'Content-Type: application/json'
-d '{
    "model": "@cf/meta/llama-3-8b-instruct",
    "max_tokens": 1024,
    "messages": [
        {
            "role": "user",
            "content": "Who are you?"
        }
    ]
}'

Response

{
    "id": "id-1719588499808",
    "object": "chat.completion",
    "created": 1719588499,
    "model": "@cf/meta/llama-3-8b-instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "I am LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I'm a large language model trained on a massive dataset of text from the internet, which allows me to generate human-like responses to a wide range of topics and questions.\n\nI'm not a human, but rather a computer program designed to simulate conversation and answer questions to the best of my ability based on my training. I can assist with tasks such as:\n\n* Answering questions on a wide range of topics\n* Generating text based on a prompt or topic\n* Translation between languages\n* Summarizing long pieces of text\n* Offering suggestions and ideas\n\nI'm constantly learning and improving my responses based on the interactions I have with users like you, so please bear with me if I make any mistakes. I'm here to help and provide information in a fun and engaging way!"
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ]
}

Ⅴ. Special notes for reviews

CLAassistant · 2024-06-28T15:51:28Z

All committers have signed the CLA.

johnlanni · 2024-07-02T13:23:17Z

cc @cr7258

plugins/wasm-go/extensions/ai-proxy/provider/cloudflare.go

cr7258 · 2024-07-03T13:25:43Z

另外我看响应体没有 token usage 的内容， Cloudflare Workers AI 有办法支持吗？例如：

"usage": {
    "prompt_tokens": 16,
    "completion_tokens": 126,
    "total_tokens": 142
  }

因为 ai-statistics 插件可以用来统计 token 数，ai-token-ratelimit 插件还可以基于这个进行限流。

imp2002 · 2024-07-06T15:38:31Z

响应体没有 token usage

这个好像不行

cr7258

LGTM 🐉

cr7258 · 2024-07-07T14:53:30Z

@imp2002 可以帮忙也更新一下官方文档吗？
https://github.com/higress-group/higress-group.github.io/blob/main/i18n/zh-cn/docusaurus-plugin-content-docs/current/plugins/ai/ai-proxy.md

cr7258 · 2024-07-08T03:11:34Z

Hello @imp2002，我是指在这个 repo 也更新一下文档：https://github.com/higress-group/higress-group.github.io

johnlanni

LGTM

Co-authored-by: Kent Dong <[email protected]>

feat: support Cloudflare Workers AI

440c3e0

imp2002 requested review from johnlanni, WeixinX and CH3CHO as code owners June 28, 2024 15:51

johnlanni requested a review from cr7258 July 2, 2024 13:23

imp2002 added 3 commits July 2, 2024 23:36

Merge branch 'main' into feat/cloudflare-workers-ai

8a30a64

Merge branch 'main' into feat/cloudflare-workers-ai

876ab0e

Merge branch 'main' into feat/cloudflare-workers-ai

0ebcdbd

cr7258 reviewed Jul 3, 2024

View reviewed changes

plugins/wasm-go/extensions/ai-proxy/provider/cloudflare.go Outdated Show resolved Hide resolved

plugins/wasm-go/extensions/ai-proxy/provider/cloudflare.go Outdated Show resolved Hide resolved

feat: support model mapping

9ec6d3f

Merge branch 'main' into feat/cloudflare-workers-ai

9650dc8

imp2002 requested a review from cr7258 July 6, 2024 15:39

cr7258 approved these changes Jul 7, 2024

View reviewed changes

imp2002 and others added 2 commits July 8, 2024 00:07

chore: update doc

18c87c2

Merge branch 'main' into feat/cloudflare-workers-ai

cb6fd58

CH3CHO approved these changes Jul 8, 2024

View reviewed changes

Merge branch 'main' into feat/cloudflare-workers-ai

265b776

imp2002 mentioned this pull request Jul 8, 2024

doc: update ai-proxy doc higress-group/higress-group.github.io#248

Merged

johnlanni approved these changes Jul 8, 2024

View reviewed changes

johnlanni merged commit b9f5c4d into alibaba:main Jul 8, 2024
11 checks passed

2456868764 pushed a commit to 2456868764/higress that referenced this pull request Jul 9, 2024

feat: support Cloudflare Workers AI (alibaba#1068)

d340de4

Co-authored-by: Kent Dong <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support Cloudflare Workers AI #1068

feat: support Cloudflare Workers AI #1068

imp2002 commented Jun 28, 2024 •

edited

Loading

CLAassistant commented Jun 28, 2024 •

edited

Loading

johnlanni commented Jul 2, 2024

cr7258 commented Jul 3, 2024

imp2002 commented Jul 6, 2024

cr7258 left a comment

cr7258 commented Jul 7, 2024

cr7258 commented Jul 8, 2024

johnlanni left a comment

feat: support Cloudflare Workers AI #1068

feat: support Cloudflare Workers AI #1068

Conversation

imp2002 commented Jun 28, 2024 • edited Loading

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

CLAassistant commented Jun 28, 2024 • edited Loading

johnlanni commented Jul 2, 2024

cr7258 commented Jul 3, 2024

imp2002 commented Jul 6, 2024

cr7258 left a comment

Choose a reason for hiding this comment

cr7258 commented Jul 7, 2024

cr7258 commented Jul 8, 2024

johnlanni left a comment

Choose a reason for hiding this comment

imp2002 commented Jun 28, 2024 •

edited

Loading

CLAassistant commented Jun 28, 2024 •

edited

Loading