Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support Cloudflare Workers AI #1068

Merged
merged 9 commits into from
Jul 8, 2024

Conversation

imp2002
Copy link
Contributor

@imp2002 imp2002 commented Jun 28, 2024

Ⅰ. Describe what this PR did

Support Cloudflare Workers AI, API documentation: https://developers.cloudflare.com/workers-ai/configuration/open-ai-compatibility/

Ⅱ. Does this pull request fix one issue?

ref: #962

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

  • envoy.yaml:
admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: cloudflare
                http_filters:
                  - name: wasmdemo
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmdemo
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/main.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "cloudflare",
                                  "cloudflareAccountId": "******",
                                  "apiTokens": [
                                    "******"
                                  ]
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: cloudflare
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: cloudflare
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.cloudflare.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.cloudflare.com"
  • Use llama-3-8b-instruct

Request

curl POST 'http://localhost:10000/v1/chat/completions'
-H 'Content-Type: application/json'
-d '{
    "model": "@cf/meta/llama-3-8b-instruct",
    "max_tokens": 1024,
    "messages": [
        {
            "role": "user",
            "content": "Who are you?"
        }
    ]
}'

Response

{
    "id": "id-1719588499808",
    "object": "chat.completion",
    "created": 1719588499,
    "model": "@cf/meta/llama-3-8b-instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "I am LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I'm a large language model trained on a massive dataset of text from the internet, which allows me to generate human-like responses to a wide range of topics and questions.\n\nI'm not a human, but rather a computer program designed to simulate conversation and answer questions to the best of my ability based on my training. I can assist with tasks such as:\n\n* Answering questions on a wide range of topics\n* Generating text based on a prompt or topic\n* Translation between languages\n* Summarizing long pieces of text\n* Offering suggestions and ideas\n\nI'm constantly learning and improving my responses based on the interactions I have with users like you, so please bear with me if I make any mistakes. I'm here to help and provide information in a fun and engaging way!"
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ]
}

Ⅴ. Special notes for reviews

@CLAassistant
Copy link

CLAassistant commented Jun 28, 2024

CLA assistant check
All committers have signed the CLA.

@johnlanni
Copy link
Collaborator

cc @cr7258

@johnlanni johnlanni requested a review from cr7258 July 2, 2024 13:23
@cr7258
Copy link
Collaborator

cr7258 commented Jul 3, 2024

另外我看响应体没有 token usage 的内容, Cloudflare Workers AI 有办法支持吗?例如:

"usage": {
    "prompt_tokens": 16,
    "completion_tokens": 126,
    "total_tokens": 142
  }

因为 ai-statistics 插件可以用来统计 token 数,ai-token-ratelimit 插件还可以基于这个进行限流。

@imp2002
Copy link
Contributor Author

imp2002 commented Jul 6, 2024

响应体没有 token usage

这个好像不行

@imp2002 imp2002 requested a review from cr7258 July 6, 2024 15:39
Copy link
Collaborator

@cr7258 cr7258 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐉

@cr7258
Copy link
Collaborator

cr7258 commented Jul 7, 2024

@cr7258
Copy link
Collaborator

cr7258 commented Jul 8, 2024

Hello @imp2002,我是指在这个 repo 也更新一下文档:https://github.com/higress-group/higress-group.github.io

Copy link
Collaborator

@johnlanni johnlanni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@johnlanni johnlanni merged commit b9f5c4d into alibaba:main Jul 8, 2024
11 checks passed
2456868764 pushed a commit to 2456868764/higress that referenced this pull request Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants