-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ai-cache] Implement a WASM plugin for LLM result retrieval based on vector similarity #1290
Conversation
update update: 注意在使用http协议的时候不要用tls update: add lobechat add: makefile for ai-proxy fix bugs fix bugs fix: redis connection fix: dashvector and dashscope cluster fix: change vdb collection feat: add chroma logic docs: 增加 api 说明 update: no callback version fix: change to callback fix: finish chrome remove: key update: gitignore
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1290 +/- ##
==========================================
+ Coverage 35.91% 43.49% +7.57%
==========================================
Files 69 76 +7
Lines 11576 12320 +744
==========================================
+ Hits 4157 5358 +1201
+ Misses 7104 6626 -478
- Partials 315 336 +21 |
## 配置说明 | ||
配置分为 3 个部分:向量数据库(vector);文本向量化接口(embedding);缓存数据库(cache),同时也提供了细粒度的 LLM 请求/响应提取参数配置等。 |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
我觉得可以。配成 disabled 也就是让 ai-cache 插件不工作了,对吧? @johnlanni 觉得呢? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far so good.
添加了对当前版本配置的支持 func (c *PluginConfig) FromJson(json gjson.Result) {
c.vectorProviderConfig.FromJson(json.Get("vector"))
c.embeddingProviderConfig.FromJson(json.Get("embedding"))
c.cacheProviderConfig.FromJson(json.Get("cache"))
if json.Get("redis").Exists() {
// compatible with legacy config
c.cacheProviderConfig.ConvertLegacyJson(json.Get("redis"))
} ConvertLegacyJson函数定义为: func (c *ProviderConfig) ConvertLegacyJson(json gjson.Result) {
c.FromJson(json)
c.typ = "redis"
} 测试文件为 {
"redis": {
"serviceName": "redis_cluster",
"timeout": 2000
}
} 测试通过 其次,添加对其他非redis配置的支持,主要包括:
现采用方法为在config.go的FromJson函数中的最后一行加上 convertLegacyMapFields(c, json, log), 对应的实现为: func convertLegacyMapFields(c *PluginConfig, json gjson.Result, log wrapper.Log) {
keyMap := map[string]string{
"cacheKeyFrom.requestBody": "cacheKeyFrom",
"cacheValueFrom.requestBody": "cacheValueFrom",
"cacheStreamValueFrom.requestBody": "cacheStreamValueFrom",
"returnResponseTemplate": "responseTemplate",
"returnStreamResponseTemplate": "streamResponseTemplate",
}
for oldKey, newKey := range keyMap {
if json.Get(oldKey).Exists() {
log.Debugf("[convertLegacyMapFields] mapping %s to %s", oldKey, newKey)
setField(c, newKey, json.Get(oldKey).String(), log)
} else {
log.Debugf("[convertLegacyMapFields] %s not exists", oldKey)
}
}
}
func setField(c *PluginConfig, fieldName string, value string, log wrapper.Log) {
switch fieldName {
case "cacheKeyFrom":
c.CacheKeyFrom = value
case "cacheValueFrom":
c.CacheValueFrom = value
case "cacheStreamValueFrom":
c.CacheStreamValueFrom = value
case "responseTemplate":
c.ResponseTemplate = value
case "streamResponseTemplate":
c.StreamResponseTemplate = value
}
log.Debugf("[setField] set %s to %s", fieldName, value)
} 但是这种写法只针对returnStreamResponseTemplate和returnResponseTemplate这两个不加点的配置项有效,其他配置项无效 |
@@ -0,0 +1,225 @@ | |||
package config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
将字符串转义,并trim两边的"来兼容现有的模版
// Escape the response to ensure consistent formatting
escapedResponse := strings.Trim(strconv.Quote(response), "\"")
ctx.SetContext(CACHE_KEY_CONTEXT_KEY, nil)
if stream {
proxywasm.SendHttpResponseWithDetail(200, "ai-cache.hit", [][2]string{{"content-type", "text/event-stream; charset=utf-8"}}, []byte(fmt.Sprintf(c.StreamResponseTemplate, escapedResponse)), -1)
} else {
proxywasm.SendHttpResponseWithDetail(200, "ai-cache.hit", [][2]string{{"content-type", "application/json; charset=utf-8"}}, []byte(fmt.Sprintf(c.ResponseTemplate, escapedResponse)), -1)
}
@@ -0,0 +1,225 @@ | |||
package config | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Check if the ResponseBody field exists | ||
if !responseBody.Exists() { | ||
// Return an empty string if we cannot extract the content | ||
log.Warnf("[%s] [processSSEMessage] cannot extract content from message: %s", PLUGIN_NAME, message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stopReason不为空的时候,是不是会进到这个分支?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实,此处修改为log.Warn,不抛出错误
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
那是不是每次都会记个warn。。。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我仔细看了一下SSE的chunk流程,其实finish_reason为null或者stop也可以提取出空字符串,只有传入的数据只是标志SSE结束的"data: [DONE]"才会解析不出来,我在processSSEMessage
函数中加了特判:
if strings.TrimSpace(bodyJson) == "[DONE]" {
return "", nil
}
修改响应处理逻辑:当当前responseBody无法提取但已有缓存内容时,只记录 debug 级别日志并返回空字符串;如果当前没有缓存任何内容,则返回 error,并跳过后续的 resp 处理。
if ctx.GetContext(CACHE_CONTENT_CONTEXT_KEY) != nil {
log.Debugf("[%s] [processSSEMessage] unable to extract content from message; cache content is not nil: %s", PLUGIN_NAME, message)
return "", nil
}
return "", fmt.Errorf("[%s] [processSSEMessage] unable to extract content from message; cache content is nil: %s", PLUGIN_NAME, message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Ⅰ. Describe what this PR did
给ai-cache插件添加基于语文本向量相似度召回缓存的能力
Ⅱ. Does this pull request fix one issue?
fixes #1040