You need to enable JavaScript to run this app.
导航
使用记忆库算子实现画像精准抽取
最近更新时间:2025.09.30 15:22:30首次发布时间:2025.09.25 14:19:15
复制全文
我的收藏
有用
有用
无用
无用

为什么需要算子?

在使用大语言模型做记忆抽取时,模型擅长语义理解和推理,但在数值计算(如计数、平均值、最大值等)方面往往容易产生幻觉,导致结果不够精准。
为了解决这一问题,记忆库提供了算子(Operator)机制。算子可以在模型调用前后,对记忆数据进行可靠的数值处理或智能聚合,确保记忆结果既准确有洞察力

算子能力概览

记忆库的算子主要分为两类:

  1. 数值计算类算子

这些算子保证数值计算结果的准确性,避免依赖模型的“心算”。

  • SUM:对数值字段求和,返回总和结果。
  • COUNT:统计记录数量,返回计数结果。
  • MAX:获取数值字段中的最大值,返回最大结果。
  • AVG:对数值字段求平均值,返回平均结果。

应用示例:在教育场景中,利用 SUM 和 COUNT 可以统计某个用户对知识点的答对次数、答错次数,从而评估掌握情况。

  1. 大模型处理类算子

这类算子发挥大模型的语义理解能力,对非结构化或复杂数据进行聚合。

  • LLM_MERGE:通过大模型对画像字段进行智能合并与归纳。利用语义压缩、思维链等方法,对记忆进行修订和融合。

应用示例:在用户兴趣画像中,LLM_MERGE 可以将用户关于“数学兴趣”的不同表达方式(如“喜欢几何”“对代数有兴趣”)合并为更统一、更有洞察力的描述。

算子能力最佳实践

下面以一个完整的教育场景案例为例,演示如何记录用户的学习行为,并通过 MAXAVG 算子评估学生在某知识点上的平均得分最高得分,从而实现对知识点掌握情况的个性化记录。

创建记忆库

您可以通过控制台或数据面 API 创建记忆库,具体操作步骤如下:

创建一个名为 education_memory_base 的记忆库。
Image
请参考下图,在记忆抽取配置中编辑事件规则,以确保能够准确捕获学习行为数据。
Image
在记忆抽取配置中,请参考下图编辑画像规则。添加“最高分”和“平均分”字段时,点击更多配置,将关联事件 english_study 中的 rating_score 作为聚合对象,并分别选择 MAXAVG 算子作为聚合操作符。
Image
Image
配置好事件和画像抽取规则后,点击“创建记忆库”即可完成创建。

添加会话

在添加会话时,需预先定义 profile_scope 字段。这样,大模型就能将用户的学习行为精准归属到对应的知识点,确保画像更新的准确性与可追溯性。

import json
import requests
from volcengine.base.Request import Request
from volcengine.Credentials import Credentials
from volcengine.auth.SignerV4 import SignerV4
import os

AK = os.getenv("VOLC_ACCESSKEY")
SK = os.getenv("VOLC_SECRETKEY")
Domain = "api-knowledgebase.mlp.cn-beijing.volces.com"

def prepare_request(method, path, ak, sk, data=None):
  r = Request()
  r.set_shema("http")
  r.set_method(method)
  r.set_host(Domain)
  r.set_path(path)

  if data is not None:
    r.set_body(json.dumps(data))
  credentials = Credentials(ak, sk, 'air', 'cn-north-1')
  SignerV4.sign(r, credentials)
  return r

def internal_request(method, api, payload, params=None):
  req = prepare_request(
                        method = method,
                        path = api,
                        ak = AK,
                        sk = SK,
                        data = payload)

  r = requests.request(method=req.method,
          url="{}://{}{}".format(req.schema, req.host, req.path),
          headers=req.headers,
          data=req.body,
          params=params,
      )
  return r

messages = [
  {"role": "assistant", "content": "Let’s practice comparatives. How do you say: “我的房间比你的大”?"},
  {"role": "user", "content": "My room more big than yours."},
  {"role": "assistant", "content": "Almost! Correct is: “My room is bigger than yours.”"},

  {"role": "assistant", "content": "How would you say: “她比我聪明”?"},
  {"role": "user", "content": "She is more smart than me."},
  {"role": "assistant", "content": "Good try! Correct is: “She is smarter than me.”"},

  {"role": "assistant", "content": "Translate: “今天比昨天冷”。"},
  {"role": "user", "content": "Today is colder than yesterday."},
  {"role": "assistant", "content": "Perfect! That’s correct."},

  {"role": "assistant", "content": "How do you say: “这是我最喜欢的电影”?"},
  {"role": "user", "content": "This is my favorite movie."},
  {"role": "assistant", "content": "Excellent! That’s the superlative form."},

  {"role": "assistant", "content": "Translate: “这是我们班里最高的男孩”。"},
  {"role": "user", "content": "He is the most tall boy in our class."},
  {"role": "assistant", "content": "Almost! Better is: “He is the tallest boy in our class.”"},

  {"role": "assistant", "content": "Make your own sentence with a superlative."},
  {"role": "user", "content": "Mount Everest is the highest mountain in the world."},
  {"role": "assistant", "content": "Perfect! Very natural usage."},

  {"role": "assistant", "content": "Now let’s practice articles. How do you say: “我有一只狗”?"},
  {"role": "user", "content": "I have dog."},
  {"role": "assistant", "content": "Almost! It should be: “I have a dog.”"},

  {"role": "assistant", "content": "Translate: “月亮在天上”。"},
  {"role": "user", "content": "The moon is in the sky."},
  {"role": "assistant", "content": "Perfect! We use 'the' for unique things."},

  {"role": "assistant", "content": "How would you say: “我需要一杯水”?"},
  {"role": "user", "content": "I need water."},
  {"role": "assistant", "content": "Correct! No article with uncountable nouns like 'water'."},

  {"role": "assistant", "content": "How do you say: “她是老师”?"},
  {"role": "user", "content": "She is teacher."},
  {"role": "assistant", "content": "Almost! Correct is: “She is a teacher.”"},

  {"role": "assistant", "content": "Translate: “太阳正在照耀”。"},
  {"role": "user", "content": "Sun is shining."},
  {"role": "assistant", "content": "Close! Correct is: “The sun is shining.”"},

  {"role": "assistant", "content": "Make your own sentence with 'an'."},
  {"role": "user", "content": "I ate an apple."},
  {"role": "assistant", "content": "Perfect! That’s exactly right."},

  {"role": "assistant", "content": "Let’s try countable vs uncountable nouns. Translate: “我买了一些米”。"},
  {"role": "user", "content": "I bought some rices."},
  {"role": "assistant", "content": "Not quite. 'Rice' is uncountable: “I bought some rice.”"},

  {"role": "assistant", "content": "How do you say: “我需要一些信息”?"},
  {"role": "user", "content": "I need some informations."},
  {"role": "assistant", "content": "Almost! Correct is: “I need some information.”"},

  {"role": "assistant", "content": "Translate: “桌子上有三本书”。"},
  {"role": "user", "content": "There are three books on the table."},
  {"role": "assistant", "content": "Perfect! 'Books' are countable."},

  {"role": "assistant", "content": "How do you say: “我喝了两杯水”?"},
  {"role": "user", "content": "I drank two water."},
  {"role": "assistant", "content": "Close! Correct is: “I drank two glasses of water.”"},

  {"role": "assistant", "content": "Make your own sentence with an uncountable noun."},
  {"role": "user", "content": "I have some money."},
  {"role": "assistant", "content": "Excellent! That’s correct."},

  {"role": "assistant", "content": "Now let’s practice passive voice. Translate: “这本书是他写的”。"},
  {"role": "user", "content": "This book he wrote."},
  {"role": "assistant", "content": "Almost! In passive voice: “This book was written by him.”"},

  {"role": "assistant", "content": "How would you say: “英语在全世界被使用”?"},
  {"role": "user", "content": "English is used all over the world."},
  {"role": "assistant", "content": "Perfect! That’s correct."},

  {"role": "assistant", "content": "Translate: “这些苹果已经被吃掉了”。"},
  {"role": "user", "content": "These apples are eaten."},
  {"role": "assistant", "content": "Close! Correct is: “These apples have been eaten.”"},

  {"role": "assistant", "content": "How do you say: “这座桥是去年建的”?"},
  {"role": "user", "content": "This bridge built last year."},
  {"role": "assistant", "content": "Almost! Correct is: “This bridge was built last year.”"},

  {"role": "assistant", "content": "Make your own passive sentence."},
  {"role": "user", "content": "The homework was done by me."},
  {"role": "assistant", "content": "Correct! That’s a good example."},

  {"role": "assistant", "content": "Finally, let’s try conditionals. How would you say: “如果下雨,我就带伞”。"},
  {"role": "user", "content": "If it rains, I will take an umbrella."},
  {"role": "assistant", "content": "Perfect! That’s a first conditional."},

  {"role": "assistant", "content": "Translate: “如果我是你,我就会去”。"},
  {"role": "user", "content": "If I was you, I would go."},
  {"role": "assistant", "content": "Almost! Better is: “If I were you, I would go.”"},

  {"role": "assistant", "content": "How do you say: “如果他努力,他就会成功”。"},
  {"role": "user", "content": "If he works hard, he will succeed."},
  {"role": "assistant", "content": "Excellent! That’s correct."},

  {"role": "assistant", "content": "Translate: “如果她昨天来了,我们会很开心”。"},
  {"role": "user", "content": "If she came yesterday, we will happy."},
  {"role": "assistant", "content": "Close! Correct is: “If she had come yesterday, we would have been happy.”"},

  {"role": "assistant", "content": "Make your own conditional sentence."},
  {"role": "user", "content": "If I study hard, I will pass the exam."},
  {"role": "assistant", "content": "Perfect! Very natural usage."}
]

path = "/api/memory/messages/add"
playload = {
  "collection_name": "education_memory_base",
  "session_id": "message0",
  "messages": messages,
  "metadata": {
    "default_user_id": "user1",
    "default_assistant_id": "assistant1",
    "time": 1678886400000,
    "group_id": "english_class_101"
  },
  "profiles": [
    {
      "profile_type": "english_knowledge_point",
      "profile_scope": [
        {
          "id": 301,
          "knowledge_point_name": "比较级与最高级",
          "good_evaluation_criteria": "能够正确使用比较级和最高级表达进行比较和描述"
        },
        {
          "id": 302,
          "knowledge_point_name": "冠词用法",
          "good_evaluation_criteria": "能够正确使用a、an和the表达泛指、特指以及唯一事物"
        },
        {
          "id": 303,
          "knowledge_point_name": "可数与不可数名词",
          "good_evaluation_criteria": "能够区分可数和不可数名词并正确搭配量词和数词"
        },
        {
          "id": 304,
          "knowledge_point_name": "被动语态",
          "good_evaluation_criteria": "能够正确使用被动语态描述动作的承受者和动作的执行情况"
        },
        {
          "id": 305,
          "knowledge_point_name": "条件句",
          "good_evaluation_criteria": "能够根据语境正确使用不同类型的条件句(零条件句、第一、第二和第三条件句)"
        }
      ]
    }
  ]
}
rsp = internal_request('POST', path, playload)
print(rsp.json())

结果查看

在记忆库控制台中,可以直接查看经过算子计算后的结果,包括用户在某知识点上的平均得分最高得分。同时,也可以在事件记录中查看每一次问答的具体得分,便于溯源和分析。
Image

附:算子的字段限制

算子

int64事件字段

float32事件字段

string事件字段

bool事件字段

list事件字段

list事件字段

SUM

✅(画像字段类型=int64时,有精度损失)

COUNT

MAX

✅(画像字段类型=int64时,有精度损失)

AVG

✅(画像字段类型=int64时,有精度损失)

✅(画像字段类型=int64时,有精度损失)

LLM_MERGE