MAICA更新与进度追踪--正式服务期3已开始于25.11.15

Edge · 2024-07-16T05:11:28+00:00

此贴被用于记录MAICA自首次规模化测试(24.7.16)往后的更新与调整内容. 此贴预计还会包含关于服务状态变化与其它公告的通知. 要使用MAICA, 请前往介绍页(https://maica.monika.love)并仔细阅读应读内容, 下载地址和使用方式均有详细说明. This topic is...

Edge

25.10.25

处理一个硬件问题.

25.10.25

Fixed a hardware issue.

Edge

25.10.29

近日在长session的生成中发现了较显著的生成混乱问题. 我准备尝试调整注意力后端, 服务可能会中断一段时间.

25.10.29

As more critical generation corruptions reported in long sessions, I decide to try operating the attention backend and see if that works. The service might be down for a period today.

Edge

25.10.29

已将vllm回退到0.10.2, 经验证相同问题未再复现.

该问题与此前MFocus性能退化本质相同, 目前仍然没有得到完整解释, 据推测是sm120和vllm注意力后端共同作用的结果.

我没有能力自主修复这一层面的问题, 问题已汇报于 https://github.com/vllm-project/vllm/issues/26930 , 且已见多例.

服务现已重新开启, 你应该不再会遭遇这一问题. 需要注意, 我们清除了十余个可能已发/将发问题的session.

25.10.29

Issue solved with vllm instance reverted to 0.10.2.

This is actually the same issue with previous MFocus quality descalation, and I have yet no definite explaination. I suppose it's a co-result of sm120 GPUs and vllm's attention backend.

I'm not able to fix such an issue of vllm myself, but reported to https://github.com/vllm-project/vllm/issues/26930 . Multiple instances of this issue have been awared.

The service is back online now, and you won't encounter this issue again hopefully. Please note that we have force reset ~10 corrupted / corrupting sessions.

Edge

25.11.1

收到汇报当session长度超过2048时, #291可以复现. 预计这两天重新开启排查流程, 期间服务可能中断.

25.11.1

It's reported that #291 can be reproduced when session length exceeds 2048. Restarting debugging procedure in the following one or two days estm, service might be down periodly.

Edge

25.11.1

排障需要的时间可能比想象中长, 问题排除并恢复服务时另行通知.

25.11.1

It might take longer than expected to debug this issue, will announce separately on service back online.

Edge

25.11.2

服务已重新上线:

已经修复#291, 且找到了问题原因. 实际上似乎是qwen3的混合注意力模式和kvc hashing共同导致的, 其实际问题表现是超过一定长度的query会读写错误的缓存.
- 具体内容已在#291的issue中更新, 自行查阅. 根据其他开发者的反馈, 问题似乎不止这一个, 但至少我们的能用了.
已更新一系列软件包并拉取相关修复.
已检修了一次硬件, 替换了有隐患的接头和灭火器模块.
新的部署算法似乎在编译完成时有更高的瞬时功耗, 会拉低市电电压导致集群崩溃. 已调整设备的功耗限制.

"当务之急是找一个星流反应堆来供电"

25.11.2

Service backed online:

Fixed #291 and located the root cause. Seems to be a hybrid of qwen3 specified mixed attention model and vllm specified kvc hashing method. This issue causes vllm to read or write wrong cache in generations exceeding a certain length.
- Details updated in issue mentioned in #291. Seemingly not the only issue with similar symptom, but ours now solved at least.
Updated libraries and packages with multiple fixes.
Checked and maintained hardware. Replaced plugs and a fire extinguisher module with potential issue.
The new deployment method has a seemingly higher power consumation peak when compiling finished, could overload mains power and cause crashes. Added pwr limit for devices.

Edge

25.11.4

更新后端到1.2.000.rc9:

为dscl_pvn添加后端支持. 启用时, 后端在合适时机要求MNerve介入检查输出合理性, 并适时提示用户清空session.
一系列调整, 规范化和兼容性改动.

25.11.4

Updated backend to 1.2.000.rc9:

Added backend support for dscl_pvn. When enabled, backend requests MNerve to check sanity of output at proper times, and reminds user to reset session on descalation detected.
Serial of adjustments, standardizations and compatibility tweaks.

Edge

25.11.6

更新后端到1.2.000.rc10

为断点续传功能添加后端支持.
一系列调整和优化.

还在发rc是因为前端没做完, 不测试协商不能转正…

25.11.6

Updated backend to 1.2.000.rc10:

Added backend support for websocket resumation.
Series of adjustments and improvements.

Edge

25.11.8

略微改良了开源数据集, 已同步至hf仓库.

25.11.8

Minor improvements on opensourced datasets, synced to hf repo.

Edge

25.11.9

修复一个导致sfe不完整的后端问题.

25.11.9

Fixed a backend issue causing sfe data inconsistent.

Edge

25.11.9

更新后端到1.2.000.rc12. 没什么内容, 懒得念了.

25.11.9

Updating backend to 1.2.000.rc12.

Edge

25.11.12

更新后端到1.2.000.rc13:

添加高级设置pre_astp和post_astp. 默认情况下启用pre_astp以节约时间.
改良MPostal和MVista的协作.
修复一个ic_prep的额外功能不生效的问题.
杂项修复调整.

25.11.12

Updating backend to 1.2.000.rc13:

Added advanced setting pre_astp and post_astp. pre_astp is enabled by default to save time.
Improved MPostal and MVista's cooperation performance.
Fixed an issue causing ic_prep's auxiliary functions not taking effect.
Miscellaneous.

Edge

25.11.13

更新后端到1.2.000:

改良短连接的返回状态码
一系列修复调整

前端1.5版本预计会很快投用.

25.11.13

Updating backend to 1.2.000:

Improved response status code behavior
Series of fixes and adjustments.

Submod frontend v1.5 is estm to be released soon.

Edge

自我第一次得知你的存在，我就一直在研究你的战斗，并以此让我的机械变得更强。
就算是现在，我依然在检测你的行动。一切都在我的计算之内。

25.11.15 - 重要!

更新前端子模组到1.5.0:

完成了前端重构工作, 稳定性和规范性应有大幅提升
添加对MVista的前端支持. 在解锁对应功能后, 玩家可以随对话/信件上传图片.
添加一系列后端1.2新功能的前端支持

该更新同时代表MAICA DAA3公开测试过渡到正式服务阶段.

该更新牵涉内容众多. 对于在使用过程中遇到的问题, 请向我们提供必要的反馈.

另请注意, 由于MAICA-MTTS的完善用时超出预期, 其将更晚投用. 具体另行通知.

==MAICA-撕裂现实的帷幕==

25.11.15 - Important!

Updating Submod frontend to 1.5.0:

Finished frontend overhaul, should improve standardity and stability significantly
Added frontend support for MVista. After unlocking the corresponding in-game event, you can now upload images together with chat / letters.
Added support for series of features introduced in backend update 1.2

This update also stands for the transition of DAA3 openbeta to DAA3 stable.

This update is complex. Please offer necessary feedback if you encounter any issue.

Besides, MAICA-MTTS is delayed due to unexpected time consuming refines. Release will be announced separately.

==MAICA-We Tear This Barrier Apart==

Edge

25.11.15

需要特别注意: 对于移动端MAS, 请使用截至目前为止最新的发行版. 更早的版本无法使用MAICA Blessland>=1.5.

Edge

25.11.15

更新前端子模组到1.5.1:

修复一个移动版存在的问题

25.11.15

Updating frontend submod to 1.5.1:

Fixed a mobile compatibility issue

Edge

25.11.16

更新前端子模组到1.5.2:

修复一个关于MPostal的升级兼容性问题

25.11.16

Updating frontend submod to 1.5.2:

Fixed a migration compatibility issue of MPostal

« 上一页