
According to Tether, the latest QVAC SDK integrates Google Research’s TurboQuant to support local AI on laptops and phones with up to 5x KV cache compression and minimal quality impact.
Tether AI said the latest QVAC SDK integrates Google Research’s TurboQuant, a memory compression technology that reduces KV cache usage by up to 5x while keeping output quality nearly unchanged. The company said the update is aimed at improving local AI performance on laptops and phones, and that it supports its broader goal of reducing reliance on centralized cloud systems.