notice: please create a custom view template for the hackernewscore class view-hackernewscore.html
Show HN: ZSE β Open-source LLM inference engine with 3.9s cold starts
π ZSE Revolution: ZSE is an open-source LLM inference engine that solves the long-standing issues of memory efficiency and fast cold starts, allowing 32B models to run on just 19.3 GB VRAM and achieving a 3.9s cold start for 7B models, making it a game-changer for serverless and autoscaling use cases. Its innovative .zse pre-quantized format enables rapid loading and seamless integration with existing infrastructure, providing unparalleled performance and efficiency.
guid
https://news.ycombinator.com/item?id=47160526
source_url
https://github.com/Zyora-Dev/zse
author_name
zyoralabs
uid: E5FkW
insdate: 2026-02-26 03:05:05
title: Show HN: ZSE β Open-source LLM inference engine with 3.9s cold starts
additional: π ZSE Revolution: ZSE is an open-source LLM inference engine that solves the long-standing issues of memory efficiency and fast cold starts, allowing 32B models to run on just 19.3 GB VRAM and achieving a 3.9s cold start for 7B models, making it a game-changer for serverless and autoscaling use cases. Its innovative .zse pre-quantized format enables rapid loading and seamless integration with existing infrastructure, providing unparalleled performance and efficiency.
category: Hacker News
md5:
guid: https://news.ycombinator.com/item?id=47160526
source_url: https://github.com/Zyora-Dev/zse
updated:
image:
author_name: zyoralabs
author_link:
