为了安全，强烈建议开启2FA双因子认证：User Settings -> Account -> Enable two-factor authentication！！！

Tags

Tags give the ability to mark specific points in history as being important

b2268

269de86b · llama : fix Gemma rope type (#5691) · Feb 26, 2024
b2267

c3937339 · flake.lock: Update · Feb 25, 2024
b2266

e3965cf3 · server: tests - slow inference causes timeout on the CI (#5715) · Feb 25, 2024
b2264

bf08e006 · llama : refactor k-shift implementation + KV defragmentation (#5691) · Feb 25, 2024
b2263

f7625019 · server : fix crash when system prompt is bigger than batch size (#5714) · Feb 25, 2024
b2262

abbabc5e · ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility (#5711) · Feb 25, 2024
b2261

f1a98c52 · make : fix nvcc version is empty (#5713) · Feb 25, 2024
b2259

930b1780 · server: logs - unified format and --log-format option (#5700) · Feb 25, 2024
b2258

d52d7819 · server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708) · Feb 25, 2024
b2257

12894088 · cmake : fix compilation for Android armeabi-v7a (#5702) · Feb 25, 2024
b2256

ab336a9d · code : normalize enum names (#5697) · Feb 25, 2024
b2254

9e359a4f · server: continue to update other slots on embedding concurrent request (#5699) · Feb 24, 2024
b2253

4c4cb307 · IQ3_S: a much better alternative to Q3_K (#5676) · Feb 24, 2024
b2252

525213d2 · server: init functional tests (#5566) · Feb 24, 2024
b2251

fd43d66f · server : add KV cache quantization options (#5684) · Feb 23, 2024
b2249

15499eb9 · mpt : do not duplicate token_embd.weight on disk (#5670) · Feb 22, 2024
b2248

96633eec · gemma : use more bits for the token_embd.weight tensor (#5650) · Feb 22, 2024
b2247

847eedbd · py : add Gemma conversion from HF models (#5647) · Feb 22, 2024
b2246

7e4f339c · ggml : always define ggml_fp16_t as uint16_t (#5666) · Feb 22, 2024
b2245

334f76fa · sync : ggml · Feb 22, 2024