Admin message
为了安全,强烈建议开启2FA双因子认证:User Settings -> Account -> Enable two-factor authentication!!!
Tags
Tags give the ability to mark specific points in history as being important
b2268
269de86b
·
llama : fix Gemma rope type (#5691)
·
Feb 26, 2024
b2267
c3937339
·
flake.lock: Update
·
Feb 25, 2024
b2266
e3965cf3
·
server: tests - slow inference causes timeout on the CI (#5715)
·
Feb 25, 2024
b2264
bf08e006
·
llama : refactor k-shift implementation + KV defragmentation (#5691)
·
Feb 25, 2024
b2263
f7625019
·
server : fix crash when system prompt is bigger than batch size (#5714)
·
Feb 25, 2024
b2262
abbabc5e
·
ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility (#5711)
·
Feb 25, 2024
b2261
f1a98c52
·
make : fix nvcc version is empty (#5713)
·
Feb 25, 2024
b2259
930b1780
·
server: logs - unified format and --log-format option (#5700)
·
Feb 25, 2024
b2258
d52d7819
·
server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708)
·
Feb 25, 2024
b2257
12894088
·
cmake : fix compilation for Android armeabi-v7a (#5702)
·
Feb 25, 2024
b2256
ab336a9d
·
code : normalize enum names (#5697)
·
Feb 25, 2024
b2254
9e359a4f
·
server: continue to update other slots on embedding concurrent request (#5699)
·
Feb 24, 2024
b2253
4c4cb307
·
IQ3_S: a much better alternative to Q3_K (#5676)
·
Feb 24, 2024
b2252
525213d2
·
server: init functional tests (#5566)
·
Feb 24, 2024
b2251
fd43d66f
·
server : add KV cache quantization options (#5684)
·
Feb 23, 2024
b2249
15499eb9
·
mpt : do not duplicate token_embd.weight on disk (#5670)
·
Feb 22, 2024
b2248
96633eec
·
gemma : use more bits for the token_embd.weight tensor (#5650)
·
Feb 22, 2024
b2247
847eedbd
·
py : add Gemma conversion from HF models (#5647)
·
Feb 22, 2024
b2246
7e4f339c
·
ggml : always define ggml_fp16_t as uint16_t (#5666)
·
Feb 22, 2024
b2245
334f76fa
·
sync : ggml
·
Feb 22, 2024
1
…
100
101
102
103
104
105
106
107
108
…
178