为了安全，强烈建议开启2FA双因子认证：User Settings -> Account -> Enable two-factor authentication！！！

Tags

Tags give the ability to mark specific points in history as being important

b2135

895407f3 · ggml-quants : fix compiler warnings (shadow variable) (#5472) · Feb 13, 2024
b2134

099afc62 · llama : fix quantization when tensors are missing (#5423) · Feb 12, 2024
b2133

df334a11 · swift : package no longer use ggml dependency (#5465) · Feb 12, 2024
b2131

43fe07c1 · ggml-sycl: Replace 3d ops with macro (#5458) · Feb 12, 2024
b2130

4a46d2b7 · llava : remove prog parameter from ArgumentParser (#5457) · Feb 12, 2024
b2129

3b169441 · sync : ggml (#5452) · Feb 12, 2024
b2128

3bdc4cd0 · CUDA: mul_mat_vec_q tiling, refactor mul mat logic (#5434) · Feb 11, 2024
b2127

2891c8aa · Add support for BERT embedding models (#5423) · Feb 11, 2024
b2125

c88c74f9 · vulkan: only use M-sized matmul on Apple GPUs (#5412) · Feb 11, 2024
b2124

a803333a · common : use enums for sampler types (#5418) · Feb 11, 2024
b2123

68478014 · server : allow to specify tokens as strings in logit_bias (#5003) · Feb 11, 2024
b2122

85910c5b · main : ctrl+C print timing in non-interactive mode (#3873) · Feb 11, 2024
b2121

139b62a8 · common : fix compile warning · Feb 11, 2024
b2119

a07d0fee · ggml : add mmla kernels for quantized GEMM (#4966) · Feb 11, 2024
b2118

e4640d8f · lookup: add print for drafting performance (#5450) · Feb 11, 2024
b2117

907e08c1 · server : add llama2 chat template (#5425) · Feb 11, 2024
b2116

f026f812 · metal : use autoreleasepool to avoid memory leaks (#5437) · Feb 10, 2024
b2114

43b65f5e · sync : ggml · Feb 10, 2024
b2112

4b7b38be · vulkan: Set limit for task concurrency (#5427) · Feb 09, 2024
b2110

7c777fcd · server : fix prompt caching for repeated prompts (#5420) · Feb 09, 2024