CUDA: faster k-quant mul_mat_q kernels (#2525) · f514d1b306 - aditya/llama.cpp - Forgejo: Beyond coding. We Forge.

aditya/llama.cpp

mirror of https://git.adityakumar.xyz/llama.cpp.git synced 2025-02-22 23:50:01 +00:00

CUDA: faster k-quant mul_mat_q kernels (#2525)

This commit is contained in:

Johannes Gäßler

2023-08-05 18:20:44 +02:00

committed by

GitHub

parent 332311234a

commit f514d1b306

No known key found for this signature in database

GPG key ID: 4AEE18F83AFDEB23

1 changed files with 518 additions and 371 deletions

877

ggml-cuda.cu

View file

File diff suppressed because it is too large Load diff