Commit graph

  • 8944a13296
    Add NVIDIA cuBLAS support (#1044) slaren 2023-04-19 11:22:45 +0200
  • 6667401238
    Multi-threaded ggml_cpy (#1035) slaren 2023-04-19 00:53:24 +0200
  • 77a73403ca
    ggml : add new Q4_2 quantization (ARM only) (#1046) Georgi Gerganov 2023-04-18 23:54:57 +0300
  • 50a8a2af97
    ggml : scratch that - vmlaq_n_f32 is always better Georgi Gerganov 2023-04-18 23:11:23 +0300
  • 4caebf6d40
    gitignore : vdot Georgi Gerganov 2023-04-18 23:00:08 +0300
  • dcdd65e296
    ggml : optimize ggml_vec_dot_q4_0_q8_0() using vectorized accumulators Georgi Gerganov 2023-04-18 22:59:17 +0300
  • 5ecff35151
    Adding a simple program to measure speed of dot products (#1041) Kawrakow 2023-04-18 21:00:14 +0200
  • 7faa7460f0
    readme : update hot topics about new LoRA functionality Georgi Gerganov 2023-04-18 20:10:26 +0300
  • 5af8e32238
    ci : do not run on drafts Georgi Gerganov 2023-04-17 18:00:10 +0300
  • 42747220b4
    Do not close file after mmap (Windows version) (#1034) Ivan Komarov 2023-04-18 03:15:50 +0200
  • e9298af389
    readme : add Ruby bindings (#1029) Atsushi Tatsuma 2023-04-18 04:34:35 +0900
  • 4ad73137a1
    add 4_0 to default outfile namestr dict (#1031) Cameron 2023-04-17 11:26:23 -0700
  • 315a95a4d3
    Add LoRA support (#820) slaren 2023-04-17 17:28:55 +0200
  • efd05648c8
    llama : well-defined static initialization of complex objects (#927) Arik Poznanski 2023-04-17 17:41:53 +0300
  • eb17a026fd
    quantize-stats : fix bug in --type argument Georgi Gerganov 2023-04-17 17:31:06 +0300
  • 69b740289f
    ggml : avoid using ggml_fp16_to_fp32() and ggml_fp32_to_fp16() in ggml.c Georgi Gerganov 2023-04-17 16:16:23 +0300
  • f266259ad9
    Speedup the AVX-512 implementation of ggml_vec_dot_q4_0() (#933) Ivan Komarov 2023-04-17 15:10:57 +0200
  • 47f61aaa5f
    Fix: do not close file on mmap (#1017) slaren 2023-04-16 21:27:38 +0200
  • 3173a62eb9
    stdout : vertical align outputs for better readibility Georgi Gerganov 2023-04-16 13:58:48 +0300
  • 489537e6cf
    examples: add missing <ctime> include for time() (#1011) Pavol Rusnak 2023-04-16 12:13:00 +0200
  • 2d3481c721
    Fix msys2 build error and warnings (#1009) nanahi 2023-04-16 17:13:42 +0800
  • 74f5899df4
    convert.py: Fix loading safetensors and ggml format on Windows (#991) comex 2023-04-15 14:53:21 -0700
  • 2f7c8e014e
    Fix potential int8 overflow in non-SIMD vec_dot (#986) Stephan Walter 2023-04-15 18:28:56 +0000
  • 0ad964631f
    Refactor ggml.c for future tensor types (#1001) Stephan Walter 2023-04-15 16:25:38 +0000
  • e95b6554b4
    ggml : add Q8_0 quantization for intermediate results (#951) Georgi Gerganov 2023-04-15 17:53:22 +0300
  • aa485cee33
    ggml : use posix_memalign on non-Windows env Georgi Gerganov 2023-04-15 14:25:45 +0300
  • c12b14b77f
    benchmark : fix result validation in benchmark-q4_0-matmult (#987) Ivan Komarov 2023-04-15 07:51:54 +0200
  • 106faaf297
    cmake : add finding the OpenBLAS header file (#992) katsu560 2023-04-15 14:51:11 +0900
  • c85e03d12e
    Revert "main : alternative instruct mode (Vicuna support, etc.) (#863)" (#982) Pavol Rusnak 2023-04-14 21:58:43 +0200
  • 489093548c
    py : bump sentencepiece to 0.1.98 to support Python 3.11 (#976) Pavol Rusnak 2023-04-14 21:46:49 +0200
  • 93265e988a
    make : fix dependencies, use auto variables (#983) Stephan Walter 2023-04-14 19:39:48 +0000
  • c56b715269
    Expose type name from ggml (#970) Pavol Rusnak 2023-04-14 20:05:37 +0200
  • f4d277ae17
    main : alternative instruct mode (Vicuna support, etc.) (#863) Tomáš Pazdiora 2023-04-14 17:19:17 +0200
  • c9a59b70a5
    ggml : add unary and binary map operations (#874) Kerfuffle 2023-04-14 08:43:55 -0600
  • a32f7acc9f
    py : cleanup dependencies (#962) Pavol Rusnak 2023-04-14 15:37:11 +0200
  • 43ffdefb74
    py : fix flake8 and isort nitpicks (#960) Pavol Rusnak 2023-04-14 14:23:21 +0200
  • 1623a6e9b4
    ggml : minor Georgi Gerganov 2023-04-14 13:31:29 +0300
  • c14e0d2f23
    ggml : always allocate buffers with size multiple of GGML_MEM_ALIGN Georgi Gerganov 2023-04-14 13:31:15 +0300
  • 723dac55fa
    py : new conversion script (#545) comex 2023-04-14 00:03:03 -0700
  • 0f07cacb05
    ggml : fix q4_1 dot product types Georgi Gerganov 2023-04-14 09:45:42 +0300
  • c5d70f5c9e
    ggml : optimize rope function to avoid call powf in the tight loop (#807) Howard Su 2023-04-14 14:24:52 +0800
  • be87b6ed20
    perplexity : add support for batch size to --perplexity (#407) Gary Linscott 2023-04-13 14:50:42 -0700
  • 0e07e6a839
    common : remove unnecessary includes (#947) CRD716 2023-04-13 10:39:25 -0500
  • a3a2a0eda8
    ggml : add GGML_DEFAULT_N_THREADS Georgi Gerganov 2023-04-13 18:36:40 +0300
  • d990e3fffc
    ggml : speed-up ggml_vec_dot_q4_1() ARM_NEON + 32-bit ARM support (#900) Georgi Gerganov 2023-04-13 18:32:36 +0300
  • 9190e8eac8
    llama : merge llama_internal.h into llama.h Georgi Gerganov 2023-04-13 18:04:45 +0300
  • c85980acd0
    gitignore : benchmark Georgi Gerganov 2023-04-13 18:01:22 +0300
  • 6232f2d7fd
    ggml : optimize non-SIMD Q4_0 vector dot product (#703) Stephan Walter 2023-04-13 14:59:50 +0000
  • 6c248707f5
    ggml : introduce GGML_ALIGNED_MALLOC/GGML_ALIGNED_FREE macros (#884) Pavol Rusnak 2023-04-13 16:08:32 +0200
  • 8cda5c981d
    fix whitespace (#944) CRD716 2023-04-13 09:03:57 -0500
  • ec29272175
    readme : remove python 3.10 warning (#929) CRD716 2023-04-13 08:59:53 -0500
  • 7e941b95eb
    readme : llama node binding (#911) Genkagaku.GPT 2023-04-13 21:54:27 +0800
  • c729ff730a
    flake.nix: add all binaries from bin (#848) Pavol Rusnak 2023-04-13 15:49:05 +0200
  • 4579af95e8
    zig : update build.zig (#872) Judd 2023-04-13 21:43:22 +0800
  • 8c3ffc2f04
    ggml : update cblas_sgemm columns var to be more reasonable (#838) Vladimir 2023-04-13 15:24:30 +0200
  • 107980d970
    examples : add -n to alpaca and gpt4all scripts (#706) niansa/tuxifan 2023-04-13 15:03:39 +0200
  • 585d91a156
    cmake : add explicit F16C option (x86) (#576) anzz1 2023-04-13 15:48:21 +0300
  • 95ea26f6e9
    benchmark : add tool for timing q4_0 matrix multiplication (#653) SebastianApel 2023-04-13 14:46:23 +0200
  • 82d146df9b
    do not force the prompt file to end with a new line (#908) Pavol Rusnak 2023-04-13 11:33:16 +0200
  • e7f6997f89
    Don't crash on ftype (formerly f16) == 4 (#917) Stephan Walter 2023-04-12 15:06:16 +0000
  • f76cb3a34d
    readme : change "GPU support" link to discussion Georgi Gerganov 2023-04-12 14:48:57 +0300
  • 782438070f
    readme : update hot topics with link to "GPU support" issue Georgi Gerganov 2023-04-12 14:31:12 +0300
  • 4dbbd40750
    readme: link to sha256sums file (#902) Nicolai Weitkemper 2023-04-12 08:46:20 +0200
  • 8b679987cd
    Fix whitespace, add .editorconfig, add GitHub workflow (#883) Pavol Rusnak 2023-04-11 21:45:44 +0200
  • 3e6e70d8e8
    Add enum llama_ftype, sync ggml_type to model files (#709) Stephan Walter 2023-04-11 15:03:51 +0000
  • 2663d2c678
    Windows fixes (#890) comex 2023-04-11 06:19:54 -0700
  • a0caa34b16
    Add BAIR's Koala to supported models (#877) qouoq 2023-04-11 04:41:53 +0800
  • 461ba9e66e
    ggml : fix WASM build Georgi Gerganov 2023-04-10 23:20:01 +0300
  • c3ac702e5e
    ggml : add ggml_cont() + optimize ggml_cpy() for contiguous dst Georgi Gerganov 2023-04-10 22:40:28 +0300
  • 9d634ef452
    ggml : remove trailing whitespaces Georgi Gerganov 2023-04-10 19:32:45 +0300
  • d9a239c410
    Simplify to include lower-case windows.h always, fix compile on mingw32 (#747) Marco Matthies 2023-04-10 19:57:59 +0200
  • 684da25926
    ggml : fix quantize_row_q4_1() ARM_NEON (close #876) Georgi Gerganov 2023-04-10 19:29:48 +0300
  • 180b693a47 Print model version. comex 2023-04-08 13:08:21 -0700
  • f963b63afa Rewrite loading code to try to satisfy everyone: comex 2023-04-08 12:24:37 -0700
  • aaf3b23deb
    fix for windows utf-8 input (#840) Tomáš Pazdiora 2023-04-08 17:49:39 +0200
  • f2d1c47294
    cmake should link openblas properly with -lopenblas like how it's done in the makefile (#839) eiery 2023-04-08 07:15:17 -0400
  • 317fb12fbd
    Add new binaries to flake.nix (#847) lon 2023-04-08 07:04:23 -0300
  • 62cfc54f77
    Add quantize-stats command for testing quantization (#728) unbounded 2023-04-08 00:09:18 +0200
  • 698f7b5d63
    make : add libllama.so target for llama-cpp-python (#797) bhubbb 2023-04-08 02:11:58 +1000
  • c1950c3431
    zig : don't link examples/common.cpp for non-example (#814) iacore 2023-04-07 16:05:29 +0000
  • 4953e9007f
    llama : always sort logits before nucleus sampling (#812) Ivan Stepanov 2023-04-07 19:02:12 +0300
  • cc9cee8e9e
    Do not crash when it has nothing to say. (#796) Sergey Alirzaev 2023-04-06 17:59:11 +0200
  • d2beca95dc
    Make docker instructions more explicit (#785) Pavol Rusnak 2023-04-06 08:56:58 +0200
  • eeaa7b0492
    ggml : multi-thread ggml_rope() (~3-4 times faster on M1) (#781) Georgi Gerganov 2023-04-05 22:11:03 +0300
  • 986b6ce9f9
    ggml, llama : avoid heavy V transpose + improvements (#775) Georgi Gerganov 2023-04-05 22:07:33 +0300
  • 3416298929
    Update README.md Georgi Gerganov 2023-04-05 19:54:30 +0300
  • 5a8c4f6240
    llama : define non-positive top_k; top_k range check (#779) Ivan Stepanov 2023-04-05 19:20:05 +0300
  • ff05d05c96
    miku.sh : add executable bit (#780) at8u 2023-04-05 15:59:13 +0000
  • 62b3e81aae
    media : add logos and banners Georgi Gerganov 2023-04-05 18:58:06 +0300
  • 8d10406d6e
    readme : change logo + add bindings + add uis + add wiki Georgi Gerganov 2023-04-05 18:56:20 +0300
  • ed1c214e66
    zig : add build.zig (#773) iacore 2023-04-05 15:06:02 +0000
  • 0c44427df1
    make : missing host optimizations in CXXFLAGS (#763) Ivan Stepanov 2023-04-05 17:38:37 +0300
  • 594cc95fab
    readme : update with CMake and windows example (#748) Adithya Balaji 2023-04-05 16:36:12 +0200
  • 88ed5761b8
    examples : add Miku.sh (#724) at8u 2023-04-05 14:32:42 +0000
  • 58c438cf7d
    Add Accelerate/BLAS when using Swift (#765) Andrew Duffy 2023-04-05 11:44:24 +0100
  • 53dbba7695
    Windows: reactive sigint handler after each Ctrl-C (#736) mgroeber9110 2023-04-03 18:00:55 +0200
  • 437e77855a
    10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654) SebastianApel 2023-04-03 09:52:28 +0200
  • cd7fa95690
    Define non-positive temperature behavior (#720) Ivan Stepanov 2023-04-03 03:19:04 +0300
  • a0c0516416
    Remove torch GPU dependencies from the Docker.full image (#665) bsilvereagle 2023-04-02 15:13:03 -0700
  • d8d4e865cd
    Add a missing step to the gpt4all instructions (#690) Thatcher Chamberlin 2023-04-02 06:48:57 -0400