Commit graph

  • 265db9834e
    ggml : output 3d sizes in ggml_graph_dump_dot() Georgi Gerganov 2023-05-21 11:56:23 +0300
  • fab49c685e
    ggml : update WASM SIMD Georgi Gerganov 2023-05-20 20:00:41 +0300
  • b8ee340abe
    feature : support blis and other blas implementation (#1536) Zenix 2023-05-20 23:58:31 +0900
  • 9ecb30f959
    OpenCL: Fixes for older devices. (#1435) Henri Vasserman 2023-05-20 17:57:39 +0300
  • 29cf5596fe
    llama : define magic numbers as integer constants (#1518) (#1520) Juuso Alasuutari 2023-05-20 15:58:15 +0300
  • 3de84b2606
    ggml : add ggml_clamp() (#1539) Georgi Gerganov 2023-05-20 15:34:45 +0300
  • affc76edfd
    cuda : loading models directly into VRAM, norm calculation on GPU, broadcasting for ggml_mul (#1483) Johannes Gäßler 2023-05-20 14:19:28 +0200
  • ea600071cb
    Revert "feature : add blis and other BLAS implementation support (#1502)" Georgi Gerganov 2023-05-20 12:03:48 +0300
  • 07e9ace0f9
    feature : add blis and other BLAS implementation support (#1502) Zenix 2023-05-20 18:02:48 +0900
  • ec2e10c444
    llama : add llama_init_backend() API (close #1527) Georgi Gerganov 2023-05-20 11:06:11 +0300
  • d2c59b8ba4
    Fix for mingw (#1462) DannyDaemonic 2023-05-20 00:40:02 -0700
  • 503db28849
    llama : fix name shadowing and C4146 (#1526) Maxime 2023-05-20 09:22:37 +0200
  • 8a203f9fa1 llama : fix compile warnings in llama_set_state_data() Georgi Gerganov 2023-05-20 10:14:31 +0300
  • 4fd3e29297 ggml : fix scalar implementation of Q4_1 dot Georgi Gerganov 2023-05-20 10:13:19 +0300
  • 2d5db48371
    ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 (#1508) Georgi Gerganov 2023-05-19 22:17:18 +0300
  • 6986c7835a
    tests : add missing header Georgi Gerganov 2023-05-19 21:17:28 +0300
  • 943e6081cc
    examples : add persistent chat (#1495) Evan Jones 2023-05-19 13:39:51 -0400
  • 7694b52b9a
    main : make reverse prompt option act as a stop token in non-interactive mode (#1032) Jason McCartney 2023-05-19 10:24:59 -0700
  • 79e3efb0e9
    readme : adds WizardLM to the list of supported models (#1485) David Kennedy 2023-05-19 13:16:30 -0400
  • 4b7e245adf
    minor : fix compile warnings Georgi Gerganov 2023-05-19 20:14:51 +0300
  • 5ea4339273
    make kv_f16 the default for api users (#1517) Erik Scholz 2023-05-18 19:31:01 +0200
  • ee9654138a
    Fixes #1511 lambda issue for w64devkit (mingw) (#1513) DannyDaemonic 2023-05-18 10:30:40 -0700
  • dc271c52ed
    Remove unused n_parts parameter (#1509) Stephan Walter 2023-05-17 22:12:01 +0000
  • c238b5873a
    benchmark-matmul: Print the average of the test results (#1490) rankaiyx 2023-05-17 22:47:58 +0800
  • 2b2646931b
    convert.py: Support models which are stored in a single pytorch_model.bin (#1469) Tom Jobbins 2023-05-16 23:04:35 +0100
  • 42627421ec
    ~7% faster Q5_1 AVX2 code (#1477) Ilya Kurdyukov 2023-05-17 01:36:47 +0700
  • 9560655409
    define default model path once, sync path with readme (#1366) András Salamon 2023-05-16 16:46:34 +0100
  • 2a5ee023ad
    Add alternate include path for openblas (#1476) sandyiscool 2023-05-16 14:00:15 +0530
  • 63d20469b8
    fix get_num_physical_cores() (#1436) zrm 2023-05-14 22:25:42 -0400
  • b5c9295eef
    benchmark-matmul: fix clang-tidy issues, report results in GFLOPS (#1458) slaren 2023-05-14 22:46:00 +0200
  • eb363627fd
    cuda : deduplicated dequantization code (#1453) Johannes Gäßler 2023-05-14 20:53:23 +0200
  • 79b2d5b69d
    ggml : alternative fix for race condition bug in non-inplace ggml_compute_forward_diag_mask_f32 (#1454) xaedes 2023-05-14 17:55:02 +0200
  • 13c351ad72
    ggml : various fixes (#1450) Georgi Gerganov 2023-05-14 18:22:50 +0300
  • 60f8c361ca
    ggml : add AVX support based on AVX2 code (#1430) katsu560 2023-05-14 19:03:51 +0900
  • 601a033475
    ggml : add GGML_QNT_VERSION to track quantization format changes Georgi Gerganov 2023-05-14 10:20:19 +0300
  • 08737ef720 cuda : fix convert function (#1412) Georgi Gerganov 2023-05-13 17:40:58 +0300
  • bda4d7c215 make : fix PERF build with cuBLAS Georgi Gerganov 2023-05-13 17:25:09 +0300
  • 5a5aeb1e91
    llama : fix unused warning Georgi Gerganov 2023-05-13 16:55:14 +0300
  • 66841fdb0e
    ggml : multi-thread mul and diag_mask ops (#1428) Georgi Gerganov 2023-05-13 16:48:03 +0300
  • 905d87b70a
    ggml : GPU-accelerated token generation (#1412) Johannes Gäßler 2023-05-13 15:38:36 +0200
  • f954edda93
    ggml : implement backward pass for llama + small training-llama-from-scratch example (#1360) xaedes 2023-05-13 14:56:40 +0200
  • f048af0230
    ggml : sync alibi fix from ggml repo Georgi Gerganov 2023-05-13 11:54:33 +0300
  • ac0cd259d5
    Adding SSE instructions to ggml_vec_dot_q4_0_q8_0 (#1413) 3ooabkhxtn 2023-05-13 10:43:33 +0200
  • 0cd22e190a
    llama : fix various warnings Georgi Gerganov 2023-05-13 11:23:15 +0300
  • 6456a4eb9f
    embedding : remove unused code (#1426) Rinne 2023-05-13 15:24:20 +0800
  • cdd5350892
    readme : update Q4_0 perplexities Georgi Gerganov 2023-05-13 09:12:44 +0300
  • 738ace394a
    llama : free ggml context in set / copy state data (close #1425) Georgi Gerganov 2023-05-13 09:08:52 +0300
  • 699b1ad7fe
    opencl : fix kernels for the new formats (#1422) Henri Vasserman 2023-05-13 09:01:15 +0300
  • fb62f92433
    llama : fix --mtest option (close #1414) Georgi Gerganov 2023-05-12 21:44:20 +0300
  • 773ee249fb
    CLI args use - instead of _, backwards compatible (#1416) Johannes Gäßler 2023-05-12 16:34:55 +0200
  • 553fd4d4b5
    Add clang-tidy reviews to CI (#1407) slaren 2023-05-12 15:40:53 +0200
  • 089b1c93ba
    readme : add C#/.NET bindings repo (#1409) Rinne 2023-05-12 13:39:40 +0800
  • b9fd7eee57
    ggml : remove bit shuffling (#1405) Georgi Gerganov 2023-05-12 00:23:08 +0300
  • b608b55a3e
    prompts : model agnostic DAN (#1304) CRD716 2023-05-11 10:10:19 -0500
  • cf348a60e0
    main : add option to save full output to session (#1338) Evan Jones 2023-05-10 11:37:14 -0400
  • e6a46b0ed1
    Locale fix for Windows (#1379) DannyDaemonic 2023-05-09 10:53:28 -0700
  • 9f8dbc4787
    use pause asm insn in busyloop to run the CPU (13600K) 10 °C cooler (#1314) Sami Farin 2023-05-09 15:29:20 +0300
  • 41654efea8
    Interface improvements and --multiline-input (previously --author-mode) (#1040) DannyDaemonic 2023-05-08 19:45:48 -0700
  • 56551bc11f
    readme : add notice about upcoming breaking change Georgi Gerganov 2023-05-08 22:52:18 +0300
  • fe60904eef
    readme : add TOC and Pygmalion instructions (#1359) AlpinDale 2023-05-08 21:03:30 +0430
  • 003ba2fb43
    llama : fix hparams shadow (#1367) Pavol Rusnak 2023-05-08 16:48:21 +0200
  • f9a6364912
    llama : require first token to be BOS (#1303) Georgi Gerganov 2023-05-08 17:41:54 +0300
  • 95078cc554
    convert: add ability to convert safetensors files (#1276) ubik2 2023-05-08 04:54:26 -0700
  • 1f48b0abcf
    Documented CUDA reproducibility, added warning (#1346) Johannes Gäßler 2023-05-08 02:42:01 +0200
  • e1295513a4
    CI: add Windows CLBlast and OpenBLAS builds (#1277) Henri Vasserman 2023-05-07 14:20:09 +0300
  • 1b0fd45465
    ggml : Allow usage of CLBlast alongside Accelerate.framework (#1336) swittk 2023-05-07 10:03:23 +0700
  • 3924088512
    Remove default arguments from sampling functions (#1343) Jed Fox 2023-05-06 17:01:47 -0400
  • 173d0e6419
    makefile: automatic Arch Linux detection (#1332) DaniAndTheWeb 2023-05-05 23:57:14 +0200
  • a3b85b28da
    ci : add cublas to windows release (#1271) Erik Scholz 2023-05-05 22:56:09 +0200
  • 921dcee00a
    readme: add missing info (#1324) Pavol Rusnak 2023-05-05 16:43:36 +0200
  • 2d13786e91
    Fix for OpenCL / clbast builds on macOS. (#1329) Ionoclast Laboratories 2023-05-05 08:18:21 -0400
  • a90e96b266
    Convert.py @staticmethod (#1327) Benjamin Lecaillon 2023-05-05 02:17:07 +0200
  • 94c5652fc0
    quantize: make output filename optional, default to ggml-model-<ftype>.bin (#1301) slaren 2023-05-05 00:58:56 +0200
  • 34d9f22f44
    Wrap exceptions in std::exception to verbose output on exception. (#1316) Ivan Stepanov 2023-05-04 19:56:27 +0300
  • d3e8093e9b
    convert: support DT_BF16 tensors (#1309) Ivan Stepanov 2023-05-04 19:54:37 +0300
  • 360cfe5bec
    readme : add OpenBuddy link (#1321) 44670 2023-05-05 00:33:31 +0800
  • 2edbdb0f99
    main : add --in-suffix option (#1318) 44670 2023-05-04 23:41:12 +0800
  • 20fbf2a2a0
    ggml : change immintrin.h to intrin.h for compatibility (#1307) Ron Jailall 2023-05-04 11:05:59 -0400
  • db1080876a
    Only escape prompts when used with -e (#1311) DannyDaemonic 2023-05-04 05:08:25 -0700
  • c65a7fbfa9
    Update main's README.md with new features (#1296) DannyDaemonic 2023-05-04 03:02:59 -0700
  • f647ce040f
    fix #1224 reverse prompt and multi line (#1297) Tomas 2023-05-04 17:02:30 +0700
  • 799fdc1b5d
    ggml : vectorize Q8_0 quantization Georgi Gerganov 2023-05-03 23:24:20 +0300
  • 6daa09d879
    examples : read chat prompts from a template file (#1196) khimaros 2023-05-03 10:58:11 -0700
  • bca9ad938a
    minor : fix whitespaces (#1302) Georgi Gerganov 2023-05-03 20:09:42 +0300
  • e2a937ca6a
    minor : fix trailing whitespaces Georgi Gerganov 2023-05-03 18:43:23 +0300
  • b0c71c7b6d
    scripts : platform independent script to verify sha256 checksums (#1203) KASR 2023-05-03 17:31:28 +0200
  • a8a2efdc81
    examples : various prompt and example fixes (#1298) CRD716 2023-05-03 10:26:47 -0500
  • e216aa0463
    llama : only copy used KV cache in get / set state (#1272) Evan Jones 2023-05-02 22:26:13 -0400
  • 2485d7a4d3
    Process escape sequences given in prompts (#1173) DannyDaemonic 2023-05-02 18:46:20 -0700
  • 13b0c68ed7
    Handle signals properly on Windows (#1123) DannyDaemonic 2023-05-02 18:01:57 -0700
  • 55bc5f0900
    Call sh on build-info.sh (#1294) DannyDaemonic 2023-05-02 17:52:35 -0700
  • 9daff419f6
    fix build-info.h for git submodules (#1289) kuvaus 2023-05-03 03:43:43 +0300
  • bf4b22ffe4
    fix missing parameters in llama_init_from_gpt_params (#1293) slaren 2023-05-03 01:36:45 +0200
  • 67c77799e0
    examples : add llama_init_from_gpt_params() common function (#1290) Ron Evans 2023-05-02 22:39:51 +0200
  • 0e6cbff1b7
    llama : fix compile warnings Georgi Gerganov 2023-05-02 23:09:08 +0300
  • 5d5817ca60
    ggml : fix 32-bit ARM Georgi Gerganov 2023-05-02 22:14:50 +0300
  • 8c9be35ff9
    examples : improve vertical alignment of a few variables (#1286) Ron Evans 2023-05-02 19:53:52 +0200
  • cc0bb7235c
    ggml : fix ppc64le build error and make cmake detect Power processors (#1284) Marvin Gießing 2023-05-02 18:42:16 +0200
  • 2bb992f034
    llama : allow 0 as a seed number. (#1275) Robert Brisita 2023-05-02 12:23:44 -0400
  • e2cd506999
    main : switch input_noecho to input_echo to remove negation (#979) Ron Evans 2023-05-02 18:13:26 +0200