Michael Yang
|
e40145a39d
lint
|
7 months ago |
Daniel Hiltgen
|
34b9db5afc
Request and model concurrency
|
9 months ago |
Daniel Hiltgen
|
de2fbdec99
Merge pull request #1819 from dhiltgen/multi_variant
|
1 year ago |
Daniel Hiltgen
|
39928a42e8
Always dynamically load the llm server library
|
1 year ago |
Fabian Preiß
|
3bc8b9832b
fix gpu_test.go Error (same type) uint64->uint32 (#1921)
|
1 year ago |
Jeffrey Morgan
|
c336693f07
calculate overhead based number of gpu devices (#1875)
|
1 year ago |
Daniel Hiltgen
|
a2ad952440
Fix windows system memory lookup
|
1 year ago |
Daniel Hiltgen
|
d966b730ac
Switch windows build to fully dynamic
|
1 year ago |
Daniel Hiltgen
|
35934b2e05
Adapted rocm support to cgo based llama.cpp
|
1 year ago |