Daniel Hiltgen
|
f6f759fc5f
Detect CUDA OS Overhead
|
6 months ago |
Daniel Hiltgen
|
9929751cc8
Disable concurrency for AMD + Windows
|
6 months ago |
Daniel Hiltgen
|
da3bf23354
Workaround gfx900 SDMA bugs
|
7 months ago |
Daniel Hiltgen
|
6f351bf586
review comments and coverage
|
7 months ago |
Daniel Hiltgen
|
4e2b7e181d
Refactor intel gpu discovery
|
7 months ago |
Daniel Hiltgen
|
6fd04ca922
Improve multi-gpu handling at the limit
|
7 months ago |
Daniel Hiltgen
|
43ed358f9a
Refine GPU discovery to bootstrap once
|
8 months ago |
Daniel Hiltgen
|
8727a9c140
Record more GPU information
|
8 months ago |
Daniel Hiltgen
|
34b9db5afc
Request and model concurrency
|
9 months ago |
Michael Yang
|
7e33a017c0
partial offloading
|
9 months ago |
Michael Yang
|
91b3e4d282
update memory calcualtions
|
9 months ago |
Daniel Hiltgen
|
6d84f07505
Detect AMD GPU info via sysfs and block old cards
|
11 months ago |
Daniel Hiltgen
|
8da7bef05f
Support multiple variants for a given llm lib type
|
1 year ago |
Jeffrey Morgan
|
c336693f07
calculate overhead based number of gpu devices (#1875)
|
1 year ago |
Daniel Hiltgen
|
a2ad952440
Fix windows system memory lookup
|
1 year ago |
Daniel Hiltgen
|
d966b730ac
Switch windows build to fully dynamic
|
1 year ago |
Daniel Hiltgen
|
7555ea44f8
Revamp the dynamic library shim
|
1 year ago |
Daniel Hiltgen
|
35934b2e05
Adapted rocm support to cgo based llama.cpp
|
1 year ago |