llm-perf-leaderboard

Running

llm-perf-leaderboard / hardware.yaml

fix-memory-requirements-for-cpu (#36)

3db3076 verified 5 days ago

999 Bytes

	- machine: 1xA10
	description: A10-24GB-150W 🖥️
	hardware_provider: nvidia
	hardware_type: cuda
	subsets:
	- unquantized
	- awq
	- bnb
	- gptq
	backends:
	- pytorch

	- machine: 1xA100
	description: A100-80GB-275W 🖥️
	hardware_provider: nvidia
	hardware_type: cuda
	subsets:
	- unquantized
	- awq
	- bnb
	- gptq
	- torchao
	backends:
	- pytorch

	- machine: 1xT4
	description: T4-16GB-70W 🖥️
	hardware_provider: nvidia
	hardware_type: cuda
	subsets:
	- unquantized
	- awq
	- bnb
	- gptq
	- torchao
	backends:
	- pytorch

	- machine: 32vCPU-C7i
	description: Intel-Xeon-SPR-385W 🖥️
	detail: \|
	We tested the [32vCPU AWS C7i](https://aws.amazon.com/ec2/instance-types/c7i/) instance for the benchmark. The memory requirement is the max RAM consumption during the decode phase.
	hardware_provider: intel
	hardware_type: cpu
	subsets:
	- unquantized
	backends:
	- pytorch
	- openvino
	- onnxruntime