Skip to content

vllm bench sweep serve

JSON CLI Arguments

When passing JSON CLI arguments, the following sets of arguments are equivalent:

  • --json-arg '{"key1": "value1", "key2": {"key3": "value2"}}'
  • --json-arg.key1 value1 --json-arg.key2.key3 value2

Additionally, list elements can be passed individually using +:

  • --json-arg '{"key4": ["value3", "value4", "value5"]}'
  • --json-arg.key4+ value3 --json-arg.key4+='value4,value5'

Options

--serve-cmd

The command used to run the server: vllm serve ...

Default: None

--bench-cmd

The command used to run the benchmark: vllm bench serve ...

Default: None

--after-bench-cmd

After a benchmark run is complete, invoke this command instead of the default ServerWrapper.clear_cache().

Default: None

--show-stdout

If set, logs the standard output of subcommands. Useful for debugging but can be quite spammy.

Default: False

--serve-params

Path to JSON file containing a list of parameter combinations for the vllm serve command. If both serve_params and bench_params are given, this script will iterate over their Cartesian product.

Default: None

--bench-params

Path to JSON file containing a list of parameter combinations for the vllm bench serve command. If both serve_params and bench_params are given, this script will iterate over their Cartesian product.

Default: None

-o, --output-dir

The directory to which results are written.

Default: results

--num-runs

Number of runs per parameter combination.

Default: 3

--dry-run

If set, prints the commands to run, then exits without executing them.

Default: False

--resume

Set this to the name of a directory under output_dir (which is a timestamp) to resume a previous execution of this script, i.e., only run parameter combinations for which there are still no output files.

Default: None