Configurations
RaNNC’s runtime configurations can be set in the following two ways:
Config file: RaNNC automatically loads a configuration file at
~/.pyrannc/rannc_conf.toml
. Names of configuration items must be in lowercase. The path to the configuration file can be set by an environment variableRANNC_CONF_DIR
.Environment variables: You can overwrite a configuration by setting environment variables. Names of variables follow
RANNC_<CONF_ITEM_NAME>
in uppercase. For example, you can set mem_margin in the following table with a variableRANNC_MEM_MARGIN
.
Name |
Default |
|
---|---|---|
show_config_items |
false |
Show configurations on startup if set to true. |
mem_limit_gb |
0 |
Set a memory limit per device in GB if a positive number is given. |
mem_margin |
0.1 |
Memory margin for model partitioning. |
save_deployment |
true |
Save deployment of a partitioned model if set to true. |
load_deployment |
false |
Load deployment of a partitioned model if set to true. |
deployment_file |
|
Path of deployment file to save/load. |
partition_num |
0 |
Force the number of partitions for model parallelism if a positive value is set. |
min_pipeline |
1 |
Minimum number of microbatches for pipeline parallelism. |
max_pipeline |
32 |
Maximum number of microbatches for pipeline parallelism. |
opt_param_factor |
2 |
Factor used to estimate memory usage by an optimizer. For example, set this item to 2 for Adam because the optimizer uses two internal data v and s, whose sizes are equivalent to parameter tensors. |
trace_events |
false |
Trace internal events if set to true. When true, the event tracing significantly degrades performance. |
event_trace_file |
|
Path to an event trace file. |
profile_by_acc |
false |
Estimate computation times/memory usages by accumulating the values of finer-grained subgraphs. This drastically reduces the time for patitioning while the accuracy of the estimation declines. |
sync_allreduce |
false |
Synchronize allreduce across all stages in pipeline parallelism. |
partitioning_dry_run_np |
0 |
Performs dry run to determine model partitioning if a positive number is given. |
The following is an example of the configuration file (~/.pyrannc/rannc_conf.toml
).
opt_param_factor=2
mem_margin=0.1
min_pipeline=1
max_pipeline=32
save_deployment=true
load_deployment=false
trace_events=false