Instance options#
The key/value configuration is namespaced with the following namespaces currently supported:
boot
(boot related options, timing, dependencies, …)cloud-init
(cloud-init configuration)environment
(environment variables)image
(copy of the image properties at time of creation)limits
(resource limits)nvidia
(NVIDIA and CUDA configuration)raw
(raw instance configuration overrides)security
(security policies)user
(storage for user properties, searchable)volatile
(used internally by LXD to store internal data specific to an instance)
The currently supported keys are:
Key |
Type |
Default |
Live update |
Condition |
Description |
---|---|---|---|---|---|
|
bool |
|
no |
virtual machine |
Set the name and MTU of the default network interfaces to be the same as the instance devices (this is automatic for containers). |
|
bool |
- |
no |
- |
Always start the instance when LXD starts (if not set, restore last state) |
|
integer |
|
no |
- |
Number of seconds to wait after the instance started before starting the next one |
|
integer |
|
no |
- |
What order to start the instances in (starting with highest) |
|
integer |
|
yes |
- |
Seconds to wait for instance to shutdown before it is force stopped |
|
integer |
|
no |
- |
What order to shutdown the instances (starting with highest) |
|
string |
|
no |
- |
Cloud-init |
|
string |
|
no |
- |
Cloud-init |
|
string |
|
no |
- |
Cloud-init |
|
string |
|
no |
- |
What to do when evacuating the instance ( |
|
string |
- |
yes (exec) |
- |
key/value environment variables to export to the instance and set on exec |
|
string |
- |
yes |
- |
Number or range of CPUs to expose to the instance (defaults to 1 CPU for VMs) |
|
string |
|
yes |
container |
How much of the CPU can be used. Can be a percentage (e.g. 50%) for a soft limit or hard a chunk of time (25ms/100ms) |
|
integer |
|
yes |
container |
CPU scheduling priority compared to other instances sharing the same CPUs (overcommit) (integer between 0 and 10) |
|
integer |
|
yes |
- |
When under load, how much priority to give to the instance’s I/O requests (integer between 0 and 10) |
|
string |
- |
yes |
container |
Fixed value in bytes (various suffixes supported, see Units for storage and network limits) to limit number of 64 KB huge pages (Available huge-page sizes are architecture dependent.) |
|
string |
- |
yes |
container |
Fixed value in bytes (various suffixes supported, see Units for storage and network limits) to limit number of 1 MB huge pages (Available huge-page sizes are architecture dependent.) |
|
string |
- |
yes |
container |
Fixed value in bytes (various suffixes supported, see Units for storage and network limits) to limit number of 2 MB huge pages (Available huge-page sizes are architecture dependent.) |
|
string |
- |
yes |
container |
Fixed value in bytes (various suffixes supported, see Units for storage and network limits) to limit number of 1 GB huge pages (Available huge-page sizes are architecture dependent.) |
|
string |
- |
no |
container |
This limits kernel resources per instance (e.g. number of open files) |
|
string |
- |
yes |
- |
Percentage of the host’s memory or fixed value in bytes (various suffixes supported, see Units for storage and network limits) (defaults to 1GiB for VMs) |
|
string |
|
yes |
container |
If |
|
bool |
|
no |
virtual machine |
Controls whether to back the instance using huge pages rather than regular system memory |
|
bool |
|
yes |
container |
Controls whether to encourage/discourage swapping less used pages for this instance |
|
integer |
|
yes |
container |
The higher this is set, the least likely the instance is to be swapped to disk (integer between 0 and 10) |
|
integer |
|
yes |
- |
When under load, how much priority to give to the instance’s network requests (integer between 0 and 10) |
|
integer |
- (max) |
yes |
container |
Maximum number of processes that can run in the instance |
|
string |
- |
yes |
container |
Comma-separated list of kernel modules to load before starting the instance |
|
string |
- |
no |
container |
Allow for modify |
|
bool |
|
yes |
container |
Incremental memory transfer of the instance’s memory to reduce downtime |
|
integer |
|
yes |
container |
Percentage of memory to have in sync before stopping the instance |
|
integer |
|
yes |
container |
Maximum number of transfer operations to go through before stopping the instance |
|
bool |
|
no |
virtual machine |
Allow for stateful stop/start and snapshots. This will prevent the use of some features that are incompatible with it |
|
string |
|
no |
container |
What driver capabilities the instance needs (sets |
|
bool |
|
no |
container |
Pass the host NVIDIA and CUDA runtime libraries into the instance |
|
string |
- |
no |
container |
Version expression for the required CUDA version (sets |
|
string |
- |
no |
container |
Version expression for the required driver version (sets |
|
blob |
- |
yes |
- |
AppArmor profile entries to be appended to the generated profile |
|
blob |
- |
no |
unprivileged container |
Raw idmap configuration (e.g. |
|
blob |
- |
no |
container |
Raw LXC configuration to be appended to the generated one |
|
blob |
- |
no |
virtual machine |
Raw QEMU configuration to be appended to the generated command line |
|
blob |
- |
no |
virtual machine |
Addition/override to the generated |
|
blob |
- |
no |
container |
Raw Seccomp configuration |
|
bool |
|
no |
- |
Controls the presence of |
|
bool |
|
no |
container |
Controls the availability of the |
|
integer |
- |
no |
unprivileged container |
The base host ID to use for the allocation (overrides auto-detection) |
|
bool |
|
no |
unprivileged container |
Use an idmap for this instance that is unique among instances with isolated set |
|
integer |
- |
no |
unprivileged container |
The size of the idmap to use |
|
bool |
|
yes |
container |
Support running LXD (nested) inside the instance |
|
bool |
|
no |
container |
Runs the instance in privileged mode |
|
bool |
|
yes |
- |
Prevents the instance from being deleted |
|
bool |
|
yes |
container |
Prevents the instance’s file system from being UID/GID shifted on startup |
|
bool |
|
no |
virtual machine |
Controls whether the |
|
bool |
|
no |
virtual machine |
Controls whether UEFI secure boot is enabled with the default Microsoft keys |
|
string |
- |
no |
container |
A ‘\n’ separated list of syscalls to allow (mutually exclusive with |
|
string |
- |
no |
container |
A ‘\n’ separated list of syscalls to deny |
|
bool |
|
no |
container |
On x86_64 this enables blocking of |
|
bool |
|
no |
container |
Enables the default syscall deny |
|
bool |
|
no |
container |
Handles the |
|
bool |
|
no |
container |
Allows |
|
bool |
|
no |
container |
Handles the |
|
bool |
|
no |
container |
Handles the |
|
string |
- |
yes |
container |
Specify a comma-separated list of file systems that are safe to mount for processes inside the instance |
|
string |
- |
yes |
container |
Whether to redirect mounts of a given file system to their FUSE implementation (e.g. |
|
bool |
|
yes |
container |
Whether to mount |
|
bool |
|
no |
container |
Handles the |
|
bool |
|
no |
container |
Handles the |
|
bool |
|
no |
container |
Handles the |
|
string |
- |
no |
- |
Cron expression ( |
|
bool |
|
no |
- |
Controls whether to automatically snapshot stopped instances |
|
string |
|
no |
- |
Pongo2 template string which represents the snapshot name (used for scheduled snapshots and unnamed snapshots) |
|
string |
- |
no |
- |
Controls when snapshots are to be deleted (expects expression like |
|
string |
- |
no |
- |
Free form user key/value storage (can be used in search) |
The following volatile keys are currently internally used by LXD:
Key |
Type |
Default |
Description |
---|---|---|---|
|
string |
- |
The name of a template hook which should be triggered upon next startup |
|
string |
- |
Whether or not to regenerate VM NVRAM on next start |
|
string |
- |
The hash of the image the instance was created from, if any |
|
string |
- |
The |
|
string |
- |
The origin (cluster member) of the evacuated instance |
|
integer |
- |
The first ID in the instance’s primary idmap range |
|
string |
- |
The idmap currently in use by the instance |
|
string |
- |
The idmap to use next time the instance starts |
|
string |
- |
Serialized instance UID/GID map |
|
string |
- |
Instance state as of last host shutdown |
|
string |
- |
Instance |
|
string |
- |
Instance UUID (globally unique across all servers and projects) |
|
string |
- |
Disk quota to be applied on next instance start |
|
string |
- |
RBD device path for Ceph disk devices |
|
string |
- |
Network device name on the host |
|
string |
- |
Network device MAC address (when no |
|
string |
- |
Whether or not the network device physical device was created ( |
|
string |
- |
Network device original MTU used when moving a physical device into an instance |
|
string |
- |
Network device original MAC used when moving a physical device into an instance |
|
string |
- |
SR-IOV Virtual function ID used when moving a VF into an instance |
|
string |
- |
SR-IOV Virtual function original MAC used when moving a VF into an instance |
|
string |
- |
SR-IOV Virtual function original VLAN used when moving a VF into an instance |
|
string |
- |
SR-IOV Virtual function original spoof check setting used when moving a VF into an instance |
Additionally, those user keys have become common with images (support isn’t guaranteed):
Key |
Type |
Default |
Description |
---|---|---|---|
|
string |
- |
Cloud-init meta-data, content is appended to seed value |
Note that while a type is defined above as a convenience, all values are stored as strings and should be exported over the REST API as strings (which makes it possible to support any extra values without breaking backward compatibility).
Those keys can be set using the lxc
tool with:
lxc config set <instance> <key> <value>
Volatile keys can’t be set by the user and can only be set directly against an instance.
The raw keys allow direct interaction with the backend features that LXD itself uses, setting those may very well break LXD in non-obvious ways and should whenever possible be avoided.
CPU limits#
The CPU limits are implemented through a mix of the cpuset
and cpu
cgroup controllers.
limits.cpu
results in CPU pinning through the cpuset
controller.
A set of CPUs (e.g. 1,2,3
) or a CPU range (e.g. 0-3
) can be specified.
When a number of CPUs is specified instead (e.g. 4
), LXD will do
dynamic load-balancing of all instances that aren’t pinned to specific
CPUs, trying to spread the load on the machine. Instances will then be
re-balanced every time an instance starts or stops as well as whenever a
CPU is added to the system.
To pin to a single CPU, you have to use the range syntax (e.g. 1-1
) to
differentiate it from a number of CPUs.
limits.cpu.allowance
drives either the CFS scheduler quotas when
passed a time constraint, or the generic CPU shares mechanism when
passed a percentage value.
The time constraint (e.g. 20ms/50ms
) is relative to one CPU worth of
time, so to restrict to two CPUs worth of time, something like
100ms/50ms should be used.
When using a percentage value, the limit will only be applied when under load and will be used to calculate the scheduler priority for the instance, relative to any other instance which is using the same CPU(s).
limits.cpu.priority
is another knob which is used to compute that
scheduler priority score when a number of instances sharing a set of
CPUs have the same percentage of CPU assigned to them.
VM CPU topology#
LXD virtual machines default to having just one vCPU allocated which shows up as matching the host CPU vendor and type but has a single core and no threads.
When limits.cpu
is set to a single integer, this will cause multiple
vCPUs to be allocated and exposed to the guest as full cores. Those vCPUs
will not be pinned to specific physical cores on the host.
The number of vCPUs can be updated while the VM is running.
When limits.cpu
is set to a range or comma-separated list of CPU IDs
(as provided by lxc info --resources
), then the vCPUs will be pinned
to those physical cores. In this scenario, LXD will check whether the
CPU configuration lines up with a realistic hardware topology and if it
does, it will replicate that topology in the guest.
When doing CPU pinning, it is not possible to change the configuration while the VM is running.
This means that if the pinning configuration includes 8 threads, with each pair of thread coming from the same core and an even number of cores spread across two CPUs, LXD will have the guest show two CPUs, each with two cores and each core with two threads. The NUMA layout is similarly replicated and in this scenario, the guest would most likely end up with two NUMA nodes, one for each CPU socket.
In such an environment with multiple NUMA nodes, the memory will similarly be divided across NUMA nodes and be pinned accordingly on the host and then exposed to the guest.
All this allows for very high performance operations in the guest as the guest scheduler can properly reason about sockets, cores and threads as well as consider NUMA topology when sharing memory or moving processes across NUMA nodes.
Huge page limits via limits.hugepages.[size]
#
LXD allows to limit the number of huge pages available to a container through
the limits.hugepage.[size]
key. Limiting huge pages is done through the
hugetlb
cgroup controller. This means the host system needs to expose the
hugetlb
controller in the legacy or unified cgroup hierarchy for these limits
to apply.
Note that architectures often expose multiple huge-page sizes. In addition,
architectures may expose different huge-page sizes than other architectures.
Limiting huge pages is especially useful when LXD is configured to intercept the
mount
syscall for the hugetlbfs
file system in unprivileged containers. When
LXD intercepts a hugetlbfs
mount
syscall, it will mount the hugetlbfs
file system for a container with correct uid
and gid
values as mount
options. This makes it possible to use huge pages from unprivileged containers.
However, it is recommended to limit the number of huge pages available to the
container through limits.hugepages.[size]
to stop the container from being
able to exhaust the huge pages available to the host.
Resource limits via limits.kernel.[limit name]
#
LXD exposes a generic namespaced key limits.kernel.*
which can be used to set
resource limits for a given instance. It is generic in the sense that LXD will
not perform any validation on the resource that is specified following the
limits.kernel.*
prefix. LXD cannot know about all the possible resources that
a given kernel supports. Instead, LXD will simply pass down the corresponding
resource key after the limits.kernel.*
prefix and its value to the kernel.
The kernel will do the appropriate validation. This allows users to specify any
supported limit on their system. Some common limits are:
Key |
Resource |
Description |
---|---|---|
|
|
Maximum size of the process’s virtual memory |
|
|
Maximum size of the process’s coredump file |
|
|
Limit in seconds on the amount of CPU time the process can consume |
|
|
Maximum size of the process’s data segment |
|
|
Maximum size of files the process may create |
|
|
Limit on the number of file locks that this process may establish |
|
|
Limit on the number of bytes of memory that the process may lock in RAM |
|
|
Maximum value to which the process’s nice value can be raised |
|
|
Maximum number of open files for the process |
|
|
Maximum number of processes that can be created for the user of the calling process |
|
|
Maximum value on the real-time-priority that maybe set for this process |
|
|
Maximum number of signals that maybe queued for the user of the calling process |
A full list of all available limits can be found in the manpages for the
getrlimit(2)
/setrlimit(2)
system calls. To specify a limit within the
limits.kernel.*
namespace use the resource name in lowercase without the
RLIMIT_
prefix, e.g. RLIMIT_NOFILE
should be specified as nofile
.
A limit is specified as two colon separated values which are either numeric or
the word unlimited
(e.g. limits.kernel.nofile=1000:2000
). A single value can be
used as a shortcut to set both soft and hard limit (e.g.
limits.kernel.nofile=3000
) to the same value. A resource with no explicitly
configured limitation will be inherited from the process starting up the
instance. Note that this inheritance is not enforced by LXD but by the kernel.
Snapshot scheduling and configuration#
LXD supports scheduled snapshots which can be created at most once every minute. There are three configuration options:
snapshots.schedule
takes a shortened cron expression:<minute> <hour> <day-of-month> <month> <day-of-week>
. If this is empty (default), no snapshots will be created.snapshots.schedule.stopped
controls whether to automatically snapshot stopped instances. It defaults tofalse
.snapshots.pattern
takes a Pongo2 template string to format the snapshot name. To name snapshots with time stamps, the Pongo2 context variablecreation_date
can be used. Be aware that you should format the date (e.g. use{{ creation_date|date:"2006-01-02_15-04-05" }}
) in your template string to avoid forbidden characters in the snapshot name. Another way to avoid name collisions is to use the placeholder%d
. If a snapshot with the same name (excluding the placeholder) already exists, all existing snapshot names will be taken into account to find the highest number at the placeholders position. This number will be incremented by one for the new name. The starting number if no snapshot exists will be0
. The default behavior ofsnapshots.pattern
is equivalent to a format string ofsnap%d
.
Example of using Pongo2 syntax to format snapshot names with timestamps:
lxc config set INSTANCE snapshots.pattern "{{ creation_date|date:'2006-01-02_15-04-05' }}"
This results in snapshots named {date/time of creation}
down to the precision of a second.