Welcome to Knot Resolver’s documentation!
Knot Resolver is an opensource implementation of a caching validating DNS resolver.
Modular architecture keeps the core tiny and efficient, and it also provides a state-machine like API for extensions.
If you are a new user, please start with chapter for getting started.
As a first step, configure your system to use upstream repositories which have
the latest version of Knot Resolver. Follow the instructions below for your
distribution.
Note
Please note that the packages available in distribution repositories of Debian and Ubuntu are outdated. Make sure to follow these steps to use our upstream repositories.
The main way to run Knot Resolver is to use provided integration with systemd.
$sudosystemctlstartknot-resolver.service
See logs and status of running instance with systemctlstatusknot-resolver.service command.
For more information about systemd integration see manknot-resolver.systemd.
Warning
knot-resolver.service is not enabled by default, thus Knot Resolver won’t start automatically after reboot.
To start and enable service in one command use systemctlenable--nowknot-resolver.service
Unfortunately, for some cases (typically Docker and minimalistic systems), systemd is not available, therefore it is not possible to use knot-resolver.service.
If you have this problem, look at usage without systemd section.
Note
If for some reason you need to use Knot Resolver as it was before version 6, check out usage without the manager
Otherwise, it is recommended to stick to this chapter.
After installation and first startup, Knot Resolver’s default configuration accepts queries on loopback interface. This allows you to test that the installation and service startup were successful before continuing with configuration.
For instance, you can use DNS lookup utility kdig to send DNS queries. The kdig command is provided by following packages:
Distribution
package with kdig
Arch
knot
CentOS
knot-utils
Debian
knot-dnsutils
Fedora
knot-utils
OpenSUSE
knot-utils
Ubuntu
knot-dnsutils
The following query should return list of Root Name Servers:
Easiest way to configure Knot Resolver is to put YAML configuration in /etc/knot-resolver/config.yml file.
You can start exploring the configuration by continuing in this chapter or look at the complete configuration documentation.
Complete examples of configuration files can be found here.
Examples are also installed as documentation files, typically in /usr/share/doc/knot-resolver/examples/ directory (location may be different based on your Linux distribution).
Tip
You can use kresctl utility to validate your configuration before pushing it into the running resolver.
It should help prevent many typos in the configuration.
$kresctlvalidate/etc/knot-resolver/config.yml
If you update the configuration file while Knot Resolver is running, you can force the resolver to reload it by invoking a systemd reload command.
$systemctlreloadknot-resolver.service
Note
Reloading configuration can fail even when your configuration is valid, because some options cannot be changed while running. You can always find an explanation of the error in the log accesed by the journalctl-euknot-resolver command.
The first thing you will probably want to configure are the network interfaces to listen to.
The following example instructs the resolver to receive standard unencrypted DNS queries on 192.0.2.1 and 2001:db8::1 IP addresses.
Encrypted DNS queries using DNS-over-TLS protocol are accepted on all IP addresses of eth0 network interface, TCP port 853.
network:listen:-interface:['192.0.2.1','2001:db8::1']# port 53 is default-interface:'eth0'port:853kind:'dot'# DNS-over-TLS
For more details look at the network configuration.
Warning
On machines with multiple IP addresses on the same interface avoid listening on wildcards 0.0.0.0 or ::.
Knot Resolver could answer from different IP addresses if the network address ranges overlap, and clients would refuse such a response.
An internal-only domain is a domain not accessible from the public Internet.
In order to resolve internal-only domains a query policy has to be added to forward queries to a correct internal server.
This configuration will forward two listed domains to a DNS server with IP address 192.0.2.44.
The following configuration is typical for Internet Service Providers who offer DNS resolver
service to their own clients in their own network. Please note that running a public DNS resolver
is more complicated and not covered by this example.
With exception of public resolvers, a DNS resolver should resolve only queries sent by clients in its own network. This restriction limits attack surface on the resolver itself and also for the rest of the Internet.
In a situation where access to DNS resolver is not limited using IP firewall, you can implement access restrictions which combines query source information with policy rules.
Following configuration allows only queries from clients in subnet 192.0.2.0/24 and refuses all the rest.
Today clients are demanding secure transport for DNS queries between client machine and DNS resolver.
The recommended way to achieve this is to start DNS-over-TLS server and accept also encrypted queries.
First step is to enable TLS on listening interfaces:
network:listen:-interface:['192.0.2.1','2001:db8::1']kind:'dot'# DNS-over-TLS, port 853 is default
By default a self-signed certificate is generated.
Second step is then obtaining and configuring your own TLS certificates signed by a trusted CA.
Once the certificate was obtained a path to certificate files can be specified:
DNS queries can be used to gather data about user behavior.
Knot Resolver can be configured to forward DNS queries elsewhere,
and to protect them from eavesdropping by TLS encryption.
Warning
Latest research has proven that encrypting DNS traffic is not sufficient to protect privacy of users.
For this reason we recommend all users to use full VPN instead of encrypting just DNS queries.
Following configuration is provided only for users who cannot encrypt all their traffic.
For more information please see following articles:
Simran Patil and Nikita Borisov. 2019. What can you learn from an IP? (slides, the article itself)
Forwarding over TLS protocol protects DNS queries sent out by resolver.
It can be configured using TLS forwarding which provides methods for authentication.
.. It can be configured using policy.TLS_FORWARD which provides methods for authentication.
See list of DNS Privacy Test Servers supporting DNS-over-TLS to test your configuration.
With the use of slice function, it is possible to split the
.. With the use of policy.slice function, it is possible to split the
entire DNS namespace into distinct “slices”. When used in conjunction with
TLS forwarding, it’s possible to forward different queries to different
.. policy.TLS_FORWARD, it’s possible to forward different queries to different
remote resolvers. As a result no single remote resolver will get complete list
of all queries performed by this client.
Warning
Beware that this method has not been scientifically tested and there might be
types of attacks which will allow remote resolvers to infer more information about the client.
Again: If possible encrypt all your traffic and not just DNS queries!
Knot Resolver’s cache contains data clients queried for.
If you are concerned about attackers who are able to get access to your
computer system in power-off state and your storage device is not secured by
encryption you can move the cache to tmpfs.
See chapter Persistence.
Configuration file is by default named /etc/knot-resolver/config.yml.
Different configuration file can be loaded by using command line option
-c/--config.
The configuration has to pass a validation step before being used. The validation mainly
checks for conformance to our configuration-schema.
Tip
Whenever a configuration is loaded and the validation fails, we attempt to log a detailed
error message explaining what the problem was. For example, it could look like the following:
If you happen to find a rejected configuration with unhelpful or confusing error message, please report it as a bug.
Tip
An easy way to see the complete configuration structure is to look at the JSON schema represention.
The raw JSON schema is available at this link (valid only for the version of resolver this documentation was generated for).
For the schema readability, some graphical visualizer can be used, for example this one.
The configuration schema describes the structure of accepted configuration files (or objects via the API). While originally specified in Python source code, it can be visualized as a JSON schema.
The JSON schema can be obtained from a running Resolver by sending a HTTP GET request to the path /schema on the management socket (by default a Unix socket at /var/run/knot-resolver/manager.sock).
The kresctlschema command outputs the schema of the currently installed version as well. It does not require a running resolver.
JSON schema for the most recent Knot Resolver version can be downloaded here.
As mentioned above, the JSON schema is NOT used to validate the configuration in the Knot Resolver. It’s the other way around, the validation process can generate JSON schema that can help you understand the configuration structure. Some validation steps are however dynamic (for example resolving of interface names) and can not be expressed using JSON schema and cannot be even completed without running full Resolver.
Note
When using the API to change configuration in runtime, your change can be rejected by the validation step even though Knot Resolver would start just fine with the given changed configuration. Some validation steps within the Resolver are dynamic and they are dependent on both your previous configuration and the new one. For example, if you try to change the management socket, the validation will fail even though the new provided address is perfectly valid. Chaning the management socket while running is not supported.
Most of the validation is however static and you can use the kresctlvalidate command to check your configuration file for most errors before actually running the Resolver.
Following, you can find the JSON schema flattened textual representation. It’s not meant to be read top-to-bottom, however it can be used as a quick lookup reference.
The first thing you will probably need to configure are the network interfaces to listen to.
The following configuration instructs Knot Resolver to receive standard unencrypted DNS queries on IP addresses 192.0.2.1 and 2001:db8::1.
Encrypted DNS queries are accepted using DNS-over-TLS protocol on all IP addresses configured on network interface eth0, TCP port 853.
network:listen:-interface:['192.0.2.1','2001:db8::1']# unencrypted DNS on port 53 is default-interface:'eth0'port:853kind:'dot'
Network interfaces to listen on and supported protocols are configured using net.listen() function.
-- unencrypted DNS on port 53 is defaultnet.listen('192.0.2.1')net.listen('2001:db8::1')net.listen(net.eth0,853,{kind='tls'})
Warning
On machines with multiple IP addresses on the same interface avoid listening on wildcards 0.0.0.0 or ::.
Knot Resolver could answer from different IP addresses if the network address ranges
overlap, and clients would refuse such a response.
Knot Resolver can be configured declaratively by using YAML files or YAML/JSON HTTP API. However, there is another option. The actual worker processes (the kresd executable) speaks a different configuration language, it internally uses the Lua runtime and the respective programming language.
Essentially, the declarative configuration is only used for validation and as an external interface. After validation, a Lua configuration is generated and passed into individual kresd instances. You can see the generated configuration files within the Resolver’s working directory or you can manually run the conversion of declarative configuration with the kresctlconvert command.
Warning
While there are no plans of ever removing the Lua configuration, we do not guarantee absence of backwards incompatible changes. Starting with Knot Resolver version 6 and later, we consider the Lua interface internal and a subject to change. While we don’t have any breaking changes planned for the foreseeable future, they might come.
Therefore, use this only when you don’t have any other option. And please let us know about it and we might try to accomodate your usecase in the declarative configuration.
Following configuration file snippet starts listening for unencrypted and also encrypted DNS queries on IP address 192.0.2.1, and sets cache size.
-- this is a comment: listen for unencrypted queriesnet.listen('192.0.2.1')-- another comment: listen for queries encrypted using TLS on port 853net.listen('192.0.2.1',853,{kind='tls'})-- 10 MB cache is suitable for a very small deploymentcache.size=10*MB
Tip
When copy&pasting examples from this manual please pay close
attention to brackets and also line ordering - order of lines matters.
The configuration language is in fact Lua script, so you can use full power
of this programming language. See article
Learn Lua in 15 minutes for a syntax overview.
When you modify configuration file on disk restart resolver process to get
changes into effect. See chapter Zero-downtime restarts if even short
outages are not acceptable for your deployment.
Besides text configuration file, Knot Resolver also supports interactive and dynamic configuration using scripts or external systems, which is described in chapter Run-time reconfiguration. Through this manual we present examples for both usage types - static configuration in a text file (see above) and also the interactive mode.
The interactive prompt is denoted by >, so all examples starting with > character are transcripts of user (or script) interaction with Knot Resolver and resolver’s responses. For example:
>-- this is a comment entered into interactive prompt>-- comments have no effect here>-- the next line shows a command entered interactively and its output>log_level()'notice'>-- the previous line without > character is output from log_level() command
Following example demonstrates how to interactively list all currently loaded modules, and includes multi-line output:
Knot Resolver functionality consists of separate modules, which allow you
to mix-and-match features you need without slowing down operation
by features you do not use.
This practically means that you need to load module before using features contained in it, for example:
-- load module and make dnstap features availablemodules.load('dnstap')-- configure dnstap featuresdnstap.config({socket_path="/tmp/dnstap.sock"})
Obviously ordering matters, so you have to load module first and configure it after it is loaded.
Here is full reference manual for module configuration:
This section describes configuration of network interfaces
and protocols. Please keep in mind that DNS resolvers act
as DNS server and DNS client at the same time,
and that these roles require different configuration.
This picture illustrates different actors involved DNS resolution process,
supported protocols, and clarifies what we call server configuration
and client configuration.
Attribution: Icons by Bernar Novalyi from the Noun Project
For resolver’s clients the resolver itself acts as a DNS server.
After receiving a query the resolver will attempt to find
answer in its cache. If the data requested by resolver’s
client is not available in resolver’s cache (so-called cache-miss)
the resolver will attempt to obtain the data from servers upstream
(closer to the source of information), so at this point the resolver
itself acts like a DNS client and will send DNS query to other servers.
By default the Knot Resolver works in recursive mode, i.e.
the resolver will contact authoritative servers on the Internet.
Optionally it can be configured in forwarding mode,
where cache-miss queries are forwarded to another DNS resolver
for processing.
Listen on addresses; port and flags are optional.
The addresses can be specified as a string or device.
Port 853 implies kind='tls' but it is always better to be explicit.
Freebind allows binding to a non-local or not yet available address.
net.listen('::1')net.listen(net.lo,53)net.listen(net.eth0,853,{kind='tls'})net.listen('192.0.2.1',53,{freebind=true})net.listen({'127.0.0.1','::1'},53,{kind='dns'})net.listen('::',443,{kind='doh2'})net.listen('::',8453,{kind='webmgmt'})-- see http modulenet.listen('/tmp/kresd-socket',nil,{kind='webmgmt'})-- http module supports AF_UNIXnet.listen('eth0',53,{kind='xdp'})net.listen('192.0.2.123',53,{kind='xdp',nic_queue=0})
Warning
On machines with multiple IP addresses avoid listening on wildcards
0.0.0.0 or ::. Knot Resolver could answer from different IP
addresses if the network address ranges overlap,
and clients would probably refuse such a response.
Knot Resolver supports proxies that utilize the PROXYv2 protocol
to identify clients.
A PROXY header contains the IP address of the original client who sent a query.
This allows the resolver to treat queries as if they actually came from
the client’s IP address rather than the address of the proxy they came through.
For example, Views and ACLs are able to work properly when
PROXYv2 is in use.
Since allowing usage of the PROXYv2 protocol for all clients would be a security
vulnerability, because clients would then be able to spoof their IP addresses via
the PROXYv2 header, the resolver requires you to specify explicitly which clients
are allowed to send PROXYv2 headers via the net.proxy_allowed() function.
PROXYv2 queries from clients who are not explicitly allowed to use this protocol
will be discarded.
Allow usage of the PROXYv2 protocol headers by clients on the specified
addresses. It is possible to permit whole networks to send PROXYv2 headers
by specifying the network mask using the CIDR notation
(e.g. 172.22.0.0/16). IPv4 as well as IPv6 addresses are supported.
If you wish to allow all clients to use PROXYv2 (e.g. because you have this
kind of security handled on another layer of your network infrastructure),
you can specify a netmask of /0. Please note that this setting is
address-family-specific, so this needs to be applied to both IPv4 and IPv6
separately.
Subsequent calls to the function overwrite the effects of all previous calls.
Providing a table of strings as the function parameter allows multiple
distinct addresses to use the PROXYv2 protocol.
When called without arguments, net.proxy_allowed returns a table of all
addresses currently allowed to use the PROXYv2 protocol and does not change
the configuration.
Examples:
net.proxy_allowed('172.22.0.1')-- allows '172.22.0.1' specificallynet.proxy_allowed('172.18.1.0/24')-- allows everyone at '172.18.1.*'net.proxy_allowed({'172.22.0.1','172.18.1.0/24'})-- allows both of the above at oncenet.proxy_allowed({'fe80::/10'}-- allows everyone at IPv6 link-localnet.proxy_allowed({'::/0','0.0.0.0/0'})-- allows everyonenet.proxy_allowed('::/0')-- allows all IPv6 (but no IPv4)net.proxy_allowed({})-- prevents everyone from using PROXYv2net.proxy_allowed()-- returns a list of all currently allowed addresses
It is important to understand limits of encrypting only DNS traffic.
Relevant security analysis can be found in article
Simran Patil and Nikita Borisov. 2019. What can you learn from an IP?
See slides
or the article itself.
DoT and DoH encrypt DNS traffic with Transport Layer Security (TLS) protocol
and thus protects DNS traffic from certain types of attacks.
You can learn more about DoT and DoH and their implementation in Knot Resolver
in this article.
Knot Resolver currently offers two DoH implementations. It is
recommended to use this new implementation, which is more reliable, scalable
and has fewer dependencies. Make sure to use doh2 kind in
net.listen() to select this implementation.
DNS-over-HTTPS server (RFC 8484) can be configured using doh2 kind in
net.listen().
This implementation supports HTTP/2 (RFC 7540). Queries can be sent to the
/dns-query endpoint, e.g.:
$kdig@127.0.0.1+httpswww.knot-resolver.czAAAA
Only TLS version 1.3 (or higher) is supported with DNS-over-HTTPS. The
additional considerations for TLS 1.2 required by HTTP/2 are not implemented
(RFC 7540#section-9.2).
Warning
Take care when configuring your server to listen on well known
HTTPS port. If an unrelated HTTPS service is running on the same port with
REUSEPORT enabled, you will end up with both services malfunctioning.
As specified by RFC 8484, the resolver responds with status 200 OK whenever
it can produce a valid DNS reply for a given query, even in cases where the DNS
rcode indicates an error (like NXDOMAIN, SERVFAIL, etc.).
For DoH queries malformed at the HTTP level, the resolver may respond with
the following status codes:
400 Bad Request for a generally malformed query, like one not containing
a valid DNS packet
404 Not Found when an incorrect HTTP endpoint is queried - the only
supported ones are /dns-query and /doh
413 Payload Too Large when the DNS query exceeds its maximum size
415 Unsupported Media Type when the query’s Content-Type header
is not application/dns-message
431 Request Header Fields Too Large when a header in the query is too
large to process
501 Not Implemented when the query uses a method other than
GET, POST, or HEAD
These settings affect both DNS-over-TLS and DNS-over-HTTPS (except
the legacy implementation).
A self-signed certificate is generated by default. For serious deployments
it is strongly recommended to configure your own TLS certificates signed
by a trusted CA. This is done using function net.tls().
The certificate files aren’t automatically reloaded on change. If
you update the certificate files, e.g. using ACME, you have to either
restart the service(s) or call this function again using
Control sockets.
Set secret for TLS session resumption via tickets, by RFC 5077.
The server-side key is rotated roughly once per hour.
By default or if called without secret, the key is random.
That is good for long-term forward secrecy, but multiple kresd instances
won’t be able to resume each other’s sessions.
If you provide the same secret to multiple instances, they will be able to resume
each other’s sessions without any further communication between them.
This synchronization works only among instances having the same endianness
and time_t structure and size (sizeof(time_t)).
For good security the secret must have enough entropy to be hard to guess,
and it should still be occasionally rotated manually and securely forgotten,
to reduce the scope of privacy leak in case the
secret leaks eventually.
Warning
Setting the secret is probably too risky with TLS <= 1.2 and
GnuTLS < 3.7.5. GnuTLS 3.7.5 adds an option to disable resumption via
tickets for TLS <= 1.2, enabling them only for protocols that do guarantee
PFS. Knot Resolver makes use of this new option when linked
against GnuTLS >= 3.7.5.
Get/set EDNS(0) padding of answers to queries that arrive over TLS
transport. If set to true (the default), it will use a sensible
default padding scheme, as implemented by libknot if available at
compile time. If set to a numeric value >= 2 it will pad the
answers to nearest padding boundary, e.g. if set to 64, the
answer will have size of a multiple of 64 (64, 128, 192, …). If
set to false (or a number < 2), it will disable padding entirely.
Selects the headers to be exposed. These headers and their values are
available in request.qsource.headers. Comparison
is case-insensitive and pseudo-headers are supported as well.
The following snippet can be used in the lua module to access headers
:method and user-agent:
In most distributions, the http module is available from a
separate package knot-resolver-module-http. The module isn’t packaged
for openSUSE.
This module does the heavy lifting to provide an HTTP and HTTP/2 enabled
server which provides few built-in services and also allows other
modules to export restful APIs and websocket streams.
One example is statistics module that can stream live metrics on the website,
or publish metrics on request for Prometheus scraper.
By default this module provides two kinds of endpoints,
and unlimited number of “used-defined kinds” can be added in configuration.
Each network address and port combination can be configured to expose
one kind of endpoint. This is done using the same mechanisms as
network configuration for plain DNS and DNS-over-TLS,
see chapter Networking and protocols for more details.
Warning
Management endpoint (webmgmt) must not be directly exposed
to untrusted parties. Use reverse-proxy like Apache
or Nginx if you need to authenticate API clients
for the management API.
By default all endpoints share the same configuration for TLS certificates etc.
This can be changed using http.config() configuration call explained below.
This section shows how to configure HTTP module itself. For information how
to configure HTTP server’s IP addresses and ports please see chapter
Networking and protocols.
-- load HTTP module with defaults (self-signed TLS cert)modules.load('http')-- optionally load geoIP database for server maphttp.config({geoip='GeoLite2-City.mmdb',-- e.g. https://dev.maxmind.com/geoip/geoip2/geolite2/-- and install mmdblua library})
Now you can reach the web services and APIs, done!
By default, the web interface starts HTTPS/2 on specified port using an ephemeral
TLS certificate that is valid for 90 days and is automatically renewed. It is of
course self-signed. Why not use something like
Let’s Encrypt?
Warning
If you use package luaossl<20181207, intermediate certificate is not sent to clients,
which may cause problems with validating the connection in some cases.
You can disable unencrypted HTTP and enforce HTTPS by passing
tls=true option for all HTTP endpoints:
http.config({tls=true,})
It is also possible to provide different configuration for each
kind of endpoint, e.g. to enforce TLS and use custom certificate only for DoH:
It is also possible to disable HTTPS altogether by passing tls=false option.
Plain HTTP gets handy if you want to use reverse-proxy like Apache or Nginx
for authentication to API etc.
(Unencrypted HTTP could be fine for localhost tests as, for example,
Safari doesn’t allow WebSockets over HTTPS with a self-signed certificate.
Major drawback is that current browsers won’t do HTTP/2 over insecure connection.)
Warning
If you use multiple Knot Resolver instances with these automatically maintained ephemeral certificates,
they currently won’t be shared.
It’s assumed that you don’t want a self-signed certificate for serious deployments anyway.
The legacy DoH implementation using http module (kind='doh_legacy')
is deprecated. It has known performance and stability issues that won’t be fixed.
Use new DNS-over-HTTPS (DoH) implementation instead.
This was an experimental implementation of RFC 8484. It can be configured using
doh_legacy kind in net.listen(). Its configuration (such as certificates)
takes place in http.config().
Queries were served on /doh and /dns-query endpoints.
Following chapters describe basic configuration of how resolver retrieves data from other (upstream) servers. Data processing is also affected by configured policies, see chapter Policy, access control, data manipulation for more advanced usage.
Following settings affect client part of the resolver,
i.e. communication between the resolver itself and other DNS servers.
IPv4 and IPv6 protocols are used by default. For performance reasons it is
recommended to explicitly disable protocols which are not available
on your system, though the impact of IPv6 outage is lowered since release 5.3.0.
Forwarding configuration instructs resolver to forward cache-miss queries from clients to manually specified DNS resolvers (upstream servers). In other words the forwarding mode does exact opposite of the default recursive mode because resolver in recursive mode automatically selects which servers to ask.
Main use-cases are:
Building a tree structure of DNS resolvers to improve performance (by improving cache hit rate).
Accessing domains which are not available using recursion (e.g. if internal company servers return different answers than public ones).
Forwarding through a central DNS traffic filter.
Forwarding implementation in Knot Resolver has following properties:
Answers from upstream servers are cached.
Answers from upstream servers are locally DNSSEC-validated, unless policy.STUB() is used.
Resolver automatically selects which IP address from given set of IP addresses will be used (based on performance characteristics).
We strongly discourage use of “fake top-level domains” like corp. because these made-up domains are indistinguishable from an attack, so DNSSEC validation will prevent such domains from working. If you really need a variant of forwarding which does not DNSSEC-validate received data please see chapter Replacing part of the DNS tree. In long-term it is better to migrate data into a legitimate, properly delegated domains which do not suffer from these security problems.
Simple examples for unencrypted forwarding:
-- forward all traffic to specified IP addresses (selected automatically)policy.add(policy.all(policy.FORWARD({'2001:db8::1','192.0.2.1'})))-- forward only queries for names under domain example.com to a single IP addresspolicy.add(policy.suffix(policy.FORWARD('192.0.2.1'),{todname('example.com.')}))
Get/set maximum EDNS payload size advertised in DNS packets. Different values can be configured for communication downstream (towards clients) and upstream (towards other DNS servers). Set and also get operations use values in this order.
Minimal value allowed by standard RFC 6891 is 512 bytes, which is equal to DNS packet size without Extension Mechanisms for DNS. Value 1220 bytes is minimum size required by DNSSEC standard RFC 4035.
Example output:
-- set downstream and upstream bufsize to value 4096>net.bufsize(4096)-- get configured downstream and upstream bufsizes, respectively>net.bufsize()4096-- result # 14096-- result # 2-- set downstream bufsize to 4096 and upstream bufsize to 1232>net.bufsize(4096,1232)-- get configured downstream and upstream bufsizes, respectively>net.bufsize()4096-- result # 11232-- result # 2
Module workarounds resolver behavior on specific broken sub-domains.
Currently it mainly disables case randomization.
For DNS resolvers, the most important parameter from performance perspective
is cache hit rate, i.e. percentage of queries answered from resolver’s cache.
Generally the higher cache hit rate the better.
It is also recommended to run Multiple instances (even on a
single machine!) because it allows to utilize multiple CPU threads and
increases overall resiliency.
Other features described in this section can be used for fine-tunning
performance and resiliency of the resolver but generally have much smaller
impact than cache settings and number of instances.
Cache in Knot Resolver is stored on disk and also shared between
Multiple instances so resolver doesn’t lose the cached data on
restart or crash.
To improve performance even further the resolver implements so-called aggressive caching
for DNSSEC-validated data (RFC 8198), which improves performance and also protects
against some types of Random Subdomain Attacks.
For personal and small office use-cases cache size around 100 MB is more than enough.
For large deployments we recommend to run Knot Resolver on a dedicated machine,
and to allocate 90% of machine’s free memory for resolver’s cache.
Note
Choosing a cache size that can fit into RAM is important even if the
cache is stored on disk (default). Otherwise, the extra I/O caused by disk
access for missing pages can cause performance issues.
For example, imagine you have a machine with 16 GB of memory.
After machine restart you use command free-m to determine
amount of free memory (without swap):
$free-m
totalusedfree
Mem:1590797914928
Now you can configure cache size to be 90% of the free memory 14 928 MB, i.e. 13 453 MB:
-- 90 % of free memory after machine restartcache.size=13453*MB
It is also possible to set the cache size based on the file system size. This is useful
if you use a dedicated partition for cache (e.g. non-persistent tmpfs). It is recommended
to leave some free space for special files, such as locks.:
cache.size=cache.fssize()-10*MB
Note
The Garbage Collector can be used to periodically trim the
cache. It is enabled and configured by default when running kresd with
systemd integration.
Using tmpfs for cache improves performance and reduces disk I/O.
By default the cache is saved on a persistent storage device
so the content of the cache is persisted during system reboot.
This usually leads to smaller latency after restart etc.,
however in certain situations a non-persistent cache storage might be preferred, e.g.:
Resolver handles high volume of queries and I/O performance to disk is too low.
Threat model includes attacker getting access to disk content in power-off state.
Disk has limited number of writes (e.g. flash memory in routers).
If non-persistent cache is desired configure cache directory to be on
tmpfs filesystem, a temporary in-memory file storage.
The cache content will be saved in memory, and thus have faster access
and will be lost on power-off or reboot.
Note
In most of the Unix-like systems /tmp and /var/run are
commonly mounted as tmpfs. While it is technically possible to move the
cache to an existing tmpfs filesystem, it is not recommended, since the
path to cache is configured in multiple places.
Mounting the cache directory as tmpfs is the recommended approach. Make sure
to use appropriate size= option and don’t forget to adjust the size in the
config file as well.
Open cache with a size limit. The cache will be reopened if already open.
Note that the max_size cannot be lowered, only increased due to how cache is implemented.
Tip
Use kB,MB,GB constants as a multiplier, e.g. 100*MB.
The URI lmdb://path allows you to change the cache directory.
For now there is only one backend implementation, even though the APIs are ready for different (synchronous) backends.
The cache supports runtime-changeable backends, using the optional RFC 3986 URI, where the scheme
represents backend protocol and the rest of the URI backend-specific configuration. By default, it
is a lmdb backend in working directory, i.e. lmdb://.
Return table with low-level statistics for internal cache operation and storage.
This counts each access to cache and does not directly map to individual
DNS queries or resource records.
For query-level statistics see stats module.
Cache operation read_leq (read less or equal, i.e. range search) was requested 9 times,
and 4 out of 9 operations were finished with cache miss.
Cache contains 6187 internal entries which occupy 15.625 % cache size.
ttl (number) – minimum TTL in seconds (default: 5 seconds)
Returns:
current minimum TTL
Get or set lower TTL bound applied to all received records.
Forcing TTL higher than specified violates DNS standards, so use higher values with care.
TTL still won’t be extended beyond expiration of the corresponding DNSSEC signature.
Note
The ttl value must be in range <0, max_ttl).
-- Get minimum TTLcache.min_ttl()0-- Set minimum TTLcache.min_ttl(5)5
Get or set time interval for which a nameserver address will be ignored after determining that it doesn’t return (useful) answers.
The intention is to avoid waiting if there’s little hope; instead, kresd can immediately SERVFAIL or immediately use stale records (with serve_stale module).
Warning
This settings applies only to the current kresd process.
Purge cache records matching specified criteria. There are two specifics:
To reliably remove negative cache entries you need to clear subtree with the whole zone. E.g. to clear negative cache entries for (formerly non-existing) record www.example.com. A you need to flush whole subtree starting at zone apex, e.g. example.com. [1].
This operation is asynchronous and might not be yet finished when call to cache.clear() function returns. Return value indicates if clearing continues asynchronously or not.
Parameters:
name (string) – subtree to purge; if the name isn’t provided, whole cache is purged
(and any other parameters are disregarded).
exact_name (bool) – if set to true, only records with the same name are removed;
default: false.
rr_type (kres.type) – you may additionally specify the type to remove,
but that is only supported with exact_name==true; default: nil.
chunk_size (integer) – the number of records to remove in one round; default: 100.
The purpose is not to block the resolver for long.
The default callback repeats the command after one millisecond
until all matching data are cleared.
callback (function) – a custom code to handle result of the underlying C call.
Its parameters are copies of those passed to cache.clear() with one additional
parameter rettable containing table with return value from current call.
count field contains a return code from kr_cache_remove_subtree().
prev_state (table) – return value from previous run (can be used by callback)
Return type:
table
Returns:
count key is always present. Other keys are optional and their presence indicate special conditions.
count(integer) - number of items removed from cache by this call (can be 0 if no entry matched criteria)
not_apex - cleared subtree is not cached as zone apex; proofs of non-existence were probably not removed
subtree(string) - hint where zone apex lies (this is estimation from cache content and might not be accurate)
chunk_limit - more than chunk_size items needs to be cleared, clearing will continue asynchronously
Examples:
-- Clear whole cache>cache.clear()[count]=>76-- Clear records at and below 'com.'>cache.clear('com.')[chunk_limit]=>chunksizelimitreached;thedefaultcallbackwillcontinueasynchronously[not_apex]=>toclearproofsofnon-existencecallcache.clear('com.')[count]=>100[round]=>1[subtree]=>com.>worker.sleep(0.1)[cache]asynchronouscache.clear('com',false)finished-- Clear only 'www.example.com.'>cache.clear('www.example.com.',true)[round]=>1[count]=>1[not_apex]=>toclearproofsofnon-existencecallcache.clear('example.com.')[subtree]=>example.com.
This section describes the usage of kresd when running under systemd.
For other uses, please refer to usage-without-systemd.
Knot Resolver can utilize multiple CPUs running in multiple independent instances (processes), where each process utilizes at most single CPU core on your machine. If your machine handles a lot of DNS traffic run multiple instances.
All instances typically share the same configuration and cache, and incoming queries are automatically distributed by operating system among all instances.
Advantage of using multiple instances is that a problem in a single instance will not affect others, so a single instance crash will not bring whole DNS resolver service down.
Tip
For maximum performance, there should be as many kresd processes as
there are available CPU threads.
To run multiple instances, use a different identifier after @ sign for each instance, for
example:
Resolver restart normally takes just milliseconds and cache content is persistent to avoid performance drop
after restart. If you want real zero-downtime restarts use multiple instances and do rolling
restart, i.e. restart only one resolver process at a time.
On a system with 4 instances run these commands sequentially:
The instance name is subsequently exposed to kresd via the environment variable
SYSTEMD_INSTANCE. This can be used to tell the instances apart, e.g. when
using the Name Server Identifier (NSID) module with per-instance configuration:
The predict module helps to keep the cache hot by prefetching records.
It can utilize two independent mechanisms to select the records which should be refreshed:
expiring records and prediction.
This mechanism is always active when the predict module is loaded and it is not configurable.
Any time the resolver answers with records that are about to expire,
they get refreshed. (see is_expiring())
That improves latency for records which get frequently queried, relatively to their TTL.
The predict module can also learn usage patterns and repetitive queries,
though this mechanism is a prototype and not recommended for use in production or with high traffic.
For example, if it makes a query every day at 18:00,
the resolver expects that it is needed by that time and prefetches it ahead of time.
This is helpful to minimize the perceived latency and keeps the cache hot.
You can disable prediction by configuring period=0.
Otherwise it will load the required stats module if not present,
and it will use its stats.frequent() table and clear it periodically.
Tip
The tracking window and period length determine memory requirements. If you have a server with relatively fast query turnover, keep the period low (hour for start) and shorter tracking window (5 minutes). For personal slower resolver, keep the tracking window longer (i.e. 30 minutes) and period longer (a day), as the habitual queries occur daily. Experiment to get the best results.
Reconfigure the predictor to given tracking window and period length. Both parameters are optional.
Window length is in minutes, period is a number of windows that can be kept in memory.
e.g. if a window is 15 minutes, a period of “24” means 6 hours.
This module provides ability to periodically prefill the DNS cache by importing root zone data obtained over HTTPS.
Intended users of this module are big resolver operators which will benefit from decreased latencies and smaller amount of traffic towards DNS root servers.
This configuration downloads the zone file from URL https://www.internic.net/domain/root.zone and imports it into the cache every 86400 seconds (1 day). The HTTPS connection is authenticated using a CA certificate from file /etc/pki/tls/certs/ca-bundle.crt and signed zone content is validated using DNSSEC.
The root zone to be imported must be signed using DNSSEC and the resolver must have a valid DNSSEC configuration.
Parameter
Description
ca_file
path to CA certificate bundle used to authenticate the HTTPS connection (optional, system-wide store will be used if not specified)
interval
number of seconds between zone data refresh attempts
Demo module that allows using timed-out records in case kresd is
unable to contact upstream servers.
By default it allows stale-ness by up to one day,
after roughly four seconds trying to contact the servers.
It’s quite configurable/flexible; see the beginning of the module source for details.
See also the RFC draft (not fully followed) and cache.ns_tout.
Knot Resolver developers think that literal implementation of RFC 7706
(“Decreasing Access Time to Root Servers by Running One on Loopback”)
is a bad idea so it is not implemented in the form envisioned by the RFC.
You can get the very similar effect without its downsides by combining
Cache prefilling and Serve stale modules with Aggressive Use
of DNSSEC-Validated Cache (RFC 8198) behavior which is enabled
automatically together with DNSSEC validation.
The module for Initializing a DNS Resolver with Priming Queries implemented
according to RFC 8109. Purpose of the module is to keep up-to-date list of
root DNS servers and associated IP addresses.
Result of successful priming query replaces root hints distributed with
the resolver software. Unlike other DNS resolvers, Knot Resolver caches
result of priming query on disk and keeps the data between restarts until
TTL expires.
This module is enabled by default; you may disable it by adding
modules.unload('priming') to your configuration.
The edns_keepalive module implements RFC 7828 for clients
connecting to Knot Resolver via TCP and TLS.
The module just allows clients to discover the connection timeout,
client connections are always timed-out the same way regardless
of clients sending the EDNS option.
When connecting to servers, Knot Resolver does not send this EDNS option.
It still attempts to reuse established connections intelligently.
This module is loaded by default. For debugging purposes it can be
unloaded using standard means:
As of version 5.2.0, XDP support in Knot Resolver is considered
experimental. The impact on overall throughput and performance may not
always be beneficial.
Using XDP allows significant speedup of UDP packet processing in recent Linux kernels,
especially with some network drivers that implement good support.
The basic idea is that for selected packets the Linux networking stack is bypassed,
and some drivers can even directly use the user-space buffers for reading and writing.
Bypassing the network stack has significant implications, such as bypassing the firewall
and monitoring solutions.
Make sure you’re familiar with the trade-offs before using this feature.
Read more in Limitations.
Linux kernel 4.18+ (5.x+ is recommended for optimal performance) compiled with
the CONFIG_XDP_SOCKETS=y option. XDP isn’t supported in other operating systems.
libknot compiled with XDP support
A multiqueue network card with native XDP support is highly recommended,
otherwise the performance gain will be much lower and you may encounter
issues due to XDP emulation.
Successfully tested cards:
Intel series 700 (driver i40e), maximum number of queues per interface is 64.
Intel series 500 (driver ixgbe), maximum number of queues per interface is 64.
The number of CPUs available has to be at most 64!
The CAP_SYS_RESOURCE is only needed on Linux < 5.11.
You want the same number of kresd instances and network queues on your card;
you can use ethtool-L before the services start.
With XDP this is more important than with vanilla UDP, as we only support one instance
per queue and unclaimed queues will fall back to vanilla UDP.
Ideally you can set these numbers as high as the number of CPUs that you want kresd to use.
Modification of /etc/knot-resolver/kresd.conf may often be quite simple, for example:
Note that you want to also keep the vanilla DNS line to service TCP
and possibly any fallback UDP (e.g. from unclaimed queues).
XDP listening is in principle done on queues of whole network interfaces
and the target addresses of incoming packets aren’t checked in any way,
but you are still allowed to specify interface by an address
(if it’s unambiguous at that moment):
The default selection of queues is tailored for the usual naming convention:
kresd@1.service, kresd@2.service, …
but you can still specify them explicitly, e.g. the default is effectively the same as:
Features in this section allow to configure what clients can get access to what
DNS data, i.e. DNS data filtering and manipulation.
Query policies specify global policies applicable to all requests,
e.g. for blocking access to particular domain. Views and ACLs allow
to specify per-client policies, e.g. block or unblock access
to a domain only for subset of clients.
It is also possible to modify data returned to clients, either by providing
Static hints (answers with statically configured IP addresses),
DNS64 translation, or IP address renumbering.
At the very end, module DNS Application Firewall provides HTTP API for run-time policy
modification, and generally just offers different interface for previously
mentioned features.
This module can block, rewrite, or alter inbound queries based on user-defined policies. It does not affect queries generated by the resolver itself, e.g. when following CNAME chains etc.
Each policy rule has two parts: a filter and an action. A filter selects which queries will be affected by the policy, and action which modifies queries matching the associated filter.
Typically a rule is defined as follows: filter(action(actionparameters),filterparameters). For example, a filter can be suffix which matches queries whose suffix part is in specified set, and one of possible actions is policy.DENY, which denies resolution. These are combined together into policy.suffix(policy.DENY,{todname('badguy.example.')}). The rule is effective when it is added into rule table using policy.add(), please see examples below.
This module is enabled by default because it implements mandatory RFC 6761 logic.
When no rule applies to a query, built-in rules for special-use and locally-served domain names are applied.
These rules can be overridden by action policy.PASS. For debugging purposes you can also add modules.unload('policy') to your config to unload the module.
For speed this filter requires domain names in DNS wire format, not textual representation, so each label in the name must be prefixed with its length. Always use convenience function policy.todnames() for automatic conversion from strings! For example:
Note
Non-ASCII is not supported.
Knot Resolver does not provide any convenience support for IDN.
Therefore everywhere (all configuration, logs, RPZ files) you need to deal with the
xn-- forms
of domain name labels, instead of directly using unicode characters.
common_suffix – common suffix of entries in suffix_table
Like policy.suffix() match, but you can also provide a common suffix of all matches for faster processing (nil otherwise).
This function is faster for small suffix tables (in the order of “hundreds”).
It is also possible to define custom filter function with any name.
An action function or nil if filter did not match.
Typically filter function is generated by another function, which allows easy parametrization - this technique is called closure. An practical example of such filter generator is:
functionmatch_query_type(action,target_qtype)returnfunction(state,query)ifquery.stype==target_qtypethen-- filter matched the query, return action functionreturnactionelse-- filter did not match, continue with next filterreturnnilendendend
This custom filter can be used as any other built-in filter.
For example this applies our custom filter and executes action policy.DENY on all queries of type HINFO:
-- custom filter which matches HINFO queries, action is policy.DENYpolicy.add(match_query_type(policy.DENY,kres.type.HINFO))
Deny existence of a given domain and add explanatory message. NXDOMAIN reply
contains an additional explanatory message as TXT record in the additional
section.
You may override the extended DNS error to provide the user with more
information. By default, BLOCKED is returned to indicate the domain is
blocked due to the internal policy of the operator. Other suitable error
codes are CENSORED (for externally imposed policy reasons) or
FILTERED (for blocking requested by the client). For more information,
please refer to RFC 8914.
Terminate query resolution and do not return any answer to the requestor.
Warning
During normal operation, an answer should always be returned.
Deliberate query drops are indistinguishable from packet loss and may
cause problems as described in RFC 8906. Only use NO_ANSWER
on very specific occasions, e.g. as a defense mechanism during an attack,
and prefer other actions (e.g. DROP or REFUSE) for normal
operation.
Force requestor to use TCP. It sets truncated bit (TC) in response to true if the request came through UDP, which will force standard-compliant clients to retry the request over TCP.
Reroute IP addresses in response matching given subnet to given target, e.g. {['192.0.2.0/24']='127.0.0.0'} will rewrite ‘192.0.2.55’ to ‘127.0.0.55’, see renumber module for more information. See policy.add() and do not forget to specify that this is postrule. Quick example:
-- this policy is enforced on answers-- therefore we have to use 'postrule'-- (the "true" at the end of policy.add)policy.add(policy.all(policy.REROUTE({['192.0.2.0/24']='127.0.0.0'})),true)
Overwrite Resource Records in responses with specified values.
type
- RR type to be replaced, e.g. [kres.type.A] or numeric value.
rdata
- RR data in DNS wire format, i.e. binary form specific for given RR type. Set of multiple RRs can be specified as table {rdata1,rdata2,...}. Use helper function kres.str2ip() to generate wire format for A and AAAA records. Wire format for other record types can be generated with kres.parse_rdata().
ttl
- TTL in seconds. Default: 1 second.
nodata
- If type requested by client is not configured in this policy:
true: Return empty answer (NODATA).
false: Ignore this policy and continue processing other rules.
Default: false.
-- policy to change IPv4 address and TTL for example.compolicy.add(policy.domains(policy.ANSWER({[kres.type.A]={rdata=kres.str2ip('192.0.2.7'),ttl=300}}),{todname('example.com')}))-- policy to generate two TXT records (specified in binary format) for example.netpolicy.add(policy.domains(policy.ANSWER({[kres.type.TXT]={rdata={'\005first','\006second'},ttl=5}}),{todname('example.net')}))
Parse string representation of RTYPE and RDATA into RDATA wire format. Expects
a table of string(s) and returns a table of wire data.
-- create wire format RDATA that can be passed to policy.ANSWERkres.parse_rdata({'SVCB 1 resolver.example. alpn=dot'})kres.parse_rdata({'SVCB 1 resolver.example. alpn=dot ipv4hint=192.0.2.1 ipv6hint=2001:db8::1','SVCB 2 resolver.example. mandatory=key65380 alpn=h2 key65380=/dns-query{?dns}',})
More complex non-chain actions are described in their own chapters, namely:
Send copy of incoming DNS queries to a given IP address using DNS-over-UDP and continue resolving them as usual. This is useful for sanity testing new versions of DNS resolvers.
Set and/or clear some flags for the query. There can be multiple flags to set/clear. You can just pass a single flag name (string) or a set of names. Flag names correspond to kr_qflags structure. Use only if you know what you are doing.
These are also “chain” actions, i.e. they don’t stop processing the policy rule list.
Similarly to other actions, they apply during whole processing of the client’s request,
i.e. including any sub-queries.
The log lines from these policy actions are tagged by extra [reqdbg] prefix,
and they are produced regardless of your log_level() setting.
They are marked as debug level, so e.g. with journalctl command you can use -pinfo to skip them.
Warning
Beware of producing too much logs.
These actions are not suitable for use on a large fraction of resolver’s requests.
The extra logs have significant performance impact and might also overload your logging system
(or get rate-limited by it).
You can use Filters to further limit on which requests this happens.
Print debug-level logging for this request.
That also includes messages from client (REQTRACE), upstream servers (QTRACE), and stats about interesting records at the end.
-- debug requests that ask for flaky.example.net or belowpolicy.add(policy.suffix(policy.DEBUG_ALWAYS,policy.todnames({'flaky.example.net'})))
Same as DEBUG_ALWAYS but only if the request required information which was not available locally, i.e. requests which forced resolver to ask upstream server(s).
Intended usage is for debugging problems with remote servers.
test_function – Function with single argument of type kr_request which returns true if debug logs for that request should be generated and false otherwise.
Same as DEBUG_ALWAYS but only logs if the test_function says so.
Note
test_function is evaluated only when request is finished.
As a result all debug logs this request must be collected,
and at the end they get either printed or thrown away.
Example usage which gathers verbose logs for all requests in subtree dnssec-failed.org. and prints debug logs for those finishing in a different state than kres.DONE (most importantly kres.FAIL, see kr_layer_state).
Pretty-print DNS responses from upstream servers (or cache) into logs.
It’s useful for debugging weird DNS servers.
If you do not use QTRACE in combination with DEBUG*,
you additionally need either log_groups({'iterat'}) (possibly with other groups)
or log_level('debug') to see the output in logs.
Pretty-print DNS requests from clients into the verbose log. It’s useful for debugging weird DNS clients.
It makes most sense together with Views and ACLs (enabling per-client)
and probably with verbose logging those request (e.g. use DEBUG_ALWAYS instead).
Log how the request arrived.
Most notably, this includes the client’s IP address, so beware of privacy implications.
-- example usage in configurationpolicy.add(policy.all(policy.IPTRACE))-- you might want to combine it with some other logs, e.g.policy.add(policy.all(policy.DEBUG_ALWAYS))
-- example log lines from IPTRACE:
[reqdbg][policy][57517.00] request packet arrived from ::1#37931 to ::1#00853 (TCP + TLS)
[reqdbg][policy][65538.00] request packet arrived internally
request – Current DNS request as kr_request structure.
Returns:
Returning a new kr_layer_state prevents evaluating other policy rules. Returning nil creates a chain action and allows to continue evaluating other rules.
This is real example of an action function:
-- Custom action which generates fake A recordlocalffi=require('ffi')localfunctionfake_A_record(state,req)localanswer=req:ensure_answer()ifanswer==nilthenreturnnilendlocalqry=req:current()ifqry.stype~=kres.type.Athenreturnstateendffi.C.kr_pkt_make_auth_header(answer)answer:rcode(kres.rcode.NOERROR)answer:begin(kres.section.ANSWER)answer:put(qry.sname,900,answer:qclass(),kres.type.A,'\192\168\1\3')returnkres.DONEend
This custom action can be used as any other built-in action.
For example this applies our fake A record action and executes it on all queries in subtree example.net:
The action function can implement arbitrary logic so it is possible to implement complex heuristics, e.g. to deflect Slow drip DNS attacks or gray-list resolution of misbehaving zones.
Warning
The policy module currently only looks at whole DNS requests. The rules won’t be re-applied e.g. when following CNAMEs.
Forwarding action alters behavior for cache-miss events. If an information is missing in the local cache the resolver will forward the query to another DNS resolver for resolution (instead of contacting authoritative servers directly). DNS answers from the remote resolver are then processed locally and sent back to the original client.
Actions policy.FORWARD(), policy.TLS_FORWARD() and policy.STUB() accept up to four IP addresses at once and the resolver will automatically select IP address which statistically responds the fastest.
Forward cache-miss queries to specified IP addresses (without encryption), DNSSEC validate received answers and cache them. Target IP addresses are expected to be DNS resolvers.
-- Forward all queries to public resolvers https://www.nic.cz/odvrpolicy.add(policy.all(policy.FORWARD({'2001:148f:fffe::1','2001:148f:ffff::1','185.43.135.1','193.14.47.1'})))
Similar to policy.FORWARD() but without attempting DNSSEC validation.
Each request may be either answered from cache or simply sent to one of the IPs with proxying back the answer.
-- Answers for reverse queries about the 192.168.1.0/24 subnet-- are to be obtained from IP address 192.0.2.1 port 5353-- This disables DNSSEC validation!policy.add(policy.suffix(policy.STUB('192.0.2.1@5353'),{todname('1.168.192.in-addr.arpa')}))
Limiting forwarding actions by filters (e.g. policy.suffix()) may have unexpected consequences.
Notably, forwarders can inject any records into your cache
even if you “restrict” them to an insignificant DNS subtree –
except in cases where DNSSEC validation applies, of course.
The behavior is probably best understood through the fact
that filters and actions are completely decoupled.
The forwarding actions have no clue about why they were executed,
e.g. that the user wanted to restrict the forwarder only to some subtree.
The action just selects some set of forwarders to process this whole request from the client,
and during that processing it might need some other “sub-queries” (e.g. for validation).
Some of those might not’ve passed the intended filter,
but policy rule-set only applies once per client’s request.
Same as policy.FORWARD() but send query over DNS-over-TLS protocol (encrypted).
Each target IP address needs explicit configuration how to validate
TLS certificate so each IP address is configured by pair:
{ip_address,authentication}. See sections below for more details.
Queries affected by policy.TLS_FORWARD() will always be resolved over TLS connection. Knot Resolver does not implement fallback to non-TLS connection, so if TLS connection cannot be established or authenticated according to the configuration, the resolution will fail.
Some public DNS-over-TLS providers may apply rate-limiting which
makes their service incompatible with Knot Resolver’s TLS forwarding.
Notably, Google Public DNS doesn’t
work as of 2019-07-10.
When multiple servers are specified, the one with the lowest round-trip time is used.
Traditional PKI authentication requires server to present certificate with specified hostname, which is issued by one of trusted CAs. Example policy is:
hostname must be a valid domain name matching server’s certificate. It will also be sent to the server as SNI.
ca_file optionally contains a path to a CA certificate (or certificate bundle) in PEM format.
If you omit that, the system CA certificate store will be used instead (usually sufficient).
A list of paths is also accepted, but all of them must be valid PEMs.
Instead of CAs, you can specify hashes of accepted certificates in pin_sha256.
They are in the usual format – base64 from sha256.
You may still specify hostname if you want SNI to be sent.
modules={'policy'}-- forward all queries over TLS to the specified serverpolicy.add(policy.all(policy.TLS_FORWARD({{'192.0.2.1',pin_sha256='YQ=='}})))-- for brevity, other TLS examples omit policy.add(policy.all())-- single server authenticated using its certificate pin_sha256policy.TLS_FORWARD({{'192.0.2.1',pin_sha256='YQ=='}})-- pin_sha256 is base64-encoded-- single server authenticated using hostname and system-wide CA certificatespolicy.TLS_FORWARD({{'192.0.2.1',hostname='res.example.com'}})-- single server using non-standard portpolicy.TLS_FORWARD({{'192.0.2.1@443',pin_sha256='YQ=='}})-- use @ or # to specify port-- single server with multiple valid pins (e.g. anycast)policy.TLS_FORWARD({{'192.0.2.1',pin_sha256={'YQ==','Wg=='}})-- multiple servers, each with own authenticatorpolicy.TLS_FORWARD({-- please note that { here starts list of servers{'192.0.2.1',pin_sha256='Wg=='},-- server must present certificate issued by specified CA and hostname must match{'2001:DB8::d0c',hostname='res.example.com',ca_file='/etc/knot-resolver/tlsca.crt'}})
With the use of policy.slice() function, it is possible to split the
entire DNS namespace into distinct slices. When used in conjunction with
policy.TLS_FORWARD(), it’s possible to forward different queries to
different targets.
slice_func – slicing function that returns index based on query
action – action to be performed for the slice
This function splits the entire domain space into multiple slices (determined
by the number of provided actions). A slice_func is called to determine
which slice a query belongs to. The corresponding action is then executed.
The function initializes and returns a slicing function, which
deterministically assigns query to a slice based on the query name.
It utilizes the Public Suffix List to ensure domains under the same
registrable domain end up in a single slice. (see example below)
seed can be used to re-shuffle the slicing algorithm when the slicing
function is initialized. By default, the assignment is re-shuffled after one
week (when resolver restart / reloads config). To force a stable
distribution, pass a fixed value. To re-shuffle on every resolver restart,
use os.time().
The following example demonstrates a distribution among 3 slices:
These two functions can be used together to forward queries for names
in different parts of DNS name space to different target servers:
policy.add(policy.slice(policy.slice_randomize_psl(),policy.TLS_FORWARD({{'192.0.2.1',hostname='res.example.com'}}),policy.TLS_FORWARD({-- multiple servers can be specified for a single slice-- the one with lowest round-trip time will be used{'193.17.47.1',hostname='odvr.nic.cz'},{'185.43.135.1',hostname='odvr.nic.cz'},})))
Note
The privacy implications of using this feature aren’t clear. Since
websites often make requests to multiple domains, these might be forwarded
to different targets. This could result in decreased privacy (e.g. when the
remote targets are both logging or otherwise processing your DNS traffic).
The intended use-case is to use this feature with semi-trusted resolvers
which claim to do no logging (such as those listed on dnsprivacy.org), to
decrease the potential exposure of your DNS data to a malicious resolver
operator.
Following procedure applies only to domains which have different content
publicly and internally. For example this applies to “your own” top-level domain
example. which does not exist in the public (global) DNS namespace.
Dealing with these internal-only domains requires extra configuration because
DNS was designed as “single namespace” and local modifications like adding
your own TLD break this assumption.
Warning
Use of internal names which are not delegated from the public DNS
is causing technical problems with caching and DNSSEC validation
and generally makes DNS operation more costly.
We recommend against using these non-delegated names.
To make such internal domain available in your resolver it is necessary to
graft your domain onto the public DNS namespace,
but grafting creates new issues:
These grafted domains will be rejected by DNSSEC validation
because such domains are technically indistinguishable from an spoofing attack
against the public DNS.
Therefore, if you trust the remote resolver which hosts the internal-only domain,
and you trust your link to it, you need to use the policy.STUB() policy
instead of policy.FORWARD() to disable DNSSEC validation for those
grafted domains.
Example configuration grafting domains onto public DNS namespace¶
extraTrees=policy.todnames({'faketldtest.','sld.example.','internal.example.com.','2.0.192.in-addr.arpa.'-- this applies to reverse DNS tree as well})-- Beware: the rule order is important, as policy.STUB is not a chain action.-- Flags: for "dumb" targets disabling EDNS can help (below) as DNSSEC isn't-- validated anyway; in some of those cases adding 'NO_0X20' can also help,-- though it also lowers defenses against off-path attacks on communication-- between the two servers.-- With kresd <= 5.5.3 you also needed 'NO_CACHE' flag to avoid unintentional-- NXDOMAINs that could sometimes happen due to aggressive DNSSEC caching.policy.add(policy.suffix(policy.FLAGS({'NO_EDNS'}),extraTrees))policy.add(policy.suffix(policy.STUB({'2001:db8::1'}),extraTrees))
There is no published Internet Standard for RPZ and implementations vary.
At the moment Knot Resolver supports limited subset of RPZ format and deviates
from implementation in BIND. Nevertheless it is good enough
for blocking large lists of spam or advertising domains.
The RPZ file format is basically a DNS zone file with very special semantics.
For example:
; left hand side ; TTL and class ; right hand side
; encodes RPZ trigger ; ignored ; encodes action
; (i.e. filter)
blocked.domain.example 600 IN CNAME . ; block main domain
*.blocked.domain.example 600 IN CNAME . ; block subdomains
The only “trigger” supported in Knot Resolver is query name,
i.e. left hand side must be a domain name which triggers the action specified
on the right hand side.
Subset of possible RPZ actions is supported, namely:
action – the default action for match in the zone; typically you want policy.DENY
path – path to zone file
watch – boolean, if true, the file will be reloaded on file change
Enforce RPZ rules. This can be used in conjunction with published blocklist feeds.
The RPZ operation is well described in this Jan-Piet Mens’s post,
or the Pro DNS and BIND book.
For example, we can store the example snippet with domain blocked.domain.example
(above) into file /etc/knot-resolver/blocklist.rpz and configure resolver to
answer with NXDOMAIN plus the specified additional text to queries for this domain:
policy.add(policy.rpz(policy.DENY_MSG('domain blocked by your resolver operator'),'/etc/knot-resolver/blocklist.rpz',true))
Resolver will reload RPZ file at run-time if the RPZ file changes.
Recommended RPZ update procedure is to store new blocklist in a new file
(newblocklist.rpz) and then rename the new file to the original file name
(blocklist.rpz). This avoids problems where resolver might attempt
to re-read an incomplete file.
rule – added rule, i.e. policy.pattern(policy.DENY,'[0-9]+\2cz')
postrule – boolean, if true the rule will be evaluated on answer instead of query
Returns:
rule description
Add a new policy rule that is executed either or queries or answers, depending on the postrule parameter. You can then use the returned rule description to get information and unique identifier for the rule, as well as match count.
-- mirror all queries, keep handle so we can retrieve information laterlocalrule=policy.add(policy.all(policy.MIRROR('127.0.0.2')))-- we can print statistics about this rule any time laterprint(string.format('id: %d, matched queries: %d',rule.id,rule.count)
Returns table of domain names in wire format converted from strings.
-- Convert single nameassert(todname('example.com')=='\7example\3com\0')-- Convert table of namespolicy.todnames({'example.com','me.cz'}){'\7example\3com\0','\2me\2cz\0'}
The policy module implements policies for global query matching, e.g. solves “how to react to certain query”.
This module combines it with query source matching, e.g. “who asked the query”. This allows you to create personalized blacklists, filters and ACLs.
There are two identification mechanisms:
addr
- identifies the client based on his subnet
tsig
- identifies the client based on a TSIG key name (only for testing purposes, TSIG signature is not verified!)
View module allows you to combine query source information with policy rules.
This example will force given client to TCP for names in example.com subtree.
You can combine view selectors with RPZ to create personalized filters for example.
Warning
Beware that cache is shared by all requests. For example, it is safe
to refuse answer based on who asks the resolver, but trying to serve
different data to different clients will result in unexpected behavior.
Setups like split-horizon which depend on isolated DNS caches
are explicitly not supported.
-- Load modulesmodules={'view'}-- Whitelist queries identified by TSIG keyview:tsig('\5mykey',policy.all(policy.PASS))-- Block local IPv4 clients (ACL like)view:addr('127.0.0.1',policy.all(policy.DENY))-- Block local IPv6 clients (ACL like)view:addr('::1',policy.all(policy.DENY))-- Drop queries with suffix match for remote clientview:addr('10.0.0.0/8',policy.suffix(policy.DROP,policy.todnames({'xxx'})))-- RPZ for subset of clientsview:addr('192.168.1.0/24',policy.rpz(policy.PASS,'whitelist.rpz'))-- Do not try this - it will pollute cache and surprise you!-- view:addr('10.0.0.0/8', policy.all(policy.FORWARD('2001:DB8::1')))-- Drop all IPv4 that hasn't matchedview:addr('0.0.0.0/0',policy.all(policy.DROP))
The current implementation is best understood as three separate rule chains:
vanilla policy.add, view:tsig and view:addr.
For each request the rules in these chains get tried one by one until a non-chain policy action gets executed.
By default policy module acts before view module due to policy being loaded by default. If you want to intermingle universal rules with view:addr, you may simply wrap the universal policy rules in view closure like this:
This is a module providing static hints for forward records (A/AAAA) and reverse records (PTR).
The records can be loaded from /etc/hosts-like files and/or added directly.
You can also use the module to change the root hints; they are used as a safety belt or if the root NS
drops out of cache.
Tip
For blocking large lists of domains please use policy.rpz()
instead of creating huge list of domains with IP address 0.0.0.0.
-- Load hints after iterator (so hints take precedence before caches)modules={'hints > iterate'}-- Add a custom hosts filehints.add_hosts('hosts.custom')-- Override the root hintshints.root({['j.root-servers.net.']={'2001:503:c27::2:30','192.58.128.30'}})-- Add a custom hinthints['foo.bar']='127.0.0.1'
Note
The policy module applies before hints,
so your hints might get surprisingly shadowed by even default policies.
That most often happens for RFC 6761#section-6 names, e.g.
localhost and test or with PTR records in private address ranges.
To unblock the required names, you may use an explicit policy.PASS action.
This .PASS workaround isn’t ideal. To improve some cases,
we recommend to move these .PASS lines to the end of your rule list.
The point is that applying any non-chain action
(e.g. forwarding actions or .PASS itself)
stops processing any later policy rules for that request (including the default block-rules).
You probably don’t want this .PASS to shadow any other rules you might have;
and on the other hand, if any other non-chain rule triggers,
additional .PASS would not change anything even if it were somehow force-executed.
pair (string) – hostnameaddress i.e. "localhost127.0.0.1"
Returns:
{result:bool}
Add a hostname–address pair hint.
Note
If multiple addresses have been added for a name (in separate hints.set() commands),
all are returned in a forward query.
If multiple names have been added to an address, the last one defined is returned
in a corresponding PTR query.
A good rule of thumb is to select only a few fastest root hints. The server learns RTT and NS quality over time, and thus tries all servers available. You can help it by preselecting the candidates.
The module for RFC 6147 DNS64 AAAA-from-A record synthesis, it is used to enable client-server communication between an IPv6-only client and an IPv4-only server. See the well written introduction in the PowerDNS documentation.
If no address is passed (i.e. nil), the well-known prefix 64:ff9b:: is used.
-- Load the module with default settingsmodules={'dns64'}-- Reconfigure laterdns64.config({prefix='2001:db8::aabb:0:0'})
Warning
The module currently won’t work well with policy.STUB().
Also, the IPv6 prefix passed in configuration is assumed to be /96.
Tip
The A record sub-requests will be DNSSEC secured, but the synthetic AAAA records can’t be. Make sure the last mile between stub and resolver is secure to avoid spoofing.
You can specify a set of IPv6 subnets that are disallowed in answer.
If they appear, they will be replaced by AAAAs generated from As.
dns64.config({prefix='2001:db8:3::',exclude_subnets={'2001:db8:888::/48','::ffff/96'},})-- You could even pass '::/0' to always force using generated AAAAs.
In case you don’t want dns64 for all clients,
you can set DNS64_DISABLE flag via the view module.
modules={'dns64','view'}-- disable dns64 for all IPv4 source addressesview:addr('0.0.0.0/0',policy.all(policy.FLAGS('DNS64_DISABLE')))-- disable dns64 for all IPv6 source addressesview:addr('::/0',policy.all(policy.FLAGS('DNS64_DISABLE')))-- re-enable dns64 for two IPv6 subnetsview:addr('2001:db8:11::/48',policy.all(policy.FLAGS(nil,'DNS64_DISABLE')))view:addr('2001:db8:93::/48',policy.all(policy.FLAGS(nil,'DNS64_DISABLE')))
The module renumbers addresses in answers to different address space.
e.g. you can redirect malicious addresses to a blackhole, or use private address ranges
in local zones, that will be remapped to real addresses by the resolver.
Warning
While requests are still validated using DNSSEC, the signatures
are stripped from final answer. The reason is that the address synthesis
breaks signatures. You can see whether an answer was valid or not based on
the AD flag.
modules={renumber={-- Source subnet, destination subnet{'10.10.10.0/24','192.168.1.0'},-- Remap /16 block to localhost address range{'166.66.0.0/16','127.0.0.0'},-- Remap /26 subnet (64 ip addresses){'166.55.77.128/26','127.0.0.192'},-- Remap a /32 block to a single address{'2001:db8::/32','::1!'},}}
Certain clients are “dumb” and always connect to first IP address or name found
in a DNS answer received from resolver instead of picking randomly.
As a workaround for such broken clients it is possible to randomize
order of records in DNS answers sent by resolver:
This module provides protection from DNS Rebinding attack by blocking
answers which contain IPv4 or IPv6 addresses for private use
(or some other special-use addresses).
To enable this module insert following line into your configuration file:
modules.load('rebinding < iterate')
Please note that this module does not offer stable configuration interface
yet. For this reason it is suitable mainly for public resolver operators
who do not need to whitelist certain subnets.
Warning
DNS Blacklists (RFC 5782) often use 127.0.0.0/8 to blacklist
a domain. Using the rebinding module prevents DNSBL from functioning
properly.
This module ensures all queries without RD (recursion desired) bit set in query
are answered with REFUSED. This prevents snooping on the resolver’s cache content.
The module is loaded by default. If you’d like to disable this behavior, you can
unload it:
This module is a high-level interface for other powerful filtering modules and DNS views. It provides an easy interface to apply and monitor DNS filtering rules and a persistent memory for them. It also provides a restful service interface and an HTTP interface.
Firewall rules are declarative and consist of filters and actions. Filters have fieldoperatoroperand notation (e.g. qname=example.com), and may be chained using AND/OR keywords. Actions may or may not have parameters after the action name.
-- Let's write some daft rules!modules={'daf'}-- Block all queries with QNAME = example.comdaf.add('qname = example.com deny')-- Filters can be combined using AND/OR...-- Block all queries with QNAME match regex and coming from given subnetdaf.add('qname ~ %w+.example.com AND src = 192.0.2.0/24 deny')-- We also can reroute addresses in response to alternate target-- This reroutes 192.0.2.1 to localhostdaf.add('src = 127.0.0.0/8 reroute 192.0.2.1-127.0.0.1')-- Subnets work too, this reroutes a whole subnet-- e.g. 192.0.2.55 to 127.0.0.55daf.add('src = 127.0.0.0/8 reroute 192.0.2.0/24-127.0.0.0')-- This rewrites all A answers for 'example.com' from-- whatever the original address was to 127.0.0.2daf.add('src = 127.0.0.0/8 rewrite example.com A 127.0.0.2')-- Mirror queries matching given name to DNS loggerdaf.add('qname ~ %w+.example.com mirror 127.0.0.2')daf.add('qname ~ example-%d.com mirror 127.0.0.3@5353')-- Forward queries from subnetdaf.add('src = 127.0.0.1/8 forward 127.0.0.1@5353')-- Forward to multiple targetsdaf.add('src = 127.0.0.1/8 forward 127.0.0.1@5353,127.0.0.2@5353')-- Truncate queries based on destination IPsdaf.add('dst = 192.0.2.51 truncate')-- Disable a ruledaf.disable(2)-- Enable a ruledaf.enable(2)-- Delete a ruledaf.del(2)-- Delete all rules and start from scratchdaf.clear()
Warning
Only the first matching rule’s action is executed. Defining
additional actions for the same matching rule, e.g. src=127.0.0.1/8,
will have no effect.
If you’re not sure what firewall rules are in effect, see daf.rules:
If you have HTTP/2 loaded, the firewall automatically loads as a snippet.
You can create, track, suspend and remove firewall rules from the web interface.
If you load both modules, you have to load daf after http.
To read service logs use commands usual for your distribution.
E.g. on distributions using systemd-journald use command journalctl-ukresd@*-f.
Knot Resolver supports 6 logging levels - crit, err, warning,
notice, info, debug. All levels with the same meaning as is defined
in syslog.h. It is possible change logging level using
log_level() function.
log_level('debug')-- too verbose for normal usage
Logging level notice is set after start by default,
so logs from Knot Resolver should contain only couple lines a day.
For debugging purposes it is possible to use the very verbose debug level,
but that is generally not usable unless restricted in some way (see below).
In addition to levels, logging is also divided into the
groups. All groups
are logged by default, but you can enable debug level for selected groups using
log_groups() function. Other groups are logged to the log level
set by log_level().
It is also possible to enable debug logging level for particular requests,
with policies or as an HTTP service.
Deprecated since version 5.4.0: Use log_level() instead.
Param:
true enable debug level, false switch to default level (notice).
Returns:
boolean true when debug level is enabled.
Toggle between debug and notice log level. Use only for debugging purposes.
On busy systems verbose logging can produce several MB of logs per
second and will slow down operation.
Knot Resolver logs to standard error stream by default,
but typical systemd units change that to 'syslog'.
That setting logs directly through systemd’s facilities
(if available) to preserve more meta-data.
Use to turn-on debug logging for the selected groups regardless of the global
log level. Calling with no argument lists the currently active log groups. To
remove all log groups, call the function with an empty table.
log_groups({'io','tls'}-- turn on debug logging for io and tls groupslog_groups()-- list active log groupslog_groups({})-- remove all log groups
Various statistics for monitoring purposes are available in Statistics collector module, including export to central systems like Graphite, Metronome, InfluxDB, or Prometheus format.
Resolver Watchdog is tool to detect and recover from potential bugs that cause the resolver to stop responding properly to queries.
Additional monitoring and debugging methods are described below. If none of these options fits your deployment or if you have special needs you can configure your own checks and exports using Asynchronous events.
This module logs a message for each DNSSEC validation failure (on noticelevel).
It is meant to provide hint to operators which queries should be
investigated using diagnostic tools like DNSViz.
Add following line to your configuration file to enable it:
Module stats gathers various counters from the query resolution
and server internals, and offers them as a key-value storage.
These metrics can be either exported to Graphite/InfluxDB/Metronome,
exposed as Prometheus metrics endpoint, or processed using user-provided script
as described in chapter Asynchronous events.
Note
Please remember that each Knot Resolver instance keeps its own
statistics, and instances can be started and stopped dynamically. This might
affect your data postprocessing procedures if you are using
Multiple instances.
Outputs a list of recent upstreams and their RTT. It is sorted by time and stored in a ring buffer of
a fixed size. This means it’s not aggregated and readable by multiple consumers, but also that
you may lose entries if you don’t read quickly enough. The default ring size is 512 entries, and may be overridden on compile time by -DUPSTREAMS_COUNT=X.
Outputs list of most frequent iterative queries as a JSON array. The queries are sampled probabilistically,
and include subrequests. The list maximum size is 5000 entries, make diffs if you want to track it over time.
The graphite sends statistics over the Graphite protocol to either Graphite, Metronome, InfluxDB or any compatible storage. This allows powerful visualization over metrics collected by Knot Resolver.
Tip
The Graphite server is challenging to get up and running, InfluxDB combined with Grafana are much easier, and provide richer set of options and available front-ends. Metronome by PowerDNS alternatively provides a mini-graphite server for much simpler setups.
Example configuration:
Only the host parameter is mandatory.
By default the module uses UDP so it doesn’t guarantee the delivery, set tcp=true to enable Graphite over TCP. If the TCP consumer goes down or the connection with Graphite is lost, resolver will periodically attempt to reconnect with it.
modules={graphite={prefix=hostname()..worker.id,-- optional metric prefixhost='127.0.0.1',-- graphite server addressport=2003,-- graphite server portinterval=5*sec,-- publish intervaltcp=false-- set to true if you want TCP mode}}
The module supports sending data to multiple servers at once.
The HTTP module exposes /metrics endpoint that serves metrics
from Statistics collector in Prometheus text format.
You can use it as soon as HTTP module is configured:
You can namespace the metrics in configuration, using http.prometheus.namespace attribute:
modules.load('http')-- Set Prometheus namespacehttp.prometheus.namespace='resolver_'
You can also add custom metrics or rewrite existing metrics before they are returned to Prometheus client.
modules.load('http')-- Add an arbitrary metric to Prometheushttp.prometheus.finalize=function(metrics)table.insert(metrics,'build_info{version="1.2.3"} 1')end
Worker is a service over event loop that tracks and schedules outstanding queries,
you can see the statistics or schedule new queries. It also contains information about
specified worker count and process rank.
Module nsid provides server-side support for RFC 5001
which allows DNS clients to request resolver to send back its NSID
along with the reply to a DNS request.
This is useful for debugging larger resolver farms
(e.g. when using Multiple instances, anycast or load balancers).
NSID value can be configured in the resolver’s configuration file:
The http module provides /trace endpoint which allows to trace various
aspects of the request execution. The basic mode allows you to resolve a query
and trace debug-level logs for it (and messages received):
This module cooperates with Systemd watchdog to restart the process in case
the internal event loop gets stuck. The upstream Systemd unit files are configured
to use this feature, which is turned on with the WatchdogSec= directive
in the service file.
As an optional feature, this module can also do an internal DNS query to check if resolver
answers correctly. To use this feature you must configure DNS name and type to query for:
Each single query from watchdog must result in answer with
RCODE = NOERROR or NXDOMAIN. Any other result will terminate the resolver
(with SIGABRT) to allow the supervisor process to do cleanup, gather coredump
and restart the resolver.
It is recommended to use a name with a very short TTL to make sure the watchdog
is testing all parts of resolver and not only its cache. Obviously this check
makes sense only when used with very reliable domains; otherwise a failure
on authoritative side will shutdown resolver!
WatchdogSec specifies deadline for supervisor when the process will be killed.
Watchdog queries are executed each WatchdogSec / 2 seconds.
This implies that half of WatchdogSec interval must be long enough for
normal DNS query to succeed, so do not forget to add two or three seconds
for random network timeouts etc.
The module is loaded by default. If you’d like to disable it you can unload it:
modules.unload('watchdog')
Beware that unloading the module without disabling watchdog feature in supervisor
will lead to infinite restart loop.
The dnstap module supports logging DNS requests and responses to a unix
socket in dnstap format using fstrm framing library.
This logging is useful if you need effectively log all DNS traffic.
The unix socket and the socket reader must be present before starting resolver instances.
Also it needs appropriate filesystem permissions;
the typical user and group of the daemon are called knot-resolver.
Tunables:
socket_path: the unix socket file where dnstap messages will be sent
identity: identity string as typically returned by an “NSID” (RFC 5001) query, empty by default
version: version string of the resolver, defaulting to “Knot Resolver major.minor.patch”
client.log_queries: if true queries from downstream in wire format will be logged
client.log_responses: if true responses to downstream in wire format will be logged
The module ta_sentinel implements A Root Key Trust Anchor Sentinel for DNSSEC
according to standard RFC 8509.
This feature allows users of DNSSEC validating resolver to detect which root keys
are configured in resolver’s chain of trust. The data from such
signaling are necessary to monitor the progress of the DNSSEC root key rollover
and to detect potential breakage before it affect users. One example of research enabled by this module is available here.
This module is enabled by default and we urge users not to disable it.
If it is absolutely necessary you may add modules.unload('ta_sentinel')
to your configuration to disable it.
The module for Signaling Trust Anchor Knowledge in DNSSEC Using Key Tag Query,
implemented according to RFC 8145#section-5.
This feature allows validating resolvers to signal to authoritative servers
which keys are referenced in their chain of trust. The data from such
signaling allow zone administrators to monitor the progress of rollovers
in a DNSSEC-signed zone.
This mechanism serve to measure the acceptance and use of new DNSSEC
trust anchors and key signing keys (KSKs). This signaling data can be
used by zone administrators as a gauge to measure the successful deployment
of new keys. This is of particular interest for the DNS root zone in the event
of key and/or algorithm rollovers that rely on RFC 5011 to automatically
update a validating DNS resolver’s trust anchor.
Attention
Experience from root zone KSK rollover in 2018 shows that this mechanism
by itself is not sufficient to reliably measure acceptance of the new key.
Nevertheless, some DNS researchers found it is useful in combination
with other data so we left it enabled for now. This default might change
once more information is available.
This module is enabled by default. You may use modules.unload('ta_signal_query')
in your configuration.
This module compares local system time with inception and expiration time
bounds in DNSSEC signatures for .NS records. If the local system time is
outside of these bounds, it is likely a misconfiguration which will cause
all DNSSEC validation (and resolution) to fail.
In case of mismatch, a warning message will be logged to help with
further diagnostics.
Warning
Information printed by this module can be forged by a network attacker!
System administrator MUST verify values printed by this module and
fix local system time using a trusted source.
This module is useful for debugging purposes. It runs only once during resolver
start does not anything after that. It is enabled by default.
You may disable the module by appending
modules.unload('detect_time_skew') to your configuration.
This module detect discontinuous jumps in the system time when resolver
is running. It clears cache when a significant backward time jumps occurs.
Time jumps are usually created by NTP time change or by admin intervention.
These change can affect cache records as they store timestamp and TTL in real
time.
If you want to preserve cache during time travel you should disable
this module by modules.unload('detect_time_jump').
Due to the way monotonic system time works on typical systems,
suspend-resume cycles will be perceived as forward time jumps,
but this direction of shift does not have the risk of using records
beyond their intended TTL, so forward jumps do not cause erasing the cache.
In case the resolver crashes, it is often helpful to collect a coredump from
the crashed process. Configuring the system to collect coredump from crashed
process is out of the scope of this documentation, but some tips can be found
here.
Kresd uses its own mechanism for assertions. They are checks that should always
pass and indicate some weird or unexpected state if they don’t. In such cases,
they show up in the log as errors. By default, the process recovers from those
states if possible, but the behaviour can be changed with the following options
to aid further debugging.
int (default: 5 minutes in meson’s release mode, 0 otherwise)
If a process should be aborted, it can be done in two ways. When this is
set to nonzero (default), a child is forked and aborted to obtain a coredump,
while the parent process recovers and keeps running. This can be useful to
debug a rare issue that occurs in production, since it doesn’t affect the
main process.
As the dumping can be costly, the value is a lower bound on delay between
consecutive coredumps of each process. It is randomized by +-25% each time.
Good news! Knot Resolver uses secure configuration by default, and this configuration
should not be changed unless absolutely necessary, so feel free to skip over this section.
Warning
Options in this section are intended only for expert users and
normally should not be needed.
Since version 4.0, DNSSEC validation is enabled by default.
If you really need to turn DNSSEC off and are okay with lowering security of your
system by doing so, add the following snippet to your configuration file.
-- turns off DNSSEC validationtrust_anchors.remove('.')
The resolver supports DNSSEC including RFC 5011 automated DNSSEC TA updates
and RFC 7646 negative trust anchors. Depending on your distribution, DNSSEC
trust anchors should be either maintained in accordance with the distro-wide
policy, or automatically maintained by the resolver itself.
In practice this means that you can forget about it and your favorite Linux
distribution will take care of it for you.
Following functions allow to modify DNSSEC configuration if you really have to:
readonly – if true, do not attempt to update the file.
The format is standard zone file, though additional information may be persisted in comments.
Either DS or DNSKEY records can be used for TAs.
If the file does not exist, bootstrapping of root TA will be attempted.
If you want to use bootstrapping, install lua-http library.
Each file can only contain records for a single domain.
The TAs will be updated according to RFC 5011 and persisted in the file (if allowed).
Remove specified trust anchor from trusted key set. Removing trust anchor for the root zone effectively disables DNSSEC validation (unless you configured another trust anchor).
>trust_anchors.remove('.')true
If you want to disable DNSSEC validation for a particular domain but keep it enabled for the rest of DNS tree, use trust_anchors.set_insecure().
Modify RFC5011 refresh timer to given value (not set by default), this will force trust anchors
to be updated every N seconds periodically instead of relying on RFC5011 logic and TTLs.
Intended only for testing purposes.
Example: 10*sec
How many Removed keys should be held in history (and key file) before being purged.
Note: all Removed keys will be purged from key file after restarting the process.
nta_list (table) – List of domain names (text format) representing NTAs.
When you use a domain name as an negative trust anchor (NTA), DNSSEC validation will be turned off at/below these names.
Each function call replaces the previous NTA set. You can find the current active set in trust_anchors.insecure variable.
If you want to disable DNSSEC validation completely use trust_anchors.remove() function instead.
rr_string (string) – DS/DNSKEY records in presentation format (e.g. .3600INDS190368249AAC11...)
Inserts DS/DNSKEY record(s) into current keyset. These will not be managed or updated, use it only for testing
or if you have a specific use case for not using a keyfile.
Note
Static keys are very error-prone and should not be used in production. Use trust_anchors.add_file() instead.
Example output:
>trust_anchors.add('. 3600 IN DS 19036 8 2 49AAC11...')
New checking level specified as string (optional).
Returns:
Current checking level.
Get or change resolver strictness checking level.
By default, resolver runs in normal mode. There are possibly many small adjustments
hidden behind the mode settings, but the main idea is that in permissive mode, the resolver
tries to resolve a name with as few lookups as possible, while in strict mode it spends much
more effort resolving and checking referral path. However, if majority of the traffic is covered
by DNSSEC, some of the strict checking actions are counter-productive.
Knot Resolver offers several ways to modify its configuration at run-time:
Using control socket driven by an external system
Using Lua program embedded in Resolver’s configuration file
Both ways can also be combined: For example the configuration file can contain
a little Lua function which gathers statistics and returns them in JSON string.
This can be used by an external system which uses control socket to call this
user-defined function and to retrieve its results.
Control socket acts like “an interactive configuration file” so all actions
available in configuration file can be executed interactively using the control
socket. One possible use-case is reconfiguring the resolver instances from
another program, e.g. a maintenance script.
Note
Each instance of Knot Resolver exposes its own control socket. Take
that into account when scripting deployments with
Multiple instances.
When Knot Resolver is started using Systemd (see section
Upgrading to 6.0.0 from 5.x.x) it creates a control socket in path
/run/knot-resolver/control/$ID. Connection to the socket can
be made from command line using e.g. socat:
$socat-UNIX-CONNECT:/run/knot-resolver/control/1
When successfully connected to a socket, the command line should change to
something like >. Then you can interact with kresd to see configuration or
set a new one. There are some basic commands to start with.
>help()-- shows help>net.interfaces()-- lists available interfaces>net.list()-- lists running network services
The direct output of commands sent over socket is captured and sent back,
which gives you an immediate response on the outcome of your command.
The commands and their output are also logged in contrl group,
on debug level if successful or warning level if failed
(see around log_level()).
Control sockets are also a way to enumerate and test running instances, the
list of sockets corresponds to the list of processes, and you can test the
process for liveliness by connecting to the UNIX socket.
Executes the provided string as lua code on every running resolver instance
and returns the results as a table.
Key n is always present in the returned table and specifies the total
number of instances the command was executed on. The table also contains
results from each instance accessible through keys 1 to n
(inclusive). If any instance returns nil, it is not explicitly part of
the table, but you can detect it by iterating through 1 to n.
>map('worker.id')-- return an ID of every active instance{'2','1',['n']=2,}>map('worker.id == "1" or nil')-- example of `nil` return value{[2]=true,['n']=2,}
The order of instances isn’t guaranteed or stable. When you need to identify
the instances, you may use kluautil.kr_table_pack() function to return multiple
values as a table. It uses similar semantics with n as described above
to allow nil values.
If the command fails on any instance, an error is returned and the execution
is in an undefined state (the command might not have been executed on all
instances). When using the map() function to execute any code that might
fail, your code should be wrapped in pcall() to avoid this
issue.
>map('require("kluautil").kr_table_pack(pcall(net.tls, "cert.pem", "key.pem"))'){{true,-- function succeededtrue,-- function return value(s)['n']=2,},{false,-- function failed'error occurred...',-- the returned error message['n']=2,},['n']=2,}
As it was mentioned in section Syntax, Resolver’s configuration
file contains program in Lua programming language. This allows you to write
dynamic rules and helps you to avoid repetitive templating that is unavoidable
with static configuration. For example parts of configuration can depend on
hostname() of the machine:
Some users observed a considerable, close to 100%, performance gain in
Docker containers when they bound the daemon to a single interface:ip
address pair. One may expand the aforementioned example with browsing
available addresses as:
You can also use third-party Lua libraries (available for example through
LuaRocks) as on this example to download cache from parent,
to avoid cold-cache start.
localhttp=require('socket.http')localltn12=require('ltn12')localcache_size=100*MBlocalcache_path='/var/cache/knot-resolver'cache.open(cache_size,'lmdb://'..cache_path)ifcache.count()==0thencache.close()-- download cache from parenthttp.request{url='http://parent/data.mdb',sink=ltn12.sink.file(io.open(cache_path..'/data.mdb','w'))}-- reopen cache with 100M limitcache.open(cache_size,'lmdb://'..cache_path)end
If called with a parameter, it will set kresd’s internal
hostname. If called without a parameter, it will return kresd’s
internal hostname, or the system’s POSIX hostname (see
gethostname(2)) if kresd’s internal hostname is unset.
This also affects ephemeral (self-signed) certificates generated by kresd
for DNS over TLS.
class (number) – Query class (optional) (e.g. kres.class.IN)
options (strings) – Resolution options (see kr_qflags)
finish (function) – Callback to be executed when resolution completes (e.g. function cb (pkt, req) end). The callback gets a packet containing the final answer and doesn’t have to return anything.
init (function) – Callback to be executed with the kr_request before resolution starts.
Returns:
boolean, true if resolution was started
The function can also be executed with a table of arguments instead. This is
useful if you’d like to skip some arguments, for example:
Lua language used in configuration file allows you to script actions upon
various events, for example publish statistics each minute. Following example
uses built-in function event.recurrent() which calls user-supplied
anonymous function:
localffi=require('ffi')modules.load('stats')-- log statistics every secondlocalstat_id=event.recurrent(1*second,function(evid)log_info(ffi.C.LOG_GRP_STATISTICS,table_print(stats.list()))end)-- stop printing statistics after first minuteevent.after(1*minute,function(evid)event.cancel(stat_id)end)
Note that each scheduled event is identified by a number valid for the duration
of the event, you may use it to cancel the event at any time.
To persist state between two invocations of a function Lua uses concept called
closures. In the following example function speed_monitor() is a closure
function, which provides persistent variable called previous.
localffi=require('ffi')modules.load('stats')-- make a closure, encapsulating counterfunctionspeed_monitor()localprevious=stats.list()-- monitoring functionreturnfunction(evid)localnow=stats.list()localtotal_increment=now['answer.total']-previous['answer.total']localslow_increment=now['answer.slow']-previous['answer.slow']ifslow_increment/total_increment>0.05thenlog_warn(ffi.C.LOG_GRP_STATISTICS,'WARNING! More than 5 %% of queries was slow!')endprevious=now-- store current value in closureendend-- monitor every minutelocalmonitor_id=event.recurrent(1*minute,speed_monitor())
Another type of actionable event is activity on a file descriptor. This allows
you to embed other event loops or monitor open files and then fire a callback
when an activity is detected. This allows you to build persistent services
like monitoring probes that cooperate well with the daemon internal operations.
See event.socket().
Filesystem watchers are possible with worker.coroutine() and cqueues,
see the cqueues documentation for more information. Here is an simple example:
localnotify=require('cqueues.notify')localwatcher=notify.opendir('/etc')watcher:add('hosts')-- Watch changes to /etc/hostsworker.coroutine(function()forflags,nameinwatcher:changes()doforflaginnotify.flags(flags)do-- print information about the modified fileprint(name,notify[flag])endendend)
The timer represents exactly the thing described in the examples - it allows you to execute closures
after specified time, or event recurrent events. Time is always described in milliseconds,
but there are convenient variables that you can use - sec,minute,hour.
For example, 5*hour represents five hours, or 5*60*60*100 milliseconds.
Reschedule a running event, it has no effect on canceled events.
New events may reuse the event_id, so the behaviour is undefined if the function
is called after another event is started.
Example:
localinterval=1*minuteevent.after(1*minute,function(ev)print('Good morning!')-- Halve the interval for each iterationinterval=interval/2event.reschedule(ev,interval)end)
Cancel running event, it has no effect on already canceled events.
New events may reuse the event_id, so the behaviour is undefined if the function
is called after another event is started.
Watch for file descriptor activity. This allows embedding other event loops or simply
firing events when a pipe endpoint becomes active. In another words, asynchronous
notifications for daemon.
cb – closure or callback to execute when fd becomes active
Returns:
event id
Execute function when there is activity on the file descriptor and calls a closure
with event id as the first parameter, status as second and number of events as third.
The event package provides a very basic mean for non-blocking execution - it allows running code when activity on a file descriptor is detected, and when a certain amount of time passes. It doesn’t however provide an easy to use abstraction for non-blocking I/O. This is instead exposed through the worker package (if cqueues Lua package is installed in the system).
Start a new coroutine with given function (closure). The function can do I/O or run timers without blocking the main thread. See cqueues for documentation of possible operations and synchronization primitives. The main limitation is that you can’t wait for a finish of a coroutine from processing layers, because it’s not currently possible to suspend and resume execution of processing layers.
Pause execution of current function (asynchronously if running inside a worker coroutine).
Example:
functionasync_print(testname,sleep)log(testname..': system time before sleep'..tostring(os.time())worker.sleep(sleep)-- other coroutines continue execution nowlog(testname..': system time AFTER sleep'..tostring(os.time())endworker.coroutine(function()async_print('call #1',5)end)worker.coroutine(function()async_print('call #2',3)end)
Output from this example demonstrates that both calls to function async_print were executed asynchronously:
call #2: system time before sleep 1578065073
call #1: system time before sleep 1578065073
call #2: system time AFTER sleep 1578065076
call #1: system time AFTER sleep 1578065078
The etcd module connects to etcd peers and watches
for configuration changes. By default, the module watches the subtree under
/knot-resolver directory, but you can change this in the
etcd library configuration.
The subtree structure corresponds to the configuration variables in the declarative style.
This experimental module provides automatic discovery of authoritative servers’ supporting DNS-over-TLS.
The module uses magic NS names to detect SPKI fingerprint which is very similar to dnscurve mechanism.
Warning
This protocol and module is experimental and can be changed or removed at any time. Use at own risk, security properties were not analyzed!
and automatically discover that example.com NS supports DoT with the base64-encoded SPKI digest of m+12GgMFIiheEhKvUcOynjbn3WYQUp5tVGDh7Snwj/Q=
and will associate it with the IPs of dot-tpwxmgqdaurcqxqsckxvdq5sty3opxlgcbjj43kumdq62kpqr72a.example.com.
In that example, the base32 encoded (no padding) version of the sha256 PIN is tpwxmgqdaurcqxqsckxvdq5sty3opxlgcbjj43kumdq62kpqr72a, which when
converted to base64 translates to m+12GgMFIiheEhKvUcOynjbn3WYQUp5tVGDh7Snwj/Q=.
The module relies on seeing the reply of the NS query and as such will not work
if Knot Resolver uses data from its cache. You may need to delete the cache before starting kresd to work around this.
The module also assumes that the NS query answer will return both the NS targets in the Authority section as well as the glue records in the Additional section.
The Knot Resolver can be started with the command knot-resolver. You can provide an optional argument --configpath/to/config.yml to load a different than default configuration file.
The resolver does not have any external runtime dependencies and it should be able to run in most environments. It should be possible to wrap it with any container technology.
The only limitation for running multiple instances of Knot Resolver is that all instances must have a different runtime directory. There are however safeguards in place that should prevent accidental runtime directory conflicts.
It is possible to share cache between multiple instances, just make sure that all instances have the same cache config and there is only a single garbage collector running (disable it in all but one config file).
Before version 6, our Docker images were not meant to be used in production. This is no longer the case and with the introduction of kres-manager, Knot Resolver runs in containers without any issues.
An official Docker image can be found on Docker Hub. The image contains Knot Resolver as if it was installed from our official distro packages.
dockerrun--rm-ti-Pdocker.io/cznic/knot-resolver
The configuration file is located at /etc/knot-resolver/config.yml and the cache is at /var/cache/knot-resolver. We recommend configuring a persistent cache across container restarts.
Warning
While the container image contains normal installation of Knot Resolver and there shouldn’t be any differences between running it natively and in a container, we (the developers) do not have any experience using the Docker image in production. Especially, beware of running the DNS resolver with a software defined network (i.e. in Kubernetes). There will likely be some performance penalties for doing so. We haven’t done any measurements comparing different types of installations so we don’t know the performance differences. If you have done some measurements yourself, please reach out to us and we will share it here with everyone else.
This page is intended for experienced users only. If you follow these
instructions, you are not protected from footguns elimited with the
introduction of the kres-manager. However, if you want to continue
using Knot Resolver the same as before the version 6.0.0 this is a chapter
for you.
For new and less experienced users, we recommend using the newer approach
starting in the Getting Started chapter.
The older way to start Knot Resolver is to run single instance of its resolving daemon manualy using kresd@ systemd integration.
The daemon is single thread process.
$sudosystemctlstartkresd@1.service
Tip
For more information about systemd integration see mankresd.systemd.
You can configure kresd by pasting your Lua code into /etc/knot-resolver/kresd.conf configuration script.
The resolver’s daemon is preconfigure to load this script when using kresd@ systemd integration.
Note
The configuration language is in fact Lua script, so you can use full power
of this programming language. See article
Learn Lua in 15 minutes for a syntax overview.
The first thing you need to configure are the network interfaces to listen to.
The following example instructs the resolver to receive standard unencrypted DNS queries on IP addresses 192.0.2.1 and 2001:db8::1.
Encrypted DNS queries are accepted using DNS-over-TLS protocol on all IP addresses configured on network interface eth0, TCP port 853.
-- unencrypted DNS on port 53 is defaultnet.listen('192.0.2.1')net.listen('2001:db8::1')net.listen(net.eth0,853,{kind='tls'})
Complete configurations files examples can be found here.
The example configuration files are also installed as documentation files, typically in directory /usr/share/doc/knot-resolver/examples/ (their location may be different based on your Linux distribution).
Note
When copy&pasting examples please pay close
attention to brackets and also line ordering - order of lines matters.
Warning
This page is intended for experienced users only. If you follow these
instructions, you are not protected from footguns elimited with the
introduction of the kres-manager. However, if you want to continue
using Knot Resolver the same as before the version 6.0.0 this is a chapter
for you.
For new and less experienced users, we recommend using the newer approach
starting in the Getting Started chapter.
Our upstream packages use systemd integration, which is the recommended
way to run kresd. This section is only relevant if you choose to use kresd
without systemd integration.
kresd is designed to be a single process without the use of threads.
While the cache is shared, the individual processes are independent. This
approach has several benefits, but it also comes with a few downsides, in
particular:
Without the use of threads or forking (deprecated, see #529), multiple
processes aren’t managed in any way by kresd.
There is no maintenance thread and these tasks have to be handled by separate
daemon(s) (such as Garbage Collector).
To offset these these disadvantages without implementing process management in
kresd (and reinventing the wheel), Knot Resolver provides integration with
systemd, which is widely used across GNU/Linux distributions.
If your use-case doesn’t support systemd (e.g. using macOS, FreeBSD, Docker,
OpenWrt, Turris), this section describes the differences and things to keep in
mind when configuring and running kresd without systemd integration.
Warning
This page is intended for experienced users only. If you follow these
instructions, you are not protected from footguns elimited with the
introduction of the kres-manager. However, if you want to continue
using Knot Resolver the same as before the version 6.0.0 this is a chapter
for you.
For new and less experienced users, we recommend using the newer approach
starting in the Getting Started chapter.
There following should be taken into consideration when running without systemd:
To utilize multiple CPUs, kresd has to be executed as several independent
processes.
Maintenance daemon(s) have to be executed separately.
If a process crashes, it might be useful to restart it.
Using some mechanism similar to Watchdog might be desirable to
recover in case a process becomes unresponsive.
Please note, systemd isn’t the only process manager and other solutions exist,
such as supervisord. Configuring these is out of the scope of this
document. Please refer to their respective documentations.
It is also possible to use kresd without any process management at all, which
may be suitable for some purposes (such as low-traffic local / home network resolver,
testing, development or debugging).
When using systemd, kres-cache-gc.service is enabled by default
and does not need any manual configuration.
Knot Resolver employs a separate garbage collector daemon which periodically
trims the cache to keep its size below size limit configured using
cache.size.
To execute the daemon manually, you can use the following command to run it
every second:
$kres-cache-gc-c/var/cache/knot-resolver-d1000
Warning
This page is intended for experienced users only. If you follow these
instructions, you are not protected from footguns elimited with the
introduction of the kres-manager. However, if you want to continue
using Knot Resolver the same as before the version 6.0.0 this is a chapter
for you.
For new and less experienced users, we recommend using the newer approach
starting in the Getting Started chapter.
The most secure and recommended way is to use capabilities and execute kresd as
an unprivileged user.
CAP_NET_BIND_SERVICE is required to bind to well-known ports.
CAP_SETPCAP when this capability is available, kresd drops any extra
capabilities after the daemon successfully starts when running as
a non-root user.
Drop privileges and start running as given user (and group, if provided).
Tip
Note that you should bind to required network addresses before
changing user. At the same time, you should open the cache AFTER you
change the user (so it remains accessible). A good practice is to divide
configuration in two parts:
You can use HTTP API to dynamically change configuration of already running Knot Resolver.
By default the API is configured as UNIX domain socket manager.sock located in the resolver’s rundir (typically /run/knot-resolver/).
This socket is used by kresctl utility in default.
The API setting can be changed only in /etc/knot-resolver/config.yml configuration file:
management:interface:127.0.0.1@5000# or use unix socket instead of inteface# unix-socket: /my/new/socket.sock
First version of configuration API endpoint is available on /v1/config HTTP endpoint.
Configuration API supports following HTTP request methods:
HTTP request methods
Operation
GET/v1/config[/path]
returns current configuration with an ETag
PUT/v1/config[/path]
upsert (try update, if does not exists, insert), appends to array
delete an existing property or list item at given index
Note
Managemnet API has other useful endpoints (metrics, schema, …), see the detailed API documentation.
path:
Determines specific configuration option or configuration subtree on that path.
Items in lists and dictionaries are reachable using indexes /list-name/{index}/ and keys /dict-name/{key}/.
payload:
JSON or YAML encoding is used for configuration payload.
Note
Some configuration options cannot be configured via the API for stability and security reasons(e.g. API configuration itself).
In the case of an attempt to configure such an option, the operation is rejected.
Knot Resolver Manager is capable of dynamically changing its configuration via an HTTP API or by reloading its config file. Both methods are equivalent in terms of its capabilities. The kresctl utility uses the HTTP API and provides a convinient command line interface.
To reload the configuration file, send the SIGHUP signal to the Manager process. The original configuration file will be read again, validated and in case of no errors, the changes will be applied.
Note: You can also send SIGHUP to the top-level process, to the supervisord. Normally, supervisord would stop all processes and reload its configuration when it receives SIGHUP. However, we have eliminated this footgun in order to prevent anyone from accidentally shutting down the whole resolver. Instead, the signal is only forwarded to the Manager.
By default, the Manager exposes its HTTP API on a Unix socket at FIXME. However, you can change where it listens by changing the management.interface config option. To use kresctl, you have to tell it this value.
Note: The v1 version qualifier is there for future-proofing. We don’t have any plans at the moment to change the API any time soon. If that happens, we will support both old and new API versions for the some transition period.
The API by default expects JSON, but can also parse YAML when the Content-Type header is set to application/yaml or text/vnd.yaml. The return value is always a JSON with Content-Type:application/json. The schema of input and output is always a subtree of the configuration data model which is described by the JSON schema exposed at /schema.
The API can operate on any configuration subtree by specifying a JSON pointer in the URL path (property names and list indices joined with /). For example, to get the number of worker processes, you can send GET request to v1/config/workers.
The different HTTP methods perform different modifications of the configuration:
GET return subtree of the current configuration
PUT set property
DELETE removes the given property or list item at the given index
To prevent race conditions when changing configuration from multiple clients simultaneously, every response from the Manager has an ETag header set. Requests then accept If-Match and If-None-Match headers with the latest ETag value and the corresponding request processing fails with HTTP error code 412 (precondition failed).
Command-line utility that helps communicate with the management API.
It also provides tooling to work with declarative configuration (validate, convert).
Most commands require connection to the management API.
With default Knot Resolver configuration, kresctl should communicate with the resolver withou need to specify --socket option.
If not, this option must be set for each command.
The following possitional arguments determine what kind of command will be executed.
Only one of these arguments can be selected during the execution of a single krestctl command.
Tells the resolver to reload YAML configuration file.
Old processes are replaced by new ones (with updated configuration) using rolling restarts.
So there will be no DNS service unavailability during reload operation.
Requires connection to the management API.
Version 6 of Knot Resolver brings one significant change - it introduces Knot Resolver Manager - a new way for interacting with Knot Resolver. The Manager brings several new features:
new declarative configuration
HTTP API to change configuration on the fly without downtime
it hides complexities of running multiple instances of kresd
Now, you might be worried about the future of kresd. No worries, you can use kresd directly the same way you did before, nothing changes there right now. However, in the long run, we might make breaking changes in the way kresd is configured and using it directly is from now on considered advanced.
With the release of version 6, there is a new way to configure and control your running kresd instances
so that you don’t have to configure multiple systemd services. The new Knot Resolver Manager handles it for you.
In the table below, you can find comparison of how things were done before and how they can be done now.
This section summarizes steps required when upgrading to newer Knot Resolver versions.
We advise users to also read Release notes for respective versions.
Section Module changes is relevant only for users who develop or use third-party modules.
Following section provides information about selected changes in not-yet-released versions.
We advise users to prepare for these changes sooner rather than later to make it easier to upgrade to
newer versions when they are released.
Function cache.zone_import was removed;
you can use ffi.C.zi_zone_import instead (different API).
When using PROXYv2 protocol, the meaning of qsource.flags and qsource.comm_flags
in kr_request changes so that flags describes the original client
communicating with the proxy, while comm_flags describes the proxy communicating
with the resolver. When there is no proxy, flags and comm_flags are the same.
DoH over HTTP/1 and unencrypted transports is still available in
legacy http module (kind='doh').
This module will not receive receive any more bugfixes and will be eventually removed.
Users of Control sockets API need to terminate each command sent to resolver with newline
character (ASCII \n). Correct usage: cache.stats()\n.
Newline terminated commands are accepted by all resolver versions >= 1.0.0.
DNS Flag Day 2020 is now effective and Knot Resolver uses
maximum size of UDP answer to 1232 bytes. Please double-check your firewall,
it has to allow DNS traffic on UDP and also TCP port 53.
Human readable output in interactive mode and from Control sockets was improved and
as consequence slightly changed its format. Users who need machine readable output for scripts
should use Lua function tojson() to convert Lua values into standard JSON format instead
of attempting to parse the human readable output.
For example API call tojson(cache.stats())\n will return JSON string with cache.stats()
results represented as dictionary.
Function tojson() is available in all resolver versions >= 1.0.0.
Lua variable worker.id is now a string with either Systemd instance name or PID
(instead of number). If your custom configuration uses worker.id value please
check your scripts.
Modules which were using kr_ranked_rrarray_add() should note that on success it no longer returns exclusively zero but index into the array (non-negative). Error states are unchanged (negative).
-f / --forks command-line option is deprecated.
In case you just want to trigger non-interactive mode, there’s new -n / --noninteractive.
This forking style was not ergonomic;
with independent kresd processes you can better utilize a process manager (e.g. systemd).
Network interface are now configured in kresd.conf with
net.listen() instead of systemd sockets (#485). See
the following examples.
Tip
You can find suggested network interface settings based on your
previous systemd socket configuration in
/var/lib/knot-resolver/.upgrade-4-to-5/kresd.conf.net which is created
during the package update to version 5.x.
In case you wrote your own module which directly calls function
kr_ranked_rrarray_add(), you need to additionally call function
kr_ranked_rrarray_finalize() after each batch (before changing
the added memory regions). For a specific example see changes in dns64 module.
In case you are using your own custom modules, move them to the new module
location. The exact location depends on your distribution. Generally, modules previously
in /usr/lib/kdns_modules should be moved to /usr/lib/knot-resolver/kres_modules.
trust_anchors.file, trust_anchors.config() and trust_anchors.negative
aliases were removed to avoid duplicity and confusion. Migration table:
3.x configuration
4.x configuration
trust_anchors.file=path
trust_anchors.add_file(path)
trust_anchors.config(path,readonly)
trust_anchors.add_file(path,readonly)
trust_anchors.negative=nta_set
trust_anchors.set_insecure(nta_set)
trust_anchors.keyfile_default is no longer accessible and is can be set
only at compile time. To turn off DNSSEC, use trust_anchors.remove().
3.x configuration
4.x configuration
trust_anchors.keyfile_default=nil
trust_anchors.remove('.')
Network for HTTP endpoints is now configured using same mechanism as for normal DNS endpoints,
please refer to chapter Networking and protocols. Migration table:
meson build system is now used for compiling the project. For instructions, see
the Building from sources. Packagers should pay attention to section Packaging
for information about systemd unit files and trust anchors.
Embedding LMDB is no longer supported, lmdb is now required as an external dependency.
Trust anchors file from upstream is installed and used as default unless you
override keyfile_default during build.
Default module location has changed from {libdir}/kdns_modules to
{libdir}/knot-resolver/kres_modules. Modules are now in the lua namespace
kres_modules.*.
kr_straddr_split() API has changed.
C modules defining *_layer or *_props symbols need to use a different style, but it’s typically a trivial change.
Instead of exporting the corresponding symbols, the module should assign pointers to its static structures inside its *_init() function. Example migration:
bogus_log module.
Module Static hints has option hints.use_nodata() enabled by default,
which is what most users expect. Add hints.use_nodata(false) to your config
to revert to the old behavior.
Modules cookie and version were removed.
Please remove relevant configuration lines with modules.load() and modules=
from configuration file.
Valid configuration must open cache using cache.open() or cache.size=
before executing cache operations like cache.clear().
(Older versions were silently ignoring such cache operations.)
Version number format is major.minor.patch.
Knot Resolver does not use semantic versioning even though the version number looks similar.
Leftmost number which was changed signalizes what to expect when upgrading:
Major version
Manual upgrade steps might be necessary, please follow instructions in Upgrading section.
Major releases may contain significant changes including changes to configuration format.
We might release a new major also when internal implementation details change significantly.
Minor version
Configuration stays compatible with the previous version, except for undocumented or very obscure options.
Upgrade should be seamless for users who use modules shipped as part of Knot Resolver distribution.
Incompatible changes in internal APIs are allowed in minor versions. Users who develop or use custom modules
(i.e. modules not distributed together with Knot Resolver) need to double check their modules for incompatibilities.
Upgrading section should contain hints for module authors.
Patch version
Everything should be compatible with the previous version.
API for modules should be stable on best effort basis, i.e. API is very unlikely to break in patch releases.
Custom modules might need to be recompiled, i.e. ABI compatibility is not guaranteed.
This definition is not applicable to versions older than 5.2.0.
avoid excessive TCP reconnections in some cases (!1380)
For example, a DNS server that just closes connections without answer
could cause lots of work for the resolver (and itself, too).
The number of connections could be up to around 100 per client’s query.
We thank Xiang Li from NISL Lab, Tsinghua University,
and Xuesong Bai and Qifan Zhang from DSP Lab, UCI.
validator: fix 5.3.1 regression on over-limit NSEC3 edge case (!1169)
Assertion might be triggered by query/answer, potentially DoS.
CVE-2021-40083 was later assigned.
new cache garbage collector is available and enabled by default (#257)
This improves cache efficiency on big installations.
DNS-over-HTTPS: unknown HTTP parameters are ignored to improve compatibility
with non-standard clients (!832)
DNS-over-HTTPS: answers include access-control-allow-origin: * (!823)
which allows JavaScript to use DoH endpoint.
http module: support named AF_UNIX stream sockets (again)
aggressive caching is disabled on minimal NSEC* ranges (!826)
This improves cache effectivity with DNSSEC black lies and also accidentally
works around bug in proofs-of-nonexistence from F5 BIG-IP load-balancers.
aarch64 support, even kernels with ARM64_VA_BITS >= 48 (#216, !797)
This is done by working around a LuaJIT incompatibility. Please report bugs.
lua tables for C modules are more strict by default, e.g. nsid.foo
will throw an error instead of returning nil (!797)
systemd: basic watchdog is now available and enabled by default (#275)
cache: fail lua operations if cache isn’t open yet (!639)
By default cache is opened after reading the configuration,
and older versions were silently ignoring cache operations.
Valid configuration must open cache using cache.open() or cache.size =
before executing cache operations like cache.clear().
libknot >= 2.7.1 is required, which brings also larger API changes
cache server unavailability to prevent flooding unreachable servers
(Please note that caching algorithm needs further optimization
and will change in further versions but we need to gather operational
experience first.)
systemd: change unit files to allow running multiple instances,
deployments with single instance now must use kresd@1.service
instead of kresd.service; see kresd.systemd(7) for details
systemd: the directory for cache is now /var/cache/knot-resolver
unify default directory and user to knot-resolver
directory with trust anchor file specified by -k option must be writeable
policy module is now loaded by default to enforce RFC 6761;
see documentation for policy.PASS if you use locally-served DNS zones
drop support for alternative cache backends memcached, redis,
and for Lua bindings for some specific cache operations
REORDER_RR option is not implemented (temporarily)
aggressive caching of validated records (RFC 8198) for NSEC zones;
thanks to ICANN for sponsoring this work.
forwarding over TLS, authenticated by SPKI pin or certificate.
policy.TLS_FORWARD pipelines queries out-of-order over shared TLS connection
Beware: Some resolvers do not support out-of-order query processing.
TLS forwarding to such resolvers will lead to slower resolution or failures.
trust anchors: you may specify a read-only file via -K or –keyfile-ro
trust anchors: at build-time you may set KEYFILE_DEFAULT (read-only)
ta_sentinel module implements draft ietf-dnsop-kskroll-sentinel-00,
enabled by default
serve_stale module is prototype, subject to change
fix CVE-2018-1000002: insufficient DNSSEC validation, allowing
attackers to deny existence of some data by forging packets.
Some combinations pointed out in RFC 6840 sections 4.1 and 4.3
were not taken into account.
This is an experimental release meant for testing aggressive caching.
It contains some regressions and might (theoretically) be even vulnerable.
The current focus is to minimize queries into the root zone.
lua: query flag-sets are no longer represented as plain integers.
kres.query.* no longer works, and kr_query_t lost trivial methods
‘hasflag’ and ‘resolved’.
You can instead write code like qry.flags.NO_0X20 = true.
Fix a critical DNSSEC flaw. Signatures might be accepted as valid
even if the signed data was not in bailiwick of the DNSKEY used to
sign it, assuming the trust chain to that DNSKEY was valid.
Refactor handling of AD flag and security status of resource records.
In some cases it was possible for secure domains to get cached as
insecure, even for a TLD, leading to disabled validation.
It also fixes answering with non-authoritative data about nameservers.
major feature: support for forwarding with validation (#112).
The old policy.FORWARD action now does that; the previous non-validating
mode is still available as policy.STUB except that also uses caching (#122).
command line: specify ports via @ but still support # for compatibility
policy: recognize 100.64.0.0/10 as local addresses
layer/iterate: do retry repeatedly if REFUSED, as we can’t yet easily
retry with other NSs while avoiding retrying with those who REFUSED
modules: allow changing the directory where modules are found,
and do not search the default library path anymore.
trust anchors: Improve trust anchors storage format (#167)
trust anchors: support non-root TAs, one domain per file
policy.DENY: set AA flag and clear AD flag
lib/resolve: avoid unnecessary DS queries
lib/nsrep: don’t treat servers with NOIP4 + NOIP6 flags as timed out
layer/iterate: During packet classification (answer vs. referral)
don’t analyze AUTHORITY section in authoritative answer if ANSWER
section contains records that have been requested
Under certain conditions, a cached negative answer from a CD query
would be reused to construct response for non-CD queries, resulting
in Insecure status instead of Bogus. Only 1.2.0 release was affected.
As mentioned in the getting started section, Knot Resolver is split into several components, namely the manager, kresd and the garbage collector. In addition to these custom components, we also rely on supervisord.
There are two different control structures in place. Semantically, the manager controls every other component in Knot Resolver. It processes configuration and passes it onto every other component. As a user you will always interact with the manager (or kresd). At the same time though, the manager is not the root of the process hierarchy, Supervisord sits at the top of the process tree and runs everything else.
Note
The rationale for this inverted process hierarchy is mainly stability. Supervisord sits at the top because it is a reliable and stable software we can depend upon. It also does not process user input and its therefore shielded from data processing bugs. This way, any component in Knot Resolver can crash and restart without impacting the rest of the system.
The inverted process hierarchy complicates Resolver’s launch procedure. You might notice it when reading manager’s logs just after start. What happens on cold start is:
Manager starts, reads its configuration and generates new supervisord configuration. Then, it starts supervisord by using exec.
Supervisord loads it’s configuration, loads our extensions and start a new instance of manager.
Manager starts again, this time as a child of supervisord. As this is desired state, it loads the configuration again and commands supervisord that it should start new instances of kresd.
Knot Resolver is designed to handle failures automatically. Anything except for supervisord will automatically restart. If a failure is irrecoverable, all processes will stop and nothing will be left behind in a half-broken state. While a total failure like this should never happen, it is possible and you should not rely on single instance of Knot Resolver for a highly-available system.
Note
The ability to restart most of the components without downtime means, that Knot Resolver is able to transparently apply updates while running.
This guide is intended for advanced users and developers. You don’t have to know and understand any of this to use Knot Resolver.
The manager is a component written in Python and a bit of C used for native extension modules. The main goal of the manager is to ensure the system is set up according to a given configuration, provide a user-friendly interface. Performance is only secondary to correctness.
The manager is mostly modelled around config processing pipeline:
The API server is implemented using aiohttp. This framework provides the application skeleton and manages application runtime. The manager is actually a normal web application with the slight difference that we don’t save the data in a database but rather modify state of other processes.
Code of the API server is located only in a single source code file. It also contains description of the manager’s startup procedure.
From the web framework, we receive data as simple strings and we need to parse and validate them. Due to packaging issues in distros, we rolled our own solution not disimilar to Python library Pydantic.
Our tool lets us model config schema similarly to how Python’s native dataclasses are constructed. As input, it takes Python’s dicts taken from PyYAML or JSON parser. The dict is mapped onto predefined Python classes while enforcing typing rules. If desired, the mapping step is performed multiple times onto different classes, which allows us to process intermediary values such as auto.
There are two relevant places in the source code - our generic modelling tools and the actual configuration data model. Just next to the data model in the templates directory, there are Jinja2 templates for generating Lua code from the configuration.
The actual core of the whole application is originally named the manager. It keeps a high-level view of the systems state and performs all necessary operations to change the state to the desired one. In other words, manager is the component handling rolling restarts, config update logic and more.
Let’s make a sidestep and let’s talk about abstractions. The manager component mentioned above interacts with a general backend (or as we call sometimes call it - a subprocess manager). The idea is that the interactions with the backend are not dependent on the backend’s implementation and we can choose which one we want to use. Historically, we had two different backend implementations - systemd and supervisord. However, systemd turned out to be inappropriate, it did not fit our needs, so we removed it. The abstraction remains though and it should be possible to implement a different subprocess manager if it turns out useful. Please note though, the abstraction might be somewhat leaky in practice as there is only one implementation.
Communication with supervisord happens on pretty much all possible levels. We edit its configuration file, we use its XMLRPC API, we use Unix signals and we even attach to it from within its Python runtime. The interface is honestly a bit messy and we had to use all we could to make it user friendly.
First, we generate supervisord’s configuration file. The configuration file sets stage for further communication by specifying location of the pidfile and API Unix socket. It prepares configuration for subprocesses and most significantly, it loads our custom extensions.
The extensions don’t use a lot of code. There are four of them - the simplest one provides a speedier XMLRPC API for starting processes, it removes delays that are not necessary for our usecase. Another one implements systemd’s sd_notify() API for supervisord, so we can track the lifecycle of ``kresd``s more precisely. Another extension changes the way logging works and the last extension monitors the lifecycle of the manager and forwards some signals.
Note
The extensions mentioned above use monkeypatching to achieve their design goals. We settled for this approach, because supervisord’s codebase appears mostly stable. The code we patch has not been changed for years. Other option would be forking supervisord and vendoring it. We decided against that mainly due to packaging complications it would cause with major Linux distributions.
For executing subprocesses, we don’t actually change the configuration file, we only use XMLRPC API and tell supervisord to start already configured programs. For one specific call though, we use our extension instead of the build-in method of starting processes as it is significantly faster.
The garbage collector is a simple component which keeps the shared cache from overfilling.
Every second it estimates cache usage and if over 80%, records get deleted in order to free 10%. (Parameters can be configured.)
The freeing happens in a few passes. First all items are classified by their estimated usefulness, in a simple way based on remaining TTL, type, etc.
From this histogram it’s computed which “level of usefulness” will become the threshold, so that roughly the planned total size gets freed.
Then all items are passed to collect the set of keys to delete, and finally the deletion is performed.
As longer transactions can cause issues in LMDB, all passes are split into short batches.
Knot Resolver is written for UNIX-like systems using modern C standards.
Beware that some 64-bit systems with LuaJIT 2.1 may be affected by
a problem
– Linux on x86_64 is unaffected but Linux on aarch64 is.
Knot Resolver uses apkg tool for upstream packaging.
It allows build packages localy for supported distributions, which it then installs.
apkg also takes care of dependencies itself.
First, you need to install and setup apkg.
Tip
Install apkg with pipx to avoid version conflicts.
$pip3installapkg
$apkgsystem-setup
Clone and change dir to knot-resolver git repository.
The apkgstatus command can be used to find out some useful information, such as whether the current distribution is supported.
When apkg is ready, a package can be built and installed.
# takes care of dependencies
apkgbuild-dep
# build package
apkgbuild
# (build and) install package, builds package when it is not already built
apkginstall
Knot Resolver uses Meson Build system.
Shell snippets below should be sufficient for basic usage
but users unfamiliar with Meson might want to read introductory
article Using Meson.
Additional dependencies are needed to build and run Knot Resolver with manager:
All dependencies are also listed in pyproject.toml which is our authoritative source.
On reasonably new systems most of the dependencies can be resolved from packages,
here’s an overview for several platforms.
Debian/Ubuntu - Current stable doesn’t have new enough Meson
and libknot. Use repository above or build them yourself. Fresh list of dependencies can be found in Debian control file in our repo, search for “Build-Depends”.
CentOS/Fedora/RHEL/openSUSE - Fresh list of dependencies can be found in RPM spec file in our repo, search for “BuildRequires”.
FreeBSD - when installing from ports, all dependencies will install
automatically, corresponding to the selected options.
Mac OS X - the dependencies can be obtained from Homebrew formula.
Folowing meson command creates new build directory named build_dir, configures installation path to /tmp/kr and enables static build (to allow installation to non-standard path).
You can also configure some Build options, in this case enable manager, which is disabled by default.
It’s possible to change the compilation with build options. These are useful to
packagers or developers who wish to customize the daemon behaviour, run
extended test suites etc. By default, these are all set to sensible values.
For complete list of build options create a build directory and run:
$mesonsetupbuild_dir
$mesonconfigurebuild_dir
To customize project build options, use -Doption=value when creating
a build directory:
$mesonsetupbuild_dir-Ddoc=enabled
… or change options in an already existing build directory:
There following tests require a working installation of kresd. The
binary kresd found in $PATH will be tested. When testing through meson,
$PATH is modified automatically and you just need to make sure to install
kresd first.
Config tests utilize the kresd’s lua config file to execute arbitrary tests,
typically testing various modules, their API etc.
To enable these tests, specify -Dconfig_tests=enabled option for meson.
Multiple dependencies are required (refer to meson’s output when configuring
the build dir).
The extra tests require a large set of additional dependencies and executing
them outside of upstream development is probably redundant.
To enable these tests, specify -Dextra_tests=enabled option for meson.
Multiple dependencies are required (refer to meson’s output when configuring
the build dir). Enabling extra_tests automatically enables config tests as
well.
Integration tests
The integration tests are using Deckard, the DNS test harness. The tests simulate specific DNS
scenarios, including authoritative server and their responses. These tests rely
on linux namespaces, refer to Deckard documentation for more info.
The pytest suite is designed to spin up a kresd instance, acquire a connected
socket, and then performs any tests on it. These tests are used to test for
example TCP, TLS and its connection management.
To check for documentation dependencies and allow its installation, use
-Ddoc=enabled. The documentation doesn’t build automatically. Instead,
target doc must be called explicitly.
It’s recommended to use the upstream system unit files. If any customizations
are required, drop-in files should be used, instead of patching/changing the
unit files themselves.
To install systemd unit files, use the -Dsystemd_files=enabled build option.
To support enabling services after boot, you must also link kresd.target to
multi-user.target.wants:
If the target distro has externally managed (read-only) DNSSEC trust anchors
or root hints use this:
-Dkeyfile_default=/usr/share/dns/root.key
-Droot_hints=/usr/share/dns/root.hints
-Dmanaged_ta=disabled
In case you want to have automatically managed DNSSEC trust anchors instead,
set -Dmanaged_ta=enabled and make sure both keyfile_default file and
its parent directories are writable by kresd process (after package installation!).
The library as described provides basic services for name resolution, which should cover the usage,
examples are in the resolve API documentation.
Tip
If you’re migrating from getaddrinfo(), see “synchronous” API, but the library offers iterative API as well to plug it into your event loop for example.
The resolution process starts with the functions in resolve.c, they are responsible for:
reacting to state machine state (i.e. calling consume layers if we have an answer ready)
interacting with the library user (i.e. asking caller for I/O, accepting queries)
fetching assets needed by layers (i.e. zone cut)
This is the driver. The driver is not meant to know “how” the query resolves, but rather “when” to execute “what”.
On the other side are layers. They are responsible for dissecting the packets and informing the driver about the results. For example, a produce layer generates query, a consume layer validates answer.
Tip
Layers are executed asynchronously by the driver. If you need some asset beforehand, you can signalize the driver using returning state or current query flags. For example, setting a flag AWAIT_CUT forces driver to fetch zone cut information before the packet is consumed; setting a RESOLVED flag makes it pop a query after the current set of layers is finished; returning FAIL state makes it fail current query.
Layers can also change course of resolution, for example by appending additional queries.
This doesn’t block currently processed query, and the newly created sub-request will start as soon as driver finishes processing current. In some cases you might need to issue sub-request and process it before continuing with the current, i.e. validator may need a DNSKEY before it can validate signatures. In this case, layers can yield and resume afterwards.
consume=function(state,req,answer)ifstate==kres.YIELDthenprint('continuing yielded layer')returnkres.DONEelseifanswer:qtype()==kres.type.NSthenlocalqry=req:push(answer:qname(),kres.type.SOA,kres.class.IN)qry.flags.AWAIT_CUT=trueprint('planned SOA query, yielding')returnkres.YIELDendreturnstateendend
The YIELD state is a bit special. When a layer returns it, it interrupts current walk through the layers. When the layer receives it,
it means that it yielded before and now it is resumed. This is useful in a situation where you need a sub-request to determine whether current answer is valid or not.
FIXME: this dev-docs section is outdated! Better see comments in files instead, for now.
The resolver library leverages the processing API from the libknot to separate packet processing code into layers.
Note
This is only crash-course in the library internals, see the resolver library documentation for the complete overview of the services.
The library offers following services:
Cache - MVCC cache interface for retrieving/storing resource records.
Resolution plan - Query resolution plan, a list of partial queries (with hierarchy) sent in order to satisfy original query. This contains information about the queries, nameserver choice, timing information, answer and its class.
Nameservers - Reputation database of nameservers, this serves as an aid for nameserver choice.
A processing layer is going to be called by the query resolution driver for each query,
so you’re going to work with struct kr_request as your per-query context.
This structure contains pointers to resolution context, resolution plan and also the final answer.
This is only passive processing of the incoming answer. If you want to change the course of resolution, say satisfy a query from a local cache before the library issues a query to the nameserver, you can use states (see the Static hints for example).
intproduce(kr_layer_t*ctx,knot_pkt_t*pkt){structkr_request*req=ctx->req;structkr_query*qry=req->current_query;/* Query can be satisfied locally. */if(can_satisfy(qry)){/* This flag makes the resolver move the query * to the "resolved" list. */qry->flags.RESOLVED=true;returnKR_STATE_DONE;}/* Pass-through. */returnctx->state;}
It is possible to not only act during the query resolution, but also to view the complete resolution plan afterwards. This is useful for analysis-type tasks, or “per answer” hooks.
intfinish(kr_layer_t*ctx){structkr_request*req=ctx->req;structkr_rplan*rplan=req->rplan;/* Print the query sequence with start time. */charqname_str[KNOT_DNAME_MAXLEN];structkr_query*qry=NULLWALK_LIST(qry,rplan->resolved){knot_dname_to_str(qname_str,qry->sname,sizeof(qname_str));printf("%s at %u\n",qname_str,qry->timestamp);}returnctx->state;}
The APIs in Lua world try to mirror the C APIs using LuaJIT FFI, with several differences and enhancements.
There is not comprehensive guide on the API yet, but you can have a look at the bindings file.
Packet is the data structure that you’re going to see in layers very often. They consists of a header, and four sections: QUESTION, ANSWER, AUTHORITY, ADDITIONAL. The first section is special, as it contains the query name, type, and class; the rest of the sections contain RRSets.
First you need to convert it to a type known to FFI and check basic properties. Let’s start with a snippet of a consume layer.
During produce or begin, you might want to want to write to packet. Keep in mind that you have to write packet sections in sequence,
e.g. you can’t write to ANSWER after writing AUTHORITY, it’s like stages where you can’t go back.
pkt:rcode(kres.rcode.NXDOMAIN)-- Clear answer and write QUESTIONpkt:recycle()pkt:question('\7blocked',kres.class.IN,kres.type.SOA)-- Start writing datapkt:begin(kres.section.ANSWER)-- Nothing in answerpkt:begin(kres.section.AUTHORITY)localsoa={owner='\7blocked',ttl=900,class=kres.class.IN,type=kres.type.SOA,rdata='...'}pkt:put(soa.owner,soa.ttl,soa.class,soa.type,soa.rdata)
The request holds information about currently processed query, enabled options, cache, and other extra data.
You primarily need to retrieve currently processed query.
consume=function(state,req,pkt)print(req.options)print(req.state)-- Print information about current querylocalcurrent=req:current()print(kres.dname2str(current.owner))print(current.stype,current.sclass,current.id,current.flags)end
In layers that either begin or finalize, you can walk the list of resolved queries.
locallast=req:resolved()print(last.stype)
As described in the layers, you can not only retrieve information about current query, but also push new ones or pop old ones.
-- Push new querylocalqry=req:push(pkt:qname(),kres.type.SOA,kres.class.IN)qry.flags.AWAIT_CUT=true-- Pop the query, this will erase it from resolution planreq:pop(qry)
struct knot_rdataset: field names were renamed to .count and .rdata
some functions got inlined from headers, but you can use their kr_* clones:
kr_rrsig_sig_inception(), kr_rrsig_sig_expiration(), kr_rrsig_type_covered().
Note that these functions now accept knot_rdata_t* instead of a pair
knot_rdataset_t* and size_t - you can use knot_rdataset_at() for that.
knot_rrset_add_rdata() doesn’t take TTL parameter anymore
knot_rrset_init_empty() was inlined, but in lua you can use the constructor
knot_rrset_ttl() was inlined, but in lua you can use :ttl() method instead
knot_pkt_qname(), _qtype(), _qclass(), _rr(), _section() were inlined,
but in lua you can use methods instead, e.g. myPacket:qname()
knot_pkt_free() takes knot_pkt_t* instead of knot_pkt_t**, but from lua
you probably didn’t want to use that; constructor ensures garbage collection.
This section is generated with doxygen and breathe. Due to their
limitations, some symbols may be incorrectly described or missing entirely.
For exhaustive and accurate reference, refer to the header files instead.
The API provides an API providing a “consumer-producer”-like interface to enable user to plug it into existing event loop or I/O code.
Example usage of the iterative API:
// Create request and its memory poolstructkr_requestreq={.pool={.ctx=mp_new(4096),.alloc=(mm_alloc_t)mp_alloc}};// Setup and provide input queryintstate=kr_resolve_begin(&req,ctx);state=kr_resolve_consume(&req,query);// Generate answerwhile(state==KR_STATE_PRODUCE){// Additional query generate, do the I/O and pass back answerstate=kr_resolve_produce(&req,&addr,&type,query);while(state==KR_STATE_CONSUME){intret=sendrecv(addr,proto,query,resp);// If I/O fails, make "resp" emptystate=kr_resolve_consume(&request,addr,resp);knot_pkt_clear(resp);}knot_pkt_clear(query);}// "state" is either DONE or FAILkr_resolve_finish(&request,state);
The rank meaning consists of one independent flag - KR_RANK_AUTH, and the rest have meaning of values where only one can hold at any time. You can use one of the enums as a safe initial value, optionally | KR_RANK_AUTH; otherwise it’s best to manipulate ranks via the kr_rank_* functions.
we have a chain of trust from TAs that cryptographically denies the possibility of existence of a positive chain of trust from the TAs to the record. Or it may be covered by a closer negative TA.
Ensure that request->answer is usable, and return it (for convenience).
It may return NULL, in which case it marks ->state with _FAIL and no answer will be sent. Only use this when it’s guaranteed that there will be no delay before sending it. You don’t need to call this in places where “resolver knows” that there will be no delay, but even there you need to check if the ->answer is NULL (unless you check for _FAIL anyway).
If the CONSUME is returned then dst, type and packet will be filled with appropriate values and caller is responsible to send them and receive answer. If it returns any other state, then content of the variables is undefined.
Parameters:
request – request state (in PRODUCE state)
dst – [out] possible address of the next nameserver
type – [out] possible used socket type (SOCK_STREAM, SOCK_DGRAM)
packet – [out] packet to be filled with additional query
The error is set only if it has a higher or the same priority as the one already assigned. The provided extra_text may be NULL, or a string that is allocated either statically, or on the request’s mempool. To clear any error, call it with KNOT_EDNS_EDE_NONE and NULL as extra_text.
To facilitate debugging, we include a unique base32 identifier at the start of the extra_text field for every call of this function. To generate such an identifier, you can use the command: $ base32 /dev/random | head -c 4
Parameters:
request – request state
info_code – extended DNS error code
extra_text – optional string with additional information
Keeps information about current query processing between calls to processing APIs, i.e. current resolved query, resolution plan, … Use this instead of the simple interface if you want to implement multiplexing or custom I/O.
Note
All data for this request must be allocated from the given pool.
Values from kr_rank, currently just KR_RANK_SECURE and _INITIAL. Only read this in finish phase and after validator, please. Meaning of _SECURE: all RRs in answer+authority are _SECURE, including any negative results implied (NXDOMAIN, NODATA).
The structure most importantly holds the original query, answer and the list of pending queries required to resolve the original query. It also keeps a notion of current zone cut.
if nonzero is returned, there’s a big problem - you probably want to abort(), perhaps except for kr_error(EAGAIN) which probably indicates transient errors.
We store xNAME at NS type to lower the number of searches in closest_NS(). CNAME is only considered for equal name, of course. We also store NSEC* parameters at NS type.
Based on passed choices, choose the next transport.
Common function to both implementations (iteration and forwarding). The *_choose_transport functions from selection_*.h preprocess the input for this one.
Parameters:
choices – Options to choose from, see struct above
unresolved – Array of names that can be resolved (i.e. no A/AAAA record)
timeouts – Number of timeouts that occurred in this query (used for exponential backoff)
mempool – Memory context of current request
tcp – Force TCP as transport protocol
choice_index – [out] Optionally index of the chosen transport in the choices array.
Returns:
Chosen transport (on mempool) or NULL when no choice is viable
Note that this opens a cache transaction which is usually closed by calling put_rtt_state, i.e. callee is responsible for its closing (e.g. calling kr_cache_commit).
Timeout was capped to a maximum value based on the other candidates when choosing this transport.
The timeout therefore can be much lower than what we expect it to be. We basically probe the server for a sudden network change but we expect it to timeout in most cases. We have to keep this in mind when noting the timeout in cache.
Specifies a API for selecting transports and giving feedback on the choices.
The function pointers are to be used throughout resolver when some information about the transport is obtained. E.g. RTT in worker.c or RCODE in iterate.c,…
Finished successfully or a special case: in CONSUME phase this can be used (by iterator) to do a transition to PRODUCE phase again, in which case the packet wasn’t accepted for some reason.
Finalises the outbound query packet with the knowledge of the IP addresses.
The checkout layer doesn’t persist the state, so canceled subrequests don’t affect the resolution or rest of the processing. Lua API: call is omitted iff (state & KR_STATE_FAIL).
If the check fails, optionally fork()+abort() to generate coredump and continue running in parent process. Return value must be handled to ensure safe recovery from error. Use kr_require() for unrecoverable checks. The errno variable is not mangled, e.g. you can: if (kr_fails_assert(…)) return errno;
That’s normally not lower-cased. However, when receiving packets from upstream we xor-apply the secret during packet-parsing, so it would get lower-cased after that point if the case was right.
Write string representation for given address as “<addr>#<port>”.
It’s the same as kr_inaddr_str(), but the input address is input in native format like for inet_ntop() (4 or 16 bytes) and port must be separate parameter.
The semantics is “the same” as for memcmp(). The partial byte is considered with more-significant bits first, so this is e.g. suitable for comparing IP prefixes.
The specified number of bits in a from the left (network order) will remain their original value, while the rest will be set to zero. This is useful for storing network addresses in a trie.
How often kr_assert() should fork the process before issuing abort (if configured).
This can be useful for debugging rare edge-cases in production. if (kr_debug_assertion_abort && kr_debug_assertion_fork), it is possible to both obtain a coredump (from forked child) and recover from the non-fatal error in the parent process.
== 0 (false): no forking
0: minimum delay between forks
(in milliseconds, each instance separately, randomized +-25%) < 0: no rate-limiting (not recommended)
This small collection of “generics” was born out of frustration that I couldn’t find no
such thing for C. It’s either bloated, has poor interface, null-checking is absent or
doesn’t allow custom allocation scheme. BSD-licensed (or compatible) code is allowed here,
as long as it comes with a test case in tests/test_generics.c.
array - a set of simple macros to make working with dynamic arrays easier.
A set of simple macros to make working with dynamic arrays easier.
MIN(array_push(arr,val),other)
May evaluate the code twice, leading to unexpected behaviour. This is a price to pay for the absence of proper generics.
Example usage:
array_t(constchar*)arr;array_init(arr);// Reserve memory in advanceif(array_reserve(arr,2)<0){returnENOMEM;}// Already reserved, cannot failarray_push(arr,"princess");array_push(arr,"leia");// Not reserved, may failif(array_push(arr,"han")<0){returnENOMEM;}// It does not hide what it really isfor(size_ti=0;i<arr.len;++i){printf("%s\n",arr.at[i]);}// Random deletearray_del(arr,0);
Note
The C has no generics, so it is implemented mostly using macros. Be aware of that, as direct usage of the macros in the evaluating macros may lead to different expectations:
Both the head and tail of the queue can be accessed and pushed to, but only the head can be popped from.
Example usage:
// define new queue type, and init a new queue instancetypedefqueue_t(int)queue_int_t;queue_int_tq;queue_init(q);// do some operationsqueue_push(q,1);queue_push(q,2);queue_push(q,3);queue_push(q,4);queue_pop(q);kr_require(queue_head(q)==2);kr_require(queue_tail(q)==4);// you may iteratetypedefqueue_it_t(int)queue_it_int_t;for(queue_it_int_tit=queue_it_begin(q);!queue_it_finished(it);queue_it_next(it)){++queue_it_val(it);}kr_require(queue_tail(q)==5);queue_push_head(q,0);++queue_tail(q);kr_require(queue_tail(q)==6);// free it upqueue_deinit(q);// you may use dynamic allocation for the type itselfqueue_int_t*qm=malloc(sizeof(queue_int_t));queue_init(*qm);queue_deinit(*qm);free(qm);
Note
The implementation uses a singly linked list of blocks (“chunks”) where each block stores an array of values (for better efficiency).
Initialize a queue iterator at the head of the queue.
If you use this in assignment (instead of initialization), you will unfortunately need to add corresponding type-cast in front. Beware: there’s no type-check between queue and iterator!
A length-prefixed list of objects, also an array list.
Each object is prefixed by item length, unlike array this structure permits variable-length data. It is also equivalent to forward-only list backed by an array.
Todo:
If some mistake happens somewhere, the access may end up in an infinite loop. (equality comparison on pointers)
// Define new LRU typetypedeflru_t(int)lru_int_t;// Create LRUlru_int_t*lru;lru_create(&lru,5,NULL,NULL);// Insert some valuesint*pi=lru_get_new(lru,"luke",strlen("luke"),NULL);if(pi)*pi=42;pi=lru_get_new(lru,"leia",strlen("leia"),NULL);if(pi)*pi=24;// Retrieve valuesint*ret=lru_get_try(lru,"luke",strlen("luke"),NULL);if(!ret)printf("luke dropped out!\n");elseprintf("luke's number is %d\n",*ret);char*enemies[]={"goro","raiden","subzero","scorpion"};for(inti=0;i<4;++i){int*val=lru_get_new(lru,enemies[i],strlen(enemies[i]),NULL);if(val)*val=i;}// We're donelru_free(lru);
Note
The implementation tries to keep frequent keys and avoid others, even if “used recently”, so it may refuse to store it on lru_get_new(). It uses hashing to split the problem pseudo-randomly into smaller groups, and within each it tries to approximate relative usage counts of several most frequent keys/hashes. This tracking is done for more keys than those that are actually stored.
The X corresponds to the module name; if the module name is hints, the prefix for constructor would be hints_init().
More details are in docs for the kr_module and kr_layer_api structures.
Note
The modules get ordered – by default in the same as the order in which they were loaded. The loading command can specify where in the order the module should be positioned.
The probably most convenient way of writing modules is Lua since you can use already installed modules
from system and have first-class access to the scripting engine. You can also tap to all the events, that
the C API has access to, but keep in mind that transitioning from the C to Lua function is slower than
the other way round, especially when JIT-compilation is taken into account.
Note
The Lua functions retrieve an additional first parameter compared to the C counterparts - a “state”.
Most useful C functions and structures have lua FFI wrappers, sometimes with extra sugar.
The modules follow the Lua way, where the module interface is returned in a named table.
--- @module Count incoming querieslocalcounter={}functioncounter.init(module)counter.total=0counter.last=0counter.failed=0endfunctioncounter.deinit(module)print('counted',counter.total,'queries')end-- @function Run the q/s counter with given interval.functioncounter.config(conf)-- We can use the scripting facilities hereifcounter.evthenevent.cancel(counter.ev)event.recurrent(conf.interval,function()print(counter.total-counter.last,'q/s')counter.last=counter.totalend)endreturncounter
The created module can be then loaded just like any other module, except it isn’t very useful since it
doesn’t provide any layer to capture events. The Lua module can however provide a processing layer, just
like its C counterpart.
-- Notice it isn't a function, but a table of functionscounter.layer={begin=function(state,data)counter.total=counter.total+1returnstateend,finish=function(state,req,answer)ifstate==kres.FAILthencounter.failed=counter.failed+1endreturnstateend}
There is currently an additional “feature” in comparison to C layer functions:
some functions do not get called at all if state==kres.FAIL;
see docs for details: kr_layer_api.
Since the modules are like any other Lua modules, you can interact with them through the CLI and and any interface.
Tip
Module discovery: kres_modules. is prepended to the module name and lua search path is used on that.
As almost all the functions are optional, the minimal module looks like this:
#include"lib/module.h"/* Convenience macro to declare module ABI. */KR_MODULE_EXPORT(mymodule)
Let’s define an observer thread for the module as well. It’s going to be stub for the sake of brevity,
but you can for example create a condition, and notify the thread from query processing by declaring
module layer (see the Writing layers).
staticvoid*observe(void*arg){/* ... do some observing ... */}intmymodule_init(structkr_module*module){/* Create a thread and start it in the background. */pthread_tthr_id;intret=pthread_create(&thr_id,NULL,&observe,NULL);if(ret!=0){returnkr_error(errno);}/* Keep it in the thread */module->data=thr_id;returnkr_ok();}intmymodule_deinit(structkr_module*module){/* ... signalize cancellation ... */void*res=NULL;pthread_tthr_id=(pthread_t)module->data;intret=pthread_join(thr_id,res);if(ret!=0){returnkr_error(errno);}returnkr_ok();}
This example shows how a module can run in the background, this enables you to, for example, observe
and publish data about query resolution.
A module can offer NULL-terminated list of properties, each property is essentially a callable with free-form JSON input/output.
JSON was chosen as an interchangeable format that doesn’t require any schema beforehand, so you can do two things - query the module properties
from external applications or between modules (e.g. statistics module can query cache module for memory usage).
JSON was chosen not because it’s the most efficient protocol, but because it’s easy to read and write and interface to outside world.
Note
The void*env is a generic module interface. Since we’re implementing daemon modules, the pointer can be cast to structengine*.
This is guaranteed by the implemented API version (see Writing a module in C).
Here’s an example how a module can expose its property:
char*get_size(void*env,structkr_module*m,constchar*args){/* Get cache from engine. */structengine*engine=env;structkr_cache*cache=&engine->resolver.cache;/* Read item count */intcount=(cache->api)->count(cache->db);char*result=NULL;asprintf(&result,"{ \"result\": %d }",count);returnresult;}structkr_prop*cache_props(void){staticstructkr_propprop_list[]={/* Callback, Name, Description */{&get_size,"get_size","Return number of records."},{NULL,NULL,NULL}};returnprop_list;}KR_MODULE_EXPORT(cache)
Once you load the module, you can call the module property from the interactive console.
Note: the JSON output will be transparently converted to Lua tables.
This chapter describes how to create custom HTTP services inside Knot Resolver.
Please read HTTP module basics in chapter Other HTTP services before continuing.
Each network address+protocol+port combination configured using net.listen()
is associated with kind of endpoint, e.g. doh_legacy or webmgmt.
Each of these kind names is associated with table of HTTP endpoints,
and the default table can be replaced using http.config() configuration call
which allows your to provide your own HTTP endpoints.
Items in the table of HTTP endpoints are small tables describing a triplet
- {mime,on_serve,on_websocket}.
In order to register a new service in webmgmtkind of HTTP endpoint
add the new endpoint description to respective table:
-- custom function to handle HTTP /health requestslocalon_health={'application/json',function(h,stream)-- API call, return a JSON tablereturn{state='up',uptime=0}end,function(h,ws)-- Stream current status every secondlocalok=truewhileokdolocalpush=tojson('up')ok=ws:send(tojson({'up'}))require('cqueues').sleep(1)end-- Finalize the WebSocketws:close()end}modules.load('http')-- copy all existing webmgmt endpointsmy_mgmt_endpoints=http.configs._builtin.webmgmt.endpoints-- add custom endpoint to the copymy_mgmt_endpoints['/health']=on_health-- use custom HTTP configuration for webmgmthttp.config({endpoints=my_mgmt_endpoints},'webmgmt')
Then you can query the API endpoint, or tail the WebSocket using curl.
Since the stream handlers are effectively coroutines, you are free to keep state
and yield using cqueues library.
This is especially useful for WebSockets, as you can stream content in a simple loop instead of
chains of callbacks.
Last thing you can publish from modules are “snippets”. Snippets are plain pieces of HTML code
that are rendered at the end of the built-in webpage. The snippets can be extended with JS code to talk to already
exported restful APIs and subscribe to WebSockets.
A RESTful service is likely to respond differently to different type of methods and requests,
there are three things that you can do in a service handler to send back results.
First is to just send whatever you want to send back, it has to respect MIME type that the service
declared in the endpoint definition. The response code would then be 200OK, any non-string
responses will be packed to JSON. Alternatively, you can respond with a number corresponding to
the HTTP response code or send headers and body yourself.
-- Our upvaluelocalvalue=42-- Expose the servicelocalservice={'application/json',function(h,stream)-- Get request method and deal with it properlylocalm=h:get(':method')localpath=h:get(':path')log('method %s path %s',m,path)-- Return table, response code will be '200 OK'ifm=='GET'thenreturn{key=path,value=value}-- Save body, perform check and either respond with 505 or 200 OKelseifm=='POST'thenlocaldata=stream:get_body_as_string()ifnottonumber(data)thenreturn500,'Not a good request'endvalue=tonumber(data)-- Unsupported method, return 405 Method not allowedelsereturn405,'Cannot do that'endend}modules.load('http')http.config({endpoints={['/service']=service}},'myservice')-- do not forget to create socket of new kind using-- net.listen(..., { kind = 'myservice' })-- or configure systemd socket kresd-myservice.socket
In some cases you might need to send back your own headers instead of default provided by HTTP handler,
you can do this, but then you have to return false to notify handler that it shouldn’t try to generate
a response.
localheaders=require('http.headers')function(h,stream)-- Send back headerslocalhsend=headers.new()hsend:append(':status','200')hsend:append('content-type','binary/octet-stream')assert(stream:write_headers(hsend,false))-- Send back datalocaldata='binary-data'assert(stream:write_chunk(data,true))-- Disable default handler actionreturnfalseend