coredns

mirror of https://github.com/coredns/coredns.git synced 2026-01-16 13:51:19 -05:00

Author	SHA1	Message	Date
Ville Vesilehto	f3983c1111	perf(proxy): use mutex-based connection pool (#7790 ) * perf(proxy): use mutex-based connection pool The proxy package (used for example by the forward plugin) utilized an actor model where a single connManager goroutine managed connection pooling via unbuffered channels (dial, yield, ret). This design serialized all connection acquisition and release operations through a single goroutine, creating a bottleneck under high concurrency. This was observable as a performance degradation when using a single upstream backend compared to multiple backends (which sharded the bottleneck). Changes: - Removed dial, yield, and ret channels from the Transport struct. - Removed the connManager goroutine's request processing loop. - Implemented Dial() and Yield() using a sync.Mutex to protect the connection slice, allowing for fast concurrent access without context switching. - Downgraded connManager to a simple background cleanup loop that only handles connection expiration on a ticker. - Updated plugin/pkg/proxy/connect.go to use direct method calls instead of channel sends. - Updated tests to reflect the removal of internal channels. Benchmarks show that this change eliminates the single-backend bottleneck. Now a single upstream backend performs on par with multiple backends, and overall throughput is improved. The implementation aligns with standard Go patterns for connection pooling (e.g., net/http.Transport). Signed-off-by: Ville Vesilehto <ville@vesilehto.fi> * fix: address PR review for persistent.go - Named mutex field instead of embedding, to not expose Lock() and Unlock() - Move stop check outside of lock in Yield() - Close() without a separate goroutine - Change stop channel to struct Signed-off-by: Ville Vesilehto <ville@vesilehto.fi> * fix: address code review feedback for conn pool - Switch from LIFO to FIFO connection selection for source port diversity, reducing DNS cache poisoning risk (RFC 5452). - Remove "clear entire cache" optimization as it was LIFO-specific. FIFO naturally iterates and skips expired connections. - Remove all goroutines for closing connections; collect connections while holding lock, close synchronously after releasing lock. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi> * fix: remove unused error consts No longer utilised after refactoring the channel based approach. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi> * feat(forward): add max_idle_conns option Add configurable connection pool limit for the forward plugin via the max_idle_conns Corefile option. Changes: - Add SetMaxIdleConns to proxy - Add maxIdleConns field to Forward struct - Add max_idle_conns parsing in forward plugin setup - Apply setting to each proxy during configuration - Update forward plugin README with new option By default the value is 0 (unbounded). When set, excess connections returned to the pool are closed immediately rather than cached. Also add a yield related test. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi> * chore(proxy): simple Dial by closing conns inline Remove toClose slice collection to reduce complexity. Instead close expired connections directly while iterating. Reduces complexity with negligible lock-time impact. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi> * chore: fewer explicit Unlock calls Cleaner and less chance of forgetting to unlock on new possible code paths. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi> --------- Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>	2026-01-13 17:49:46 -08:00
Ville Vesilehto	c2894d47d6	feat(forward): add max connect attempts knob (#7722 )	2025-12-01 18:06:52 -08:00
Endre Szabo	d68cbedbb1	plugin/forward: added support for per-nameserver TLS SNI (#7633 )	2025-10-27 08:43:30 -07:00
Ville Vesilehto	f4ab631ae4	fix(forward): disallow NOERROR in failover (#7622 ) Previously the parsing logic in the forward plugin setup failed to recognise when NOERROR was used as a failover RCODE criteria. The check was in the wrong code branch. This PR fixes it and adds validation tests. Also updates the plugin README. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>	2025-10-17 14:37:02 +03:00
Fitz_dev	9683de0feb	fix: No failover to next upstream when receiving SERVFAIL or REFUSED response codes（#7457） (#7458 )	2025-09-12 14:45:01 -07:00
Puneet Loya	4de8fb57b2	plugin/forward: added option `failfast_all_unhealthy_upstreams` to return servfail if all upstreams are down (#6999 ) * feat: option to return servfail if upstreams are down Signed-off-by: Puneet Loya <puneetloya@Puneets-MBP.attlocal.net> * fix based on review comments and added to Readme Signed-off-by: Puneet Loya <puneetloya@Puneets-MBP.attlocal.net> * add tests to improve code coverage Signed-off-by: Puneet Loya <puneetloya@Puneets-MBP.attlocal.net> * added failfast_all_unhealthy_upstreams option to forward plugin Signed-off-by: Puneet Loya <puneetloya@Puneets-MBP.attlocal.net> --------- Signed-off-by: Puneet Loya <puneetloya@Puneets-MBP.attlocal.net> Co-authored-by: Puneet Loya <puneetloya@Puneets-MBP.attlocal.net>	2025-03-07 11:37:25 -05:00
Jasper Bernhardt	2e9986c622	Add alternate option to forward plugin (#6681 ) Allows the forward plugin to execute the next plugin based on the return code. Similar to the externally mainted alternate plugin https://github.com/coredns/alternate Based on the idea of chrisohaver@ in #6549 (comment) Also incoperated the request to rename `alternate` to `next` as an option I am having issues adding a proper test for functionality. Primarily, I do not know the code base enough and having multiple `dnstest.NewServer` with ResponseWriter does not work. From my testing these are "Singletons'' and only the last defined response writer is used for all servers Signed-off-by: Jasper Bernhardt <jasper.bernhardt@live.de>	2024-07-01 11:20:12 -04:00
Pat Downey	ea293da1d6	Fix forward metrics for backwards compatibility (#6178 )	2023-07-04 16:35:55 +02:00
Justin	7231bb0881	plugin/forward: fix descriptions in README.md (#6123 ) Signed-off-by: Justin <cattyhouse@users.noreply.github.com>	2023-05-26 17:01:06 -04:00
Vancl	4033d7aeba	plugin/forward: health_check needs to normalize a specified domain name (#5543 ) * plugin/forward: convert the specified domain of health_check to Fqdn * plugin/forward: update readme for health check Signed-off-by: vanceli <vanceli@tencent.com>	2022-08-15 10:16:15 -04:00
Chris O'Haver	513f27b9a9	plugin/forward: Enable multiple forward declarations (#5127 ) * enable multiple declarations of forward plugin Signed-off-by: Chris O'Haver <cohaver@infoblox.com>	2022-07-20 10:35:04 -04:00
RetoHaslerMGB	d594d61341	Correct timeout description (#5388 )	2022-05-19 02:48:25 -07:00
hansedong	0622a6c66c	plugin/forward: configurable domain support for healthcheck (#5281 ) * plugin/forward: configurable domain support for healthcheck Signed-off-by: hansedong <admin@yinxiaoluo.com>	2022-04-12 12:39:48 -04:00
OctoHuman	29f6d0a6b2	Docs: Add warning to use tls_servername (#4992 ) Signed-off-by: OctoHuman <17958767+OctoHuman@users.noreply.github.com>	2021-11-22 08:49:13 +01:00
Chris O'Haver	0348b019be	plugin/forward: Document and warn for unsupported FROM CIDR notations (#4639 ) * trap unsupported FROM cidr notations Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * make is a warning Signed-off-by: Chris O'Haver <cohaver@infoblox.com>	2021-05-20 09:24:36 +02:00
Chris O'Haver	929aa3886e	add metadata section to docs (#4525 ) Signed-off-by: Chris O'Haver <cohaver@infoblox.com>	2021-03-16 13:51:21 +01:00
Maxime Ginters	b1173ed2a5	plugin/forward Add rcode and rtype to request_duration_seconds metric (#4391 ) * plugin/forward Add rcode and rtype to request_duration_seconds metric Signed-off-by: Maxime Ginters <maxime.ginters@shopify.com> * Control the cardinality of query type Signed-off-by: Maxime Ginters <maxime.ginters@shopify.com>	2021-01-28 16:37:17 +01:00
Chris O'Haver	9cb53487ec	respond with REFUSED when max_concurrent is exceeded to avoid caching it (#4326 ) Signed-off-by: Chris O'Haver <cohaver@infoblox.com>	2020-12-15 14:02:15 +01:00
Miek Gieben	8759d00edd	forward doc update (#4254 ) * forward: add example with multiple DoT upstreams Remove Bugs section as this is a nice work around. h/t https://twitter.com/mholt6/status/1284250606673080321 Signed-off-by: Miek Gieben <miek@miek.nl> * Actually remove bugs section Signed-off-by: Miek Gieben <miek@miek.nl>	2020-11-03 06:32:49 -08:00
Miek Gieben	c2e4f2f1ab	docs: move Also See to See Also (#4245 ) sed -i 's/Also See/See Also/' plugin/**/README.md Some plugins did already use 'See Also', so it's all consistent now. Fixes: #4196 Signed-off-by: Miek Gieben <miek@miek.nl>	2020-10-28 10:56:35 -07:00
Ruslan Drozhdzh	30a4a87eaa	plugin/forward: add hit/miss metrics for connection cache (#4114 ) Signed-off-by: Ruslan Drozhdzh <rdrozhdzh@infoblox.com>	2020-09-14 11:42:55 +02:00
Chris O'Haver	47d6e86f58	plugin/cache/forward: Clean up grammar/wording in forward & cache metrics descriptions. (#3971 ) * tweak language Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * tweak language Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * typo Signed-off-by: Chris O'Haver <cohaver@infoblox.com>	2020-06-24 07:49:42 -07:00
Miek Gieben	55e9c2cd7b	plugin/forward: remove exp backoff stuff (#3970 ) we hc every 0.5s, doing exp backoff will create a large gap in the ability to re-use an upstream. Doing a exp. backoff up to (say) 3s, isn't really exp backoff either. Remove the wording from the documentation. Signed-off-by: Miek Gieben <miek@miek.nl>	2020-06-24 07:49:06 -07:00
Zou Nengren	73e927d6a8	completed metrics of cache and forward (#3962 ) Signed-off-by: zounengren <zounengren@cmss.chinamobile.com>	2020-06-24 06:54:03 +02:00
Chris O'Haver	f4cb9a1ba3	fix readme (#3889 ) Signed-off-by: Chris O'Haver <cohaver@infoblox.com>	2020-05-14 09:58:58 -07:00
Miek Gieben	19cfa2960c	Cleanup metrics (#3776 ) Cleanup a variety of metric issues. * Eliminate department of redundancy "count_total" naming. * Use the plural of the unit when appropriate. (ex, "requests") * Remove label names from metric names where appropriate. (ex, "rcode") * Simplify request metrics by consolidating type label in to the base request counter. * Re-generate man pages. Signed-off-by: Ben Kochie <superq@gmail.com> Co-authored-by: Ben Kochie <superq@gmail.com>	2020-03-26 09:17:33 +01:00
Christian Tryti	116bda4d27	Add configuration flag to set if RecursionDesired should be set on health checkers in Forward-plugin (#3679 ) * Make the RD-flag in health-checks in the Forward-plugin configurable Introduces a new configuration flag; `health_check_non_recursive`. This flag makes the health-checker do non-recursive requests when checking the health of upstream servers. Signed-off-by: Geir Haugom <ghagit@haugom.org> Signed-off-by: Christian Tryti <ctryti@gmail.com> * Changes after feedback from reviewer * Better tests of health-checks with and without recursion * Removed the health_check_non_recursive configuration in favor of extending the existing health_check configuration. Now supports an optional `no_rec` argument. Signed-off-by: Christian Tryti <ctryti@gmail.com> * Add new test that checks setup of health_check. Signed-off-by: Christian Tryti <ctryti@gmail.com>	2020-03-06 11:52:43 +01:00
Miek Gieben	c4fc5cb54a	plugin/pkg/up: make default intervals shorter (#3651 ) * plugin/pkg/up: make default intervals shorter I think 15 min is too high, make this lower to react faster. Signed-off-by: Miek Gieben <miek@miek.nl> * Update README Signed-off-by: Miek Gieben <miek@miek.nl>	2020-02-06 19:28:53 +01:00
Chris O'Haver	1818df0d06	Update README.md (#3655 )	2020-02-05 10:19:04 -05:00
Ricky S	efbe4ac5e8	Add exponential backoff to healthcheck (#3643 ) Move exponential backoff initialization to Start() Signed-off-by: RickyRajinder <singh.sangh@gmail.com> Move comment Increase max interval and update README Remove trailing whitespace Change Start() param name back to interval	2020-02-04 14:19:48 +01:00
Chris O'Haver	22cd28a798	plugins/forward: Add max_concurrent option (#3640 ) * count and limit concurrent queries Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * add option Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * return servfail when limit exceeded Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * docs Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * docs Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * docs Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * review feedback Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * move atomic counter to beginning of struct Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * add comment for ErrLimitExceeded Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * rename option to max_concurrent Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * add metric Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * response REFUSED; incl max in error; add more docs Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * avoid err setup race Signed-off-by: Chris O'Haver <cohaver@infoblox.com> * respond SERVFAIL; doc memory usage Signed-off-by: Chris O'Haver <cohaver@infoblox.com>	2020-02-04 13:59:08 +01:00
Miek Gieben	65458b2de2	Directive -> plugin (#3363 ) Caught my eye, we name things directive still, esp when talking about the prometheus plugin. Rename everything that needs to be plugin to 'plugin'. Also make sure Metrics is a H2 section (not H1). Signed-off-by: Miek Gieben <miek@miek.nl>	2019-10-08 10:20:48 +01:00
Miek Gieben	2d98d520b5	plugin/forward: make Yield not block (#3336 ) * plugin/forward: may Yield not block Yield may block when we're super busy with creating (and looking) for connection. Set a small timeout on Yield, to skip putting the connection back in the queue. Use persistentConn troughout the socket handling code to be more consistent. Signed-off-by: Miek Gieben <miek@miek.nl> Dont do Signed-off-by: Miek Gieben <miek@miek.nl> * Set used in Yield This gives one central place where we update used in the persistConns Signed-off-by: Miek Gieben <miek@miek.nl>	2019-10-01 16:39:42 +01:00
Miek Gieben	f2df37a1fe	plugin/forward: metrics: make docs reflect reality (#3311 ) Remove talk about labels that are not added. Signed-off-by: Miek Gieben <miek@miek.nl>	2019-09-27 11:09:59 +01:00
Mat Lowery	dae6aea292	Fix response_rcode_count_total metric (#3029 )	2019-07-23 06:23:16 +00:00
Miek Gieben	663271a7ca	plugin/forward: remove proxy comparison (#2760 ) proxy is removed, so this is moot Signed-off-by: Miek Gieben <miek@miek.nl>	2019-04-04 13:36:17 -07:00
Fernando Ripoll	6f5b294d7e	Add all policies to forward plugin docs (#2751 )	2019-04-02 18:24:54 +01:00
Rob Maas	b0d37c5443	fix tls_servername in cloudflare example (#2466 )	2019-01-15 09:18:20 -08:00
Miek Gieben	973349592e	plugin/forward: make tls config more clear (#2326 ) Automatically submitted.	2018-11-20 20:16:54 +00:00
Miek Gieben	1ef0a02b46	Revert "log/forward plugins: Extend dns query logging (#2240 )" (#2256 ) This reverts commit `8045aa279b`.	2018-10-31 21:03:46 +00:00
Dzmitry Razhanski	8045aa279b	log/forward plugins: Extend dns query logging (#2240 ) Automatically submitted.	2018-10-29 18:50:31 +00:00
Miek Gieben	6ec1978340	plugin/forward: various cleanup (#1949 ) Fix documentation and remove the unused From method. Signed-off-by: Miek Gieben <miek@miek.nl>	2018-07-07 14:38:05 +01:00
Miek Gieben	41c2871907	plugin/test: Fix documentation (#1948 ) Fix documentation and touch up plugin/forward/README.md Signed-off-by: Miek Gieben <miek@miek.nl>	2018-07-07 08:30:57 +01:00
Ruslan Drozhdzh	bc50901234	plugin/forward: add prefer_udp option (#1944 ) * plugin/forward: add prefer_udp option * updated according to code review - fixed linter warning - removed metric parameter in Proxy.Connect()	2018-07-07 08:14:21 +01:00
Tobias Schmidt	422aec5f5f	plugin/forward: Increase minimum read timeout to 200ms (#1889 ) After several experiments at SoundCloud we found that the current minimum read timeout of 10ms is too low. A single request against a slow/unavailable authoritative server can cause all TCP connections to get closed. We record a 50th percentile forward/proxy latency of <5ms, and a 99th percentile latency of 60ms. Using a minimum timeout of 200ms seems to be a fair trade-off between avoiding unnecessary high connection churn and reacting to upstream failures in a timely manner. This change also renames hcDuration to hcInterval to reflect its usage, and removes the duplicated timeout constant to make code comprehension easier.	2018-06-21 11:40:19 +01:00
Francois Tur	70c957d885	Plugin/Forward - autotune the dialTimeout for connection (#1852 ) * - implement an auto-tunable dialTimeout for fallback. * - fix gofmt * - factorized timeout computation with readTimeout / updated readme / * - fix comment	2018-06-15 07:37:22 +01:00
John Belamaric	9d25b6d8b9	Update README.md (#1748 )	2018-04-27 10:25:46 -04:00
John Belamaric	7ec92e055d	Update README.md	2018-04-27 09:49:25 -04:00
Miek Gieben	f0f80ed739	plugin/forward: clarify relation with proxy (#1747 ) some other small bits as well.	2018-04-27 14:24:58 +01:00
Miek Gieben	eb7c3ad137	Build manual docs (#1721 ) Slight tweak in the forward readme, as sublist don't work well to generate these.	2018-04-23 12:47:32 +01:00

1 2

57 Commits