Commit Graph

20 Commits

Author SHA1 Message Date
Ville Vesilehto
f3983c1111 perf(proxy): use mutex-based connection pool (#7790)
* perf(proxy): use mutex-based connection pool

The proxy package (used for example by the forward plugin) utilized
an actor model where a single connManager goroutine managed
connection pooling via unbuffered channels (dial, yield, ret). This
design serialized all connection acquisition and release operations
through a single goroutine, creating a bottleneck under high
concurrency. This was observable as a performance degradation when
using a single upstream backend compared to multiple backends
(which sharded the bottleneck).

Changes:
- Removed dial, yield, and ret channels from the Transport struct.
- Removed the connManager goroutine's request processing loop.
- Implemented Dial() and Yield() using a sync.Mutex to protect the
  connection slice, allowing for fast concurrent access without
  context switching.
- Downgraded connManager to a simple background cleanup loop that
  only handles connection expiration on a ticker.
- Updated plugin/pkg/proxy/connect.go to use direct method calls
  instead of channel sends.
- Updated tests to reflect the removal of internal channels.

Benchmarks show that this change eliminates the single-backend
bottleneck. Now a single upstream backend performs on par with
multiple backends, and overall throughput is improved.

The implementation aligns with standard Go patterns for connection
pooling (e.g., net/http.Transport).

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>

* fix: address PR review for persistent.go

- Named mutex field instead of embedding, to not expose
  Lock() and Unlock()
- Move stop check outside of lock in Yield()
- Close() without a separate goroutine
- Change stop channel to struct

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>

* fix: address code review feedback for conn pool

- Switch from LIFO to FIFO connection selection for source port
  diversity, reducing DNS cache poisoning risk (RFC 5452).
- Remove "clear entire cache" optimization as it was LIFO-specific.
  FIFO naturally iterates and skips expired connections.
- Remove all goroutines for closing connections; collect connections
  while holding lock, close synchronously after releasing lock.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>

* fix: remove unused error consts

No longer utilised after refactoring the channel based approach.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>

* feat(forward): add max_idle_conns option

Add configurable connection pool limit for the forward plugin via
the max_idle_conns Corefile option.

Changes:
- Add SetMaxIdleConns to proxy
- Add maxIdleConns field to Forward struct
- Add max_idle_conns parsing in forward plugin setup
- Apply setting to each proxy during configuration
- Update forward plugin README with new option

By default the value is 0 (unbounded). When set, excess
connections returned to the pool are closed immediately
rather than cached.

Also add a yield related test.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>

* chore(proxy): simple Dial by closing conns inline

Remove toClose slice collection to reduce complexity. Instead close
expired connections directly while iterating. Reduces complexity with
negligible lock-time impact.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>

* chore: fewer explicit Unlock calls

Cleaner and less chance of forgetting to unlock on new possible
code paths.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>

---------

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
2026-01-13 17:49:46 -08:00
Syed Azeez
7b38eb8625 plugin: fix gosec G115 integer overflow warnings (#7799)
Fix integer overflow conversion warnings (G115) by adding appropriate
suppressions where values are provably bounded.

Fixes: https://github.com/coredns/coredns/issues/7793

Changes:
- Updated 56 G115 annotations to use consistent // #nosec G115 format
- Added 2 //nolint:gosec suppressions for conditional expressions
- Removed G115 exclusion from golangci.yml (now explicitly handled per-line)

Suppressions justify why each conversion is safe (e.g., port numbers
are bounded 1-65535, DNS TTL limits, pool lengths, etc.)

Signed-off-by: Azeez Syed <syedazeez337@gmail.com>
2026-01-01 10:20:29 +02:00
Ville Vesilehto
05efeb0a7e fix(test): prevent race condition in dial test (#7770)
The test "TestDial_TransportStoppedDuringRetWait" replaced
tr.dial and tr.ret with test-controlled channels, then called
tr.Start(). Since connManager reads from t.dial, both the test
and connManager were racing to read from the same channel.
Remove tr.Start() since the test manually simulates connManager
behavior.

Also changed some test log formatting to align with other tests.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
2025-12-15 19:30:56 -08:00
Ville Vesilehto
fe7335e634 perf(proxy): avoid unnecessary alloc in Yield (#7708) 2025-11-24 08:20:30 -08:00
Endre Szabo
d68cbedbb1 plugin/forward: added support for per-nameserver TLS SNI (#7633) 2025-10-27 08:43:30 -07:00
Ville Vesilehto
39abf5aeba chore(lint): modernize Go (#7536)
Use modern Go constructs through the modernize analyzer from the
golang.org/x/tools package.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
2025-09-10 13:08:27 -07:00
Ville Vesilehto
11774d9e98 fix(proxy): flaky dial tests (#7349) 2025-06-04 14:36:59 -07:00
Ville Vesilehto
8cac83dfb5 lint: enable wastedassign linter (#7340) 2025-06-01 16:30:41 -07:00
Ville Vesilehto
19a6ae4983 lint: enable intrange linter (#7331)
Enable intrange linter to enforce modern Go range syntax over
traditional for loops, by converting:

for i := 0; i < n; i++

to:

for i := range n

Adding type conversions where needed for compatibility
with existing uint64 parameters.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
2025-05-28 17:50:55 -07:00
Ville Vesilehto
0a48523083 fix(proxy): avoid Dial hang after Transport stopped (#7321)
Ensure Dial exits early or returns error when Transport has been
stopped, instead of blocking on the dial or ret channels. This removes
a potential goroutine leak where callers could pile up waiting
forever under heavy load.

Add select guards before send and receive, and propagate clear error
values so callers can handle shutdown gracefully.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
2025-05-28 06:58:48 -07:00
Manuel Rüger
76ba39ffe9 chore: Upgrade to golangci-lint v2 (#7236)
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2025-04-04 14:27:39 -04:00
Ben Kochie
0d6e113f90 Enable Prometheus native histograms (#6524)
Add a NativeHistogramBucketFactor parameter to the use of
`NewHistogramVec` in order to enable use of Prometheus Native
Histograms.

This will store automatically computed sparse buckets in CoreDNS.
If a compatible Prometeus requests native histograms this data will
returned instead of the static buckets.

The default factor of 1.05 should provide high quality resolution data.

Signed-off-by: SuperQ <superq@gmail.com>
2024-03-11 16:09:09 -04:00
Tom Thorogood
b541b4ea49 Use the correct root domain name in the proxy plugin's TestHealthX tests (#6395)
When packing the empty domain name, miekg/dns can end up creating
corrupt DNS messages. With some planned unpacking changes, this now
trips an error condition and causes these tests to fail. Correct this
by using the root domain explicitly as this gets correctly encoded on
the wire.

Signed-off-by: Tom Thorogood <me+github@tomthorogood.net>
2023-11-08 12:00:32 -08:00
Sri Harsha
4c69549832 Handle UDP responses that overflow with TC bit with test case (#6277)
Signed-off-by: SriHarshaBS001 <SriHarshaBS009@gmail.com>
2023-09-07 15:01:45 -04:00
Chris O'Haver
678d0333af Revert "plugin/forward: Continue waiting after receiving malformed responses (#6014)" (#6270)
This reverts commit 604a902e2c.
2023-08-14 20:33:37 -04:00
Pat Downey
ea293da1d6 Fix forward metrics for backwards compatibility (#6178) 2023-07-04 16:35:55 +02:00
Chris O'Haver
604a902e2c plugin/forward: Continue waiting after receiving malformed responses (#6014)
* forward: continue waiting after malformed responses

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* add test

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* fix test

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* clean up

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* clean up

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* move test to /test/. Add build tag.

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* install libpcap-dev for e2e tests

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* sudo the test

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* remove stray err check

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* disable the test

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* use -exec flag to run test binary as root

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* run new test by itself in a new workflow

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* fix test name

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* only for udp

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* remove libpcap test workflow action

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* remove test, since it cant run in ci

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

* and remove gopacket package

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>

---------

Signed-off-by: Chris O'Haver <cohaver@infoblox.com>
2023-04-29 11:52:00 +02:00
cui fliter
ee3999303d fix some comments (#6052)
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-04-25 11:25:07 -04:00
Vancl
7db1d4f6e9 Prevent fail counter of a proxy overflows (#5990)
Signed-off-by: vanceli <vanceli@tencent.com>
Signed-off-by: Vance Li <vncl@YingyingM1.local>
Co-authored-by: vanceli <vanceli@tencent.com>
2023-04-16 16:08:56 +02:00
Pat Downey
f823825f8a plugin/forward: Allow Proxy to be used outside of forward plugin. (#5951)
* plugin/forward: Move Proxy into pkg/plugin/proxy, to allow forward.Proxy to be used outside of forward plugin.

Signed-off-by: Patrick Downey <patrick.downey@dioadconsulting.com>
2023-03-24 08:55:51 -04:00