Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iperf3 tests need control over bind address to support tests with NAT'd hosts #1476

Open
jprorama opened this issue Oct 4, 2024 · 6 comments

Comments

@jprorama
Copy link

jprorama commented Oct 4, 2024

Using pscheduler to test throughput against servers running in NAT'd environments, like cloud hosting, requires the test to use the public IP of the instance from the external test participant but use the instance internal (often private) IP from the NAT'd server.

The pscheduler task subcommand and throughput test have some support for specify bind addresses, but these do not get translated to the correct bind behavior when running iperf3.

An easy option is to avoid specifying an iperf3 bind parameter "-B" for command invocations. This causes iperf3 to bind on any interface and allows the test to proceed. Unfortunately, the only why to cause iperf3 to be called in this way is to submit pscheduler thoughput tasks with only a destination host parameter, e.g. pscheduler task throughput -d <destination>. These test specifications work to test throughput to and from a NAT'd host but require that the test is submitted from a shell on source host (so that the source address is implied).

This works because the code is written to not use the -B bind parameter to iperf3 if there is no source host provided in the test specification, in run_client() and run_server().

This doesn't work with throughput tests specify both a source and destination host. In those cases, the -B parameter is passed to the server and the client iperf3 invocations and will default to the public IP of the NAT'd host. This causes the iperf3 bind to fail and kills the command. The pscheduler task result then reports a timeout and the test fails. Specifying the various bind parameters of the task subcommand or throughput test doesn't work either because these do not make it through to the iperf3 command construction in a way that makes sense for the NAT'd use-case.

Since scheduled tests and third-party tests specify both source and destination hosts in the test specification the existing code prevents these tests from working with hosts that are using NAT. This prevents regular test for hosts in cloud environments.

jprorama added a commit to jprorama/pscheduler that referenced this issue Oct 4, 2024
Add support for the bind_address parameter to be read
from the iperf3.conf file and use it to override
bind address interpretation from the test spec.

This enables pscheduler throughput tests against
hosts that are deployed in NAT environments where
their public IP used by clients is different from
their local network address.  The config provides
the local knowledge to use the correct bind address.

Proposed fix for perfsonar#1476
@jprorama jprorama changed the title iperf3 tests need control over bind address to support tests from NAT'd hosts iperf3 tests need control over bind address to support tests with NAT'd hosts Oct 5, 2024
@pllopis
Copy link

pllopis commented Oct 10, 2024

Thanks for creating this issue and for the PR. I'm experiencing the same issue, so would also be interested in a resolution for this issue.
In our case it's a Kubernetes cluster with perfSONAR sitting behind a LoadBalancer service, and while I haven't tested this patch, indeed it looks like it would resolve the issue.

@pllopis
Copy link

pllopis commented Oct 10, 2024

Worth noting that this seems a duplicate of #1323, for which there used to be a solution that was already merged (which allowed setting --dest-bind and --source-bind separately), but this functionality was removed in 045bc00 as part of #256

@mfeit-internet2
Copy link
Member

The better way to deal with this is:

  • Put an entry in the NATted host's /etc/hosts that points the outside host's name at the inside address (e.g., 10.9.8.7 outside.cloud.example.org).
  • Make sure the resolver is configured to query hosts before dns.
  • Always refer to the host by its FQDN and never by its outside IP when configuring tests.

@pllopis
Copy link

pllopis commented Oct 10, 2024

Thanks @mfeit-internet2 , that's indeed a valid workaround that works. What also works is to add the options -B 0.0.0.0 --reverse, which even skips having to touch /etc/hosts (since it's obviously just reversing the roles..)

However I'd still be interested in a more permanent solution going forward that doesn't require further manual adjustments.
Do you know why the --dest-bind and --source-bind options were removed? It seems like an elegant way to make this work, especially if -B becomes an alias of --source-bind, as I believe it was already implemented :)
(.. or the config file override as implemented in #1477)

@jprorama
Copy link
Author

jprorama commented Oct 10, 2024 via email

@pllopis
Copy link

pllopis commented Oct 17, 2024

Hi again, perhaps this should go on a separate issue (let me know if you'd like me to open a new one) but since it's very related I'll post here first:

There's the same issue affecting owampd, where it will try to bind to the address specified as the destination.

Unfortunately in this case the workaround of using /etc/hosts does not seem to work. With that /etc/hosts edit, iperf3 works just fine, and getent hosts gives back the local IP address for the DNS name that I use with --dest, but owampd seems to try to bind to the public IP anyway.

Is this expected?
Any idea about whether there's another potential workaround when sitting behind a NAT/LB?

If there isn't and changing the code is required, would you accept a patch that works similarly to the one in this PR, where it's possible to optionally override the listen address for UDP tests? I could look into writing it if you'd like to see a patch. (Having a config option in this sense I think would make sense, but let me know if you have a specific opinion on how this should be done)

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready
Development

No branches or pull requests

3 participants