-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
timeout error running SoapySDRUtil --probe on LimeSDR #29
Comments
There were cases where we got locked up forever waiting for system tcp timeouts when hosts had been removed from the network suddenly. Its possible that these timeouts are too strict: 0ede4c4 Its probably the case that some calls can take a very long time. Can you try increasing this value above 3 seconds? https:/pothosware/SoapyRemote/blob/master/common/SoapyRPCUnpacker.cpp#L23 The default timeout for TCP seems to be 1 or 2 minutes in which case the OS closes the connection. Obviously thats too high, we need to find something in the middle :-) |
I tried to improve the timeout check so we actually check if the server is down to allow for longer blocking calls: https:/pothosware/SoapyRemote/tree/timeout_improved Give it try, thanks! |
Thanks, Josh. I tried the new tree, and initially the behavior against the two servers (Raspian and openSUSE) was the same as before, with the same error messages. I then upped the big time limit from 30000000 to 35000000, at which point the command With the new tree and modified time limit, the behavior against the openSUSE server did not change; same errors that I had listed originally. |
|
Thanks again. The "hang up" code does the trick- no error messages running "--probe" against the Raspian server. The error messages I got running against openSUSE seem to have a completely different origin. |
I merged that change btw. I'm not sure about the second set of errors on opensuse. I'm curious if you see the issue on the previous release 0.4.0 (https:/pothosware/SoapyRemote/tree/soapy-remote-0.4.0). That would help pinpoint if there an additional issue with the timeout additions in 0.4.1. |
Just tested 0.4.0- it works like a charm with the Raspian server, and gives same old same old with the openSUSE server (no surprises there). cheers, |
I seem to be getting a related problem with SoapyLogAcceptor. When running a rate test with SoapySDRUtil, a single timeout occurs about 30 seconds into a run, once only, with the rate test continuing to run after the error with no apparent errors. But when SoapySDRUtil is stopped with Ctrl-C, the remote server also closes and needs to be restarted. If i put a sleep(40) immediately after make device in SoapySDRRateTest, the error occurs roughly 30seconds into the sleep, then continues clean until stopped with Ctrl-C resulting in the remote exiting also. SoapySDRUtil --rate=39062.5 --channels=0 --direction=RX [INFO] SoapyRemote::setupRxStream(remoteFormat=CS32, localFormat=CS32, scaleFactor=5.53648e+07, mtu=1500, window=44040192) |
Thanks, that pointed me in the right direction. the log acceptor also needed to recv() forever without a timeout. The timeout was only meant for the control packets and not the listener loops which can be hung up on. I made a change on the timeout_improved branch. Didnt test it yet, but its straightforward. |
No errors so far. Very Nice. |
Anyways sorry for the bugs, thanks for testing. I will try to get this tagged as well since its a pretty annoying bug to have in a release. @rkb68 let me know if the current master branch resolves your opensuse issues. Perhaps it was also the timeout bug in the logger. If not I can make the suggestion that the connection timeout of 50 ms could be too strict in some cases. So I offer the following patch to simply increase it and see what happens:
Thanks! |
Thanks again; when I increased SOAPY_REMOTE_SOCKET_TIMEOUT_US to 500 ms, --probe still did not work, but --find did, though it would fail after the first run. However, after increasing the timeout to a full second, I can now run both --find and --probe with no errors. Perhaps my openSUSE installation has greater latency somewhere? Anyway, thanks for looking into this. |
@rkb68 1 second is an incredible amount of time for a socket to connect on another host on a local network. Thats a bit worry some all by itself. Is the openSUSE in a remote location or can you think of some oddity with the network configuration otherwise? Howabout sshing into the openSUSE box, is there a noticeable lag? Sometimes there can be issues with the DNS that cause the client to reverse lookup an IP address and timeout on every lookup -- something else to double check anyway. I'm thinking of making the timeout configurable with an input argument or environment variable just to be safe and future proof things. But there is probably something else going on here. |
I was able to back off the timeout to 700 ms, but yes, that is still a very long time. I installed openSUSE on the RPi3 using their card image, did not tweak any settings, merely enabled wifi with DHCP. And yes, there is often (not always) a lag when ssh'ing in from elsewhere on the network (which does not happen with the Raspian image). |
I added a remote timeout option but I forgot to mention it. Remote timeout it should hit all of the timeout values on the client that were previously using the default. I think this should not be needed usually, but in case its useful: https:/pothosware/SoapyRemote/wiki#the-remote-timeout-key |
With current release, a timeout error appears that did not appear in previous release; behavior is different depending on whether the host, a Raspberry Pi 3 (yeah, I know), is running Raspian (in which case at least the connection succeeds) or openSUSE (in which case it does not).
..................................................................................................................................
Raspian:
[server]
$ SoapySDRServer --bind
7701212d-a2ea-14cc-8567-0325007f0101
Launching the server... tcp://[::]:55132
Server bound to [::]:55132
Launching discovery server...
Press Ctrl+C to stop the server
SoapyServerListener::accept([::ffff:192.168.1.9]:50506)
SoapyServerListener::accept([::ffff:192.168.1.9]:50507)
SoapyServerListener::accept([::ffff:192.168.1.9]:50508)
SoapyServerListener::close()
SoapyServerListener::close()
SoapyServerListener::accept([::ffff:192.168.1.9]:50509)
SoapyServerListener::close()
SoapyServerListener::handlerLoop() FAIL: SoapyRPCUnpacker::recv(header) FAIL:
SoapyServerListener::close()
[client]
$ SoapySDRUtil --probe="remote=192.168.1.10"
Probe device remote=192.168.1.10
[INFO] Make connection: 'LimeSDR-USB [USB 2.0] 9060B00472212'
[INFO] Estimated reference clock 30.7198 MHz
[INFO] Selected reference clock 30.720 MHz
[INFO] Device name: LimeSDR-USB
[INFO] Reference: 30.72 MHz
[INFO] Init LMS7002M(0)
[INFO] LMS7002M cache /home/pi/.limesuite/LMS7002M_cache_values.db
[INFO] Ver=7, Rev=1, Mask=1
[INFO] LMS7002M calibration values caching Enable
Error probing device: SoapyRPCUnpacker::recv() TIMEOUT:
..................................................................................................................................
openSUSE:
[server]
$ SoapySDRServer --bind
79ac27fb-a35e-14cc-8567-04ada8c00a01
Launching the server... tcp://[::]:55132
Server bound to [::]:55132
Launching discovery server...
Press Ctrl+C to stop the server
SoapyServerListener::accept([::ffff:192.168.1.9]:50542)
[ERROR] ~SoapyRPCUnpacker: Unconsumed payload bytes -4
SoapyServerListener::handlerLoop() FAIL: SoapyRPCUnpacker::recv(header) FAIL:
SoapyServerListener::close()
[client]
$ SoapySDRUtil --probe="remote=192.168.1.10"
Probe device remote=192.168.1.10
[ERROR] SoapyRemote::find() -- connect(tcp://192.168.1.10:55132) FAIL: connect(tcp://192.168.1.10:55132) [110: Connection timed out]
[INFO] Make connection: ''
Error probing device: Failed to make connection with ''
The text was updated successfully, but these errors were encountered: