Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No blocking on the main thread #2159

Open
srcejon opened this issue Jun 10, 2024 · 17 comments
Open

No blocking on the main thread #2159

srcejon opened this issue Jun 10, 2024 · 17 comments
Assignees

Comments

@srcejon
Copy link
Collaborator

srcejon commented Jun 10, 2024

For the WASM port, we need to ensure that there is no blocking on the main thread, otherwise SDRangel will hang.

This is described more in the Qt docs here https://doc.qt.io/qt-6/wasm.html under "Multithreading" and here in the Emscripten docs https://emscripten.org/docs/porting/pthreads.html#blocking-on-the-main-browser-thread

So far, I see two main areas that need to change in SDRangel:

  • DSPDeviceSourceEngine::startAcquisition/stopAcquistion

When a device start button is pressed, DSPDeviceSourceEngine::startAcquisition is eventually called and is executed on the main/GUI thread. Its code blocks the thread until the device is started:

DSPAcquisitionStart cmd;
return m_syncMessenger.sendWait(cmd) == StRunning;

However, on WASM, USB devices will unable to start, because libusb calls get deferred to the main thread, so we get a deadlock.

As a workaround, it seems the blocking can be removed with:

DSPAcquisitionStart *cmd = new DSPAcquisitionStart();
m_inputMessageQueue.push(cmd);
return true;

and the RTLSDR driver (and a few others I've quickly looked at) don't appear to require it to block.

Removing this blocking would potentially have a small benefit for all platforms - as it would remove the small GUI freeze that can sometimes be seen when one device is already running and the start button is pressed on another device that takes a little while to initialize.

The above might not be the best way to implement this change - it's just the first thing I tried - and similar would also need to be implemented for TX and MIMO devices - and also for the stopAcquistion method.

  • DeviceInput::applySettings

Currently DeviceInput::applySettings is also often called from the main/GUI thread, and the libusb calls result in a hang. I worked around this in RTLSDRInput::applySettings by moving all of the rtlsdr driver calls in to a separate thread. Similar would need to be done for other devices.

Again, this change is probably beneficial to all platforms, as discussed a while back in this issue: #1522

@srcejon
Copy link
Collaborator Author

srcejon commented Jun 12, 2024

DSPAcquisitionInit can hang as well, the second time time an RTL SDR is started. Similar workaround applies.

And forgot to mention, there needs to be coded added to DSPDeviceSourceEngine::handleInputMessages() to handle those messages.

(RTL SDR driver crash on stop has been worked around by using rtlsdr_read_sync instead of rtlsdr_read_async)

@srcejon
Copy link
Collaborator Author

srcejon commented Jun 13, 2024

So, it appears there are cases where we do need to wait some of these DSPDeviceSourceEngine commands to complete.

One such case is in void MainWindow::removeDeviceSet. It starts by calling deviceEngine->stopAcquistion(); which results in RTLSDRInput::stop being called. It later on calls deleteSampleSourcePluginInstanceInput, which calls ~RTLSDRInput which also calls RTLSDRInput::stop(). In this instance, it's on the main thread, so it will hang if the previous stop hasn't completed.

So, we need a way for DSPDeviceSourceEngine and friends to indicate when they have completed, but without blocking. As an experiment, I've tried adding an acquistionStopped signal to DSPDeviceSourceEngine. MainWindow::removeDeviceSet then needs to be broken up in to two parts: First calling stopAcquistion - then when it gets the acquistionStopped signal, it can continue the rest of the device set deletion. Seems to work.

@srcejon
Copy link
Collaborator Author

srcejon commented Jun 14, 2024

It does get a little trickier though. In mainwindow.cpp, there are several places where all device sets are removed as part of some other sequential code, such as:

// Do X...
// Remove device sets
while (m_deviceUIs.size() > 0) {
    removeLastDeviceSet();
}
// Do Y...

This needs to be broken up in to separate steps, with a non-blocking wait for a signal after each device set is removed, before trying to remove the next.

Trying to think of the cleanest way to write this code, I came across Qt's State Machine API: https://doc.qt.io/qt-5/statemachine-api.html - which seems like a reasonable way to sequence different bits of code using signals. I will give it a try.

Any thoughts on any of this?

@f4exb
Copy link
Owner

f4exb commented Jun 15, 2024

I only had a quick look at the Qt state machine and I find it quite interesting. When the events and event handlers start to be rather complex it probably helps keep things tidy. State Machine is a well known design pattern.

@f4exb
Copy link
Owner

f4exb commented Aug 14, 2024

I see that DSPDeviceSourceEngine inherits from QThread. This is not the recommended way to work with QThreads. Instead the thread should be managed by the DSPEngine and DSPDeviceSourceEngine should be moved to the thread. This may involve quite a bit of rework but is probably a necessary preliminary action. It may also solve some corner case issues.

The start method of the DSPDeviceSourceEngine is always called after the addDeviceSourceEngine of the DSPEngine and similarly for the "remove" operations thus thread management could be moved to DSPEngine.

@f4exb
Copy link
Owner

f4exb commented Aug 14, 2024

initDeviceEngine and startDeviceEngine should be combined into startDeviceEngine because we always have a sequence like:

            if (m_deviceAPI->initDeviceEngine(0)) {
                m_deviceAPI->startDeviceEngine(0);
            }  

And thus only a DSPAcquisitionStart message (removing or neutralizing DSPAcquisitionInit)

@srcejon
Copy link
Collaborator Author

srcejon commented Aug 21, 2024

Hi. Haven't had time to work on this lately. But I should mention that I did manage to work around most of the blocking issues by using Qt FSMs. Some of the more complicated methods in MainWindow needed to broken up into separate methods, driven by a FSM.

E.g. Here's how the FSM is defined for MainWindow::loadConfiguration which was broken up in to a number of different steps:

class LoadConfigurationFSM : public QStateMachine {
public:
    LoadConfigurationFSM(MainWindow *mainWindow, const Configuration *configuration, QProgressDialog *waitBox);
private:
    MainWindow *m_mainWindow;
    const Configuration *m_configuration;
    QProgressDialog *m_waitBox;

    void clearWorkspace();
    void createWorkspaces();
    void loadDeviceSets();
    void loadDeviceSetSettings();
    void loadFeatureSets();
    void restoreGeometry();
};


MainWindow::LoadConfigurationFSM::LoadConfigurationFSM(MainWindow *mainWindow, const Configuration *configuration, QProgressDialog *waitBox) :
    m_mainWindow(mainWindow),
    m_configuration(configuration),
    m_waitBox(waitBox)
{
    QState *s1 = new QState();
    QState *s2 = new QState();
    QState *s3 = new QState();
    QState *s4 = new QState();
    QState *s5 = new QState();
    QFinalState *s6 = new QFinalState();

    s1->addTransition(&m_mainWindow->m_removeAllWorkspacesFSM, &RemoveAllWorkspacesFSM::finished, s2);
    s2->addTransition(s3);
    s3->addTransition(m_mainWindow, &MainWindow::allDeviceSetsAdded, s4);
    s4->addTransition(s5);
    s5->addTransition(s6);

    // We use unconditional transitions between states, to allow event handling / progress dialog to be updated
    connect(s1, &QState::entered, this, &MainWindow::LoadConfigurationFSM::clearWorkspace);
    connect(s2, &QState::entered, this, &MainWindow::LoadConfigurationFSM::createWorkspaces);
    connect(s3, &QState::entered, this, &MainWindow::LoadConfigurationFSM::loadDeviceSets);
    connect(s4, &QState::entered, this, &MainWindow::LoadConfigurationFSM::loadDeviceSetSettings);
    connect(s5, &QState::entered, this, &MainWindow::LoadConfigurationFSM::loadFeatureSets);
    connect(s6, &QState::entered, this, &MainWindow::LoadConfigurationFSM::restoreGeometry);

    addState(s1);
    addState(s2);
    addState(s3);
    addState(s4);
    addState(s5);
    addState(s6);
    setInitialState(s1);
}

With similar for initialisation, addSampleSource and so on.

I should now hopefully have a bit of free time so I can submit some of the Emscripten patches, so you can build it too.

@f4exb
Copy link
Owner

f4exb commented Aug 27, 2024

The essential of these commits is to make the various device engines truly asynchronous which shouldn't hurt. As part of it they are now using the recommended way to work with QThreads using a thread + worker concept instead of inheriting from QThread and thus they call the delete_later() method on the thread and worker by connecting the appropriate signals. Also it was found that the the destroy() method of ChannelGUI was hiding QWidget destroy() method. With the proper way to handle virtual destructors this may simply be replaced by a delete of the ChannelGUI object and the destroy() method be completely removed.

On the top of it some code issues pointed out by Sonar were fixed.

@f4exb
Copy link
Owner

f4exb commented Aug 29, 2024

I have noticed occasional issues when closing SDRangel while RTLSDR is still running. This may be due to the RTLSDR plugin not implementing the recommended way of using threads so that the deleteLater() method is not called on the thread and the worker causing it to be still running while RTLSDRInput is deleted.

@f4exb
Copy link
Owner

f4exb commented Aug 30, 2024

This destroy() also hides QWidget::destroy() :

class RTLSDRGui : public DeviceGUI {
	Q_OBJECT

public:
	explicit RTLSDRGui(DeviceUISet *deviceUISet, QWidget* parent = nullptr);
	~RTLSDRGui() final;
	void destroy() final;

@srcejon
Copy link
Collaborator Author

srcejon commented Sep 11, 2024

While the blocking in DSPDeviceSourceEngine has been removed, I think we now have a race condition relating to the processing of messages in DSPDeviceSourceEngine.

In MainWindow::sampleSourceCreate, we indirectly call DSPDeviceSourceEngine::setSource, which creates and sends a DSPSetSource msg to itself that is handled on its own thread (which sets m_deviceSampleSource).
Without waiting for that message to be processed, MainWindow::sampleSourceCreate then creates the device GUI, which tries to get the sampleSource via m_deviceUISet->m_deviceAPI->m_deviceSourceEngine->m_deviceSampleSource; This can sometimes return null, if the DSPDeviceSourceEngine has not yet processed the message (and then lots will go will obviously go wrong).

To address this, I had added some signals in DSPDeviceSourceEngine, such as sourceSet and acquistionStopped, that it would emit when the corresponding messages had been processed. The FSMs I'd added in MainWindow would then wait on these signals (which does so without blocking the main thread), before continuing with the next part of creating / removing a device. I need to merge these changes to work with your latest updates. Any other ideas to address it?

@f4exb
Copy link
Owner

f4exb commented Sep 12, 2024

Any other ideas to address it?

Having the connect as QtQueuedConnection but I haven´t checked it yet

Edit: well it is already a Qt::QueuedConnection so it seems it does not take it into account if on the same thread. It is not very clear but perhaps one should let Qt decide.

@srcejon
Copy link
Collaborator Author

srcejon commented Sep 12, 2024

Qt::QueuedConnection - The slot is invoked when control returns to the event loop of the receiver's thread. The slot is executed in the receiver's thread.

Qt::DirectConnection - The slot is invoked immediately when the signal is emitted. The slot is executed in the signalling thread.

It sounds like we want Qt::DirectConnection, if we want to process the message immediately. I'll give it a try.

@srcejon
Copy link
Collaborator Author

srcejon commented Sep 12, 2024

Qt::DirectConnection does seem to cause the message to be handled on the senders thread - however, it's probably not what we want:

a) what's the point of having a thread if we're not going to use it
b) for other messages, we don't want the message to be handled on the senders thread, as that can then block the main thread

@srcejon
Copy link
Collaborator Author

srcejon commented Sep 12, 2024

The following contains MainWindow and DSPDeviceSourceEngine updated to use signals and a FSM for adding a sample source (with a few other changes for Emscripten)

src.zip

(You may need to add StateMachine package in CMakeLists.txt for Qt6.)

Please take a look and let me know if you think we should pursue this, as other parts of MainWindow require additional FSMs. Cheers.

@f4exb
Copy link
Owner

f4exb commented Sep 23, 2024

It looks fine for me

@srcejon
Copy link
Collaborator Author

srcejon commented Sep 23, 2024

Ok, great, I will update the other FSMs as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants