-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SHM Acks degrade over time #1194
Comments
Current (workaround?) branch is here: Currently, this branch only increases the internal timeout making it more unlikely to miss it. |
No idea why, but this is what has happened. Could be the testing tool
As always, the issue created itself after some time. The test was performed with the current HEAD of https:/eclipse-ecal/ecal/tree/hotfix/shm_data_drop 1 x publisher
4 x 10ms subscriber:
1 x 10ms subscriber that behaves substantially different and suddenly received more than what is published!
1 x 100ms subscriber
eCAL Mon shows a little different picture: processes {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5279
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
pparam: "./ecal_perftool pub topic 25 1024000"
pmemory: 435613696
usrptime: -1
datawrite: 14336000
state {
}
tsync_state: tsync_realtime
tsync_mod_name: "\"ecaltime-localtime\""
component_init_state: 183
component_init_info: "pub|sub|srv|log|time"
ecal_runtime_version: "v5.12.0-nightly2-71-gac103991e"
hgname: "florian-ubuntu20"
}
processes {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5290
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
pparam: "./ecal_perftool sub topic 10"
pmemory: 585547776
usrptime: -1
dataread: 14336000
state {
}
tsync_state: tsync_realtime
tsync_mod_name: "\"ecaltime-localtime\""
component_init_state: 183
component_init_info: "pub|sub|srv|log|time"
ecal_runtime_version: "v5.12.0-nightly2-71-gac103991e"
hgname: "florian-ubuntu20"
}
processes {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5303
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
pparam: "./ecal_perftool sub topic 10"
pmemory: 510046208
usrptime: -1
dataread: 14336000
state {
}
tsync_state: tsync_realtime
tsync_mod_name: "\"ecaltime-localtime\""
component_init_state: 183
component_init_info: "pub|sub|srv|log|time"
ecal_runtime_version: "v5.12.0-nightly2-71-gac103991e"
hgname: "florian-ubuntu20"
}
processes {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5315
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
pparam: "./ecal_perftool sub topic 10"
pmemory: 510046208
usrptime: -1
dataread: 14336000
state {
}
tsync_state: tsync_realtime
tsync_mod_name: "\"ecaltime-localtime\""
component_init_state: 183
component_init_info: "pub|sub|srv|log|time"
ecal_runtime_version: "v5.12.0-nightly2-71-gac103991e"
hgname: "florian-ubuntu20"
}
processes {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5327
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
pparam: "./ecal_perftool sub topic 10"
pmemory: 510046208
usrptime: -1
dataread: 14336000
state {
}
tsync_state: tsync_realtime
tsync_mod_name: "\"ecaltime-localtime\""
component_init_state: 183
component_init_info: "pub|sub|srv|log|time"
ecal_runtime_version: "v5.12.0-nightly2-71-gac103991e"
hgname: "florian-ubuntu20"
}
processes {
rclock: 5482
hname: "florian-ubuntu20"
pid: 5353
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
pparam: "./ecal_perftool sub topic 10"
pmemory: 510046208
usrptime: -1
dataread: 14336000
state {
}
tsync_state: tsync_realtime
tsync_mod_name: "\"ecaltime-localtime\""
component_init_state: 183
component_init_info: "pub|sub|srv|log|time"
ecal_runtime_version: "v5.12.0-nightly2-71-gac103991e"
hgname: "florian-ubuntu20"
}
processes {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5373
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
pparam: "./ecal_perftool sub topic 100"
pmemory: 510046208
usrptime: -1
dataread: 6144000
state {
}
tsync_state: tsync_realtime
tsync_mod_name: "\"ecaltime-localtime\""
component_init_state: 183
component_init_info: "pub|sub|srv|log|time"
ecal_runtime_version: "v5.12.0-nightly2-71-gac103991e"
hgname: "florian-ubuntu20"
}
processes {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5436
pname: "/usr/bin/ecal_mon_gui-5.12.0"
uname: "eCALMon"
pparam: "/usr/bin/ecal_mon_gui"
pmemory: 934182912
usrptime: -1
state {
severity: proc_sev_healthy
info: "Running"
}
tsync_state: tsync_realtime
tsync_mod_name: "\"ecaltime-localtime\""
component_init_state: 191
component_init_info: "pub|sub|srv|mon|log|time"
ecal_runtime_version: "v5.12.0-nightly2-71-gac103991e"
hgname: "florian-ubuntu20"
}
topics {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5279
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
tid: "1624945319207"
tname: "topic"
direction: "publisher"
tlayer {
type: tl_ecal_shm
confirmed: true
}
tsize: 1024000
connections_loc: 6
dclock: 65106
dfreq: 14000
hgname: "florian-ubuntu20"
tdatatype {
}
}
topics {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5290
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
tid: "1628043986590"
tname: "topic"
direction: "subscriber"
tlayer {
type: tl_ecal_shm
confirmed: true
}
tsize: 1024000
connections_loc: 1
dclock: 65003
dfreq: 14000
hgname: "florian-ubuntu20"
tdatatype {
}
}
topics {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5303
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
tid: "1629142832302"
tname: "topic"
direction: "subscriber"
tlayer {
type: tl_ecal_shm
confirmed: true
}
tsize: 1024000
connections_loc: 1
dclock: 64976
dfreq: 14000
hgname: "florian-ubuntu20"
tdatatype {
}
}
topics {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5315
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
tid: "1630083352274"
tname: "topic"
direction: "subscriber"
tlayer {
type: tl_ecal_shm
confirmed: true
}
tsize: 1024000
connections_loc: 1
dclock: 64952
dfreq: 14000
hgname: "florian-ubuntu20"
tdatatype {
}
}
topics {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5327
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
tid: "1631091796732"
tname: "topic"
direction: "subscriber"
tlayer {
type: tl_ecal_shm
confirmed: true
}
tsize: 1024000
connections_loc: 1
dclock: 64943
dfreq: 14000
hgname: "florian-ubuntu20"
tdatatype {
}
}
topics {
rclock: 5482
hname: "florian-ubuntu20"
pid: 5353
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
tid: "1645633128470"
tname: "topic"
direction: "subscriber"
tlayer {
type: tl_ecal_shm
confirmed: true
}
tsize: 1024000
connections_loc: 1
dclock: 64574
dfreq: 14000
hgname: "florian-ubuntu20"
tdatatype {
}
}
topics {
rclock: 5483
hname: "florian-ubuntu20"
pid: 5373
pname: "/home/florian/Projects/_build/release/ecal_perftool"
uname: "ecal-perftool"
tid: "1654095462610"
tname: "topic"
direction: "subscriber"
tlayer {
type: tl_ecal_shm
confirmed: true
}
tsize: 1024000
connections_loc: 1
message_drops: 33478
dclock: 44831
dfreq: 6000
hgname: "florian-ubuntu20"
tdatatype {
}
} |
So I now have a way of immediately triggering the issue. The setup is as follows:
The only thing we have to do is make the subscriber miss the ack timeout once. After missing it once, the entire ack feature breaks down entirely and never recovers.
Now watch the eCAL ack feature breaking. I think we can say, that SHM acks are pretty much useless on Ubuntu at the moment, as only 1 missed ack destroys the entire feature. I don't have a solution, yet, but on Windows it doesn't show the same effect, so there must be some solution. |
Problem Description
eCAL seems to not handle SHM Acknowledgements properly. Even in a system that has enough resources, everything seems to degrade over time and the publisher will run into SHM ack timouts.
A first analysis shows, that if the Subscriber is not able to get memfile access in 5ms (hardcoded!) It will omit sending the shm ack and let the publisher wait for the timeout.
ecal/ecal/core/src/io/ecal_memfile_pool.cpp
Line 166 in 95480ad
Also, the list of events hold by the publisher is never cleaned up, so when subscribers go away, the publisher will still try to receive their events.
ecal/ecal/core/src/readwrite/ecal_writer_shm.cpp
Lines 193 to 205 in 95480ad
How to reproduce
ecal.ini
The publishing freq should be 10 Hz (due to the 100ms sub, that slows everything down), but after some time it will go down to 2Hz, which is the ack timeout:
How did you get eCAL?
Ubuntu PPA (apt-get)
Environment
eCAL 5.11.5
Ubuntu 20.04
eCAL System Information
The text was updated successfully, but these errors were encountered: