Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with yarprobotinterface, wholeBodyDynamics, and gravityCompensator #1574

Closed
paliasgh opened this issue Jun 6, 2023 · 31 comments
Closed

Comments

@paliasgh
Copy link

paliasgh commented Jun 6, 2023

Device name 🤖

iCubWaterloo01

Request/Failure description

We are facing a number of issues regarding motors, especially the arms, of our iCub robot after a recent update.

  1. Changing the control mode of any joint in robot's arm to torque control mode will make the joint unavailable due to hardware fault.
  2. If we stop yarprobotinterface and then, start it again, the arms will not get calibrated and stay in the idle mode.
  3. yarprobotinterface produces lots of errors regarding cables under one board are not attached while nothing has been physically changed.

Detailed context

We have recently updated iCubWaterloo01 systems to Distro v2023.02.2 (and icub-firmware-build to v1.34.1 consequently) as per this long discussion. Regarding the issues listed above (log files are attached next):

Changing the control mode of any joint in robot's arm to torque control mode, either through setControlMode(i,VOCAB_CM_TORQUE) or directly in yarpmotorgui, will make the joint unavailable due to hardware fault (code 7):

image

As a result, gravityCompensator does not seem to provide any torque to hold the arms against gravity:

241812112-5b13a486-7658-48c2-9f29-23cba4590c4b.mov

We have run some tests on ports related to yarprobotinterface, wholeBodyDynamics, and gravityCompensator:

icub@icub26:~$ yarp read ... /wholeBodyDynamics/left_arm/FT:i
[INFO] |yarp.os.Port|/tmp/port/1| Port /tmp/port/1 active at tcp://10.0.0.26:10003/
[INFO] |yarp.os.impl.PortCoreInputUnit|/tmp/port/1| Receiving input from /wholeBodyDynamics/left_arm/FT:i to /tmp/port/1 using tcp

icub@icub26:~$ yarp read ... /wholeBodyDynamics/left_arm/Torques:o
[INFO] |yarp.os.Port|/tmp/port/1| Port /tmp/port/1 active at tcp://10.0.0.26:10003/
[INFO] |yarp.os.impl.PortCoreInputUnit|/tmp/port/1| Receiving input from /wholeBodyDynamics/left_arm/Torques:o to /tmp/port/1 using tcp
1 1.56717448240744028354 -0.943850984098839274061 0.151011971455497440164 -0.680200214314772400037 0.0569468937549454587432 -0.0241318946280751102373 0.393043435128048235239
1 1.56125732848709164458 -0.942025830216480564161 0.151010507049799097556 -0.711019595750900679221 0.0568831249057451041051 -0.0238640538652060384128 0.346560790373746152593
icub@icub26:~$ yarp read ... /icub/left_arm/analog:o
[INFO] |yarp.os.Port|/tmp/port/1| Port /tmp/port/1 active at tcp://10.0.0.26:10003/
[INFO] |yarp.os.impl.PortCoreInputUnit|/tmp/port/1| Receiving input from /icub/left_arm/analog:o to /tmp/port/1 using tcp
49.346923828125 -75.0732421875 -34.97314453125 -2.5634765625 -3.753662109375 0.3662109375
49.713134765625 -73.79150390625 -33.69140625 -2.471923828125 -3.753662109375 0.3662109375
50.1708984375 -74.981689453125 -34.881591796875 -2.5634765625 -3.753662109375 0.3662109375
icub@icub26:~$ yarp name list | grep -i analog
registration name /iCub/left_hand/analog:o ip 10.0.0.26 port 10376 type tcp
registration name /iCub/left_hand/analog:o/rpc:i ip 10.0.0.26 port 10375 type tcp
registration name /iCub/right_hand/analog:o ip 10.0.0.26 port 10378 type tcp
registration name /iCub/right_hand/analog:o/rpc:i ip 10.0.0.26 port 10377 type tcp
registration name /icub/left_arm/analog:o ip 10.0.0.2 port 10099 type tcp
registration name /icub/left_arm/analog:o/rpc:i ip 10.0.0.2 port 10066 type tcp
registration name /icub/left_foot/analog:o ip 10.0.0.2 port 10104 type tcp
registration name /icub/left_foot/analog:o/rpc:i ip 10.0.0.2 port 10071 type tcp
registration name /icub/left_hand/analog:o ip 10.0.0.2 port 10084 type tcp
registration name /icub/left_hand/analog:o/rpc:i ip 10.0.0.2 port 10051 type tcp
registration name /icub/left_leg/analog:o ip 10.0.0.2 port 10102 type tcp
registration name /icub/left_leg/analog:o/rpc:i ip 10.0.0.2 port 10069 type tcp
registration name /icub/right_arm/analog:o ip 10.0.0.2 port 10100 type tcp
registration name /icub/right_arm/analog:o/rpc:i ip 10.0.0.2 port 10067 type tcp
registration name /icub/right_foot/analog:o ip 10.0.0.2 port 10103 type tcp
registration name /icub/right_foot/analog:o/rpc:i ip 10.0.0.2 port 10070 type tcp
registration name /icub/right_hand/analog:o ip 10.0.0.2 port 10085 type tcp
registration name /icub/right_hand/analog:o/rpc:i ip 10.0.0.2 port 10052 type tcp
registration name /icub/right_leg/analog:o ip 10.0.0.2 port 10101 type tcp
registration name /icub/right_leg/analog:o/rpc:i ip 10.0.0.2 port 10068 type tcp
icub@icub26:~$ yarp ping /wholeBodyDynamics/left_arm/FT:i
This is "/wholeBodyDynamics/left_arm/FT:i" at "tcp://10.0.0.26:10161/"
There are no outgoing connections
There is an input connection from "/icub/left_arm/analog:o" to "/wholeBodyDynamics/left_arm/FT:i" using tcp
There is an input connection from "<ping>" to "/wholeBodyDynamics/left_arm/FT:i" using text_ack
icub@icub26:~$ yarp ping /wholeBodyDynamics/right_arm/FT:i
This is "/wholeBodyDynamics/right_arm/FT:i" at "tcp://10.0.0.26:10162/"
There are no outgoing connections
There is an input connection from "/icub/right_arm/analog:o" to "/wholeBodyDynamics/right_arm/FT:i" using tcp
There is an input connection from "<ping>" to "/wholeBodyDynamics/right_arm/FT:i" using text_ack

If we stop yarprobotinterface through yarpmanager and then, start it again, the arms will not get calibrated:

243501777-70145bb5-7658-4cb1-8c99-7a1d6e3457df.mov

In fact, the main joints stay in idle mode (I can switch them back to position control though) and some others give fault as you can see in the screenshot below.

image

This means, every time we want to restart yarprobotinterface, we have to manually shutdown the motors (press robot's backpack's button) and start them again. It was not like this before.

Based on the yarprobotinterface output that contains following messages,

[ERROR] from BOARD 10.0.1.11 (right_leg-eb11-skin), src CAN1, adr 0, time 582s 881m 129u: (code 0x00000019, par16 0x4001 par64 0x0000000000000000) ->
SYS: EOtheCANservice could not tx frames on CAN bus. In par16 there is: on msb the size of txfifo, on lsb a code. + .

[ERROR] from BOARD 10.0.1.11 (right_leg-eb11-skin), src CAN2, adr 0, time 582s 884m 128u: (code 0x00000019, par16 0x3f01 par64 0x0000000000000000) ->
SYS: EOtheCANservice could not tx frames on CAN bus. In par16 there is: on msb the size of txfifo, on lsb a code. + .

we know that "the cables on CAN1 and CAN2 under board 10.0.1.11 are not attached.". In fact, no board is discovered under 10.0.1.11, as well as 10.0.1.20 and 10.0.1.21 via FirmwareUpdater. Could you please help us check the cables? It is strange since nothing happened to our robot physically recently that could have caused cable detachment.

Thanks so much for your help!

Additional context

Here are console outputs of yarprobotinterface, wholeBodyDynamics, and gravityCompensator after a fresh start of the robot:

yarprobotinterface.log (The log may contain errors related to this this issue that still exists (we are going to get back to this one next!).)

gravityCompensator.log
wholeBodyDynamics.log

After closing yarprobotinterface, rerunning it produces this output (when arms are not calibrated):

yarprobotinterface_second_run.log

How does it affect you?

No response

@pattacini
Copy link
Member

pattacini commented Jun 7, 2023

Thank you @paliasgh for the thorough description of the problems.

Just for our internal debug, I'll start reporting below some considerations.

🔲 WBD (aka wholeBodyDynamics)

The following error caught my attention:

[ERROR] |yarp.os.Network| Failure: could not find source port /icub/inertial

This an old name related to a deprecated way of dealing with inertial data.
At any rate, inspecting the code, I managed to track it down at this line:

Network::connect(string("/"+robot_name+"/inertial").c_str(), string("/"+local_name+"/unfiltered/inertial:i").c_str(),"tcp",false);

There's no port named /wholeBodyDynamics/unfiltered/inertial:i; thus, this should be just a leftover we could clean up 🧹

@Nicogene, did I get this right?

👉🏻 However, the real problem is the timeout that occurs on the MAS device, still related to the IMU, I suppose:

[ERROR] |yarp.device.multipleanalogsensorsclient| No data received in the last 0.104186 seconds, timeout enabled.

Once in timeout, WBD stops its activity causing the robot to go HF 🔴 when asked to switch to torque mode.
Presumably, TBC.

@Nicogene
Copy link
Member

Nicogene commented Jun 7, 2023

Thank you @paliasgh for the thorough description of the problems.

Just for our internal debug, I'll start reporting below some considerations.

🔲 WBD (aka wholeBodyDynamics)

The following error caught my attention:

[ERROR] |yarp.os.Network| Failure: could not find source port /icub/inertial

This an old name related to a deprecated way of dealing with inertial data. At any rate, inspecting the code, I managed to track it down at this line:

Network::connect(string("/"+robot_name+"/inertial").c_str(), string("/"+local_name+"/unfiltered/inertial:i").c_str(),"tcp",false);

There's no port named /wholeBodyDynamics/unfiltered/inertial:i; thus, this should be just a leftover we could clean up 🧹

@Nicogene, did I get this right?

It seems that unfilterde/inertial:i is an old leftover but harmless, in any case @pattacini removed it robotology/icub-main@18b7803. We not need it since we are using the MASclient for reading the IMU

👉🏻 However, the real problem is the timeout that occurs on the MAS device, still related to the IMU, I suppose:

[ERROR] |yarp.device.multipleanalogsensorsclient| No data received in the last 0.104186 seconds, timeout enabled.

Once in timeout, WBD stops its activity causing the robot to go HF 🔴 when asked to switch to torque mode. Presumably, TBC.

Yes this should be the problem, 100 ms of delay is a lot and this

[ERROR] |yarp.device.multipleanalogsensorsclient| Sensor of type ThreeAxisLinearAccelerometers with index 0 has non-MAS_OK status.

and this:

[WARNING] network delays detected (1/10)

Are syntoms that something is not working on imu side.

Some checks that could be done are:

  • Check if the imu is streaming data consistently reading from /icub/head/inertials/measures:o
  • Try to run wbd from icub-head and see if it is still problematic.
  • Check the version of the rfe board w/ FirmwareUpdater

If on the icub-head wbd runs fine, it means that there are network issues between icub-head and the laptop.
If you cannot discover the rfe via FirmwareUpdater it means that there are issues in the CAN connection.

I recall that in the past we had issues of freeze of the board, but on the most recent firmware of rfe this problem should be fixed

cc @marcoaccame

@paliasgh
Copy link
Author

paliasgh commented Jun 7, 2023

Hi @pattacini and @Nicogene , thank you very much for looking into the problem.

The port you noted streams data:

icub@icub26:~$ yarp read ... /icub/head/inertials/measures:o
[INFO] |yarp.os.Port|/tmp/port/1| Port /tmp/port/1 active at tcp://10.0.0.26:10003/
[INFO] |yarp.os.impl.PortCoreInputUnit|/tmp/port/1| Receiving input from /icub/head/inertials/measures:o to /tmp/port/1 using tcp
(((0.0625 0.0 0.0) 1686171416.2559928894)) (((-0.200000000000000011102 -0.359999999999999986677 -9.80000000000000071054) 1686171416.25599217415)) (((1.45624999999999999101e-05 1.71875000000000010354e-05 4.87499999999999991175e-05) 1686171416.25599265099)) () () () () () () ()
(((0.0 0.0625 0.0625) 1686171416.2660138607)) (((-0.22000000000000000111 -0.349999999999999977796 -9.81000000000000049738) 1686171416.26601314545)) (((1.45624999999999999101e-05 1.71875000000000010354e-05 4.87499999999999991175e-05) 1686171416.26601338387)) () () () () () () ()
(((0.0 0.0 0.0) 1686171416.27603173256)) (((-0.22000000000000000111 -0.340000000000000024425 -9.81000000000000049738) 1686171416.27603125572)) (((1.45624999999999999101e-05 1.71875000000000010354e-05 4.87499999999999991175e-05) 1686171416.27603149414)) () () () () () () ()
(((0.0 -0.1875 0.0) 1686171416.28613686562)) (((-0.22000000000000000111 -0.349999999999999977796 -9.81000000000000049738) 1686171416.28613615036)) (((1.45624999999999999101e-05 1.71875000000000010354e-05 4.87499999999999991175e-05) 1686171416.2861366272)) () () () () () () ()

It seems like wbd runs fine in icub-head (log is attached, those sensor errors are gone). I am able to enable torque control this time when wbd is running. Just to mention, in wbd logs, anything after line 321, including the sensor errors, is after when closing command is issued (through yarpmanager in the first log and by Ctrl+C in this log).

wholeBodyDynamics_icub-head.log

Screenshot_2023-06-07_16-48-50

Running gravityCompensator in icub-head still gives [WARNING] network delays detected (1/10) as the last output before closing the module in line 83.

gravityCompensator_icub-head.log

We do not seem to yet have any gravity compensation in action.

Lastly, I think there are some unusual things going on regarding rfe boards. In the initial issue, I had mentioned no board is discovered under 10.0.1.11 (tactile sensors of one of the legs I think), as well as 10.0.1.20 and 10.0.1.21 (which I did not know what those could be) via FirmwareUpdater. Just now, I could discover one of the rfe boards, with version 1.3.0, under 10.0.1.21. Still nothing can be discovered under 10.0.1.20 (possibly another rfe?).

Screenshot_2023-06-07_17-09-20

@pattacini
Copy link
Member

Running gravityCompensator in icub-head still gives [WARNING] network delays detected (1/10) as the last output before closing the module in line 83.

It shouldn't happen but it's not operationally critical yet as the gravityCompensator stops working when it reaches 10 events of this type. See https:/robotology/icub-main/blob/master/src/modules/gravityCompensator/gravityThread.cpp#L662-L666.

Looking at the gravityCompensator log, it seems all good, aside from that warning.

We do not seem to yet have any gravity compensation in action.

What exactly have you been seeing? Are the arms falling down when switched in torque mode, although WBD is ok?

@Nicogene
Copy link
Member

Nicogene commented Jun 8, 2023

Hi @paliasgh!

The rfe fw version is the latest one, so should be fine on that side.

It seems like wbd runs fine in icub-head (log is attached, those sensor errors are gone).

This means that probably there are issues in the network connection between laptop and icub-head, you can check w/ linux command like ping/iperf etc.

We do not seem to yet have any gravity compensation in action.

Unfortunately, wbd on icub-main is a very old executable that has several hardcoded dynamics parameters that probably became obsoletes, this is probably the reason why the gravity compensation does not work very well.

We planned to migrate to the iDynTree::WholeBodyDynamics (see robotology/robots-configuration#202) that instead takes the robot dynamics/kinematics characteristics of the robot from the urdf. With this, gravity compensation should work better.

cc @traversaro

@pattacini
Copy link
Member

2. If we stop yarprobotinterface and then, start it again, the arms will not get calibrated and stay in the idle mode.

I've seen that the second log contains lots of MC: hard limit reached for several limbs, which can prevent the robot from starting.

Don't know why by simply power cycling the boards these issues will disappear somehow. Any clue @marcoaccame?

@paliasgh, did you try to put the robot in a more "comfortable" configuration1 before starting it the second time? Just as a quick test.

Footnotes

  1. iCub is fully idle so you should keep it by hand as much as possible within the joints bounds.

@paliasgh
Copy link
Author

paliasgh commented Jun 8, 2023

Hello @pattacini and @Nicogene, thanks for the feedback!

It shouldn't happen but it's not operationally critical yet as the gravityCompensator stops working when it reaches 10 events of this type.

gravityCompensator only generated one of those warnings in my test today, so no serious issue here.

What exactly have you been seeing? Are the arms falling down when switched in torque mode, although WBD is ok?

Yes, they fall down. Having arm joints in the torque control mode with wbd and gc running is not any different (in terms of ease of moving the arm, arm falling down, etc.) from having them in the idle mode without wbd and gc running. It behaves now like this now in torque control mode:

IMG_3301.mov

@paliasgh, did you try to put the robot in a more "comfortable" configuration1 before starting it the second time? Just as a quick test.

Yes, even without any change in the arms position, yarprobotinterface will not start the arms after a second start:

IMG_3302.mov

@pattacini
Copy link
Member

Yes, they fall down. Having arm joints in the torque control mode with wbd and gc running is not any different (in terms of ease of moving the arm, arm falling down, etc.) from having them in the idle mode without wbd and gc running. It behaves now like this now in torque control mode:

Hi @paliasgh,
We have planned to test the gc's functionality (point 1) in our next sprint (next 2 weeks). Stay tuned.

We will also organize our activities to cover points 2 and 3.

@AntonioConsilvio
Copy link
Contributor

Hi @paliasgh, regarding point 3:
Board 10.0.1.11 (EB11) is located in the right leg of the robot. It should be on the outer side of the leg (obviously under the covers), as you can see from the following photos:

image
Scheda eb11

The EB11 board has 2 CAN connectors (CAN1 and CAN2) to which the MTB4 boards (which are screwed on the cover) must be connected.

So, when you reach the board, you need to find the cables called 11N1 and 11N2.

The cable 11N1 connects the EB11 board to an MTB4 on the thigh cover.

The cable 11N2 connects the EB11 board to an MTB4 on the ankle cover.

Please remember to give feedback and if you have any questions, write to us.

@paliasgh However, I don't quite understand whether you have the same error on the CAN of boards 10.0.1.20 and 10.0.1.21 or not?

cc @sgiraz @pattacini

@paliasgh
Copy link
Author

paliasgh commented Jun 9, 2023

Hi @AntonioConsilvio , thanks for the help!

I opened the thigh cover and looked at the EB11 board. 11N1 was connected to one of the MTB4s in the front cover and then, to the back cover of the thigh. 11N2 was also connected to EB11, but I did not check the other end of its connection as it was going into the ankle. We do not move the robot's leg, so I think it is not very likely that a cable is disconnected. Can you please help us instead disable the leg tactile sensor, for now, to get rid of the error?

IMG_3312

IMG_3311

IMG_3313

@paliasgh However, I don't quite understand whether you have the same error on the CAN of boards 10.0.1.20 and 10.0.1.21 or not?

I also do not know. The unusual thing that grabbed my attention was that nothing is discovered under 10.0.1.20 (see screenshot below). Should normally be something there?

image

@AntonioConsilvio
Copy link
Contributor

Hi @paliasgh,

  • If you want to solve the problem, you should test the continuity of the cables (11N1 and 11N2, just the part connecting the EB11 to the first MTB4) with a tester.

    If the continuity of the cables is good, perhaps the problem is inside the EB11 board.

    There is another test you can do: you can switch on the robot and check whether the MTB4s inside the leg have their LEDs
    switched on or not. If they are not lit, it means that the problem is the black or red cable.

  • If you prefer to disable the tactile leg sensor, you can go into the robots-configuration files and comment on these two lines inside icub_all.xml: first and second.

    I suggest adding another comment explaining why you commented on these lines.

    This should be sufficient to disable the tactile sensors on the right leg.

Please remember to give feedback and if you have any questions, write to us.

Regarding board 10.0.1.20, it has no CAN board under it, so it is normal that FirmwareUpdater shows nothing.

However, if there are CAN board detached, the logger will show errors very similar to the error of the board EB11:

[ERROR] from BOARD 10.0.1.11 (right_leg-eb11-skin), src CAN1, adr 0, time 582s 881m 129u: (code 0x00000019, par16 0x4001 par64 0x0000000000000000) ->
SYS: EOtheCANservice could not tx frames on CAN bus. In par16 there is: on msb the size of txfifo, on lsb a code. + .

[ERROR] from BOARD 10.0.1.11 (right_leg-eb11-skin), src CAN2, adr 0, time 582s 884m 128u: (code 0x00000019, par16 0x3f01 par64 0x0000000000000000) ->
SYS: EOtheCANservice could not tx frames on CAN bus. In par16 there is: on msb the size of txfifo, on lsb a code. + .

So, the board 10.0.1.20 should have no problems.

cc @sgiraz

@martinaxgloria
Copy link

Hi @paliasgh,

in the past days, we did some internal tests in order to replicate your first issue (i.e. Changing the control mode of any joint in robot's arm to torque control mode will make the joint unavailable due to hardware fault.). In particular, we managed to run WBD and gc on icub-head without errors and from the yarpmotorgui we were able to switch the joints into torque control.
We tried to put the arm in different configurations when in this control mode and we logged some values in order to understand if something happened when gc was enabled. We retrieved that actually something happens because the values are not zeros, but we have to further investigate the amount of compensation it is sending to the joints. Stay tuned!

@martinaxgloria
Copy link

Moreover, doing the tests cited above, I observed the problem n.2 (i.e. If we stop yarprobotinterface and then, start it again, the arms will not get calibrated and stay in the idle mode.). Also in this case, we are going to investigate the problem more in depth in the next days

cc @valegagge

@valegagge
Copy link
Member

Today I performed some tests related to the un-calibration of the arms when yarprobotinterface starts without switch off/on the motors.

I did the following steps:

  • switch on the motor

  • start yarprobotinterface (without calibration to speedup the tests.)

  • calibrate the fingers joint from 11 to 15 by yarprmotorgui

  • move all calibrated joints by yarprmotorgui.
    ==> until now all is ok
    Screenshot from 2023-07-05 10-23-19

  • stop yarprobotinterface

  • start again yarprobotinterface without the calibration

  • ==> now the fingers joints are in hw fault
    Screenshot from 2023-07-05 10-52-42

Due to the fingers fault, the calibration of the whole arm cannot be achieved.

Checking the figures above, it seems that when the controller checks the motor positions of the fingers considers them out of the range defined in the configuration, even if this is not true and moreover the motor position are almost the same in the first situation where all is fine.

This are the motor limits used.
Screenshot from 2023-07-05 10-58-36

This is a firmware bug. We are going to fix it.

cc @MSECode

@MSECode
Copy link

MSECode commented Jul 26, 2023

In relation to the comment above here the PRs that solve the issue of the joints going in hw fault after yarprobotinterface restarting.

cc: @valegagge

@paliasgh
Copy link
Author

paliasgh commented Jul 28, 2023

Hello @MSECode and @valegagge,

Thanks so much for working on the issue. Today, I updated the firmware of the boards to pull #95. For our robot, this meant updating ems4 boards from 3.65 to 3.70, mc4plus boards from 3.68 to 3.73, and strain2 boards from 2.1.0 to 2.2.0 (this last one was not part of the fix, but I assumed it is also needed for us to have the same configuration as what you tested on). There were no mc2plus, amc_wr, or amc_wl boards identified to update, as the title suggested. After the update, we now have:

79D2EE64-00DD-4AAC-93DD-745774966AE7_1_102_o

DE784E32-CB7B-4490-813D-1D67AD384DDC_1_102_o

32F7D132-EB3D-486B-A79F-675C0124A11A_1_102_o

During the update, something happened that caused another, even more serious, issue! One strain2 board, under 10.0.1.6, got disappeared during the update (similar to this issue with another board). You can compare the photo below, taken before updating strain2's, with the second photo above. I tried restarting everything, but still FirmwareUpdater no longer sees it.

35FD7D02-5825-499A-BBBD-23150657DB3C_1_201_a

Now, yarprobotinterface can not even run due to errors regarding that board, mainly:

[ERROR]  from BOARD 10.0.1.6 (left_leg-eb6-j0_3), src LOCAL, adr 0, time 86s 654m 427u: (code 0x00000013, par16 0x0001 par64 0x0000000000000000) -> SYS: the EOtheInfoDispatcher could not accept a eOmn_info_properties_t item inside its transmitting queue. In par16 there is the number of lost items. + . 
[ERROR]  from BOARD 10.0.1.6 (left_leg-eb6-j0_3), src CAN2, adr 0, time 86s 621m 128u: (code 0x00000019, par16 0x4001 par64 0x0000000000000000) -> SYS: EOtheCANservice could not tx frames on CAN bus. In par16 there is: on msb the size of txfifo, on lsb a code. + . 

Here is a complete log:

yarprunlog_28_07_2023_14_28_17.log

Thanks again for your support.

--
Update: I noticed in the attached log-file that we were getting errors like wrong eobrd_strain2 BOARD 10.0.1.1:CAN2:13 because it has: WRONG APPLICATION VERSION as well, for all starin2 boards that were updated to 2.2.0. I tried downgrading starin2s back to 2.1.0. During the process, another starin2, under 10.0.1.9 also got disappeared! We now miss two starin2s at least. For now, I commented out everything leg-related in our robot configuration file, so yarprobotinterface can start with only the upper body of our robot.

The previous issue, i.e., joints going in hw fault after yarprobotinterface restarting, still exists.

@MSECode
Copy link

MSECode commented Jul 28, 2023

cc: @marcoaccame @sgiraz @mfussi66
If I'm not mistaken the problem reported above and reported here:

"During the update, something happened that caused another, even more serious, issue! One strain2 board, under 10.0.1.6, got disappeared during the update"

seems similar to one we had updating the strain2.

@paliasgh
Copy link
Author

Hello @pattacini, @MSECode, @valegagge, and everyone,

I wanted to follow up on the status of this issue. As I reported here, the fix did not solve the issue of the joints going in hw fault after yarprobotinterface restarting in our robot. Is there anything we need to check again? Thanks a lot.

@MSECode
Copy link

MSECode commented Oct 2, 2023

That's unexpected, since I've just updated a robot, which should have the same sw and hw architecture of yours, with the latest Distro and firmware a couple of days ago and we had no issues.
Anyways, @valegagge @pattacini, should we organize a call all together to dig into the issue and solve that asap. Does it make sense to you?

@valegagge
Copy link
Member

valegagge commented Oct 2, 2023

Anyways, @valegagge @pattacini, should we organize a call all together to dig into the issue and solve that asap. Does it make sense to you?

Hi @paliasgh,
as the first thing, I suggest updating the entire robot to the new distro 2023.08.0.

If you still have the issue related to the strain boards, please tell us. thank you!!

@paliasgh
Copy link
Author

paliasgh commented Oct 2, 2023

Hello @valegagge,

I updated our systems and the robot to v2023.08.0. We still have an issue with strain2 boards. When updating the firmware to v1.36.0 today, strain2 boards under 10.0.0.3 and 10.0.1.7 also got disappeared. It seems like 10.0.0.3 is for the robot's right arm, so we no longer even have the upperbody of the robot working. Basically, there are only two strain2 boards left that can be seen. The rest are all gone!

Screenshot_2023-10-02_13-38-07

--
Update: I can only start the robot when both right and left arm are commented out from the icub_all.xml file. For the right arm, I believe the reason is strain2 board under 10.0.0.3 can not be seen (as reported above). For the not being able to start left arm, the reason is this error in yarprobotinterface, I think:

[WARNING] from BOARD 10.0.1.1 (left_arm-eb1-j0_3) time=2205s 973m 544u : CFG: CANdiscovery has detected  eobrd_strain2 board in CAN2 addr 13 with can protocol ver 2.0 and application ver 2.2.0 Search time was 0 ms
[ERROR] from BOARD 10.0.1.1 (left_arm-eb1-j0_3) time=2205s 973m 655u : CFG: CANdiscovery detected 1 invalid eobrd_strain2 boards in CAN2:
	 1 of 1: wrong eobrd_strain2 because it has: WRONG APPLICATION VERSION 
[ERROR] from BOARD 10.0.1.1 (left_arm-eb1-j0_3) time=2205s 973m 774u : CFG: EOtheSTRAIN cannot be configured. CANdiscovery fails for board at addr:13 and port:1 with can protocol ver 2.0 and application ver 2.1. Strain number is:0

However, the version of strain2 under 10.0.0.1 is 2.2.0 as in Firmware v1.36.0. Isn't it correct? Is it expecting version 2.1.0 for strain2 ?

@MSECode
Copy link

MSECode commented Oct 4, 2023

Hi @paliasgh,
for the strain2 under address 10.0.1.1 on CAN2:13, does the version defined in the robots-configuration files match with what the Firmware updater is seeing. Specifically, is the application version set there to be 2.2.0? I suppose for that strain2 there is a mismatching error between configuration files and the data sent over CAN after the discovery signal.
For the strain2 not seeing at all, we will check.

@valegagge
Copy link
Member

Hi @paliasgh,
for sure you need to update the configuration by setting the the 2.2.0 as fw version as @MSECode said before.

Anyway, we can verify that the update to the new distro works fine without the strain and then fix the strain issue.

So, you need to remove the strain boards configurations and their wrappers from the configuration file:
image

I prepared waterloo_icub_all_no_strain.txt for you which starts all devices except the strain. Could you try to run yarprobotinterfece and all should be ok.
waterloo_icub_all_no_strain.xml.txt

Obviously, you cannot run wholeBodyDynamics without strain.

@paliasgh
Copy link
Author

paliasgh commented Oct 4, 2023

Hello @MSECode and @valegagge,

  • Regarding the version conflict, I set 2.2.0 in all the xml files in iCubWaterloo01/hardware/FT. This should be everything that needed to be changed, right? That fixed the left arm. Thanks.

  • I could start the robot, entirely, with waterloo_icub_all_no_strain.xml configuration. The new distro works fine. In fact, one of the original problems (If we stop yarprobotinterface and then, start it again, the arms will not get calibrated and stay in the idle mode.) is solved with the update! Thanks a lot.

Here is a log file when running yarprobotinterface.
yarprunlog_04_10_2023_11_48_34.log

@MSECode
Copy link

MSECode commented Oct 4, 2023

Hi @paliasgh,

  • first point is correct. That should be all. All the data related to the CAN configuration of the strain should be thre.
  • good news for the second point. That's great.
  • We are investigating on the problem related to the impossibility to discover the strain in some conditions

Thanks for the udpate

@Nicogene
Copy link
Member

Nicogene commented Oct 6, 2023

Hi @paliasgh,

in the meanwhile you could use wholeBodyDynamics without ft sensors running with the option --dummy_ft, see https:/robotology/icub-main/blob/bd74c24d0aae5918d62a97ad3354ed5793a0b0f7/src/modules/wholeBodyDynamics/main.cpp#L529C34-L529C34

@MSECode
Copy link

MSECode commented Oct 19, 2023

Hello @valegagge,

I updated our systems and the robot to v2023.08.0. We still have an issue with strain2 boards. When updating the firmware to v1.36.0 today, strain2 boards under 10.0.0.3 and 10.0.1.7 also got disappeared. It seems like 10.0.0.3 is for the robot's right arm, so we no longer even have the upperbody of the robot working. Basically, there are only two strain2 boards left that can be seen. The rest are all gone!

Screenshot_2023-10-02_13-38-07

-- Update: I can only start the robot when both right and left arm are commented out from the icub_all.xml file. For the right arm, I believe the reason is strain2 board under 10.0.0.3 can not be seen (as reported above). For the not being able to start left arm, the reason is this error in yarprobotinterface, I think:

[WARNING] from BOARD 10.0.1.1 (left_arm-eb1-j0_3) time=2205s 973m 544u : CFG: CANdiscovery has detected  eobrd_strain2 board in CAN2 addr 13 with can protocol ver 2.0 and application ver 2.2.0 Search time was 0 ms
[ERROR] from BOARD 10.0.1.1 (left_arm-eb1-j0_3) time=2205s 973m 655u : CFG: CANdiscovery detected 1 invalid eobrd_strain2 boards in CAN2:
	 1 of 1: wrong eobrd_strain2 because it has: WRONG APPLICATION VERSION 
[ERROR] from BOARD 10.0.1.1 (left_arm-eb1-j0_3) time=2205s 973m 774u : CFG: EOtheSTRAIN cannot be configured. CANdiscovery fails for board at addr:13 and port:1 with can protocol ver 2.0 and application ver 2.1. Strain number is:0

However, the version of strain2 under 10.0.0.1 is 2.2.0 as in Firmware v1.36.0. Isn't it correct? Is it expecting version 2.1.0 for strain2 ?

Hi @paliasgh,
regarding the discovery and updating of the CAN boards that cannot be found as usual, such as the strain2 that is due to the fact that the when a CAN board cannot be found, the discovery message should be saved in the bootloader of the board. In order to do that, you need to triggered the restart of the ethernet board and just after that (in a time interval of 5 seconds) trigger the Discovery command in the FirmwareUpdate. This will allows you to discover the "hidden" CAN boards as illustrated at the new section of the documentation:

https://icub-tech-iit.github.io/documentation/icub_firmware/firmwareupdater/firmwareupdater/#discover-hidden-can-boards

Please, check it out, hope this will help you and sorry for the late reply.

cc: @pattacini @valegagge @sgiraz @Nicogene

@paliasgh
Copy link
Author

Hello @MSECode,

Many thanks! It seems to be working. Two things to check, just to make sure:

  • I could not discover any strain2 boards under 10.0.1.5. Is this normal?
Screenshot 2023-10-19 at 6 26 47 PM

Doing the procedure for boards 10.0.1.10 and 10.0.1.11 led to the discovery of 13 mtb4 boards under each address, which are not seen normally. Should I update all these as well? It seems like I need to do it one by one, since they get lost in case I update them all at once.

image

@MSECode
Copy link

MSECode commented Oct 20, 2023

Hi @paliasgh,

  • regarding the first point it is normal you do not have strain2 boards under the ETH board 10.0.1.5, since it is the torso board, and as you can see from the image below, there not any FT sensor configured for them. So that's fine, no strain2 for 10.0.1.5

image

  • regarding the mtb4 CAN boards, it is not needed to update them.

cc: @pattacini @valegagge @sgiraz @Nicogene

@sgiraz
Copy link
Contributor

sgiraz commented Nov 29, 2023

Hi @paliasgh,

Do you have any news on this issue?
If everything is okay, I will proceed to close it..

@paliasgh
Copy link
Author

Hello @sgiraz,

Thanks for checking with us. I was busy with another project during the past months so I was not fast responding here. I did some tests today and everything with yarprobotinterface looks fine! in fact, all the 3 original problems are resolved.

Thanks again @MSECode @pattacini @valegagge @sgiraz @Nicogene @martinaxgloria @AntonioConsilvio for all the support!

@sgiraz sgiraz closed this as completed Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

8 participants