Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Accton] AS4630-54TE support SystemHealthMonitor #8183

Merged

Conversation

seanwu-ec
Copy link
Contributor

@seanwu-ec seanwu-ec commented Jul 15, 2021

  1. Implement FanDrawer-Fan hierarchy.
  2. Enable thermalctld, disable pcied.
  3. Implement SystemLED in Chassis.
  4. Correct Fan direction
  5. Implement require Fan APIs for SystemHealthMonitoring.
  6. Handle non-ascii character while reading PSU model/serial num.

Why I did it

To support system health monitoring feature.

How I did it

Mainly provided a config file, system_health_monitoring_config.json. Then implement the SystemLED part that'd be manipulated by HealthChecker. Others are just to correct the provided information which will be utilized by HardwareChecker.

How to verify it

Check if System-health can pass the check and display the SystemLED correctly.


///////// booting, DIAG_LED = GREEN_BLINKING /////////

root@sonic:/tmp# show system-health detail 
System is currently booting...
root@sonic:/tmp# cat /sys/class/leds/diag/brightness 
5


///////// container_checker fail, DIAG_LED = AMBER /////////

root@sonic:/sys/bus/i2c/devices# show system-health detail
System status summary

  System status LED  STATUS_LED_COLOR_AMBER
  Services:
    Status: OK
  Hardware:
    Status: Not OK
    Reasons: container_checker is not Status ok

System services and devices monitor list

Name                        Status    Type
--------------------------  --------  ----------
container_checker           Not OK    Program
sonic                       OK        System
rsyslog                     OK        Process
root-overlay                OK        Filesystem
var-log                     OK        Filesystem
routeCheck                  OK        Program
diskCheck                   OK        Program
container_memory_telemetry  OK        Program
FAN-1F                      OK        Fan
FAN-1R                      OK        Fan
FAN-2F                      OK        Fan
FAN-2R                      OK        Fan
FAN-3F                      OK        Fan
FAN-3R                      OK        Fan
PSU-1 FAN-1                 OK        Fan
PSU-2 FAN-1                 OK        Fan
PSU 1                       OK        PSU
PSU 2                       OK        PSU

System services and devices ignore list

Name             Status    Type
---------------  --------  ------
asic             Ignored   Device
psu.temperature  Ignored   Device

///////// skip container_checker, DIAG_LED = GREEN /////////

root@sonic:/sys/bus/i2c/devices# vi /usr/share/sonic/device/x86_64-accton_as4630_54te-r0/system_health_monitoring_config.json 
root@sonic:/sys/bus/i2c/devices# 
root@sonic:/sys/bus/i2c/devices# 
root@sonic:/sys/bus/i2c/devices# show system-health detail
System status summary

  System status LED  STATUS_LED_COLOR_GREEN
  Services:
    Status: OK
  Hardware:
    Status: OK

System services and devices monitor list

Name                        Status    Type
--------------------------  --------  ----------
sonic                       OK        System
rsyslog                     OK        Process
root-overlay                OK        Filesystem
var-log                     OK        Filesystem
routeCheck                  OK        Program
diskCheck                   OK        Program
container_memory_telemetry  OK        Program
FAN-1F                      OK        Fan
FAN-1R                      OK        Fan
FAN-2F                      OK        Fan
FAN-2R                      OK        Fan
FAN-3F                      OK        Fan
FAN-3R                      OK        Fan
PSU-1 FAN-1                 OK        Fan
PSU-2 FAN-1                 OK        Fan
PSU 1                       OK        PSU
PSU 2                       OK        PSU

System services and devices ignore list

Name               Status    Type
-----------------  --------  -------
container_checker  Ignored   Service
psu.temperature    Ignored   Device
asic               Ignored   Device

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106

Description for the changelog

[Accton] AS4630-54TE support SystemHealthMonitor

A picture of a cute animal (not mandatory but encouraged)

1. Implement FanDrawer-Fan hierarchy.
2. Enable thermalctld, disable pcied.
3. Implement SystemLED in Chassis.
4. Correct Fan direction
5. Implement require Fan APIs for SystemHealthMonitoring.
6. Handle non-ascii character while reading PSU model/serial num.

Signed-off-by: Sean Wu <[email protected]>
@seanwu-ec seanwu-ec requested a review from jleveque as a code owner July 15, 2021 02:00
Copy link
Collaborator

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@lguohan lguohan merged commit fed8957 into sonic-net:master Jul 24, 2021
@seanwu-ec seanwu-ec deleted the as4630_54te_support_system_health branch July 26, 2021 01:03
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
1. Implement FanDrawer-Fan hierarchy.
2. Enable thermalctld, disable pcied.
3. Implement SystemLED in Chassis.
4. Correct Fan direction
5. Implement require Fan APIs for SystemHealthMonitoring.
6. Handle non-ascii character while reading PSU model/serial num.

```
Check if System-health can pass the check and display the SystemLED correctly.


///////// booting, DIAG_LED = GREEN_BLINKING /////////

root@sonic:/tmp# show system-health detail 
System is currently booting...
root@sonic:/tmp# cat /sys/class/leds/diag/brightness 
5


///////// container_checker fail, DIAG_LED = AMBER /////////

root@sonic:/sys/bus/i2c/devices# show system-health detail
System status summary

  System status LED  STATUS_LED_COLOR_AMBER
  Services:
    Status: OK
  Hardware:
    Status: Not OK
    Reasons: container_checker is not Status ok

System services and devices monitor list

Name                        Status    Type
--------------------------  --------  ----------
container_checker           Not OK    Program
sonic                       OK        System
rsyslog                     OK        Process
root-overlay                OK        Filesystem
var-log                     OK        Filesystem
routeCheck                  OK        Program
diskCheck                   OK        Program
container_memory_telemetry  OK        Program
FAN-1F                      OK        Fan
FAN-1R                      OK        Fan
FAN-2F                      OK        Fan
FAN-2R                      OK        Fan
FAN-3F                      OK        Fan
FAN-3R                      OK        Fan
PSU-1 FAN-1                 OK        Fan
PSU-2 FAN-1                 OK        Fan
PSU 1                       OK        PSU
PSU 2                       OK        PSU

System services and devices ignore list

Name             Status    Type
---------------  --------  ------
asic             Ignored   Device
psu.temperature  Ignored   Device

///////// skip container_checker, DIAG_LED = GREEN /////////

root@sonic:/sys/bus/i2c/devices# vi /usr/share/sonic/device/x86_64-accton_as4630_54te-r0/system_health_monitoring_config.json 
root@sonic:/sys/bus/i2c/devices# 
root@sonic:/sys/bus/i2c/devices# 
root@sonic:/sys/bus/i2c/devices# show system-health detail
System status summary

  System status LED  STATUS_LED_COLOR_GREEN
  Services:
    Status: OK
  Hardware:
    Status: OK

System services and devices monitor list

Name                        Status    Type
--------------------------  --------  ----------
sonic                       OK        System
rsyslog                     OK        Process
root-overlay                OK        Filesystem
var-log                     OK        Filesystem
routeCheck                  OK        Program
diskCheck                   OK        Program
container_memory_telemetry  OK        Program
FAN-1F                      OK        Fan
FAN-1R                      OK        Fan
FAN-2F                      OK        Fan
FAN-2R                      OK        Fan
FAN-3F                      OK        Fan
FAN-3R                      OK        Fan
PSU-1 FAN-1                 OK        Fan
PSU-2 FAN-1                 OK        Fan
PSU 1                       OK        PSU
PSU 2                       OK        PSU

System services and devices ignore list

Name               Status    Type
-----------------  --------  -------
container_checker  Ignored   Service
psu.temperature    Ignored   Device
asic               Ignored   Device
```


Signed-off-by: Sean Wu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants