Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SONiC NPU MDIO access support and gbsyncd docker enhancement HLD #1045

Merged
merged 8 commits into from
Oct 8, 2022

Conversation

jiahua-wang
Copy link
Contributor

@jiahua-wang jiahua-wang commented Aug 3, 2022

SONiC NPU MDIO access support and gbsyncd docker enhancement HLD

Repo PR title State
SAI Add switch api for clause 22 mdio access
sonic-buildimage Add sai_mdio_access_clause22=1 in td3x2-a720dt-48s-flex.config.bcm
sonic-sairedis Add support of mdio IPC server class using sai switch api and unix socket
sonic-sairedis Add mdio IPC client library
sonic-sairedis another code PR of VendorPai class TBD

@jiahua-wang jiahua-wang changed the title Initial version of Sonic gbsyncd/PAI mdio access HLD SONiC NPU MDIO access support and gbsyncd docker enhancement HLD Aug 3, 2022

There are 2 mdio access modes: clause 45 and clause 22. Some external PHY uses clause 45 mdio while other external PHY uses clause 22 mdio. The switch NPU sai switch api should distinguish the 2 modes.

When a configured platform target is built, there is only one syncd docker as the SAI library for the configured platform will cover all the device with switch NPU in the same configured platform class. Although the gbsyncd docker is very much similar to the syncd docker, there could be many gbsyncd docker required as there are different PAI library for different PHY and MDIO access method. Our goal is to build a single gbsyncd docker to cover all PHY and MDIO access method.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In switch hardware arch of sonic, one realistic assumption is that multiple external PHYs connecting to one ASIC/NPU, are same chip type. This simplifies software design. Like one syncd docker having one SAI library, one gbsyncd docker includes one PAI library for multiple PHYs.

Copy link
Contributor Author

@jiahua-wang jiahua-wang Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary true. Two counterexamples come to mind: Dell 3248pxe and Accton as4630-54NPE.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the PHYs from one Vendor? If yes, can be driven by one PAI libary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dell 3248pxe has PHYs from 2 different vendors.


### Architecture Design

There are many choices for the IPC mechanism between the syncd daemon and gbsyncd daemon. One performance requirement is that it should finsh firmware download within a reasonable time. Our design choice is to use the Unix socket as the IPC mechanism. Our design has the MDIO IPC server in the syncd daemon with its own thread. A new syncd class MdioIpcServer is added to start a new thread, to create an unix socket, to listen on the socket, to accept connection and to read/reply IPC messages.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we consider using connectionless unix socket for avoiding open/accept/close connection?

Copy link
Contributor Author

@jiahua-wang jiahua-wang Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the design goals is to easily accommodate multiple clients at the same time. The unix socket type used in the design helps to achieve this goal.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unix socket with SOCK_DGRAM could do same with epoll/select. Clients are able to be identified by client addresses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be done by either socket types. It is easier to track clients using connection.


There is a corresponding MDIO access IPC client code in the form of dynamic link library which provides the flexiblity to load the library at runtime. Assuming the MDIO access library for sysfs is also in the form of dynamic library, gbsyncd can select the MDIO access library at runtime based on some configuration in the gearbox\_config.json file.

The same gearbox\_config.json file already has the information of the PAI library name. The information can be used to dynamically load the PAI library at runtime.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to load MDIO client library and PAI library at runtime looks over-designed. The current design of gbsyncd having a dynamical linked PHY library is simple and straightforward. It also eases the gbsyncd docker build.

Copy link
Contributor Author

@jiahua-wang jiahua-wang Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our design is to have a single gbsyncd container. When a new PAI/external PHY type is added, only the new library is added to the gbsyncd container, but not to create another gbsyncd container.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each PAI/PHY type may also need specific psai.profile for initialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design does not aggregate PHYs. Without extra software components/changes, each PHY is still treated as individual PHY.

Copy link

@jimmyzhai jimmyzhai Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a new PAI/external PHY type is added, only the new library is added to the gbsyncd container,

The new library may depend on 1/ env setup 2/ init PHY config file 3/ other lib/sdk, etc, as to work and manage new PHY. Would you describe the details of how to handle them, while loading the library with dlopen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a new platform with external PHY is added to the SONiC image. The sonic device data directories will have the new init PHY config file, gearbox config file and PHY config file in the new "platform" and "hwsku" directories. The new PAI/PHY driver libraries need to be added to the 'gbsyncd' docker when the SONiC image is built. dlopen()/dlsym() do not require LD_LIBRARY_PATH or ldconfig setup. As long as the new library has a distinguish path/name, it can be open and loaded at runtime.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see dlopen()/dlsym() do not require LD_LIBRARY_PATH. My point is the library might need its runtime context setup in gbsyncd docker. How to dynamically make it ready?

  • In docker init, some shell scripts could be run to help PAI
  • How to let each library know the value of SAI_INIT_CONFIG_FILE
  • If the library depends on other libs, how to load them automatically in dlopen? Looks they must be under standard lib directory or LD_LIBRARY_PATH?
  • The library name, as well as the related script, needs distinguished

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that the syncd needs a profile name passed by -p option, e.g.
command=/usr/bin/syncd -p /etc/sai.d/pai.profile -x /usr/share/sonic/hwsku/context_config.json -g 1
in the supervisord.conf. The profile has the environment SAI_INIT_CONFIG_FILE which is PHY specific. There are many different ways to solve the issue. Because the changed VendorSai class will parse the gearbox_config.json file and the key "sai_init_config_file": exists in the file, the VendorSai class will override the value of SAI_INIT_CONFIG_FILE from the profile.
The PAI library is packaged as a debian file and the gbsyncd docker build uses the debian package, the PAI library debian package has the dependence built in at package build time. When it is installed in the gbsyncd docker, all dependence package will be installed too.
The PAI library name is in the gearbox_config.json file with the key "lib_name": . We can put any related script under /usr/share/sonic/hwsku or platform module which are platform specific.

![syncd gbsyncd MDIO IPC](images/MDIO-IPC.png)

The VendorSai class constructor is overloaded to handle the dynamic linking of the libraries. The syncd class VendorSai currently only has a constructor with no argument. The linked SAI library path is determined at syncd build time. New constructors with arguments for the VendorSai class is introduced as the overloading constructors. The new constructors can either have the SAI/PAI library path as the argument or have PHY id as one of the arguments. When the constructor with PHY id as the argument is used, the syncd can index into the "phys" section of gearbox configuration file "gearbox\_config.json" to get the PAI library path and MDIO access library path at run time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, overloading VendorSai should not be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is the same as above.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it should not be called Dll, and just by simple VendorPai

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed class name to VendorPai

...
"phy_access": "mdio",
+ "phy_access_lib_name": "/usr/lib/libgbsyncdaccess.so",
+ "mdio_cl22_only": true,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a boolean key, I prefer to explicitly specify mdio access method, like

phy_access_sw ::= "ipc_mdio_cl22" | "ipc_mdio_cl45"

Copy link
Contributor Author

@jiahua-wang jiahua-wang Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because MDIO clause 22 could access clause 45 register but not vice versa. The MDIO clause 22 register is often referred as clause 22 access only register.

- The VendoSai class overrides the mdio clause 45 read/write and mdio clause 22 read/write virtual functions.
- New keys are added to the gearbox configuration file "gearbox\_config.json".

When the VendorSai class constructor with PHY id argument is used in syncd.cpp, we can assume a simple case where syncd context number is the same as the PHY id.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now one syncd process associates with one context entry in context_config.json. Will it launch multiple syncd processes in new design?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design does not aggregate PHYs. It also does not block other software components/changes to manage multiple PHYs in a single context.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assume a simple case where syncd context number is the same as the PHY id

Still do not understand the meaning of the assumption. There is only one context for gbsyncd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original gearbox design has one context for each PHY. That is why there is ongoing effort to aggregate multiple PHY in a single context. Our design is based on what is currently in the community sonic. There is different way to aggregate PHYs into a single context. It is outside the scope of this design document.

...
}

For the case that multiple PHY ids aggregate in a single syncd context, it is a new separate topic. We don't discuss the topic in this HLD.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we are in the case of "multiple PHY ids aggregate in a single syncd context" by default. Need consider it for backwards compatibility. Even different cases could coexist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design does not block other software components/changes to manage multiple PHYs in a single context.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If commandLineOptions->m_globalContext != 0, will it only create one vendorSai? In the design, looks a vendorSai will dynamically load a PAI library for one PHY. How to load multiple PAI libs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design does not cover the PHY aggregation issue. When there is no PHY aggregation, each PHY will have its own context and each PHY is controlled by one gbsyncd process. Each gbsyncd process will have a different context.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently Broncos/Broadcom PAI for PikeZ is working with PHY aggregation, i.e. one gbsyncd process with one context manages multiple PHYs. This design should cover it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to PR sonic-net/sonic-buildimage#11362, file device/arista/x86_64-arista_720dt_48s/Arista-720DT-48S/context_config.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see that context of a json file but I don't see the code processing the context. In order for us to adapt our design for the phy aggregation, we need to know how the context_config.json is processed and used.
We only can design based on what we know. Our original design has listed context limitation in the HLD. We can have a follow up HLD to overcome the limitation in this HLD when we see the design of the PHY aggregation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure about idea passing globalContext to VendorSai since global context is concept of syncd and sairedis not VendorSai

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be good idea do ass PaiSai.cpp that will inherit VendorSai.cpp, and use selection in syncd_main.cpp where vendoisai would load dynamic SAI apis' and PaiSai would override those and prepare arguments in create switch function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may add another command line options to specify PHY id. I will double check if global context is needed in VendorSai. Our existing code use VendorPaiDll class which inherits from VendorSai class. We found that we need to override most member functions but with a small change in each member function. We come to the conclusion that change VendorSai itself makes more sense.

...
m_access_lib_handle = dlopen (gbsyncd_get_phy_access_lib_name().c_str(), RTLD_LAZY);
m_mdio_cl22_only = gbsyncd_get_mdio_cl22_only();
...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One MDIO access client lib looks OK, since only having a MDIO IPC server. Is it necessary to dynamically load it?

Copy link
Contributor Author

@jiahua-wang jiahua-wang Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Broncos/Broadcom PAI as an example, the MDIO IPC client is part of the PAI. We think the MDIO IPC client and PAI should be separate libraries. With a separate PAI library which does not include the MDIO access method, we need another mechanism to specify what kind of MDIO access method to use, e.g. IPC vs. SYSFS. dlopen()/dlsym() is one of the ways to provide the mechanism.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like MDIO IPC server in syncd, MDIO IPC client could be a part of gbsyncd. This is common and mdio read/write is provided to PAI with switch attr SAI_SWITCH_ATTR_REGISTER_READ/SAI_SWITCH_ATTR_REGISTER_WRITE.

PAI itself knows MIDIO access method. If it is SYSFS, it can implement its mdio read/write with SYSFS and ignore attr SAI_SWITCH_ATTR_REGISTER_READ/SAI_SWITCH_ATTR_REGISTER_WRITE.

Further more, we could create a common IPC channel between syncd and gbsyncd. MDIO access is a service on top of it. In the future new service could be implemented on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MDIO SYSFS interface may depend on FPGA/CPLD based MDIO kernel driver implementation. There will be many different hardware implementation of the FPGA/CPLD hence the SYSFS interface could differ from vendor to vendor. The easiest way to add new MDIO access function is to add the function through library.
The current common IPC mechanism in SONiC is RedisDB. We have tried to use RedisDB, the performance is very bad. When any common IPC channel is used, it puts a limit on using a specific thread to process MDIO only, which affects the performance of firmward download.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MDIO SYSFS access is one part of PAI lib, which is vendor specific.

For common IPC mechanism, I means your IPC client/server for MDIO access.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IPC channel is dedicated to MDIO access. If the channel is shared with other IPC tasks, we could see performance issue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where gbsyncd_get_mdio_cl22_only is implemented?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where gbsyncd_get_pai_lib_name implemented? name of the lib should be passed to VendorSai constructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will change to pass above values to VendorSai constructor.

Comment on lines +179 to +184
if (objectType == SAI_OBJECT_TYPE_SWITCH)
{
std::vector<sai_attribute_t> attr_copy;
sai_attribute_t attr;

for (auto i = 0; i < attr_count; i++)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure it this should be handled here, or in syncd itself, since vendor sai should be just thin layer on SAI apis, and not interfering with attributes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will change to handle in syncd itself.

attr.value.ptr = (void *) m_mdio_read;
attr_copy.push_back(attr);
}
else if (attr_list[i].id == SAI_SWITCH_ATTR_REGISTER_WRITE)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should here be also && m_context !=0 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually SAI_SWITCH_ATTR_REGISTER_WRITE and SAI_SWITCH_ATTR_REGISTER_READ attributes are only passed by orchagent for PAI but not for SAI. As my answer to another comment, we are going to move this piece of code from VendorSai to syncd itself.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe there should be ParSay inheriting VendorSai to keep this separate from Syncd.cpp class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will create VendorPaiDll inheriting VendorSai. We still need a small change inside VendorSai so that VendorPaiDll does not have to override most the member function.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

int syncd_main(int argc, char **argv)
{
...
if (commandLineOptions->m_globalContext != 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the condition commandLineOptions->m_globalContext != 0, suggest explicitly define a new command line option such as m_useVendorPaiDll. If it is true, it will create VendorPaiDll object.

This would be a backward compatible behavior. It does not impact the current case of one gbsyncd process managing multiple same-type PHYs, where it is also a truth of m_globalContext != 0.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, this was one of the proposal on the meeting, and i think also it is good approach

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also that json configuration should be passed to that object in the constructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add commandLineOptions->m_paiInstance. It severs 2 purposes, to differentiate VendorPai from VendorSai and to index into the "phys" section array of "gearbox_config.json". The json configuration name is already in commandLineOptions->m_contextConfig which will be passed the VendorPai constructor. The constructor can get all information by parsing the json configuration.

@jimmyzhai jimmyzhai merged commit deb3188 into sonic-net:master Oct 8, 2022
@zhangyanzhao
Copy link
Collaborator

We still have another code PR of VendorPai class which is part of the [https://pull/1045] HLD. We will try to make it as part of 202311 release.

Thanks,

Jiahua

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants