Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch flash_map to resolve devices at compile-time instead of using device_get_binding #33346

Closed

Conversation

de-nordic
Copy link
Collaborator

The PR contains series of commits that switch flash_map from using device_get_binding on flash_area.fa_dev_name
to using new filed flash_area.fa_dev, that is filled at compile time with a pointer to the flash device the flash_area is tied to.
This PR also contains commit that makes the filed flash_area.fa_dev_name optional, as it is no longer used within flash map API, but is left available for external code that relies on existence of this field.

Set to DNM due to ongoing tests.

@de-nordic de-nordic added area: Flash DNM This PR should not be merged (Do Not Merge) labels Mar 15, 2021
@github-actions github-actions bot added area: API Changes to public APIs area: Storage Storage subsystem labels Mar 15, 2021
@@ -21,6 +21,19 @@ config FLASH_MAP_SHELL
help
This enables shell commands to list and test flash maps.


config FLASH_MAP_DEV_NAME_IN_FLASH_AREA
Copy link
Collaborator

@nvlsianpu nvlsianpu Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for backward compatibility this option should enforce device binding behavior in the code?

#if defined(CONFIG_FLASH_MAP_DEV_NAME_IN_FLASH_AREA) && defined(FLASH_MAP_DEV_NAME_IN_FLASH_AREA)
	if (fa->fa_dev == NULL) {
		flash_dev = device_get_binding(fa->fa_dev_name);
		if (flash_dev == NULL) {
			return -ENODEV;	
		}
	}
#endif

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have such variant of the PR if desired. I have thought about the options and I now think, since that we allow custom flash_map to be developed, that user may fail to deliver device bindings so the option to use the new compile time resolution should be default for non custom maps.

Copy link
Collaborator

@nvlsianpu nvlsianpu Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you.
I was about to write comment with suggestion to make he new compile time resolution should be default.

Copy link
Collaborator

@nvlsianpu nvlsianpu Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#if defined(CONFIG_FLASH_MAP_DEV_NAME_IN_FLASH_AREA) && defined(FLASH_MAP_DEV_NAME_IN_FLASH_AREA)
	if (fa->fa_dev == NULL) {
		flash_dev = device_get_binding(fa->fa_dev_name);
		if (flash_dev == NULL) {
			return -ENODEV;	
		}
	}
#endif

@de-nordic de-nordic force-pushed the flash_map-no-more-binding branch 2 times, most recently from 3de5c86 to ecf557c Compare March 16, 2021 15:06
Copy link
Collaborator

@nvlsianpu nvlsianpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLASH_MAP_DEV_IN_FLASH_AREA=y
FLASH_MAP_USE_DEV_IN_FLASH_AREA=n

makes fa_dev field meaningless. This doesn't make sense.

@de-nordic
Copy link
Collaborator Author

FLASH_MAP_DEV_IN_FLASH_AREA=y
FLASH_MAP_USE_DEV_IN_FLASH_AREA=n

makes fa_dev field meaningless. This doesn't make sense.

That is in case when user wants to use fa_dev for something other than flash map API.

@nvlsianpu
Copy link
Collaborator

That is in case when user wants to use fa_dev for something other than flash map API.

Is there any user case? I understand that user might want either fa_dev and fa_dev_name. Worth to add possibility to use fa_dev_name if fa_dev is null (backward compatibility, and dynamic drivers).

I feel that so simple subsystem have to much option with this patch.

@de-nordic
Copy link
Collaborator Author

de-nordic commented Mar 17, 2021

Is there any user case?

I don't know. I will default the one choice: if fa_dev is enable it is used.

I understand that user might want either fa_dev and fa_dev_name. Worth to add possibility to use fa_dev_name if fa_dev is null (backward compatibility, and dynamic drivers).

I am not sure I want to add such option: the selection should go either way, if use the fa_dev, then use fa_dev if not, then use fa_dev_name. We will end up in situations where user gets fa_dev null due to error in generating map and fa_dev_name points to something else than expected. We can only control the default flash map. No harm unless some writes happen.

I feel that so simple subsystem have to much option with this patch.

Will reduce the _USE_ one.

@nvlsianpu
Copy link
Collaborator

@de-nordic Looks ok to me now.

Need to align a few flash_area structure usage from the codebase,

@github-actions github-actions bot added area: File System area: Samples Samples area: Settings Settings subsystem area: Tests Issues related to a particular existing or missing test labels Mar 22, 2021
@de-nordic
Copy link
Collaborator Author

Figure we can save a bit on the ifdef'ery :).

We will probably dump the fa_dev_name completely but we have to deprecate it so that users that generate customs flash maps get used to generating pointers to device objects instead of device names.

@Laczen
Copy link
Collaborator

Laczen commented Mar 25, 2021

@de-nordic, @nvlsianpu, can't we get rid of flash_map? Since all information can be received from the dts, I would expect this to be replaced by a flash_partition_write, flash_partition_read and flash_partition_erase that would additionally do a check to avoid doing a flash operation outside the area. A flash partition cfg could be defined that contains the offset and size and points to the flash cfg for erase_value, write_size and erase_block_size.

@de-nordic
Copy link
Collaborator Author

@de-nordic, @nvlsianpu, can't we get rid of flash_map? Since all information can be received from the dts, I would expect this to be replaced by a flash_partition_write, flash_partition_read and flash_partition_erase that would additionally do a check to avoid doing a flash operation outside the area. A flash partition cfg could be defined that contains the offset and size and points to the flash cfg for erase_value, write_size and erase_block_size.

Cannot yet. The msuboot and mcumgr use integer IDs for figuring which partition to operate on. Within Zephyr, internally we probably could do that.

@Laczen
Copy link
Collaborator

Laczen commented Mar 25, 2021

Cannot yet. The msuboot and mcumgr use integer IDs for figuring which partition to operate on. Within Zephyr, internally we probably could do that.

If required by mcuboot/mcumgr, these things should move into mcuboot/mcumgr code.

Copy link
Contributor

@mbolivar-nordic mbolivar-nordic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 for now because I'm not comfortable with this at first blush and want to take a longer look at this today

Comment on lines 39 to 41
The option is selected for default flash map (FLASH_MAP_CUSTOM=n); if your custom map
populates the flash_area.fa_dev, with pointers to devices that are tied to proper flash
areas, you may want to unselect this option.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this.

The option is selected

Nothing I can see selects this option if FLASH_MAP_CUSTOM=n, it's just default y in that case. That is not the same thing.

if your custom map [...] you may want to unselect this option.

I don't understand what this is trying to say, either.

First, I think using the word "unselect" in this context is unclear because Kconfig has a select keyword that you cannot undo.

Second, it seems to me that the only way to both have a custom map (CONFIG_FLASH_MAP_CUSTOM=y) and populate fa_dev pointers is if the user manually sets them up, so why would they want to make sure CONFIG_FLASH_MAP_DEV_IN_FLASH_AREA=n?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix the comments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that I have merged comments from two different options.

@mbolivar-nordic
Copy link
Contributor

mbolivar-nordic commented Mar 29, 2021

@de-nordic @nvlsianpu I'm confused about why this is being done this way. I feel like we've been carrying a bunch of legacy stuff from the mynewt operating system in Zephyr's mcuboot port that I don't understand the need for and this seems to be adding on top of it instead of fixing the underlying issue.

From what I recall, struct flash_area is used in mcuboot basically as a copy/paste from the mynewt RTOS HAL. And now when I look at what's inside the mcuboot repository today, I see that each flash backend is responsible for defining the structure according to its own needs.

Grepping a bit more, I only see fa_device_id mentioned in 2 places in bootutil itself:

Those are write-only, just putting the value into the struct boot_rsp. And back in https:/mcu-tools/mcuboot/blob/579b30c29999860f9f7d843a25ff5453b6542cce/boot/zephyr/main.c, it doesn't look like we're doing anything with that value at all.

So it seems to me we could do this in a couple of different ways that would have lower footprint.

Option 1 Stick with the mynewt HAL names, but change fa_device_id to a uintptr_t and put the device pointer in there. We'd need to rearrange a bit to minimize padding bytes, so the results would look like this:

struct flash_area {
	uintptr_t fa_device_id;
	off_t fa_off;
	size_t fa_size;
	uint8_t fa_id;
};

Then set .fa_device_id = (uintptr_t)DEVICE_DT_GET(...) in the C macro magic that initializes this at build time.

Zephyr is already using differently sized types for the fa_size and fa_off members (we use size_t and off_t, but MCUboot's PORTING.md says they are uint32_t, which isn't the same: size_t is bigger on 64 bit systems, and off_t is signed). So there's already some skew here in the Zephyr code, and we should be able to use a differently sized integer type for fa_device_id if we want to IMO.

Option 2 Take this to upstream MCUboot and change it so this bootloader doesn't force design decisions made by the mynewt RTOS onto other operating systems.

From what I can tell, that would basically mean changing the br_flash_dev_id in struct boot_rsp to be an opaque typedef instead,

struct boot_rsp {
    const struct image_header *br_hdr;
    flash_dev_id_t br_flash_dev_id;
    uint32_t br_image_off;
};

which Zephyr's flash map backend could handle like this:

typedef const struct device* flash_dev_id_t;

Then our flash_area structure can use a device pointer here too, with nicer results since there will be less casting to and from const struct device*:

struct flash_area {
	flash_dev_id_t fa_device_id;
	off_t fa_off;
	size_t fa_size;
	uint8_t fa_id;
};

Am I missing something here? Why do we need to bloat this structure even more?

@mbolivar-nordic
Copy link
Contributor

Am I missing something here? Why do we need to bloat this structure even more?

To head off a potential objection, yes, I know that what I am proposing is an API change and that Flash map is stable.

But I think we all agree that the current situation is just technical debt we inherited from another RTOS. We have a process for changing stable APIs and I think a change like this in particular is long overdue, especially now that we can avoid using device_get_binding() at all.

@nvlsianpu
Copy link
Collaborator

@mbolivar This PR objective is to use compile time device binding instead of runtime binding. Legacy fa_dev_name field is kept optional only for backward compatibility.

And yes, we should improve this code as you elaborated (I found both options usefully) and change the API.

We want to limit this PR to already small change. API change we would do in very next PR.

@mbolivar-nordic
Copy link
Contributor

We want to limit this PR to already small change. API change we would do in very next PR.

I guess I don't understand why this PR should be merged if it's adding a new Kconfig option that will be removed in the next PR. If the path forward to using compile-time struct devices is agreed upon -- it seems like you agree, would be good to know @de-nordic's thoughts too -- then why not pursue it?

@de-nordic
Copy link
Collaborator Author

de-nordic commented Apr 1, 2021

The mcuboot should not directly peek into flash_area, I have thought that we are giving it flash_map header and it blindly uses flash_area_* API calls. I think that was the original idea as it that flash_area defined multiple times in the mcuboot code, for example for other platforms.

We could probably get rid of the fa_device_id, you may post your comment (#33346 (comment)) as an issue.
I do not think that changing type of fa_device_id to let it take const struct device * casted to uintptr_t is good idea.

The ifdefery and Kconfig options are provided because we let users to link custom flash maps, which means that we have let them adapt what they generate and actually give them option whether they want to adapt at all (at least for now).
The objective of the PR has been to get rid of device_get_binding usage within Zephyr, internally yet allow alternative.

I guess I don't understand why this PR should be merged if it's adding a new Kconfig option that will be removed in the next PR.

I do not think that the Kconfig will be removed in next PR. The fa_device_id might be, Addition of data placeholders, like this PR does are quite safe, removals not always.
Yes we consider changes to flash_map API but we do not have consensus yet whether this will be change to existing API or alternative API that will phase out the original within Zephyr so that we could later on deprecate the original one.

@mbolivar-nordic
Copy link
Contributor

The mcuboot should not directly peek into flash_area, I have thought that we are giving it flash_map header and it blindly uses flash_area_* API calls.

No, that's not correct. It needs the structure definition provided by a platform port. This is covered in https:/mcu-tools/mcuboot/blob/master/docs/PORTING.md. You can see in flash_map.h for Zephyr that fa_device_id is provided directly for MCUboot compatibility; we don't use it anywhere in the tree except to print it out in the shell.

It's not just documentation either. The mcuboot code looks directly into the flash_area structure in several places in bootutil. By grep, it directly looks at fa_id, fa_size, fa_device_id, fa_off -- all of the members it mentions in struct flash_area in PORTING.md.

We could probably get rid of the fa_device_id

I wish!

But since mcuboot is getting this definition from here, we can't do that without breaking mcuboot for the reason explained above.

Try it and see; if you comment it out in Zephyr and build mcuboot, you will get this:

/home/mbolivar/zp/bootloader/mcuboot/boot/bootutil/src/loader.c:1947:58: error: 'const struct flash_area' has no member named 'fa_device_id'
 1947 |     rsp->br_flash_dev_id = BOOT_IMG_AREA(state, BOOT_PRIMARY_SLOT)->fa_device_id;

@de-nordic
Copy link
Collaborator Author

de-nordic commented Apr 2, 2021

No, that's not correct. It needs the structure definition provided by a platform port. This is covered in https:/mcu-tools/mcuboot/blob/master/docs/PORTING.md. You can see in flash_map.h for Zephyr that fa_device_id is provided directly for MCUboot compatibility; we don't use it anywhere in the tree except to print it out in the shell.

Yes, it also states that you may need to implement counters to protect flash area from closing it when it has been opened more than one time.
So it is bad design. The object of flash area should be only accessed an manipulated via API or macros that hide the internals of the structure. if you want something from it it should be taken by the API and the flash area structure should be implemented.

We could probably get rid of the fa_device_id

I wish!

But since mcuboot is getting this definition from here, we can't do that without breaking mcuboot for the reason explained above.

Then we should change mcuboot code to access the structure via API/macros only and make it take our flash area definition when compiled with Zephry

The mcuboot should not directly peek into flash_area, I have thought that we are giving it flash_map header and it blindly uses flash_area_* API calls.

No, that's not correct. It needs the structure definition provided by a platform port. This is covered in https:/mcu-tools/mcuboot/blob/master/docs/PORTING.md. You can see in flash_map.h for Zephyr that fa_device_id is provided directly for MCUboot compatibility; we don't use it anywhere in the tree except to print it out in the shell.

That was my wishful thinking.

It's not just documentation either. The mcuboot code looks directly into the flash_area structure in several places in bootutil. By grep, it directly looks at fa_id, fa_size, fa_device_id, fa_off -- all of the members it mentions in struct flash_area in PORTING.md.

We could probably get rid of the fa_device_id

I wish!

But since mcuboot is getting this definition from here, we can't do that without breaking mcuboot for the reason explained above.

So again that is not a job for simple PR that does switch from one API to other. There is more rework needed for both the mcuboot and the flash map API.
I want to improve the current solution til we figure out what to do next and how will we approach the problem.

@de-nordic de-nordic marked this pull request as draft April 2, 2021 07:41
de-nordic and others added 6 commits April 2, 2021 09:25
The commit adds Kconfig option FLASH_MAP_DEV_IN_FLASH_AREA that
enables addition of const struct device * type fa_dev field to
the flash_area structure.
Enabling this option will also populate fa_dev of each entry of
the default flash map with a pointer to a device object that flash
area described by the entry is tied to.
The option switches flash map API to use the flash_area.fa_dev
instead of obtaining device object by name, flash_area.fa_dev_name,
by default.

Signed-off-by: Dominik Ermel <[email protected]>
The flash subsystem API is switching from using fa_dev_name to fa_dev,
of flash_area structure, making the fa_dev_name optional filed,
controlled by Kconfig option CONFIG_FLASH_MAP_DEV_NAME_IN_FLASH_AREA.
Using fa_dev_name for logging makes it required field, so this commit
removed the usage from log to remove dependency on the Kconfig option.

Signed-off-by: Dominik Ermel <[email protected]>
The flash subsystem API is switching from using fa_dev_name to fa_dev,
of flash_area structure, making the fa_dev_name optional filed,
controlled by Kconfig option CONFIG_FLASH_MAP_DEV_NAME_IN_FLASH_AREA.
Using fa_dev_name for logging makes it required field, so this commit
removed the usage from log to remove dependency on the Kconfig option.

Signed-off-by: Dominik Ermel <[email protected]>
The flash subsystem API is switching from using fa_dev_name to fa_dev,
of flash_area structure, making the fa_dev_name optional filed,
controlled by Kconfig option CONFIG_FLASH_MAP_DEV_NAME_IN_FLASH_AREA.
Using fa_dev_name for device binding makes it required field,
so this commit introduce binding using flash_area_get_device() which
removes dependency on the Kconfig option.

Signed-off-by: Andrzej Puzdrowski <[email protected]>
The flash subsystem API is switching from using fa_dev_name to fa_dev,
of flash_area structure, making the fa_dev_name optional filed,
controlled by Kconfig option CONFIG_FLASH_MAP_DEV_NAME_IN_FLASH_AREA.
Using fa_dev_name for device binding makes it required field,
so this commit introduce binding using flash_area_get_device() which
removes dependency on the Kconfig option.

Signed-off-by: Andrzej Puzdrowski <[email protected]>
With switch to compile time resolved device object and usage of
flash_area.fa_dev, the filed flash_area.fa_dev_name becomes optional
as there are no more device_get_binding calls within flash_map
that rely on that filed, or any other references to that field.

The added Kconfig option, FLASH_MAP_DEV_NAME_IN_FLASH_AREA, allows
to remove the flash_area.fa_dev_name field, when disabled.
The option is set to y by default to support external code that may
relay on the existence of flash_area.fa_dev_name.

Configurations for samples and tests that require the fa_dev_name
have been altered to have proper Kconfig options enabled.

Signed-off-by: Dominik Ermel <[email protected]>
@de-nordic de-nordic marked this pull request as ready for review April 2, 2021 11:40
@mbolivar-nordic
Copy link
Contributor

I want to improve the current solution til we figure out what to do next and how will we approach the problem.

I think this is not a good idea for a stable API. I made a pretty detailed proposal above and it seemed like we agreed on at least one of the options; can we figure out the right thing to do instead of just merging this?

We seem to all agree that the inclusion of struct flash_area in the zephyr flash_map.h API is "leaking" mynewt implementation details used by a single application (mcuboot) into something that is supposed to be a generic API. I understand the goal of resolving the devices at build time, but I disagree with the proposal to do that by further mixing zephyr and mynewt by extending this structure in this way.

@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Jul 14, 2021
@github-actions github-actions bot closed this Jul 29, 2021
@de-nordic de-nordic deleted the flash_map-no-more-binding branch December 20, 2021 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: File System area: Flash area: Samples Samples area: Settings Settings subsystem area: Storage Storage subsystem area: Tests Issues related to a particular existing or missing test DNM This PR should not be merged (Do Not Merge) Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants