Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing header files succeeds when it should not, reports mysterious error when it fails #7001

Open
TodAmon opened this issue Oct 3, 2024 · 1 comment
Assignees
Labels
Feature: C-Parser Status: Internal This is being tracked internally by the Ghidra team

Comments

@TodAmon
Copy link

TodAmon commented Oct 3, 2024

Describe the bug

When parsing header files, the parser ignores #error when it should not. When it correctly does fail due to an #error it fails to identify the cause of the problem.

To Reproduce

There are four very simple header files in the attached tar file. To reproduce, simply process the header files using two different orders:
bbb.h aaa.h ccc.h and bbb.h aaa.h ccc-fixed.h. You can use the Ghidra UI, or you can achieve the same results using CParserUtils.parseHeaderFiles

Expected Behavior

Ghidra improperly reports success when the parse order is bbb.h aaa.h ccc.h although it does have a message with a useful warning (but it should have failed). Alternatively, Ghidra properly reports failure if the parse order is bbb.h aaa.h ccc-fixed.h but then it provides meaningless information about the actual error.

The only difference between ccc.h and ccc-fixed.h is the inclusion of a typedef before the #error Why should this affect the success or failure of parsing? The files were designed such that the only valid parse order is aaa.h bbb.h ccc.h

Screenshots

parse-succeeds-but-should-fail
parse-fails-with-wrong-message

Attachments

ghidrabug.tar.gz
a gzipped tar file containing four files

Environment

Ghidra version 11.2 using headless analyzer and a custom script,
Ghidra version 11.03 using UI

@ryanmkurtz ryanmkurtz added Feature: C-Parser Status: Triage Information is being gathered labels Oct 4, 2024
@emteere
Copy link
Contributor

emteere commented Oct 16, 2024

There are two phases to the parsing just as in normal header parsing.
The first phase is the Pre-processor that handles macro expansion. That would catch the error and fail.
The second phase the CParser will parse the output of the pre-processor.

My suspicion is that in both cases the second phase parse is getting to run when the first phase encountered an error.

  • In the first case the output has stopped at the error, but is still parseable C, so the CParser can parse it. The error isn't stopping the second phase.
  • In the second case where it fails and provides an error, it has allowed the second phase to occur but the output fed to the second phase is only pre-processor parsed up to the error, and thus the CParser also fails with a parse error. The CParser shouldn't get kicked off if the PreProcessor fails.

It might have been as designed originally to allow processing so one could extract as much type information as possible. This probably isn't the best behavior, or at least should be an option to push through any error conditions that is off by default. I've seen some #error conditions that can be ignored in special cases, I think when parsing AVR8 header files, but this probably isn't the norm.

I'll take a look and address the issue.

@emteere emteere added Status: Internal This is being tracked internally by the Ghidra team and removed Status: Triage Information is being gathered labels Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: C-Parser Status: Internal This is being tracked internally by the Ghidra team
Projects
None yet
Development

No branches or pull requests

3 participants