Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First cut #1

Closed
egberts opened this issue Nov 20, 2019 · 15 comments
Closed

First cut #1

egberts opened this issue Nov 20, 2019 · 15 comments

Comments

@egberts
Copy link
Owner

egberts commented Nov 20, 2019

@ptmcg

Here's the first cut, complete with the most comprehensive test cases I've got (because I absolutely needed it to ensure nothing got broken during development).

I use the JetBrain PyCharm IDE so the (.idea hidden dotfile should be there).

Enjoy.

Steve

@egberts
Copy link
Owner Author

egberts commented Nov 22, 2019

Turns out that bind9_parser was not packagable. Just made it packagble and provided a try-me.sh script in example subdirectory.

Was in the JetBrain PyCharm IDE for so long (doing unit tests from within there) that I had forgotten about the packagability of this.

Now in latest master branch.

@ptmcg
Copy link
Collaborator

ptmcg commented Nov 23, 2019

I recently struggled with this too, with the new restructuring of pyparsing into folders and modules. Did you test out your package using the test pypi? I still need to do that - I won't really be sure I've got all the pieces in place until I can pip install from there.

@egberts
Copy link
Owner Author

egberts commented Nov 23, 2019

Pip is next on my list. I’m learning the entire pip ecosystem

@egberts
Copy link
Owner Author

egberts commented Nov 23, 2019

Done.

~/work/python/parsing/bind9-parser$ twine upload dist/*
Uploading distributions to https://upload.pypi.org/legacy/
Enter your username: egberts
/home/steve/.local/lib/python3.7/site-packages/twine/auth.py:68: UserWarning: Failed to unlock the keyring!
  warnings.warn(str(exc))
Enter your password: 
Uploading bind9_parser-0.9.8-py2.7.egg
100%|████████████████████████████████████████| 101k/101k [00:02<00:00, 42.5kB/s]
Uploading bind9_parser-0.9.8-py3.7.egg
100%|████████████████████████████████████████| 108k/108k [00:01<00:00, 73.2kB/s]

View at:
https://pypi.org/project/bind9-parser/0.9.8/
~/work/python/parsing/bind9-parser$ 

@egberts
Copy link
Owner Author

egberts commented Nov 23, 2019

Seems like I broke something at a PyPi-level... but this pip install command works.

python3 -m pip install --pre --upgrade bind9_parser==0.9.8

EDITED: pip v18.1 was obsoleted for this reason. Upgraded to v19.3.1. It's all good.

@ptmcg
Copy link
Collaborator

ptmcg commented Nov 24, 2019

There are some online instructions at the Python docs on how to use PyPi, including how to twine upload to a test site instead of the live site.

@egberts
Copy link
Owner Author

egberts commented Nov 24, 2019

Turns out that I'm running an old Debian 'python[3]-pip' v18.1 package (on Debian Buster 9 distro). Fixed it using this-link to bring it up to at least pip v19.3.1

python3 -m pip install --upgrade --force pip && python3 -m pip install --force requests urllib3 distro
apt remove python-pip python3-pip

It works now.

@egberts
Copy link
Owner Author

egberts commented Nov 24, 2019

Didn't have to use the test.pypi.org.... https://test.pypi.org/project/bind9-parser/

Only thing left to do is Tox/coverity, et. al.

@egberts
Copy link
Owner Author

egberts commented Nov 24, 2019

Tox/coverity happens when I take this project out of private mdoe.

Major changes to all tests/test_*.py import statement to properly reflect bind9_parser.isc_*.

We can now execute those tests/test_*.py directly from PyCharm with no related project configuration settings.

I can also execute those test scripts directly from command line as well (after package install)

I cannot make it work with pytest (it would only let me work against the installed version OR local package, but not both; pytest default is installed package).

@ptmcg
Copy link
Collaborator

ptmcg commented Nov 27, 2019

Sorry not to respond earlier - I got wiped out this week with something very flu-like, despite getting a flu shot just a few weeks ago. This is a huge project of yours, and I'll try to give some feedback over the holiday weekend.

@ptmcg
Copy link
Collaborator

ptmcg commented Nov 28, 2019

Some notes:

exclamation = Char('!')

The purpose of Char is to simplify Word(some_character_set, exact=1), or oneOf(list(some_character_set)). When there is only one character in the character set, just use Literal.

parse_me

Seems to be a reinvention of runTests.

CaselessLiteral

Should probably be CaselessKeyword throughout. Also, if pyparsing matches a CaselessLiteral or CaselessKeyword, it will return not the parsed string, but the string used to define the expression:

CaselessKeyword("xyz").parseString("XyZ") -> "xyz"

so there is no need to do extra .lower() conversion calls in parse actions, you can just test against the string used to define the expression.

Parse actions

An quick and easy debugger is to use pyparsing's @traceParseAction decorator to see what is being passed to a parse action, and what is coming back.

Also, defensive coding like "if t[0]:" should be necessary only if the associated expression would match an empty list. Otherwise, don't do these kind of checks - rely on your parser to only call the parse action when there actually is a match (similar to don't call .lower() in previous note).

Example:

integer = Word(nums)

def convert_integer(t):
    if t[0].is_numeric():     #  unnecessary - integer expression will only match a numeric string
        try:
            return int(t[0])
        except ValueError:  # this try-except is also unnecessary
            raise

def convert_integer(t):
    return int(t[0])    # no other validation required, we never get here without parsing a valid int

Module formats

A number of modules include internal unit-style tests, with find_pattern and parse_me testing methods. Move all of this test code to the bottom of each module in a "if name == 'main':" section, so that the module can be imported as a library module without actually running all the tests. This will also help clarify how much in each module is actual parser content vs. testing.

It also feels like you have too many modules, but I am often accused of writing modules that are too big. But an entire module for a single isc_boolean expression seems a bit much.

Reformatting isc_boolean:

from pyparsing import CaselessKeyword, Keyword

isc_boolean = (
    CaselessKeyword('true')
    | CaselessKeyword('false')
    | CaselessKeyword('yes')
    | CaselessKeyword('no')
    | Keyword('1')
    | Keyword('0')
).addParseAction(lambda t: t[0] in ['true', 'yes', '1'])

if __name__ == "__main__":

    isc_boolean.runTests("""
        TRUE
        True
        true
        yes
        Yes
        YES
        1
        FALSE
        False
        false
        no
        No
        NO
        0
        bogus
        wrong
        righto
        yeah
        nope
        nah
        12345
        """)

Anyway, the way your parser is broken up, it is very difficult to see the big picture. Perhaps an overall BNF?

@egberts
Copy link
Owner Author

egberts commented Nov 28, 2019

One topic/comment at a time.

BNF? Overall BNF? For the entirety of ISC Bind9 configuration file? I've extracted it from multiple versions of ISC Bind manuals and re-edited/finalized it from their source code; they were horribly documented (has at least 100+ errors in latest 9.15.1, and sometime do not even reflect its code).

If and when I do reconstruct the correct BNF (and often do so), I put them before each function as a comment block. Perhaps, I should complete this step to 100%.

Might be useful (to folks unlike me). But is it relevant to me at this point and stage? To me; Less so now, than before. To others, probably more so.

@egberts
Copy link
Owner Author

egberts commented Nov 28, 2019

Parser is broken up. It sure is.

Yes. And the "parser appears broken up" is a product of individual SCRUM/AGILE approach. Small pieces at a time; small BNF syntax at a time; make it work firstly and solidly. And build my way upward carefully without breakage. Run unit tests repeatedly to ensure no breakage (PyCharm does this automatically auto-unittest after major code changes).

Individual test cases solidify each parser.

It is important to note that all parsers were done BEFORE the .asDict() came into the picture. Now, I am pretty solid on .asDict() and am in "fuzz" stage (scramble the ordering of statement, split grouping of same statements, ...) and am fixing those fuzzings.

Often times the Bind9 'clause' can be included inside another clause(s); statement be inside another statement(s); syntax be inside another syntax(es). It didn't make sense (after a deep code review) to have duplication of parsers because it's neatly 'nested'.

The way things are broken up is simply as:

  • clause
    • statement
      • keyword/value

The above terms are what a typical ISC Bind9 engineer would understand.

As for trouble shooting, yes, big picture is not solid but I am quite positive that I got those clauses down 100%. It is the mixture of clauses that I am struggling at the moment:

  • options clause can only exist once,
  • masters clause can appear 0-*
  • keys clauses can appear 0-*
  • zone clauses can appear 1-*

PyParsing, as I've programmed with, can only take the same clause as a contiguous group of clause and not interleaved (or spread across or stand apart) between other clauses.

I am making effort to reframe the terms so that a PyParsing engineer can solve it and it's not perfect either.

@egberts
Copy link
Owner Author

egberts commented Nov 28, 2019

Module formats;

If I understood you correctly, you want to relocate those test_ functions from its own standalone test Python script file and into their respective library module source file? And using these test code but under the if __name__ == "__main__": code block?

If that is so ...why introduce more Python bytecodes to add into your Python library by the mere act of merging the unit test files into their respective module source file? I like my Python library lean and light.

It's bad enough when a Python script takes up to 2.5 GB of memory space.

Is that it?

(PS: isc_boolean_logic was the first module I wrote, I could fold that into the generic isc_utils.py library module.)

@egberts
Copy link
Owner Author

egberts commented Nov 28, 2019

Parse_Me

I'm making that go away... parse_me is legacy and needs to just disappear.

@egberts egberts closed this as completed Sep 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants