Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce AST prior to render #576

Merged
merged 6 commits into from
Mar 31, 2018

Conversation

aidantwoods
Copy link
Collaborator

@aidantwoods aidantwoods commented Mar 21, 2018

An experimental go at refactoring the main to produce a complete AST prior to converting it into a HTML string. This could allow us/extensions to traverse the AST and perform structural transformations to parsing results.
Highly experimental, definitely not finished.


Closes #338
Closes #413

This also splits 'text' into 'text', 'elements', and
'element' to hopefully better communicate structure
@aidantwoods
Copy link
Collaborator Author

aidantwoods commented Mar 21, 2018

So far this aims to allow the entire AST to be generated prior to its rendering by separating handler delegation from the structure – in the integrated AST-handler model, the AST is generated in layers and then rendered before generating the next later (i.e. generate, render, recurse via handler, ...).

Previously handlers would be required to return HTML, and would be inserted directly within an element. In order to successfully decouple this process and obtain a meaningful AST end result, handlers therefore need to be able to return AST data.
Previously the handler was also used to generate the HTML for nested elements, since this aims to decouple the handler from the AST structure there are now two new keys in addition to text within an element: elements, and element. The other keys allow for nested structures. The introduction of these special keys for the AST structure to eliminate the need for use of elements as a handler during the rendering process. Now only text should be put in text and it no longer serves the dual purpose of "argument to handler".

The handler is now an array that contains information only relevant to handler delegation, in which the following keys should all be specified: function, argument, destination. function is the function which should be called during the handling process, argument is the argument that should be given to the function, destination is the key to which the result of this call should be written to within the element.

IMO this also majorly disambiguates the handling process, in particular the use of the text key to contain: text, elements, an element, and lines in an li, ... are a little confusing to say the least :p

This has caused confusion in the past: e.g. #507 (comment)

I just tried your code and it works[...] I'm also trying to understand how you did it but I'm having a hard time[...]

Also see #523

It's definitely worth considering the implications of all these changes before deciding which version this might belong to should it ever be merged. This is a bit of an experiment, so there is potentially room for improvement with regard to breaking changes.

@aidantwoods
Copy link
Collaborator Author

aidantwoods commented Mar 21, 2018

Also note that (since the handler key is now mapping to a different type), we could in theory implement a compatibility layer using rawHtml (probably without opting into the safe-mode exception) by writing something like:

    protected function handle(array $Element)
    {
        if (isset($Element['handler']))
        {
            if (!isset($Element['nonNestables']))
            {
                $Element['nonNestables'] = array();
            }

            if (is_string($Element['handler']))
            {
                $function = $Element['handler'];
                $argument = $Element['text'];
                $destination = 'rawHtml';
            }
            else
            {
                $function = $Element['handler']['function'];
                $argument = $Element['handler']['argument'];
                $destination = $Element['handler']['destination'];
            }

            $Element[$destination] = $this->{$function}($argument, $Element['nonNestables']);
        }

        unset($Element['handler']);

        return $Element;
    }

Just to give an example of doing this with minimal BC breaks if desired.

@Sommerregen
Copy link

@aidantwoods Since you switch to an AST structure, is there any chance to register node visitors in order to extend functions or add new functionalities? I think something in the direction of #333 .

@aidantwoods
Copy link
Collaborator Author

@Sommerregen would you mind opening a separate issue to track this? I'll try take a look at this when I can, though probably separately to this PR :)

@aidantwoods
Copy link
Collaborator Author

cc: @PhrozenByte :)
Any thoughts on this – as far as I can see BC breakage should be very minimal (only breakage I can foresee would be in extensions that utilise custom (their own) handlers and expect safe mode to be enabled by their users – the old way of doing handlers that return HTML now goes via conditional escaping in rawHtml as a handler destination. There is a pretty easy routes to fix in this case anyway without using the AST (though of course that is preferable).

@PhrozenByte
Copy link
Contributor

I'll definitly look into this, unfortunately my time is very limited right now. I hope I'll find some tme in the next couple of days 😃

@aidantwoods
Copy link
Collaborator Author

No worries, just for when you get the chance :)

@aidantwoods aidantwoods added this to the 1.8.0 milestone Mar 28, 2018
@aidantwoods
Copy link
Collaborator Author

Even though it is now possible to generate the AST prior to rendering – element will still delegate work to handlers if they are present. This means it is not required to generate the AST in entirety prior to rendering if we don't need to.
If we do wish to see the entire AST then we only need call one of the added recursive methods on the AST to ensure that all handlers are called. The point here was to separate the two processes, we won't always need it though – in these cases we can (and probably should) delay traversal until render.

@aidantwoods
Copy link
Collaborator Author

I think I'd consider this pretty much done now, bar any improvements @PhrozenByte may suggest :)

@aidantwoods aidantwoods merged commit ce073c9 into erusev:master Mar 31, 2018
@aidantwoods aidantwoods deleted the enhancement/moar-ast branch March 31, 2018 22:11
@hamedgasemi200
Copy link

I hope you finish this soon, I check this page every day !

@sbrl
Copy link

sbrl commented Jan 3, 2020

Is there a hook I can register against when subclassing Parsedown to grab a hold of that AST? It would be great for things like generating tables of contents (ref: sbrl/Pepperminty-Wiki#155)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove markdown Run a completion block after text() has processed lines()
5 participants