Documentation

Lean.PrettyPrinter.Parenthesizer

The parenthesizer inserts parentheses into a Syntax object where syntactically necessary, usually as an intermediary step between the delaborator and the formatter. While the delaborator outputs structurally well-formed syntax trees that can be re-elaborated without post-processing, this tree structure is lost in the formatter and thus needs to be preserved by proper insertion of parentheses.

The abstract problem & solution #

The Lean 4 grammar is unstructured and extensible with arbitrary new parsers, so in general it is undecidable whether parentheses are necessary or even allowed at any point in the syntax tree. Parentheses for different categories, e.g. terms and levels, might not even have the same structure. In this module, we focus on the correct parenthesization of parsers defined via Lean.Parser.prattParser, which includes both aforementioned built-in categories. Custom parenthesizers can be added for new node kinds, but the data collected in the implementation below might not be appropriate for other parenthesization strategies.

Usages of a parser defined via prattParser in general have the form p prec, where prec is the minimum precedence or binding power. Recall that a Pratt parser greedily runs a leading parser with precedence at least prec (otherwise it fails) followed by zero or more trailing parsers with precedence at least prec; the precedence of a parser is encoded in the call to leadingNode/trailingNode, respectively. Thus we should parenthesize a syntax node stx supposedly produced by p prec if

  1. the leading/any trailing parser involved in stx has precedence < prec (because without parentheses, p prec would not produce all of stx), or
  2. the trailing parser parsing the input to the right of stx, if any, has precedence >= prec (because without parentheses, p prec would have parsed it as well and made it a part of stx). We also check that the two parsers are from the same syntax category.

Note that in case 2, it is also sufficient to parenthesize a parent node as long as the offending parser is still to the right of that node. For example, imagine the tree structure of (f fun x => x) y without parentheses. We need to insert some parentheses between x and y since the lambda body is parsed with precedence 0, while the identifier parser for y has precedence maxPrec. But we need to parenthesize the $ node anyway since the precedence of its RHS (0) again is smaller than that of y. So it's better to only parenthesize the outer node than ending up with (f $ (fun x => x)) y.

Implementation #

We transform the syntax tree and collect the necessary precedence information for that in a single traversal. The traversal is right-to-left to cover case 2. More specifically, for every Pratt parser call, we store as monadic state the precedence of the left-most trailing parser and the minimum precedence of any parser (contPrec/minPrec) in this call, if any, and the precedence of the nested trailing Pratt parser call (trailPrec), if any. If stP is the state resulting from the traversal of a Pratt parser call p prec, and st is the state of the surrounding call, we parenthesize if prec > stP.minPrec (case 1) or if stP.trailPrec <= st.contPrec (case 2).

The traversal can be customized for each [*Parser] parser declaration c (more specifically, each SyntaxNodeKind c) using the [parenthesizer c] attribute. Otherwise, a default parenthesizer will be synthesized from the used parser combinators by recursively replacing them with declarations tagged as [combinator_parenthesizer] for the respective combinator. If a called function does not have a registered combinator parenthesizer and is not reducible, the synthesizer fails. This happens mostly at the Parser.mk decl, which is irreducible, when some parser primitive has not been handled yet.

The traversal over the Syntax object is complicated by the fact that a parser does not produce exactly one syntax node, but an arbitrary (but constant, for each parser) amount that it pushes on top of the parser stack. This amount can even be zero for parsers such as checkWsBefore. Thus we cannot simply pass and return a Syntax object to and from visit. Instead, we use a Syntax.Traverser that allows arbitrary movement and modification inside the syntax tree. Our traversal invariant is that a parser interpreter should stop at the syntax object to the left of all syntax objects its parser produced, except when it is already at the left-most child. This special case is not an issue in practice since if there is another parser to the left that produced zero nodes in this case, it should always do so, so there is no danger of the left-most child being processed multiple times.

Ultimately, most parenthesizers are implemented via three primitives that do all the actual syntax traversal: maybeParenthesize mkParen prec x runs x and afterwards transforms it with mkParen if the above condition for p prec is fulfilled. visitToken advances to the preceding sibling and is used on atoms. visitArgs x executes x on the last child of the current node and then advances to the preceding sibling (of the original current node).

  • cat : Lean.Name

    We need to store this categoryParser argument to deal with the implicit Pratt parser call in trailingNode.parenthesizer.

  • forceParens : Bool

    Whether to add parentheses regardless of any other conditions. This is cached from the pp.parens option.

Instances For
    Instances For
      Equations
      • One or more equations did not get rendered due to their size.
      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          Equations
          • One or more equations did not get rendered due to their size.
          Instances For
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Execute x at the right-most child of the current node, if any, then advance to the left.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Run x and parenthesize the result using mkParen if necessary. If canJuxtapose is false, we assume the category does not have a token-less juxtaposition syntax a la function application and deactivate rule 2.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Adjust state and advance.

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For
                    @[extern lean_mk_antiquot_parenthesizer]
                    @[extern lean_pretty_printer_parenthesizer_interpret_parser_descr]
                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For
                      @[implemented_by Lean.PrettyPrinter.Parenthesizer.parenthesizerForKindUnsafe]
                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For
                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For
                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  Wraps the term stx in parentheses and then moves its SourceInfo to the result. The purpose of this is to move synthetic delaborator positions from the stx node to the parentheses node, which causes the info view to view the node with parentheses as referring to the parenthesized expression. If we did not move info, the info view would consider the parentheses to belong to the outer term. Note: we do not do withRef stx because that causes the "(" and ")" tokens to have source info as well, causing the info view to highlight each parenthesis as an independent expression.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For
                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For
                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For
                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For
                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For
                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For
                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For
                                                  Equations
                                                  • One or more equations did not get rendered due to their size.
                                                  Instances For
                                                    Equations
                                                    • One or more equations did not get rendered due to their size.
                                                    Instances For
                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For
                                                        @[macro_inline]
                                                        Equations
                                                        Instances For

                                                          Adds necessary parentheses in stx parsed by parser.

                                                          Equations
                                                          • One or more equations did not get rendered due to their size.
                                                          Instances For

                                                            Adds necessary parentheses to the syntax in the given category (for example, term, tactic, or command).

                                                            Equations
                                                            Instances For