### Perl Foundation News: Grant Committee - Request for Members

We are looking for new members to join the Grant Committee.

We have a few members who are ready to retire, and so we have a few positions to consider filling.

Voting members review proposals every two months, including community feedback, and vote on whether to approve/fund the grant.

Grant Managers (who can be voting members, but don't have to be) work with approved grant recipients to ensure monthly status reports are generated, and recommend to the committee to consider work complete at the end of the grant.

At the last TPC, several people approached me with an interest in serving in one of these capacities, but we didn't make any moves at that time; Because so much time has passed, I'm basically starting the process over.

If you are interested in becoming part of the committee, please read through https://www.perlfoundation.org/grants-committee1.html and also send me an email (coke at CPAN).

If you would like to nominate someone - please ask them to send an email, as then I don't have to double check that they are willing.

Thanks to everyone in the community for making this work fulfilling.

### Perl Foundation News: Grant Proposals Mar/Apr 2019

The Grants Committee has received the following grant proposal for the March/April 2019 round.

Before the Committee members vote on any proposal, we like to solicit feedback from the Perl community.

Review the proposals at their individual links and please comment there by March 22nd, 2019. The Committee members will start the voting process following that and the conclusion will be announced shortly after.

### Perl Foundation News: Grant Proposal: Create a complete course of the Perl 6 programming language

The Grants Committee has received the following grant proposal for the March/April 2019 round. Before the Committee members vote, we would like to solicit feedback from the Perl community on the proposal.

Review the proposal below and please comment here by March 22nd, 2019. The Committee members will start the voting process following that.

# A Complete (Interactive) Perl 6 Course with Exercises

• Name

Andrew Shitov

• Amount Requested

USD 9,996 Task: Create a complete course of the Perl 6 programming language. ## Abstract I want to create a complete course with exercises covering all aspects of Perl 6. It will be aimed at everyone who is familiar with programming. The goal is to make a course that you can use in self-studying or use as a platform in the class. ## Target audience Perl 6 is a language that many people find extremely attractive. The efforts of our activists during recent years show that there are people from outside of the Perl community who also want to start learning Perl 6. There are two groups of potential users: those with and without Perl 5 background. As Perl 6 significantly differs from Perl 5, both target groups can benefit from a single Perl 6 course. ## Unique points How this is different from what already exists: vast online documentation, books, etc.? The proposed course is a step-by-step flow that begins from simple things, which makes it different from the documentation. Unlike the books, the main focus will be on having small lessons with many exercises. There are also a few video lectures and introductions, but again with very little homework. Neither I want to have an extended version of the documentation (see, for example, learn-perl.org/en/Operators as an example of the tutorial with long lists of features). My idea is aligned towards interactive courses such as perltuts.com (not completed for Perl 6) and www.snakify.org/en that I used myself to learn and teach Python. ## Content The course contains approximately 15 sections (roughly following the chapters of "Perl 6 Deep Dive"). Each section includes 15-40 lessons. Each lesson covers a single topic (such as accessing array elements or using different variants of multi-methods, or a regex quantifier, or an element of concurrent code) and includes 2-4 exercises with displaying correct solutions on request. ## Timeline and deliverable chunks The project needs about six months to complete all 15 sections; independent sections can be published earlier, upon completion. The work is divided in four parts. The first three parts are about content, the fourth is to make it interactive. 1. Materials and exercises for the chapters devoted to the aspects of Perl 6 as a general programming language (thus, variables, functions, file operations, object-oriented programming, etc.). 2. Chapters about regular expressions and grammars. 3. Everything else. This part covers topics such as concurrent, functional, and reactive programming, etc. In other words, everything that is beyond the general programming language from point 1. 4. Implement the exercises from the three above parts as interactive online pages. The result of the first three parts is a GitHub repository with lessons in Markdown format. ## Interactivity We will need to host it on a subdomain of perl6.org and/or link at the Resources page of the site. The technical implementation should use JJ's docker image to spawn a compiler for each user. I suggest postponing the question of who pays for the hosting. Potentially we can re-use perltuts's engine. In a minimal form, exercises do not have to be really interactive so that we can gain both time and money on programming the server side. Additionally or alternatively, this course can be ported to sites such as stepik.org (but with no video lectures), where they host interactive tutorials for other programming languages. If we choose this way, it will still require a lot of work under Chunk 4 from the above list. ## About me I am a Perl 6 enthusiast since around 2000, have/had run a few Perl 6 blogs: perl6.ru, perl6.online, wrote three Perl 6 books, spoke with Perl 6 talks at different events including FOSDEM 2019 and organised a number of events dedicated to the Perl programming languages. ## Financial The requested amount is US2499.00 per each of the four parts, thus US$9996 in total. ### Perl Foundation News: February report of the Perl 6 Development Grant of Jonathan Worthington Jonathan writes: The majority of my Perl 6 grant time during February was spent on the escape analysis and scalar replacement work. Happily, the first round of work on this analysis and optimization reached the point of being complete and stable enough to merge into MoarVM master, so Perl 6 users can now benefit from it. I also made allocation profiling aware of scalar replacement, meaning profiling does not block the optimization and collects statistics about how many scalar replacements took place. I also started on the next round of escape analysis work, which handles transitive object relationships. This means if we have, for instance, a Scalar that does not escape, and a Num which also does not escape is assigned into it, then we can scalar replace both. The basic algorithm currently in master would conservatively consider the Num as having escaped. Further work is needed with regard to deoptimization of such cases before this work can be merged. I also worked on enabling more aggressive optimization of inlined code. Sometimes, knowledge of the context the code is being inlined into allows for significant further optimization of the inlinee. While profiling a hash benchmark, I spotted some duplicated sanity checks during hash access, and eliminated those for a speedup, and started looking into a means to speed up attribute access in the non-mixin case. Finally, I fixed a recent regression, and did assorted bits of code reivew, replying to issues, and merging PRs. 9:45 Fix remaining issues with basic escape analysis and scalar replacement; merge the branch 1:26 Integrate allocation profiling and scalar replacement 5:09 Work on transitive object handling in escape analsyis 3:50 More aggressively optimize inlined code after inlining 1:43 Look into a bug involving BEGIN, closures, and$/; fix it.
1:29    Eliminate duplicate checks on hash access, giving ~5% off
a hash access benchmark
3:55    Work on speeding up attribute access in non-mixin cases
1:48    Assorted code review, replying to issues, merging PRs

Total: 29:05

Remaining funding-approved hours on current grant period: 131:46
Remaining community-approved hours on current grant period: 297:46


### Perl.com: What's new on CPAN - February 2019

Welcome to “What’s new on CPAN”, a curated look at last month’s new CPAN uploads for your reading and programming pleasure. Enjoy!

## Science & Mathematics

### Perl Foundation News: Maintaining Perl 5 (Tony Cook): February 2019 Grant Report

This is a monthly report by Tony Cook on his grant under Perl 5 Core Maintenance Fund. We thank the TPF sponsors to make this grant possible.

Approximately 37 tickets were reviewed, and 9 patches were
applied

[Hours]         [Activity]
0.85          #108276 re-test, apply to blead
12.83          #124203 reproduce, debugging (mostly seems to be lock() at
start of DB::sub), try to bisect
#124203 bisect some more, review results, notice change in
bug and bisect for that change
#124203 debugging, appears to be locking up in Cwd::CLONE,
try to disable DB::sub calls while cloning, get a bit
further, locks in threads::join() (holding the $DBGR lock) and main::sub1, trying to get the$DBGR lock
#124203 re-work DB::sub to avoid holding the lock,
testing, clean up and polish, work on a regression test
#124203 debug test failures, fixes, testing, polish,
comment with patch
#124203 look for similar issues, work up a test case, fix
and comment with patch
6.50          #130585 debugging
#130585 try to re-work parsing to avoid the bug
#130585 testing, polish, comment with patch
#130585 re-work patch a little, comment with updated patch
1.95          #131683 review, testing, comment
#131683 review original discussion testing, minor fixes,
0.82          #132964 review and comment
#132964 review new patch, testing, apply to blead
0.75          review security queue, comments, make #133334, #13335
public
0.30          #133462 work on fix, testing, apply to blead
0.72          #133523 (sec) reply
0.60          #133638 review and comment, mail perldoc.perl.org
maintainer
1.03          #133660 work up a test and comment with patch
#133660 re-test and apply to blead
0.10          #133670 review and close
0.28          #133695 review and comment
0.70          #133771 review, research and comment
2.08          #133778 debugging, comment with patch
#133778 retest and apply to blead
0.62          #133781 review and comment
2.14          #133795 research and comment
#133795 review toke.c, perly.y to try to find a solution
1.20          #133803 review, comment, prepare to try an test
0.83          #133810 test setup, testing, propose fixes for
backporting, close #133760
1.39          #133816 review apparent bisect, testing
#133816 reproduce bisect, ask khw about it
1.78          alpine linux tests, open #133818 and #133819
#133818 discussion with khw on providing access to alpine
linux
2.10          #133822 debugging
#133822 debugging, comment with patch, also comment on
similar #133823
0.15          #133827 comment
1.75          #133830 research, testing, comment, some setlocate
discussion with khw
0.87          #133838 review weekend commits, review patch, testing,
0.40          #133846 review, test, apply to blead, research ERRSV
history
4.02          #133850 debugging, work on fix
#133850 testing, check a similar case, comment with patch,
search for other bugs with the same cause, comment with
some similar cases
0.27          #133851 briefly comment, debugging
2.32          #133853 research, adhoc testing, work up a test case,
testing, apply to blead
0.10          check for un-perldeltaed changes
1.07          more perlsecpolicy, open a security ticket for discussion
0.68          perlsecpolicy
3.03          perlsecpolicy:
0.30          perlsecpolicy: comment
1.57          perlsecpolicy: respond to only comment, updates
1.75          perlsecpolicy: spell check with some work, fixes, comment
with updated pod
0.27          research and briefly comment on â€œRFC: Adding \p{foo=/re/}â€
0.77          respond toddr email
======
58.89 hours total





## Ruby Slippers

In Marpa, a "Ruby Slippers" symbol is one which does not actually occur in the input. Ruby Slippers parsing is new with Marpa, and made possible because Marpa is left-eidetic. By left-eidetic, I mean that Marpa knows, in full detail, about the parse to the left of its current position, and can provide that information to the parsing app. This implies that Marpa also knows which tokens are acceptable to the parser at the current location, and which are not.

Ruby Slippers parsing enables a very important trick which is useful in "liberal" parsing -- parsing where certain elements might be in some sense "missing". With the Ruby Slippers you can design a "liberal" parser with a "fascist" grammar. This is, in fact, how the Haskell 2010 Report's context-free grammar is designed -- the official syntax requires explicit layout, but Haskell programmers are encouraged to omit most of the explicit layout symbols, and Haskell implementations are required to "dummy up" those symbols in some way. Marpa's method for doing this is left-eideticism and Ruby Slippers parsing.

The term "Ruby Slippers" refers to a widely-known scene in the "Wizard of Oz" movie. Dorothy is in the fantasy world of Oz, desperate to return to Kansas. But, particularly after a shocking incident in which orthodox Oz wizardry is exposed as an affable fakery, she is completely at a loss as to how to escape. The "good witch" Glenda appears and tells Dorothy that in fact she's always had what she's been wishing for. The Ruby Slippers, which she had been wearing all through the movie, can return her to Kansas. All Dorothy needs to do is wish.

In Ruby Slippers parsing, the "fascist" grammar "wishes" for lots of things that may not be in the actual input. Procedural logic here plays the part of a "good witch" -- it tells the "fascist" grammar that what it wants has been there all along, and supplies it. To do this, the procedural logic has to have a reliable way of knowing what the parser wants. Marpa's left-eideticism provides this.

## Ruby Slippers combinators

This brings us to a question I've postponed -- how do we know which combinator to call when? The answer is Ruby Slippers parsing. First, here are some lexer rules for "unicorn" symbols. We use unicorns when symbols need to appear in Marpa's lexer, but must never be found in actual input.


:lexeme ~ L0_unicorn
L0_unicorn ~ unicorn
unicorn ~ [^\d\D]
ruby_i_decls ~ unicorn
ruby_x_decls ~ unicorn [14]



<unicorn> is defined to match [^\d\D]. This pattern is all the symbols which are not digits and not non-digits -- in other words, it's impossible that this pattern will ever match any character. The rest of the statements declare other unicorn lexemes that we will need. <unicorn> and <L0_unicorn> are separate, because we need to use <unicorn> on the RHS of some lexer rules, and a Marpa lexeme can never occur on the RHS of a lexer rule.[15]

In the above Marpa rule,

• <decls> is the symbol from the 2010 Report;
• <ruby_i_decls> is a Ruby Slippers symbol for a block of declarations with implicit layout.
• <ruby_x_decls> is a Ruby Slippers symbol for a block of declarations with explicit layout.
• <laidout_decls> is a symbol (not in the 2010 Report) for a block of declarations covering all the possibilities for a block of declarations.

laidout_decls ::= ('{') ruby_x_decls ('}')
| ruby_i_decls
| L0_unicorn decls L0_unicorn [16]


It is the expectation of a <laidout_decls> symbol that causes child combinators to be invoked. Because <L0_unicorn> will never be found in the input, the <decls> alternative will never match -- it is there for documentation and debugging reasons.[17] Therefore Marpa, when it wants a <laidout_decls>, will look for a <ruby_x_decls> if a open curly brace is read; and a <ruby_i_decls> otherwise. Neither <ruby_x_decls> or <ruby_i_decls> will ever be found in the input, and Marpa will reject the input, causing a "rejected" event.

## Rejected events

In this code, as often, the "good witch" of Ruby Slippers does her work through "rejected" events. These events can be set up to happen when, at some parse location, none of the tokens that Marpa's internal lexer finds are acceptable.

In the "rejected" event handler, we can use Marpa's left eideticism to find out what lexemes Marpa would consider acceptable. Specifically, there is a terminals_expected() method which returns a list of the symbols acceptable at the current location.


my @expected =
grep { /^ruby_/xms; } @{ \$recce->terminals_expected() }; [18]


Once we "grep" out all but the symbols with the "ruby_" prefix, there are only 4 non-overlapping possibilities:

• Marpa expects a <ruby_i_decls> lexeme;
• Marpa expects a <ruby_x_decls> lexeme;
• Marpa expects a <ruby_semicolon> lexeme;
• Marpa does not expect any of the Ruby Slippers lexemes;

If Marpa does not expect any of the Ruby Slippers lexemes, there was a syntax error in the Haskell code.[19]

If a <ruby_i_decls> or a <ruby_x_decls> lexeme is expected, a child combinator is invoked. The Ruby Slippers symbol determines whether the child combinator looks for implicit or explicit layout. In the case of implicit layout, the location of the rejection determines the block indent.[20]

If a <ruby_semicolon> is expected, then the parser is at the point where a new block item could start, but none was found. Whether the block was implicit or explicit, this indicates we have reached the end of the block, and should return control to the parent combinator.[21]

To explain why <ruby_semicolon> indicates end-of-block, we look at both cases. In the case of an explicit layout combinator, the rejection should have been caused by a closing curly brace, and we return to the parent combinator and retry it. In the parent combinator, the closing curly brace will be acceptable.

If we experience a "rejected" event while expecting a <ruby_semicolon> in an implicit layout combinator, it means we did not find an explicit semicolon; and we also never found the right indent for creating a Ruby semicolon. In other words, the indentation is telling us that we are at the end of the block. We therefore return control to the parent combinator.

## Conclusion

With this, we've covered the major points of this Haskell prototype parser. It produces an AST whose structure and node names are those of the 2010 Report. (The Marpa grammar introduces non-standard node names and rules, but these are pruned from the AST in post-processing.)

In the code, the grammars from the 2010 Report are included for comparison, so a reader can easily determine what syntax we left out. It might be tedious to add the rest, but I believe it would be unproblematic, with one interesting exception: fixity. To deal with fixity, we may haul out the Ruby Slippers again.

## The code, comments, etc.

A permalink to the full code and a test suite for this prototype, as described in this blog post, is on Github. I expect to update this code, and the latest commit can be found here. Links for specific lines of code in this post are usually static permalinks to earlier commits.

To learn more about Marpa, a good first stop is the semi-official web site, maintained by Ron Savage. The official, but more limited, Marpa website is my personal one. Comments on this post can be made in Marpa's Google group, or on our IRC channel: #marpa at freenode.net.

## Footnotes

1. Graham Hutton and Erik Meijer, Monadic parser combinators, Technical Report NOTTCS-TR-96-4. Department of Computer Science, University of Nottingham, 1996, pp 30-35. http://eprints.nottingham.ac.uk/237/1/monparsing.pdf. Accessed 19 August 2018.

2. I use whitespace-significant parsing as a convenient example for this post, for historical reasons and for reasons of level of complexity. This should not be taken to indicate that I recommend it as a language feature.

3. Simon Marlow, Haskell 2010 Language Report, 2010. Online version accessed 21 August 2018. For layout, see in particular section 2.7 (pp. 12-14) and section 10.3 (pp. 131-134).

4. 2010 Report. The short examples are on p. 13 and p. 134. The long examples are on p. 14.

5. Paul Hudak, John Peterson and Joseph Fasel Gentle Introduction To Haskell, version 98. Revised June, 2000 by Reuben Thomas. Online version accessed 21 August 2018. The examples are in section 4.6, which is on pp. 20-21 of the October 1999 PDF.

9. Single-line comments are dealt with properly by lexing them as a different token and discarding them separately. Handling multi-line comments is not yet implemented -- it is easy in principle but tedious in practice and the examples drawn from the Haskell literature did not provide any test cases.

15. The reason for this is that by default a Marpa grammar determines which of its symbols are lexemes using the presence of those symbol on the LHS and RHS of the rules in its lexical and context-free grammars. A typical Marpa grammar requires a minimum of explicit lexeme declarations. (Lexeme declarations are statements with the :lexeme pseudo-symbol on their LHS.) As an aside, the Haskell 2010 Report is not always careful about the lexer/context-free boundary, and adopting its grammar required more use of Marpa's explicit lexeme declarations than usual.

17. Specifically, the presense of a <decls> alternative silences the usual warnings about symbols inaccessible from the start symbol. These warnings can be silenced in other ways, but at the prototype stage it is convenient to check that all symbols supposed to be accessible through <decls> are in fact accessible. There is a small startup cost to allowing the extra symbols in the grammars, but the runtime cost is probably not measureable.

19. Currently the handling of these is simplistic. A practical implementation of this method would want better reporting. In fact, Marpa's left eideticism allows some interesting things to be done in this respect.

## Language popularity

Github's linguist is seen as the most trustworthy tool for estimating language popularity[1], in large part because it reports its result as the proportion of code in a very large dataset, instead of web hits or searches.[2] It is ironic, in this context, that linguist avoids looking at the code, preferring to use metadata -- file name and the vim and shebang lines. Scanning the actual code is linguist's last resort.[3]

How accurate is this? For files that are mostly in a single programming language, currently the majority of them, linguist's method are probably very accurate.

But literate programming often requires mixing languages. It is perhaps an extreme example, but much of the code used in this blog post comes from a Markdown file, which contains both C and Lua. This code is "untangled" from the Lua by ad-hoc scripts[4]. In my codebase, linguist indentifies this code simply as Markdown.[5] linguist then ignores it, as it does all documentation files.[6].

Currently, this kind of homegrown literate programming may be so rare that it is not worth taking into account. But if literate programming becomes more popular, that trend might well slip under linguist's radar. And even those with a lot of faith in linguist's numbers should be happy to know they could be confirmed by more careful methods.

## Token-by-token versus line-by-line

linguist avoids reporting results based on looking at the code, because careful line counting for multiple languages cannot be done with traditional parsing methods.[7] To do careful line counting, a parser must be able to handle ambiguity in several forms -- ambiguous parses, ambiguous tokens, and overlapping variable-length tokens.

The ability to deal with "overlapping variable-length tokens" may sound like a bizarre requirement, but it is not. Line-by-line languages (BASIC, FORTRAN, JSON, .ini files, Markdown) and token-by-token languages (C, Java, Javascript, HTML) are both common, and even today commonly occur in the same file (POD and Perl, Haskell's Bird notation, Knuth's CWeb).

Deterministic parsing can switch back and forth, though at the cost of some very hack-ish code. But for careful line counting, you need to parse line-by-line and token-by-token simultaneously. Consider this example:


int fn () { /* for later
\begin{code}
*/ int fn2(); int a = fn2();
int b = 42;
return  a + b; /* for later
\end{code}
*/ }


A reader can imagine that this code is part of a test case using code pulled from a LaTeX file. The programmer wanted to indicate the copied portion of code, and did so by commenting out its original LaTeX delimiters. GCC compiles this code without warnings.

It is not really the case that LaTeX is a line-by-line language. But in literate programming systems[8], it is usually required that the \begin{code} and \end{code} delimiters begin at column 0, and that the code block between them be a set of whole lines so, for our purposes in this post, we can treat LaTeX as line-by-line. For LaTeX, our parser finds


L1c1-L1c29 LaTeX line: "    int fn () { /* for later"
L2c1-L2c13 \begin{code}
L3c1-L5c31 [A CODE BLOCK]
L6c1-L6c10 \end{code}
L7c1-L7c5 LaTeX line: "*/ }"[9]


Note that in the LaTeX parse, line alignment is respected perfectly: The first and last are ordinary LaTeX lines, the 2nd and 6th are commands bounding the code, and lines 3 through 5 are a code block.

The C tokenization, on the other hand, shows no respect for lines. Most tokens are a small part of their line, and the two comments start in the middle of a line and end in the middle of one. For example, the first comment starts at column 17 of line 1 and ends at column 5 of line 3.[10]

What language is our example in? Our example is long enough to justify classification, and it compiles as C code. So it seems best to classify this example as C code[11]. Our parses give us enough data for a heuristic to make a decision capturing this intuition.[12]

## Earley/Leo parsing and combinators

In a series of previous posts[13], I have been developing a parsing method that integrates Earley/Leo parsing and combinator parsing. Everything in my previous posts is available in Marpa::R2, which was Debian stable as of jessie.

The final piece, added in this post, is the ability to use variable length subparsing[14], which I have just added to Marpa::R3, Marpa::R2's successor. Releases of Marpa::R3 pass a full test suite, and the documentation is kept up to date, but R3 is alpha, and the usual cautions[15] apply.

Earley/Leo parsing is linear for a superset of the LR-regular grammars, which includes all other grammar classes in practical use, and Earley/Leo allows the equivalent of infinite lookahead.[16] When the power of Earley/Leo gives out, Marpa allows combinators (subparsers) to be invoked. The subparsers can be anything, including other Earley/Leo parsers, and they can be called recursively[17]. Rare will be the grammar of practical interest that cannot be parsed with this combination of methods.

## The example

The code that ran this example is available on Github. In previous posts, we gave larger examples[18], and our tools and techniques have scaled. We expect that the variable-length subparsing feature will also scale -- while it was not available in Marpa::R2, it is not in itself new. Variable-length tokens have been available in other Marpa interfaces for years and they were described in Marpa's theory paper.[19].

The grammars used in the example of this post are minimal. Only enough LaTex is implemented to recognize code blocks; and only enough C syntax is implemented to recognize comments.

## The code, comments, etc.

To learn more about Marpa, a good first stop is the semi-official web site, maintained by Ron Savage. The official, but more limited, Marpa website is my personal one. Comments on this post can be made in Marpa's Google group, or on our IRC channel: #marpa at freenode.net.

## Footnotes

1. This github repo for linguist is https://github.com/github/linguist/.

2. Their methodology is often left vague, but it seems safe to say the careful line-by-line counting discussed in this post goes well beyond the techniques used in the widely-publicized lists of "most popular programming languages".

In fact, it seems likely these measures do not use line counts at all, but instead report the sum of blob sizes. Github's linguist does give a line count but Github does not vouch for its accuracy: "if you really need to know the lines of code of an entire repo, there are much better tools for this than Linguist." (Quoted from the resolution of Github linguist issue #1331.) The Github API's list-languages command reports language sizes in bytes. The API documentation is vague, but it seems the counts are the sum of blob sizes, with each blob classed as one and only one language.

Some tallies seem even more coarsely grained than this -- they are not even blob-by-blob, but assign entire repos to the "primary language". For more, see Jon Evan's Techcrunch article; and Ben Frederickson's project.

3. linguist's methodology is described in its README.md (permalink as of 30 September 2018).

4. This custom literate programming system is not documented or packaged, but those who cannot resist taking a look can find the Markdown file it processes here, and its own code here (permalinks accessed 2 October 2018).

5. For those who care about getting linguist as accurate as possible. there is a workaround: the linguist-language git attribute. This still requires that each blob be reported as containing lines of only one language.

6. For the treatment of Markdown, see linguist README.md (permalink accessed as of 30 September 2018).

7. Another possibility is a multi-scan approach -- one pass per language. But that is likely to be expensive. At last count there were 381 langauges in linguist's database. Worse, it won't solve the problem: "liberal" recognition even of a single language requires more power than available from traditional parsers.

8. For example, these line-alignment requirements match those in Section 10.4 of the 2010 Haskell Language Report.

9. Adapted from test code in Github repo, permalink accessed 2 October 2018.

10. See the test file on Gihub.

11. Some might think the two LaTex lines should be counted as LaTex and, using subparsing of comments, that heuristic can be implemented.

12. To be sure, a useful tool would want to include considerably more of C's syntax. It is perhaps not necessary to be sure that a file compiles before concluding it is C. And we might want to class a file as C in spite of a fleeting failure to compile. But we do want to lower the probably of a false positive.

14. There is documentation of the interface, but it is not a good starting point for a reader who has just started to look at the Marpa::R3 project. Once a user is familiar with Marpa::R3 standard DSL-based interface, they can start to learn about its alternatives here.

15. Specifically, since Marpa::R3 is alpha, its features are subject to change without notice, even between micro releases, and changes are made without concern for backward compatibility. This makes R3 unsuitable for a production application. Add to this that, while R3 is tested, it has seen much less usage and testing than R2, which has been very stable for some time.

16. Technically, a grammar is LR-regular if it can be parsed deterministically using a regular set as its lookahead. A "regular set" is a set of regular expressions. The regular set itself must be finite, but the regular expressions it contains can match lookaheads of arbitrary length.

18. The largest example is in Marpa and combinator parsing 2

19. Kegler, Jeffrey. Marpa, A Practical General Parser: The Recognizer. Online version accessed of 24 April 2018. The link is to the 19 June 2013 revision of the 2012 original.

## Subscriptions

Header image by Tambako the Jaguar. Some rights reserved.