Wednesday, December 1, 2021

The one-more-re-nightmare compiler – A fast regex compiler in Common Lisp

Before there was one-more-re-nightmare, there was cl-ppcre. cl-ppcre had been considered to be reasonably fast, and an exhibit of being able to compile at runtime in Common Lisp. The book Let over Lambda claims:

With CL-PPCRE, the technical reason for the performance boost is simple: Common Lisp, the language used to implement CL-PPCRE, is a more powerful language than C, the language used to implement Perl. When Perl reads in a regular expression, it can perform analysis and optimisation but eventually the regular expression will be stored into some sort of C data structure for the static regular expression engine to use when it attempts the matching. But in Common Lisp—the most powerful language—it is essentially no more difficult to take this regular expression, convert it into a lisp program, and pass that lisp program to the optimising, native-code lisp compiler used to build the rest of your lisp system.

While the claims are all true, they have nothing to do with cl-ppcre! cl-ppcre just produces a chain of closures representing the regular expression. I could believe that it was faster than Perl at the time, but it is not as fast as actually producing native code from a regular expression, such as what PCRE2 does with a bespoke JIT compiler. Still, if it was so fast for its time, it would be nice to implement a new engine which did what people thought cl-ppcre did.



from Hacker News https://ift.tt/3phBV9p

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.