Ranguard's blog: Devel::NYTProf - Perl profiling links needed

A conversation on IRC this morning...

07:46 XXXX damn... is it normal for DProf/Profiler to not work correctly with moose stuff?
07:46 aaaa people still use dprof?
07:47 XXXX people who might not be aware of alternatives, sure
07:48 XXXX what would you suggest instead then?
07:49 aaaa Devel::NTYProf is THE profiller these days :)
07:52 XXXX unfortunately googling perl profiling doesn't take you anywhere near it :(

So, I'd like to talk about Perl profiling, or even profiling Perl. Devel::NYTProf doesn't make the claim that it is THE Perl profiler, but it really does seem to be the best profiler for Perl that is currently available.

If you've not come across Devel::NYTProf the Perl profiler please go and give it at try.

If like me you think it's fantastic, please link to http://search.cpan.org/dist/Devel-NYTProf/ in a blog post or from your website with some appropriate link words, so we can let Google know about it, and therefor the rest of the world.

dagolden: Can you help identify ambiguous CPAN distributions?

Hello, Perl community. As I work on converting legacy CPAN Testers (CT1.0) reports to the new CPAN Testers 2.0 (CT2.0) format, I’ve encountered a curious conundrum and could use some volunteer help.

CT1.0 indexes reports based on the distribution name and version, e.g. “Foo-Bar-1.23″. This is an unfortunate historical accident, since PAUSE does not prevent uploads with the same file name to different author directories:

  • JDOE/Foo-Bar-1.23.tar.gz
  • JQPUBLIC/Foo-Bar-1.23.tar.gz

CT2.0 will index reports based on the full unique distribution file path. I’m currently working on a heuristic to link any given legacy test report (on “Foo-Bar-1.23″) with the correct distribution file path for that distribution name and version for the conversion to CT2.0.

For the most part, it works. Usually, there is only one distribution file path on BackPAN that matches. Sometimes there is more than one possibility, but I’ve worked out ways to resolve the ambiguity by comparing the possibilities to information in the 01mailrc files or the 02packages.details file.

But there are about 50 distribution name-version pairs on BackPAN that my heuristic fails to resolve. Since this is a one-time conversion from CT1.0 to CT2.0, all I need is a mapping file with entries like this for these ambiguous cases:

    YAML-0.39    INGY/YAML-0.39.tar.gz

If you think you can help — either through some automated approach or just by volunteering your human brain to do some basic research to identify the “authoritative” path (e.g. historical author list in the distribution documentation files), that would be a great help for me so I can keep plugging away on the conversion code and other todos.

Even confirming that the candidates on BackPAN have the same md5 sum would be helpful since then even if we guess the wrong author, the test results are still “good” for the mistaken distribution file.

Here is the list. The name-version pair is followed by an indented list of possible paths for that pair.

Attribute-Memoize-0.01
  DANKOGAI/Attribute-Memoize-0.01.tar.gz
  MARCEL/Attribute-Memoize-0.01.tar.gz
B-Generate-1.12_03
  JCROMIE/B-Generate-1.12_03.tar.gz
  JJORE/B-Generate-1.12_03.tar.gz
Bundle-Cobalt-0.01
  HARASTY/Bundle-Cobalt-0.01.tar.gz
  JPEACOCK/Bundle-Cobalt-0.01.tar.gz
CDDB-0.9
  FONKIE/CDDB-0.9.tar.gz
  KRAEHE/CDDB-0.9.tar.gz
Catalyst-Plugin-Session-Store-File-0.07
  ESSKAR/Catalyst-Plugin-Session-Store-File-0.07.tar.gz
  KARMAN/Catalyst-Plugin-Session-Store-File-0.07.tar.gz
Catalyst-Plugin-Static-0.05
  MRAMBERG/Catalyst-Plugin-Static-0.05.tar.gz
  SRI/Catalyst-Plugin-Static-0.05.tar.gz
Catalyst-Plugin-Static-Simple-0.14
  AGRUNDMA/Catalyst-Plugin-Static-Simple-0.14.tar.gz
  MRAMBERG/Catalyst-Plugin-Static-Simple-0.14.tar.gz
Crypt-SSLeay-0.51
  CHAMAS/Crypt-SSLeay-0.51.tar.gz
  TAKESAKO/Crypt-SSLeay-0.51.tar.gz
Curses-UI-0.72
  MARCUS/Curses-UI-0.72.tar.gz
  MMAKAAY/Curses-UI-0.72.tar.gz
Curses-UI-0.73
  MARCUS/Curses-UI-0.73.tar.gz
  MMAKAAY/Curses-UI-0.73.tar.gz
DateManip-5.20
  PHOENIX/DateManip-5.20.tar.gz
  SBECK/DateManip-5.20.tar.gz
Finance-Bank-HSBC-1.04
  BISSCUITT/Finance-Bank-HSBC-1.04.tar.gz
  MWILSON/Finance-Bank-HSBC-1.04.tar.gz
Finance-Bank-HSBC-1.05
  BISSCUITT/Finance-Bank-HSBC-1.05.tar.gz
  MWILSON/Finance-Bank-HSBC-1.05.tar.gz
Locale-Object-0.73
  EMARTIN/Locale-Object-0.73.tar.gz
  FOTANGO/Locale-Object-0.73.tar.gz
MARC-0.81
  BBIRTH/MARC-0.81.tar.gz
  ESUMMERS/MARC-0.81.tar.gz
MARC-1.13
  ESUMMERS/MARC-1.13.tar.gz
  PETDANCE/MARC-1.13.tar.gz
Mail-Thread-2.41
  RCLAMP/Mail-Thread-2.41.tar.gz
  SIMON/Mail-Thread-2.41.tar.gz
Math-MatrixReal-1.1
  ANDK/Math-MatrixReal-1.1.tar.gz
  STBEY/Math-MatrixReal-1.1.tar.gz
Maypole-Authentication-Abstract-0.6
  BOBTFISH/Maypole-Authentication-Abstract-0.6.tar.gz
  SRI/Maypole-Authentication-Abstract-0.6.tar.gz
Maypole-Config-YAML-0.1
  BOBTFISH/Maypole-Config-YAML-0.1.tar.gz
  SRI/Maypole-Config-YAML-0.1.tar.gz
Maypole-Loader-0.1
  BOBTFISH/Maypole-Loader-0.1.tar.gz
  SRI/Maypole-Loader-0.1.tar.gz
Maypole-Plugin-Authentication-Abstract-0.10
  BOBTFISH/Maypole-Plugin-Authentication-Abstract-0.10.tar.gz
  SRI/Maypole-Plugin-Authentication-Abstract-0.10.tar.gz
Maypole-Plugin-Component-0.05
  BOBTFISH/Maypole-Plugin-Component-0.05.tar.gz
  SRI/Maypole-Plugin-Component-0.05.tar.gz
Maypole-Plugin-Config-YAML-0.04
  BOBTFISH/Maypole-Plugin-Config-YAML-0.04.tar.gz
  SRI/Maypole-Plugin-Config-YAML-0.04.tar.gz
Maypole-Plugin-Exception-0.03
  BOBTFISH/Maypole-Plugin-Exception-0.03.tar.gz
  SRI/Maypole-Plugin-Exception-0.03.tar.gz
Maypole-Plugin-I18N-0.02
  BOBTFISH/Maypole-Plugin-I18N-0.02.tar.gz
  SRI/Maypole-Plugin-I18N-0.02.tar.gz
Maypole-Plugin-Loader-0.03
  BOBTFISH/Maypole-Plugin-Loader-0.03.tar.gz
  SRI/Maypole-Plugin-Loader-0.03.tar.gz
Maypole-Plugin-Relationship-0.03
  BOBTFISH/Maypole-Plugin-Relationship-0.03.tar.gz
  SRI/Maypole-Plugin-Relationship-0.03.tar.gz
Maypole-Plugin-Transaction-0.02
  BOBTFISH/Maypole-Plugin-Transaction-0.02.tar.gz
  SRI/Maypole-Plugin-Transaction-0.02.tar.gz
Maypole-Plugin-Untaint-0.04
  BOBTFISH/Maypole-Plugin-Untaint-0.04.tar.gz
  SRI/Maypole-Plugin-Untaint-0.04.tar.gz
Net-DNS-0.02
  ANDK/Net-DNS-0.02.tar.gz
  MFUHR/Net-DNS-0.02.tar.gz
Net-SSH2-0.07
  AWA/AWA/Net-SSH2-0.07.tar.gz
  DBROBINS/Net-SSH2-0.07.tar.gz
NetPacket-0.04
  ATRAK/NetPacket-0.04.tar.gz
  CGANESAN/NetPacket-0.04.tar.gz
PDL-2.3.2
  CSOE/PDL-2.3.2.tar.gz
  KGB/PDL-2.3.2.tar.gz
PNGgraph-1.11
  DMOW/PNGgraph-1.11.tar.gz
  SBONDS/PNGgraph-1.11.tar.gz
POE-Session-Attributes-0.01
  CFEDDE/POE-Session-Attributes-0.01.tar.gz
  JSN/POE-Session-Attributes-0.01.tar.gz
Plucene-1.19
  SIMON/Plucene-1.19.tar.gz
  STRYTOAST/Plucene-1.19.tar.gz
RT-Extension-MergeUsers-0.02
  JESSE/RT-Extension-MergeUsers-0.02.tar.gz
  KEVINR/RT-Extension-MergeUsers-0.02.tar.gz
SNMP-1.6
  GSM/SNMP-1.6.tar.gz
  WMARQ/SNMP-1.6.tar.gz
SXIP-Membersite-1.0.0
  KGRENNAN/SXIP-Membersite-1.0.0.tar.gz
  TOKUHIROM/SXIP-Membersite-1.0.0.tar.gz
Scalar-Defer-0.13
  AUDREYT/Scalar-Defer-0.13.tar.gz
  NUFFIN/Scalar-Defer-0.13.tar.gz
Term-Prompt-0.02
  ALLENS/Term-Prompt-0.02.tar.gz
  DAZJORZ/Term-Prompt-0.02.tar.gz
Term-Prompt-0.05
  ALLENS/Term-Prompt-0.05.tar.gz
  DAZJORZ/Term-Prompt-0.05.tar.gz
Test-Warn-0.07
  BIGJ/Test-Warn-0.07.tar.gz
  MPRESSLY/Test-Warn-0.07.tar.gz
Time-0.01
  JPRIT/Time-0.01.tar.gz
  PGOLLUCCI/Time-0.01.tar.gz
Tk-Wizard-Bases-1.07
  LGODDARD/Tk-Wizard-Bases-1.07.tar.gz
  MTHURN/Tk-Wizard-Bases-1.07.tar.gz
UUID-0.03
  CFABER/UUID-0.03.tar.gz
  LZAP/UUID-0.03.tar.gz
Win32-EventLog-Carp-1.21
  IKEBE/Win32-EventLog-Carp-1.21.tar.gz
  RRWO/Win32-EventLog-Carp-1.21.tar.gz
YAML-0.39
  INGY/YAML-0.39.tar.gz
  KING/YAML-0.39.tar.gz
finance-yahooquote_0.19
  DJPADZ/finance-yahooquote_0.19.tar.gz
  EDD/finance-yahooquote_0.19.tar.gz
libapreq-1.33
  GEOFF/libapreq-1.33.tar.gz
  STAS/libapreq-1.33.tar.gz
pg95perl5-1.2.0
  MERGL/pg95perl5-1.2.0.tar.gz
  YVESP/pg95perl5-1.2.0.tar.gz

Alias's Journal: Why Ruby is prettier and Padre changes the Perl community

Alias's Journal

Why is PHP so much easier for newbies?

Why does Java have the best IDE tools?

Why is Ruby prettier than Perl?

Why does Perl have the best package repository?

As I've played through Mass Effect 2 over the last few weeks, I see some interesting parallels.

In the Mass Effect universe, human technology is bootstrapped by the discovery of an ancient abandoned alien observation outpost on Mars, and the further discovery that the dwarf planet Charon is really an abandoned but active interstellar jump gate covered in ice.

Other similar species have done the same, resulting in a galactic community of around a dozen civilisations all based around the same basic technological underpinnings.

Despite these civilisations believing a recently (50,000 years) extinct civilisation built the gates, it turns out the technology is perhaps millions of years old.

Every 50,000 years, the synthetic AI race that built them returns from hiding in intergalactic space to wipe out all of the existing advanced species based on "their" technology, and reset the galaxy for the next set of civilisations to rise.

In a conversation between the game's protagonist and one of these old AIs, we are lambasted by the AI for taking the shortcut on their technology. The jump gates and other technology is left in place intentionally, so that each new generation of civilisations take a controlled and predictable development path, making it easier to destroy them.

The AI posits that it is the overcoming of adversity on your own that drives true technological advancement, and that easy routes make you (technologically) weak.

I think you can see something similar in the development of the different programming languages.

Java is long and wordy, taking a long time to type. The need to work around this limitation resulted in the proliferation of powerful IDEs, resulting in the annual 20 million line of code Eclipse release train.

PHP as a web language would have been stillborn if it didn't deal competently and quickly with the need to easily deploy code, the result of which is that you can effortlessly just change .html to .php, add a hello world tag, and upload via FTP as normal (something Perl still can't do well).

Python's need to gain mindshare against an entrenched Perl led to a huge focus on being easy to learn, to a simplification of the language, and to hugely popular things such as the PyGame library and game competitions.

Faced with the lack of truly great package repository, and with a web-heavy community, Ruby became the "prettiest" language. Creating an elegant website is both expected and required if you are going to gain mindshare for an idea.

And Perl's messy syntax and difficulties in the area of maintaining large codebases, combined with a pragmatic sysadmin-heavy community, resulted in an unmatched packaging system that allowed code to be maintained in small pieces, with enormous volumes of support infrastructure around it.

The ease of publishing and trend to smaller package that the CPAN allowed conversely means that the Perl community has never really had the need for pretty and elaborate websites, and the smaller package size means that we lack the giant headline libraries that make the payoff from website work better.

Our bias towards a pragmatic tech-savvy sysadmin userbase means we haven't really provided anything like the focus on learnability that has driven Python's gradual dominance in the mindshare of the young. It takes a certain rigour in your prioritisation to intentionally remove and dumb down existing powerful features so that the language is easier to learn.

Even for Strawberry, which focuses on the userbase with the lowest traditional knowledge, we intentionally have the smallest and most maintainable website possible and we don't even have the kind of introductory screencasts that we really really need (which should be easy but which I never seem to find the time to do).

If you throw a bunch of Perl coders against some PHP coders in a website competition, it is not unexpected that when both sides play to their strengths you will see something like http://geo2gov.com.au/html?location=e.g.+1+Oxford+Street from the Perl coders and something like http://www.hackdays.com/knowwhereyoulive/postcodes/view/2000 from the PHP coders.

The former required a massive amount of data extraction, transformation, aggregation, a gigabyte-sized PostGIS database, and deployment via a Linux virtual appliance to Amazon EC2 to allow for strategic load-shedding.

The latter required the ability to turn data into presentable and understandable information for real humans, and to make it pretty enough that they WANT to look at it.

Driving true technological progress, then, may often be about identifying weaknesses that are hard to solve but aren't completely impossible (and don't have any crippling long-term conceptual flaws at an economic or project-management level).

The three best projects I have driven - PPI, Strawberry, and (in part) Padre - all share this property. All three of these represent hard but not impossible problems, and require an awareness about which issues are intractable and which issues merely exist because there's been no need to solve them any better.

Padre in particular has suffered greatly from issues with Wx quality and threading. But given the low takeup of both threading and Wx it was reasonable to move forward on the basis that these would be fixed once there was something depending on them, and driving a need to fix them.

All of our early problems are gone now, and there is continued pressure to find ways to improve our use of (and the efficiency of) Perl's native ithreads.

Similarly, the creation of Strawberry required a lengthy year-long process of fixing Win32 bugs in all kinds of toolchain and low level modules, because we'd never had a proper working developer feedback loop before.

Similarly, Perl's current push for marketing and blogging and websites is directly resulting from Python's success in mindshare capture.

So my question for you to ponder this week is the following:

What can you see that Perl as a whole struggles to do well, for which a good solution is not impossible, and is only being held back by smaller problems which would go away if there was a working candidate solution put in place that needed those small problem solved.

Perl Foundation News: YAPC::NA 2010 Update

The YAPC::NA 2010 website has been up for a while, but it is now officially integrated with Act. Be sure to visit soon, create an account if you don't already have one, and start getting ready for the conference.

Remember that the conference is June 21st through 23rd 2010 at Ohio State University. Registration will be $100 ($90 early-bird), which day-passes available.

Modern Perl Books, a Modern Perl Blog: Chunking and Programming Languages

Some of my biases are transparent. For example, I believe that many of the complaints of Perl's "unreadability" are from people who've never bothered to learn how to read the language. You often see this from people who say "Sigils? Pfft. They're useless—mere syntactic noise!"

Linguists may disagree.

One of the early inventions in written language was punctuation. In specific, adding spaces between words (and even vowels, in some languages... yes, my history studies have come in useful while programming) makes documents easier to read. The same goes for punctuation. It's easy enough to write sentences with ambiguous meanings, depending on where you put a comma to delineate logically separate clauses. (Languages with greater riches of declensions and tenses and numbers and other forms are more flexible in word order, but they do retain some degree of poetic license. It's not all meter and rhyme scheme however.)

The basic idea behind all of these ancient inventions is that "Communicating is difficult enough without verbal and body language cues. Making different things look different helps."

To read source code, you have to be able to identify nouns and verbs. You have to be able to group related items and ideas while not grouping unrelated ideas. You need to be able to identify separate expressions as well as idioms.

One reason assembly language can be difficult to read is that its regularity (op arg1, arg2 or op arg1, arg2, arg3) precludes skimmability. That may sound odd; if you're reading code, why do you need to skim code, but it's important. Programming encompasses so many small details that you must understand the code in the small in the context of the local component as a part of the system as a whole.

Uniformity of syntax means that you have to rely on cues external to the source code or patterns of repeated details within the source code to indicate structure.

I have the same problem reading Lisp code, with its homoiconicity; the shape of the code gives me few cues as to what's different between sections of code. As well, Python's use of vertical whitespace to end blocks means that my eyes slip off of the end of logical blocks and I can't tell what happens where.

A lot of that is familiarity and personal preference (or quirks of the way my brain works). Some of that is the effect of deliberate design decisions.

If you embrace the idea, like Perl does, that different things should look differently, you reach some interesting conclusions. I don't think you can learn Perl effectively without understanding those conclusions, at least at an intuitive level. I'll write about that next time.

Perl Hacks: Building RPMs from CPAN Distributions

Regular readers will know that in the past I've shown some interest in building RPMs from CPAN distributions. It's been a while since I did much work in this area (although I do still release the occasional module to my RPM repository.

Over the weekend I was at FOSDEM and I attended Gabor's talk on packaging CPAN modules for Linux distributions. This has rekindled my interest in this area and I spent most of the train journey back from Brussels hacking around the area.

There's one thing that has been bothering me in particular recently. The standard RPM building mechanism (or, at least, the way it's configured in Fedora and Centos) does something incredible brain dead when trying to work out what other modules the current module depends on. It does it by parsing the source code and looking for "use" statements. This means that a module that might only be used in really obscure cases is going to be listed as a mandatory requirement for your module.

Gabor and I actually saw an example of this over the weekend when the Fedora packaging team raised a bug against Padre because it requires Win32::API. Padre, of course, only uses Win32::API when being used on Windows. And for that reason Win32::API is not listed as a dependency in its META.yml.

And that's, of course, where the RPM builders should be going to get a list of dependencies. META.yml contains the list of other modules that the author wants the module to depend on. This should be seen as the definitive list. Of course, there might be errors in that list - but that should be addressed by raising a bug against the module.

I've poked at this problem a few times, trying to work out how the RPM system parses the code and trying to replace that with code that looks at META.yml instead. But the RPM system uses a baroque system of interdependent macros and eventually they all lead to a piece of rather clunky Perl code. So each time I've approached this problem, I've backed off again.

The problem became more urgent when I wanted to package Plack for Fedora. Plack supports all sorts of hosting environments and therefore includes "use" statements loading a number of modules that most people will never use. Fedora includes Apache2, so Apache::Request (which is for Apache1) will never be available. It's not listed in META.YML, but it is used by one of the modules. The RPM build system was therefore insisting that it should be present. An impasse was reached.

Then I decided to turn the problem on its head. RPM building has two steps. You create a spec file for the RPM and then you build the RPM using the spec file and your original tarball. I started wondering if I could ensure that the spec had all of the requirements (from the META.yml). Once I'd done that I would only need to find some way to turn off the RPM build system's default behaviour.

People packaging CPAN modules for Fedora (and Centos) use a program called 'cpanspec' to generate spec files. I started digging into the code there in order to find out how to insert the list of correct dependencies.

Only to find that it has already been done. cpanspec is already doing the right thing and generating a list of 'Requires' statements from the data in META.yml.

Then all I needed to do was to see if I could turn off the (broken) default RPM build behaviour which was adding spurious extra dependencies. That proved to be easy too. It's just a case of adding %__perl_requires %{nil} to your .rpmmacros files.

So now all of my RPMs will have only the correct dependencies listed. This makes me very happy.

I suppose I should go back and rebuild all of the older ones too.

Oh, and because I've worked out a really easy way to generate this - here's a spreadsheet listing which CPAN modules are available as RPMs for Fedora. I plan to keep this list up to date (and make it much longer). [Link now fixed]

p.s. More about my trip to FOSDEM and the Perl marketing push there over the next couple of days.

Perl Foundation News: Grant Proposal: Fixing Perl5 Core Bugs

David Mitchell has submitted a grant proposal, which if accepted would make use of a portion of the funding generously provided to TPF by Booking.com.

Before the Board votes on this proposal we would like to get feedback and endorsements from the Perl community. Please leave feedback in the comments or send email with your comments to karen at perlfoundation.org.

Grant Title: Fixing perl5 core bugs

Name: David Mitchell

Amount Requested: $25,000

Synopsis

Recently, booking.com donated $50K for the "further development and
maintenance of the Perl programming language". I would like part of that
money to to be used to fund me for approximately six months to devote 50%
of my time fixing "hard" core perl5 bugs.

Benefits to the Perl Community

There are currently approximately 1200 open and 300 new bug reports in the
perl5 bug queue. Although some of these are of the "5.003_08 does not
build on platform X" variety, many are current: for example, almost 500 of
them were created after the release of 5.10.0. As the perl core has become
more and more gnarly, and the pool of experienced but active core hackers
has declined, these bugs are just piling up and not getting fixed,
especially the hard ones. With this funding, I would would be able to
devote serious time and effort to making a dent in this queue.

Note that unlike many large open source projects, perl has no paid
developers devoted to bug fixing.

Deliverables

Unusually for a TPF grant, there are not clear-cut deliverables for this
project. I intend to devote 500 hours of my time over the next six months
fixing perl core bugs. The net result will be a list of bug numbers that
have been diagnosed, and (hopefully) fixed. Because it's impossible to
predict in advance how difficult a bug is going to be to diagnose and fix
(or indeed whether it is even fixable), I can't commit in advance to a
fixed list of bugs that I will fix over the course of the grant. Nor is it
realistic to have a bounty per fixed bug; I would end up not getting
rewarded for time spent on difficult bugs, and conversely I would have a
strong incentive to cherry-pick easy bugs, defeating the purpose of the
grant.

Therefore, monitoring of my progress will become important (see below).

Project Details

I think this has been fully covered above.

Inch-stones

Note that due to the length and scale of this project, it is suggested
that there be two project managers, who can spread the monitoring load
between them as they see fit.

Since this project is heavily based on hours worked and the monitoring
thereof, I would post a weekly summary on the p5p mailing list which
details, for each bug worked on that week, how many hours were spent on
diagnosis and fixes, plus any bug status changes. This frequent feedback
would allow the grant managers and active core developers (who will be
aware of any recent commits and other activity of mine) to observe whether
my claimed hours bear any relation to actual activity and results, and
thus allow early flagging of any concerns.

Missing two weekly reports in a row without prior notice would be grounds
for terminating the project.

Once per calendar month I would claim an amount equal to $50 x hours worked.
I would issue a report similar to the weekly ones, but summarizing the
whole month. The report would need to be signed off by one of the
project managers before I get paid. Note that this means I am paid
entirely in arrears.

At the time of my final claim, I would also produce a report summarising
the activity across the whole project period.

Also, (the "nuclear option"), I suggest that either of the project managers
be allowed, at any time, to inform the board that in their opinion the
project is failing badly, and that the TPF board may then, after allowing
me to present my side of things, to vote whether to terminate the project
at that point (i.e. to not pay me for any hours worked after I was first
informed that a manager had "raised the alarm").

To ensure that at there are at least some visible results for the hours
spent, I would be required have closed at least one bug per 20 hours
before being able to claim money for those hours. (I would hope to close
more bugs than that, but by setting a low baseline, I'm not tying my
hands, while still allowing TPF to have something visible for publicity
purposes during the interim.)

Project Schedule

I am available to start work on this project immediately.

The project is expected to take six months. I am self-employed, which
allows me a good deal of flexibility. By promising approximately 50% of my
time, this gives me the ability to continue with my existing commitments
to other clients, while deferring seeking new clients. As such, the weekly
hours I devote to perl are likely to be highly variable, but hopefully
averaging out to about 20 hours per week. If for some reason I find that I
have spent less than 500 hours at the end of the six months, then I will
continue the project until until the 500 hours been spent, with the
proviso that that the TPF board are free to terminate the project at any
time after the six months. Conversely, if I manage to devote more than 20
hours per week, then my monthly payments will be accordingly larger, and
the project will terminate early (once the 500 hours are spent).

Note that it is currently my intention that the after six months I will
apply for a further $25K extension, although there is no obligation for me
to do so, nor for TPF to approve it.

Bio

I'm a freelance UNIX sysadmin and programmer living in the UK. I have been
using perl since 1993, and have been fixing core perl 5 bugs since 2001.
I have had commit rights since 2003 and I was pumpking for the 5.10.1 perl
release.

In short, I am one of only a handful of active people who understand large
parts of the perl internals and who can thus fix "hard" bugs.

Ovid: Unhappy with @INC

Ever since I upgraded to Snow Leopard, my Perl has been unstable. I've fixed most of it, but I sometimes have strange behavior. Today I discovered why (and why didn't I notice this sooner?)

@INC:
/System/Library/Perl/5.10.1/darwin-2level
/System/Library/Perl/5.10.1
/Library/Perl/5.10.1/darwin-2level
/Library/Perl/5.10.1
/Library/Perl/5.10.0
/Library/Perl/5.8.9
/Library/Perl
/Network/Library/Perl/5.10.1/darwin-2level
/Network/Library/Perl/5.10.1
/Network/Library/Perl

Great. A bunch older Perl's have somehow made their way into my @INC. Time to recompile.

Perlbuzz: Help keep the world safe from SQL injection

A while back, I put up bobby-tables.com as a repository for showing people the right way to handle external data in their SQL calls. Whenever someone pops up on a mailing list or IRC and they're building SQL statements using external tainted data, you can just refer them to the site.

In the past few days, I've spiffed up the site (with design help from Jeana Clark) and added pages on Perl and PHP. I need more examples, though. It's 2010, and there's no reason anyone shouldn't know about parameterized SQL calls.

The site source is hosted on github, so if you have any contributions, please fork it and let me know about your applied changes, or you can email me directly.

Thanks!

P.S. In the next few days, I hope to fire up some redesign on perl101.org, too.

Perl Foundation News: 2010Q1 Grant Proposals

For this quarter TPF has three grant proposals that were not funded in 2009Q3 round and that will be discussed and voted again in this round, and four new grant proposals:

Please take some time to comment on these proposals. TPF Grants Committee is very interested in community feedback on these projects relevance. Please be polite.

Perl Foundation News: 2010 Grant Proposal: Enhancing Perl 6 Pattern Matching

Enhancing Perl 6 Pattern Matching with Ideas from Snobol4 and Other Sources

Name:

Morris M. Siegel, Ph.D.

Email:

[hidden email]

Amount Requested:

$3000 (negotiable)

The substance of my proposed alternative pattern-matching specification has already been essentially worked out as a self-funded research project, as it were. I felt this project was of such importance that it was worthwhile giving myself a sort of sabbatical to develop it. It has taken rather longer than I originally anticipated, and my personal funds have dropped to an uncomfortably low level. I can no longer afford to keep concentrating my efforts on this project on an unfunded basis.

My grant request is intended to retroactively fund some of my past development, and as well to enable me to continue focusing my efforts on the project. (The main task remaining is to write it up carefully and precisely, but there are some aspects that still need to be thought out.) Without a grant I would have to relegate the project to my limited spare time, and by the time it would be done it could well be too late for serious consideration.

I selected the figure of $3000 since I understand it is the upper range of typical Perl Foundation grants. I think the merits of the project, plus its time requirement (past and future), would justify a larger sum if the money is available. In addition, a larger sum would enable me to spend more time on fleshing out those ideas that still need to be thought out.

Synopsis

One of the chief reasons for Perl's popularity is its regex pattern-matching facility; no other part of Perl has been made into a stand-alone package (PCRE) or borrowed so extensively by other languages. The very name "Perl" alludes to the fundamental nature of pattern matching: the "E" of the acronym "PERL" stands for "extraction," which mostly means pattern matching. Perl 6 pattern matching is substantially more powerful than that of Perl 5.

Snobol4 is arguably the first widely-available language providing a pattern-matching facility, and despite its age, and despite all the new features of Perl 6, there are still some aspects in which Snobol4 pattern matching is more powerful than that of Perl 6.

Aside from the above, the current specification for Perl 6 pattern matching, Synopsis 05 (available as http://svn.pugscode.org/pugs/docs/Perl6/Spec/S05-regex.pod, http://perlcabal.org/syn/S05.html, or http://perl6.cz/wiki/Synopses/S05), is quite complicated to learn and remember, as is evident simply by reading through all of S05. Moreover, many of its multitudinous capture mechanisms lead to code which is brittle, hard to read and maintain, and often non-mnemonic. These complications and problems do not conform to the practical usability which is supposed to be a hallmark of Perl (the "P" of the acronym), and are not a necessary price that must be unfortunately be paid for the power.

The purpose of this project is to formulate an alternative specification for Perl 6 pattern matching that is (1) enhanced by ideas inspired by Snobol4 and other sources but adapted to Perl's idiom, (2) simpler to learn and use, and leading to code which is easier to read and maintain, and (3) at least as powerful, and arguably more powerful, than the current specification.

Benefits to the Perl Community

If pattern matching is enhanced as indicated by the preceding paragraph, then potentially all Perl programmers needing to do non-trivial pattern matching will benefit.

In addition, the very acceptance of Perl 6 in the wider computing community could well be facilitated, since I think it quite probable that many would be put off by the complexity of the current pattern-matching specification. Along these lines, enhanced pattern matching would be more likely to inspire the adaptation of PCRE to Perl 6 and similar imitation by other languages, and thereby benefit not just the Perl community but the entire computing community.

I am well aware that much work has already been done in the implementation of the current specification, and that much development has been done based on the current specification, notably Larry Wall's STD grammar for Perl 6, and also the grammars for the various Parrot-based languages. As such, it is admittedly bold to suggest at this late stage that the specification be significantly revised. However, I believe the advantages afforded by the alternative specification warrant its serious consideration. (In a conversation I had once with Larry Wall, he stated that although he does not agree with all my ideas, he finds it worthwhile to listen to them.)

At YAPC|10 in Pittsburgh, on Jun 24, 2009, I gave a talk entitled "Enhancing Perl 6 Pattern-Matching with Ideas from Snobol4" (http://yapc10.org/yn2009/talk/1988), whereby I intended to present my ideas to the Perl community and get feedback. Unfortunately, I did not time my talk well, and by the time I finished presenting an overview of Snobol4 (to provide background for my ideas) and a sampling of problems with the current specification (to justify revising it), there was not enough time left to actually explain the alternative specification. Conversations with other YAPC participants did reveal interest in hearing my ideas. In particular, in discussions with Patrick Michaud, who is the chief if not sole implementer of the current specification, he (1) acknowledged that the core Perl 6 developers realize that S05 is hard to read (Larry Wall confirmed this), (2) complimented me on my examples illustrating the brittleness and other problems of the current capture mechanism, and (3) stated that if an alternative specification were even better than the current one, he would be happy to implement it.

Technically it would be possible for both the current and the alternative specifications to coexist in the same implementation, so on-going Perl 6 development efforts could proceed unimpeded while the alternative specification was being implemented and refined. If at some point it were decided to actually replace the current specification with the alternative (which is the ultimate intent), I believe the conversion of existing code should not be too laborious, so the initial release of Perl 6 would not be unduly delayed. There is some precedent for this, viz. the two different threading models of Perl 5.

Deliverables

This project should initially result in a document published on the Web presenting (1) an overview of the relevant parts of Snobol4, to help motivate the Snobol4-inspired features of the alternative specification, (2) a discussion of problems with the current specification, and (3) the alternative specification, in fairly complete detail.

After publication, a notice would be emailed to the appropriate mailing lists (perl6-language, yapc, snobol, perhaps others) informing subscribers of the existence of the document and inviting feedback. Based on feedback, the alternative specification might be revised. After a few iterations of this, assuming sufficient interest expressed by the Perl community, the alternative specification should be stable enough to proceed to implementation and further refinement as appropriate.

Project Details

The alternative specification document would assume the reader knows Perl 5 and has read S05 at least cursorily, and would rely on S05 to provide details on those features common to the alternative and current specifications. However, the alternative specification has a sufficiently different flavor from the current one that the document would have to present many ideas from scratch, so it should be reasonably accessible even to someone whose understands pattern matching conceptually but is unfamiliar with S05.

It is difficult to go into more detail on the content of the alternative specification document without summarizing it, which I feel is beyond the scope of a grant proposal. The "inch-stones" listed below are far too terse to give the reader any notion of the content. However, to provide some sort of glimpse of the content, we list the following features of Snobol4 pattern matching that are absent from Perl 6 as it stands now but would be present in alternative pattern matching:

(A) Compile time vs. build time vs. match time

The pattern structures used in Snobol4 pattern matching are not built at compile time. Rather, at run time, a pattern structure is built as a result of evaluating a pattern-valued expression; once built, this structure can be used to do pattern matching, either immediately or later on.

As a result of having two distinct run-time operations, pattern building and pattern matching, the Snobol4 programmer has the ability (1) to chose during which operation to bind the value of pattern components (e.g. LEN(N) vs. LEN(*N)), and (2) to define new pattern-matching functions in a convenient high-level manner, without having to resort to writing macros or low-level code.

Understanding this three-way distinction among compile time, build time, and match time is crucial. On one hand, a careless or novice programmer who conflates compile time and build time can inadvertently write a program that inefficiently reconstructs the same pattern numerous times (although to mitigate this an optimizing compiler can precompute constant patterns or subpatterns). On the other hand, this distinction encourages a mind-set and facilitates a programming style in which the programmer writes pattern-valued functions to effectively extend the language of pattern-matching expressions, since the execution of these functions takes place during build time and does not cost anything at match time. Writing such pattern-valued functions seems at least conceptually easier than writing macros, and the ability to do so enhances the expressiveness of the programming language.

If the equivalent distinction existed in Perl 6, then not only would the expressiveness of pattern notation be increased, but also some of the complexity of the core pattern-matching specification could be offloaded to modules that define pattern-valued functions or methods.

(B) Conditional capture

In the Snobol4 operation of conditional value assignment (binary "."), assignment ("capture," in Perl terminology) takes place only if the value is captured from a subpattern that is part of an ultimately successful match. That is, in the semantics of Snobol4 pattern matching, there is a distinct "conditional" phase following the successful conclusion of a match and prior to the substitution phase, in which conditional value assignments (which may include arbitrary side effects) are carried out. This phase, which currently has no analogue in Perl whatsoever, enables the Snobol4 programmer to write patterns that backtrack without having to undo side effects performed by alternands that initially succeed but are later backtracked out of. Although unrestricted backtracking can result in unacceptably slow performance, limited backtracking can be quite efficient, and reworking a pattern to avoid backtracking entirely can be tedious and result in code that is less natural. If Perl 6 had a similar conditional phase, then the programmer would no longer have to rid his patterns of backtracking in order to avoid performing inappropriate side effects. This would clearly facilitate the task of formulating patterns, especially complex ones.

(C) Miscellaneous primitive patterns

Snobol4 has some useful primitive patterns which cannot easily be emulated in Perl 6: TAB and RTAB, which move the cursor to a given position from the beginning or the end of a string, and (in the Snobol4+ dialect) ATAB, ARTAB, and LEN, which can move the cursor to the left as part of normal pattern matching (not as look-behind). Unlike (A) and (B) above, these do not reflect a fundamental difference between the Snobol4 and Perl 6 pattern-matching models, and thus could be included (if desired) even into the current specification.

A significant part of the challenge of adapting these features for inclusion in Perl 6 lies not merely in altering the notation to conform to the style of Perl 6, but rather in appropriately generalizing the features themselves to harmonize with the rest of Perl 6 pattern matching, and in particular to accord with Perl features absent from Snobol4 such as lexical scope.

Inch-stones and Project Schedule

Experience with my dissertation and other long papers I have written has shown that writing up even already-worked-out ideas is more time-consuming than one anticipates, so I have tried to be conservative in the timing estimates for the inch-stones listed below. The estimates are in units of work days and appear in curly braces following each milestone. The sum total comes to 46 work days, or (allowing for slippage) about 10 work weeks. Taking into account some personal obligations during this period, I believe the essential deliverable -- the initial specification document -- could be completed in three elapsed months, which could begin at once. How much time would be needed after that for revision would obviously depend heavily on the promptness, quantity, and content of feedback.

I mentioned above that although the ideas are essentially worked out, there are still some aspects needing further reflection. They are: (a) verifying conformity with the other Synopses (which is non-trivial, given how voluminous and dense the Synopses are); (b) fleshing out details of a possible Pattern role; and (c) providing additional examples of patterns written according to the alternate specification. These are the issues that, given a larger grant, could get extra attention. Even if the project is expanded to include them, the initial document would not be delayed: I think it important that the Perl community be able to begin considering the alternative specification as soon as feasible, and any expansion of the project could be done thereafter while feedback would be (hopefully) received and considered.

The inch-stones of the specification document are:--

I. Summary of salient parts of Snobol4 {2}

II. Problems with the current specification of Perl 6 pattern matching (S05); justification for considering an alternate specification {2}

III. The alternative specification

0. influence of prior work; disclaimer: possible Perl5ish spirit; perhaps could be simplified {.4}

1. terminology: "pattern", "special form", "subpattern", "subrule", "P6c", "P6a" {.3}

2. overview of model: data structure, with arbitrary embedded values, that acts like code during pattern matching {1}

2.1. pattern code vs. normal code (PE vs. NE [PNE, listNE, numNE] {.2}

2.2. p{PE}, p/PE/, /PE/, p(@args):attrs{PE}; perhaps pat{PE} or pattern{PE}; rule, token {1}

2.3. incorporation (similar to Lisp quasiquotation) {.3}

2.4. compile time vs. build time vs. match time {.2}

2.5. named patterns (declared rather than assigned); build time at UNITCHECK/INIT/BEGIN {.2}

2.6. substantiation, persubstantiation {.2}

3. matching a string, :i etc., :approx (cf. agrep, TRE) {.4}

4. matching a Boolean (True, <null>, False, <fail>), <do> {.4}

5. matching a number, :fuzzy {.3}

6. matching a CharSet (O-O character classes -- <[ ... ]>; cf. Icon charsets) {1.3}

7. matching a [sub]pattern (primitive or composite) {.5}

8. matching a closure {.3}

9. parametrized patterns; <bind> {.5}

10. quantification with separation {.2}

11. scoping [sub]patterns: [ ... ], <LABEL>:[ ... ] {.5}

11.1. unique properties: $/ (pattern-local, not normal-local), emission, conditional emission {.5}

11.2. possible "minor scope": e.g. (:i PE) vs. [:i PE] {.1}

11.3. <my>, <state>, NORMAL:: {.3}

11.4. <abort>, undef {.1}

11.5. <commit>, :: {.2}

11.6. <emit>, @() or @EMIT; uncaptured emissions {.5}

11.7. <yield> {.3}

11.8. @($/.quant) {.2}

11.9. identities and other examples {.5}

12. :P5 {.2}

13. capture (of emission of scoping subpattern) {.5}

13.1. overview: data flow model, "~>" (if not ">>"), target lists, repetition, using coroutine logic to capture to next target not yet processed, :take(n) { 2 }

13.2. passive targets: scalars, arrays, slices, * {.5}

13.3. active targets: functions, code references, (perhaps) $*TAKE {.8}

13.3. active targets: plain blocks, pointy blocks, <do> {.8}

13.4. active targets: p{PE}, [PE], <named_pattern> {.8}

13.5. secondary targets, splitting and joining of data streams {.5}

13.6. chaining of subpatterns {.5}

13.7. examples {1}

14. conditional phase {.5}

14.1. "~>?": conditional capture {.5}

14.2. <do?>: conditional side-effect {.5}

14.3. <confirm> {.5}

14.4. behavior w.r.t. backtracking {.5}

14.5. examples {1}

15. <tell>, <seek>, <at>, :forward, :bidi {.5}

16. <reverse> {.2}

17. :decl (or :par or :parallel), :proc (or :seq or :sequential), (:proc) to establish sequence point {.8}

18. :canon, :quick {.5}

19. generic meaning of <name> and of <op args> {.3}

20. <before PE>, <after PE>, <!before PE>, <!after PE> {.5}

21. <eval>, :memo {.3}

22. {any @arr}, {cat @arr}, {all @arr}, :eval, :lazyeval, :memo {.6}

23. <to>, <from>, <(PE)> {.2}

24. <cut>; <subst> {.7}

25. Rationalized m, M, s {.3}

25.1. m {.4}

25.2. M {.1}

25.3. s {.3}

25.4. possible generalization of m {.3}

25.5. possible generalization of s {.3}

25.6. relation to "~~" {.1}

25.7. dwimmy laxity in placement of attributes for m, s, and p {.3}

26. OO interface {.2}

26.1 m {.4}

26.2 s {.3}

26.3 resumed matching after <yield> (coroutine-style) {.4}

27. :keepall {1}

28. :g -- top-level result is list/array of Match objects {.3}

29. <try>, <catch> {.6}

30. perhaps: <lazy PE> -- like {{ p{PE} }}, but when p{PE} is first evaluated it replaces (memoizingly) the closure { p{PE} } in the pattern structure. {.3}

31. <literal>, :eval {.4}

32. matching a Range {.3}

33. matching an arbitrary object: Pattern role (patternization method) {1}

34. summary of members of $/ {.5}

35. other notational differences {.1}

35.1. :sigspace should retain the colon (m:s, s:s, p:s). (If not, at least let m:s abbreviate to ms, not mm .) {.1}

35.2. {overlay(p,q)} instead of [p & q] or [p && q] {.2}

35.3. {juxta(a,b,c)} instead of [a ~ c b] {.3}

36. :panic {.1}

37. comparison of P6a with P6c; features of P6c not directly present in P6a -- i.e. handled differently or (like ~~ and <prior>) subsumed by other features {2}

38. more examples {3}

39. co-existence with current specification {.8}

40. concluding remarks {1}

Bio

I have a Ph.D. in Computer Science from Cornell University; my dissertation is entitled Proving Properties of Snobol4 Patterns. I have long been interested in regular expressions, context-free and other formal languages, and pattern matching.

As mentioned above, the ideas constituting my proposal are basically already worked out, and there was interest expressed by some participants at YAPC|10 in seeing them. As far as I know, no one else has proposed or intends to propose an alternate pattern-matching specification for Perl, so it would follow that I am the best person to do this.

Perl Foundation News: 2010 Grant Proposal: CPAN Reviews

CPAN reviews

Name:

Alexandr Ciornii ('chorny' on IRC and PAUSE).

Email:

[hidden email] (backup) [hidden email]

Amount Requested:

$1200

Synopsis

Many CPAN modules have good documentation, many have bad documentation. But there is no such thing as enough documentation. There are many good reviews, examples, descriptions outside CPAN. I propose to collect them and cataloguize.

I want to make a site with links to reviews of CPAN modules. In general this site should be community-moderated, community-edited and allow users adding links to do minimal work first and enhance later, i.e. use this site as a bookmarking service.

Benefits to the Perl Community

Simplify learning CPAN modules for novices and mature users - no need to scan google search results, and be able to see is it worth reading review or not, by opinion of others.

Ability to store list of useful links and share it with others.

Possibility of integrating list of links into author own page.

Additional benefits:

A ready code of site to copy and use for similar purposes.

Support for OpenID/Bitcard in CGI::Application.

Deliverables

Code of web app (under open license)

Working site

CPAN module to support for OpenID/Bitcard in CGI::Application.

After release I will maintain and enhance code and site further.

Project Details

I plan to develop it using CGI::Application. I will need to develop CGI::Application::Plugin::Authentication plugin for OpenID/Bitcard.

Users will be able to vote up/down for link, report spam or dupe link, comment. Every link will have title and description (only one from them will be mandatory), language, date (original), tags, list of modules described in review. After adding link, some info will be fetched automatically, so user will need to edit it.

No users registration at all, OpenID/Bitcard only.

There would be ready JavaScript widgets for other sites:

  1. To display list of links for a module (sorted by popularity).
  2. To display number of links for a module.

They would be customizable, by language of links or language list can be received from HTTP headers. Also JSON output should be available.

Site would be able to get list of links from RSS feeds by tag (I propose "cpanreview", but this will be discussed with Perl community). Also tags like "cpanreview-Module::Name" or "cpanreview-Dist-Name" would add association with module. Unassociated links would be displayed separately on special page for anyone who would like to review some links.

It would be possible for any user with sufficient number of upvotes (for ex. 2) to modify title/description/module_list of link. Number of votes should be customizable for every operation.

Later I want to ask owners of http://search.cpan.org and http://kobesearch.cpan.org/ to include links to corresponding pages on modules pages.

Github will be used for hosting code.

Inch-stones

  1. OpenID/Bitcard plugin
  2. Adding links
  3. Automating fetching data about link added by user (title, modules mentioned)
  4. Voting/spam/dupe
  5. Comment system
  6. Community editing
  7. JavaScript output, export to JSON
  8. RSS fetching
  9. Refactoring based on opinion of Perl community on real version.

Project Schedule

I will begin work immediately, with 10-15 hours a week. First version with reduced capabilities should be available in 1.5 month, full version in 3 month.

Bio

I'm Perl programmer from Moldova (Europe). I've working in Perl from 2000, joined Strawberry Perl project in 2006. I'm active memeber of Perl community, maintain 18 modules on CPAN and several more are planned for release next month. I have big number of patches for Perl modules, including CGI::Application plugins, ExtUtils::MakeMaker, Module::Install.

Perl Foundation News: 2010 Grant Proposal: Perl Compiler

Perl Compiler

Name: Reini urban
Email: [hidden email]
Duration: Until March 2011
Amount Requested: € $1000 (just for motivation)

Synopsis

Fix most of the remaining perl compiler, i.e. B::C, B::CC, B::Bytecode bugs.

Improve documentation a bit.

Maintain the planned compiler.perl.org site.

Benefits to the Perl Community

A working compiler.

Faster startup time.

Optionally faster run-time if the B::CC optimizations work out as expected.

parrot? I've worked with them. I gave up. Better a half-ass perl5 compiler now, than the ongoing ... with parrot/perl6.

Deliverables

Extend the testsuite reasonably - but less is more. The full author tests for all tested perls on all platforms needs 2 days.

Fix the existing SKIPs and TODOs

Testsuite passing on my main platforms cygwin, MSWin32, debian5, centos5, freebsd7, solaris10.

compiler.perl.org

More fun, less headaches

Inch-stones

I don't think I need this.

See below at Project Schedule. From now until end of March 2011, the next surf-season.

Project Details

I successfully ported the abandoned compiler to 5.10 and blead and fixed most of the old bugs, so that the tests pass now on most platforms.

But there's more todo. Finding bugs cannot be detailled here. In the core suite are some, in the top100 modules are some, the community will come up with more. Some are known, some not yet. So far all found bugs could be fixed within 1-2 days, sometimes they are just hard to catch.

  1. Adjust the perl core suite and find limitations (runperl issues) vs bugs
  2. Check modules
  3. Check user reports
  4. Check weird platforms, compilers, programming tricks.

CC bugs

Well, some bugs are run-time limitations which will require run-time solutions. The sortcv bug [CPAN #53536] is easily understood but hard to fix. Will need at least 2 days concentration on it.

Planned CC Optimizations

Static initialization of readonly data: SVs, AVs, HVs.

-fcog for strings (copy on grow by using a custom destructor)

Fill in missing Opcodes flags for most optimisable ops. Maybe even automatically.

Check possible type declarations with Devel::TypeCheck, MooseX::Types, attributes and such.

I've finished 50% of Malcom's Todo during the winter surf-holidays, and fixed 90% of Malcom bugs in the last year so I'm confident.

I've already got a sponsor for my conference travel expenses. A tip: They could be persuaded to sponsor this grant also :) (cPanel)

Project Schedule

During summer-time I prefer surfing over coding to keep emotional stability in the coorporate environment. Winter 2009-2010 was very productive, because I got a kick by cPanel who needed it.

For the next coding season I might need further kicks, a mini-grant like this might be enough.

2010: Find and fix all remaining bugs. I suspect there are still 5-6 major ones.

2010: Faster testsuite. Now: 8 min user - 40min author - 2 days all perls + plats.

Until March 2011: CC type and sub optimisations

Later (not part of this proposal)

Until 2012: CC unrolling => jit within perl (perl -j)

Bio

Reini Urban, living in Graz, Austria. Born 1963, pretty old, yes.

Born Lisper, but I've been writing perl programs since 1992 and released my first module to CPAN in 1995, the perl5.hlp file for Windows, created by some pod2rtf.pl. cygwin maintainer (perl, parrot, postgresql, clisp, ...) for a couple of years, and several B::* modules.

I work for a large HW+SW company (>2000 developers), 8-16 o'clock.

Since nobody is able to help me with the compiler it looks like I'm alone. Hopefully this will change! I even had to write my own Debugger. Yes, I'm aware of trucks. Surfing is not risky at all. Bycycling is more dangerous.

Perl Foundation News: 2010Q1 Grant Proposal: perl core memory improvements

perl core memory improvements

Name:

Jim Cromie

Email:

[hidden email]

Amount Requested:

How much is your project worth? $3000

Synopsis

Memory allocation enhancements in core (sv.c).

Perl's variable namespace model is very flexible, users can:

 - create vars, in any package, or in my scope, by naming them;
 - give them complex values: my $foo = [ 1, { a => 2}, 3 ];
 - share/assign/shallow-copy them: $main::bar = $foo;
 - crosslink or self ref them: $a[2] = [$a[2], $a[1]];
 - other hairy stuff

This user data is all built on-demand from an inventory of sv-parts which is kept on the interpreter's freelists (sv_root, PL_body_roots). These are refilled periodically by S_more_bodies, which gets-an-arena, slices it into sv-parts, and threads them onto the freelist.

This can result in user data spread across memory like a spiderweb in a corner; its hard to clean the corner without destroying the web. IOW, it makes memory reclaim "hard", and probably ineffective. As a result I think, perl core has never really seen the need/benefit to bother reclaiming arenas.

One important workload however could benefit; Storable::freeze() uses a ptr-table to track SVs that it has %seen, but its PTEs hang off the interpreter until process termination. For a long-running process, this is clearly suboptimal.

Benefits to the Perl Community

1st, theres this in perltodo:

  use less 'memory'
       Investigate trade offs to switch out perl's choices on memory
       usage.  Particularly perl should be able to give memory back.
       This task is incremental - even a little bit of work on it will help.

This is deep core work, benefits accrue to users of 5.14, which is eventual target. Since the interfaces changed are internal, it may be possible to get it into 5.12.x.

Currently, Storable::freeze() uses ptr-tables to track seen SVs as it freezes them, so that it honors shared linkages. Doing this on large datasets will allocate a huge ptr-table, which when freed, releases all those PTEs back to the interpreter-global freelist, where they hang uselessly until process death (or interpreter shutdown).

The work proposed below appears to provide a workable mechanism to implement the private-arenas that Tim Bunce expressed want/need for, with Nicholas Clark's comments, here:

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2009-12/msg00821.html

By my 1st reads, Tim wants to coax a set of SV allocations to be taken out of separate arenas, to protect them from others. Nick outlined a solution that largely fits with my earlier revision of this grant proposal (Aug, 09), but added a discussion of savestacks, and implied (to me, at any rate) a need for a robust underlying mechanism, prompting this revision of the proposal.

General benefits will likely flow from finding out what nytprof needs, and figuring out how to provide it :-D

Deliverables, Project Details

here are the major elements

Private Arenas

  The short version:
  - adapt get-arenas(sig): sv_type arg2 -> (void*) reqid
    and track allocs by the reqid
  - propagate that to S_more_bodies and its macro wrappers
  - add release-arenas() stub 1st

With get-arenas(reqid), we can track arenas by its users, with S_more_bodies we can extend that tracking to the interpreter's svtype consumers individually. With unique tracking of arena users, we can offer release-arenas(reqid), and since we're an internal sub-system interface, expect them to use it properly.

Design Benefits

S_more_bodies() outer-users (disregarding the macro-wrapper) keep their current interface, the arenas provisioned by it for each sv_type are transparently tracked, and can soon be reclaimed.

get-arena/release-arena give a balanced api for clients to manage slabs of memory themselves. The api is minimal, allowing and requiring simply that callers of get-arena(reqid) do:

 - call release-arenas(reqid) when done with mem.
 - know theyre not sharing parts of the arenas when done.
 - dont abuse the reqids of others, ref your own object.
 - users can create and abandon arenas (be careful!)

With this, users hacking in core can allocate many slabs, of various sizes, using just one reqid, assemble them with pointers into arbitrary structures, and when done, know that they're all cleaned up together. Users may also use multiple reqids to simplify their memory reclaim operations.

It should also be flexible and efficient enough for use by XS libraries, given their tolerance for newness.

Private-Arenas 1st user: ptr-tables

Given the Storable use case, this has potential merit; being parsimonious with PTE mem by default will work for some users.

But for less specific cases, the global PTE freelist probably wins a performance contest; the malloc demand is intrinsically less when PTEs are reused, not only freeze() uses ptr-tables, and its only pathological cases that would even cause notice.

Nonetheless, it provides a test-case for 1st use of the new interface, and an alternate ptr-table implementation, possibly providing support for 'use less memory'

Note that with the stubbed release_arenas, we only pretend to free the private allocations; this may cause problems in make test, but the overall demand for ptr-tables is quite limited (iirc the big user, t/re/regexp_qr_embed_thr.t creates ~2000 ptr-tables), and on 1GB machines, we may not run out of memory.

This has some probitive value for OOM handling also, especially in a setrlimit()d sandbox.

Design Benefit

private arenas in ptr-tables provides a concrete basis to consider other resource reclaim strategies, narrowly 1st, but perhaps also broadly for other potential users.

When ptr_table_free is called, we know that:

  - we start with an empty, private PTE freelist, fill it as needed
  - pt-store consumes PTEs from private PTE freelist
  - all PTEs in the table came from our arenas
  - all PTEs cleared back to it are from our arenas
  - no other users of those PTEs exist
  - all our arenas have our reqid

With this, we should be able to just whack the whole table (by finding and freeing the arenas with the reqid), skipping all the rethreading to the global freelist, and immediately releasing the memory back to the system. This sounds possibly useful later.

release_arenas(reqid)

Ive separated this deliverable because private-arenas in ptr-tables can be mostly validated without it (using the stub) and because in some respects its our 1st new feature, where the previous focus was on refactoring the existing code to accommodate the feature.

The 1st test of this code will be in perl_destruct

  release_arenas(&PL_body_roots[$_]) foreach @sv_types;
  release_arenas(&PL_sv_root);

Then we call it from ptr_table_clear.

support for NYTProf

Given the recent p5p traffic 12/20 (link above), I think this path to private arenas helps; it adds support needed beneath the fancy freelist pushing-and-popping briefly described there. What nytprof needs will take further study.

register_arena_consumer

The design thus far does nothing to protect (or even advise) of reqid trampling between 2 users, get_arena() implicitly allows callers to start new reservations with the given ID, which allows sharing amongst knowing users. Formal registration will provide at least advisory protection. This could be done with a flag too.

Semi-Deliverables

These have real merit in my estimation, but are rather speculative, and I'm reluctant to call them committable deliverables. I think think they help illustrate the potential of the above work.

use less memory, pte

One way to nudge this rock foward is to plug in a 2nd (private) ptr-table-* function set, addressing the Storable::freeze use case.

I suspect however that freelist pushing and popping, along with get_arenas() and release_arenas(), will ultimately be a better tool than this specialized fix for PTEs, but it serves as a point of discussion (strawman); we dont even have decent terminology yet, let alone a few paths forward.

use my_arenas

Storable::thaw() might want to put the perl-data it vivifies into a constrained region of memory, as this may improve processor cache performance, especially with their modern prefetch systems. So would perl routines, such as parsers, data generators, etc.

  # Doing it lexically would be nice;
  get_tight_hash {
    my $var;
    use my_arenas depth => 1, 'xs';
    return { Storable::thaw($packet) };
  }

Here, my_arenas seeks to capture only SVs in the contained xs scope (the thaw), and those in {} composition. depth => 1 sounds safest wrt the spiderweb problem, N might be nice if it makes sense (depth=>0 makes me nervous). I also suppose that xs might somehow be different than just depth => 1.

This doesnt attempt to migrate perl data into a container; that would be tantamount to lifting the spiderweb without damaging it, and is out of scope here. But this may shed some light in a dimly lit corner.

Inch-stones

The deliverables above are largely self explanatory, but will also include responding and resolving issues; they're then largely defined by porters and particularly pumpkings.

Tim Bunce, given his interest for nytprof, will hopefully offer guidance as to what he needs, Id treat those as immediate goals.

There are no doubt numerous knock-on effects to the rest of core, some of these will be in-scope, though I hope not all.

setrlimit()d sandbox, oom tests. work this into fresh_perl, maybe wrap this as sandboxed_perl().

p5p discussion, review, responses, revisions, variations, etc.

Project Schedule

1-2 months

Bio

Ive been hacking in perl for a while

  [jimc@groucho perl-git]$ git log blead | grep Cromie | wc -l
     102
  Ive also hacked in pertinent parts of core, ext/ code:
  - added arena-sets into the arena allocator
  - reworked the body-allocator around S_more_bodies
  - helped refactor sv_upgrade (Nick did the heavy lifting)
  - added struct body_details (says the blamelog)
  - extended B::Concise feature set
  - implemented OptreeCheck and tests using it

Perl Foundation News: 2010 Grant Proposal: Improve Dist::Zilla

Improve Dist::Zilla's Tests, Documentation, and Structure

Name:

Ricardo Signes

Email:

rjbs@cpan.org

Amount Requested:

$2000

Synopsis

Dist::Zilla is a tool that helps Perl programmers build distributions for the CPAN. It eliminates boilerplate, handles packaging, interfaces with changelogs and version control, improves prerequisite management, and generally makes it easier to be a CPAN author. This grant will fund work to make it easier for new users to adopt Dist::Zilla and for Dist::Zilla itself to be more easily extended, maintained, and understood.

Benefits to the Perl Community

Dist::Zilla makes the CPAN better. More code can be released because the work required to do so is greatly lessened. The code that is released can be of a higher quality because more time can be spent on the code rather than the packaging. It can also improve the lives of CPAN authors in general: if you don't want to spend the time that Dist::Zilla saves you on writing more code, you can spend it on anything else you like: skiing, sleeping, or eating ice cream.

Dist::Zilla has already been adopted by dozens of authors and used to release hundreds of distributions.

Deliverables

Each deliverable below is also an "inch-stone."

proper logging facility

Right now, Dist::Zilla logs with "print." It has always been meant to use Log::Dispatch (via Log::Dispatchouli) but these changes need to be made, presumably before testing begins, so that the testing system can incorporate logged data.

Estimated time: one half day

reusable testing tools

Dist::Zilla and most of its plugins (both core and otherwise) are not well tested, because testing it is tedious. This could be greatly improved by writing a few test classes or mock plugins.

Estimated time: two days

extensive testing of the core

The reusable test tools will be put to use (and thus proven useful) when tests are written for all the core functionality. These tests may not be exhaustive, but they will be extensive and will be written with the goal of making contributors feel that they can trust the test suite to catch most regressions.

Estimated time: four days

simplification of the command line tool's code

Right now, a number of hookable events are defined only in the code implementing the dzil command, which too tightly couples the main class behavior to the command line tool. As much as is possible, the App::Cmd-based code for dzil will be turned into a very thin wrapper around Dist::Zilla's methods.

Estimated time: one half day

event structure for distribution creation

In other words, plugins will be able to attach more behavior to distribution creation, to create new source code repositories, start files, and so on.

Estimated time: one half day

core set of well-known FileFinder plugins

The FileFinder plugin role allows other plugins to operate on dynamically located sets of files like "all Perl modules that will be installed" or "all files marked executable." At present, there are no predefined FileFinder plugins with Dist::Zilla. By providing a few core finders with well-known names, it is easier for new third-party plugins to behave more like core plugins.

This requires writing the finders, testing them, and updating existing plugins to use them. It also must be possible for a user to override the behavior at the well-defined name.

Estimated time; one day

improved prerequisite handling

This will include improved methods for specifying versions required by allowing shorthand identifiers for the latest version of a prerequisite, or the version with which the author has tested.

(If the META.json 2.0 specification is sufficiently finalized by the time this work is approved, the core Dist::Zilla prerequisite system will be improved to match it. I am familiar with the proposed changes to META and have a plan for how to support them.)

Estimated time: one day

improvements for authoring distributions containing XS

I do not write XS code or C, but a number of users of Dist::Zilla do and have asked whether I can improve Dist::Zilla's ability to accomodate them. Florian Ragwitz has given me some ideas on how to do this, and I would like to carry out his plan so that Dist::Zilla does not discriminate against XS authors.

Estimated time: one half day

documentation: improved new user's guide

This will extend and supplement the existing Dist::Zila::Tutorial, starting from the position, "So you want to release code to the CPAN..." There will be a Pod version shipped with Dist::Zilla, but also an HTML document and slidecast or screencast to more clearly walk new users through the process.

Estimated time: four days

Project Schedule

I can begin work immediately upon receipt of first-third payment. I predict about ten or twelve Saturdays of work. I believe that work can be completed this quarter.

Bio

I'm RJBS on the CPAN. I have released or adopted hundreds of modules, and Dist::Zilla is the result of my own desire for a tool to make maintenance of CPAN distributions simpler. My previous TPF grant-supported work on Pod-munging tools was also in furtherance of making it easier to maintain CPAN distributions. That work was completed without problems and the released code has been succesfully adopted by a number of CPAN authors.

dagolden: An English-only Planet Iron Man

I’m very happy to know that Perl has global appeal from seeing all the non-English Perl blogs aggregated on Planet Iron Man, but since I’m a (typical American) monoglot, I’d prefer an Iron Man feed with only English articles. So I made one.

It’s available at http://feeds.dagolden.com/ironman-english.xml. It updates hourly from the master feed.

And for the curious, or for anyone who wants to adapt this for other languages, here’s the Perl program that I whipped-up to create the feed:

# feedfilter.pl - downloads and filters the Perl Ironman feed for English
# entries. Results sent to STDOUT.
#
# The heuristic filters out entries unless the content is mostly latin
# characters and English is close to the best guess of a language.  Short
# entries with code seem to confuse Lingua::Identify, so we take entries that
# seem "close-enough".  Tuned via trial-and-error.
#
# Copyright (c) 2010 by David Golden - This may be used or copied under the
# same terms as Perl itself.

use 5.008001;
use strict;
use warnings;
use utf8;
use autodie;

use IO::File;
use Lingua::Identify qw(:language_identification);
use Time::Piece;
use URI;

use XML::Atom::Feed;
$XML::Atom::ForceUnicode = 1;
$XML::Atom::DefaultVersion = "1.0";

# Global heuristic tuning
my $latin_target = 0.95;  # 95% latin chars
my $lang_fuzz = 0.02;     # English within 2% probability of best language

run();

#--------------------------------------------------------------------------#

sub latin_ratio {
  my $string = shift;
  my $alpha =()= $string =~ /(\p{Alphabetic})/g;
  my $latin =()= $string =~ /(\p{Latin})/g;

  return 0 if ! $latin || !$alpha; # !$alpha probably redundant
  return $latin / $alpha;
}

sub run {
  my $in_feed = XML::Atom::Feed->new(URI->new("http://ironman.enlightenedperl.org"));

  my $out_feed = XML::Atom::Feed->new;
  $out_feed->title("Planet Iron Man: English Edition");
  $out_feed->subtitle( $in_feed->subtitle );
  $out_feed->id("tag:feeds.dagolden.com,".gmtime->year().":ironman:english");
  $out_feed->generator("XML::Atom/" . XML::Atom->VERSION);
  $out_feed->updated( gmtime->datetime . "Z" );
  for my $l ( $in_feed->link ) {
    $out_feed->link($l);
  }

  for my $e ( $in_feed->entries ) {
    my $content = $e->content->body;
    my $latin = latin_ratio($content);
    my %lang = langof($content);
    my $best = [sort { $lang{$b} <=> $lang{$a} } keys %lang]->[0];
    $lang{en} ||= 0;
    $out_feed->add_entry($e)
      if $latin > $latin_target && ($lang{$best} - $lang{en} < $lang_fuzz);
  }

  binmode(STDOUT, ":utf8");
  print $out_feed->as_xml;
}

Ahmad M. Zawawi (azawawi): Padre, Firefox and Chrome

Lately there have been a discussion in #padre about integrating Padre with "Edit with Emacs" Chrome extension. We found out that we need to implement a server that services XmlHttp requests at port 9292 in Padre to be able to integrate with that extension. It was a very interesting discussion that led me to do more research on the topic. I learned that there are Firefox add-ons that can edit text field or areas, launch your favorite editor; i.e. Padre :) and then monitor files for changes and reflect those changes in the browser. So here are some popular examples of such add-ons for Firefox:


Firebug Add-on

Firebug adds a simple edit-the-contents-with-your-editor feature. It turned out that you could easily configure Padre with Firebug. In fact you can configure one or more editors:



It is All Text! Add-on
This plugin unfortunately supports only one editor but it can actually reflect saved changes from Padre by monitoring opened files.




Future plans
In the future I plan to create a Padre plugin that services "Edit with Emacs" Chrome extension. Please let me know if there are any browser add-ons/extensions of which I may have missed.

Shlomi Fish: NYTProf-3 is Out!

Tim Bunce writes on his blog about the new features in Devel-NYTProf version 3. Devel-NYTProf is a profiler for the Perl programming language, which has put all the previous attempts in profiling in the dust, and now it's even better than before. Enjoy! (Thanks to Fred Moyer's post on the San-Fransisco Perl Mongers mailing list).

ajt: Speaking to hardware

Today I'm performing some minor enhancements to a hardware interface I wrote some time ago. The program isn't a perfect example of object orientated bliss, with automated regression tests, but it's a fairly cleanly written procedural program that is divided into subroutines, however it's not to hard to maintain.

Ovid: Tracking Down Bug Reports

I often find that test reports I get from smokers are not terribly useful for failures. Sometimes it's obvious. Other times it's not. One thing which helps a bit is this bit in your t/load.t test:

diag(
    "Testing My::Module $My::Module::VERSION, Perl $], $^X"
);

In the CPAN testers reports, you can click on a test report and instantly see the module version, Perl version and path to the current Perl (arguably not as useful).

I like to go a step further and include information about all modules I claim to require. For my t/00-load.t for Test::Most, I have the following:

use Test::More tests => 9;

BEGIN {
    local $^W;
    use_ok('Test::Most')
      or BAIL_OUT("Cannot load Test::Most");
    use_ok('Test::Most::Exception')
      or BAIL_OUT("Cannot load Test::Most::Exception");

    diag("Testing Test::Most $Test::Most::VERSION, Perl $], $^X");
    my @dependencies = qw(
      Exception::Class
      Test::Deep
      Test::Differences
      Test::Exception
      Test::Harness
      Test::More
      Test::Warn
    );
    foreach my $module (@dependencies) {
        use_ok $module or BAIL_OUT("Cannot load $module");
        my $version = $module->VERSION;
        diag("    $module version is $version");
    }
}

Now on my test reports, I see output like this:

# Testing Test::Most 0.21, Perl 5.008009, /home/chris/pit/rel/perl-5.8.9/bin/perl
#     Exception::Class version is 1.29
#     Test::Deep version is 0.106
#     Test::Differences version is 0.5
#     Test::Exception version is 0.29
#     Test::Harness version is 3.21
#     Test::Simple version is 0.94
#     Test::Warn version is 0.21

If I want to replicate a bug, this is a huge benefit

dagolden: CPAN Testers 2.0 end-January update

The bad news is that we’re still about two weeks behind schedule. The good news is that we’re not falling further behind and in some areas, we’re already ahead.

As I wrote in the last update, a number of my early-January tasks for revising the Metabase libraries didn’t get done and were blocking progress on other fronts. That work is now pretty much complete and I’m ready to turn my attention to the actual migration of legacy reports into a Metabase repository. Once that’s done, we’ll be able to test using it to feed the cpantesters.org databases that Barbie has been preparing for the conversion.

CPAN Testers 2.0 activity in the last couple weeks:

  • I revised Metabase framework libraries to separate user profile information and user authentication into separate facts. This also meant revising the user-profile generation program to match.
  • I implemented new Metabase::Resource classes to standardize extraction indexing data from Metabase::Fact resource strings. (E.g. cpan://distfile/DAGOLDEN/Capture-Tiny-0.07.tar.gz can be indexed under author “DAGOLDEN”, distribution name “Capture-Tiny”, and so on). This wasn’t on the plan, but I discovered that Ricardo and I had never actually gotten around to implementing, so it had to go from stubs to working code.
  • I confirmed that despite all the Metabase framework changes, I could still launch a local Metabase server and send CPAN::Reporter test reports via Test::Reporter::Transport::Metabase. (Thanks to Florian Ragwitz and Matt Trout for patches and guidance respectively on updating the Metabase web server for a more modern Catalyst runtime).
  • Barbie converted cpantesters.org backend databases to index on GUIDs instead of NNTP IDs. Existing legacy report had their NNTP IDs mapped to GUIDs.
  • Barbie and I agreed to have cpantesters.org get updates directly from the Amazon back-end rather than going through a web server. This postpones the need to implement search capability through the web until after launch.

The exact semantics of direct search against the back-end have yet to be worked out, but I’ve decided to hold off on that until we have a Metabase of historical records to experiment against.

At this point, the critical path is the conversion of old articles to the Metabase and the deployment of a web server to inject new reports into it. I’m working on the first, and hope to have both of those done by mid Feb. That leaves us a tight two week period for testing, so stay tuned for the next update.

Modern Perl Books, a Modern Perl Blog: A Perl Programming Maintenance Checklist

Polemic: anyone who believes that any specific general purpose programming language is inherently unmaintainable has opinions on software development worth ignoring.

Many people claim that the design of Perl 5 has such significant flaws that render it far too difficult to write and maintain useful programs. Many of the supporting arguments are syntactic preferences. "I don't like sigils!" "Context make no senses to my!" "Real men don't need your sissy curly braces to accompany our manly indentation!" "Isn't bless a little bit cutesy for our Serious Enterprise Business Application?"

Other arguments... well, you've heard them.

Perl 5 has some design flaws, but I believe that syntax is such a small part of maintainability that only the most facile discussions focus on syntax to the exclusion of more important concerns. The next time you have trouble maintaining a Perl 5 program, ask yourself:

  • Have I learned the language by reading documentation and working through tutorials, or am I fiddling with changing things by trial and error and guesswork and intuition based on experience in other languages?
  • Do I know how to use perldoc to look up builtins and language features?
  • Have I skimmed the Perl FAQ included in every Perl 5 distribution?
  • Have I used Perl::Tidy to unify the formatting into a consistent style?
  • Do I know the difference between void, scalar, and list context? Can I identify them?
  • Do I know how to use B::Deparse to explain the evaluation plan of complex constructs?
  • Does this program have a set of automated tests I can trust?
  • Did the original programmer understand the problem domain? Do I?
  • Did the original programmer "borrow" this code from elsewhere, change a few lines, and add a modified copyright statement?
  • Did this program grow from a throwaway idea into a critical business component without planning, design, or refactoring?
  • Is the original author available to answer questions, whether in person or through some sort of design notes?
  • Is the program well-factored?
  • Does the program include appropriate documentation for its purpose, its major systems, its APIs, and any surprising design decisions?
  • Do I have a clear understanding of what the program does and why?
  • Does the program have a modular design, with well-enforced encapsulation boundaries between components?
  • Can I configure and build the program on my local system?
  • Can I deploy it?
  • Does the code show examples of idiomatic programming from authors fluent in the language, or is it a pastiche of styles cribbed from documentation and witch-doctor expermentation?
  • Did the original author know how to program in any language?
  • Did the original author take advantage of obvious strengths of the host language in appropriate ways (or did he distrust arrays and continually write to and read from a temporary file instead—I have seen this with my own eyes, and the host language was not Perl)?
  • Does the program take advantage of well-known and trustworthy external libraries?
  • Does the build process spew compiler errors and warnings? Does the program spew warnings and errors when deployed?
  • Does the program contain obvious repetition and near repetition?
  • Would you be proud of writing the program in six months?

Note how few of these concerns have anything to do with Perl—and, of those that do, trivial rewording would make them appropriate for other languages.

Phred's Journal: Why you should create CPAN distros

Phred's Journal

On Tuesday, February 23rd, Jeff Thalhammer will speak on why you should create CPAN distros, even for proprietary code. He has worked on worked on several projects where all the private code was organized into CPAN-style distros, and then injected into a local copy of CPAN. They then used the CPAN tool chain to manage the entire build, test, and release process.

Jeff Thalhammer's CPAN page:
http://search.cpan.org/~thaljef/

Announcement posted via App::PM::Announce

RSVP at Meetup - http://www.meetup.com/San-Francisco-Perl-Mongers/calendar/12509158/

Blog of Gábor Szabó: Perl for Windows statistics

Adam Kennedy pointed me to the download count of Strawberry Perl for Windows. this shows that in the last 3 months there were approximately 54K downloads of 5.10.x and 4.5K downloads of 5.8.x That would be about 650 downloads a day.

Back on 7/7/2007 Jan Dubois mentioned that ActiveState has more than 4,500 downloads of ActivePerl 5.8.x per day for Windows and more than 1,000 downloads of ActivePerl 5.6.1 per day.

Unfortunatelly I don't have newer numbers from ActiveState but it would be interesting to see how have the numbers changed? Also it would be nice if we had numbers from the other Perl distributions for Windows.

transfixed but not dead!: Perl and Mac OS X versions

transfixed but not dead! » perl

Apple website has an opensource page: http://www.opensource.apple.com/, which lists all its products and components which have and use opensource technology.

If you go here: http://www.apple.com/opensource/, it provides a complete list of opensource projects that Apple uses.

In this list is of course Perl. From this we can see which version of Perl was compiled on Mac OS X. We can also see what options, patches, fixes & extra modules were included.

So I’ve been able to glean these Perl versions that Apple have used with Mac OS X:

  • 5.10.0 & 5.8.9 (perl-63)
  • 5.8.8 (perl-51 patches: perl-51.1.1, perl-51.1.4)
  • 5.8.6 (perl-38 patch: perl-38.1)
  • 5.8.6 (perl-35 patch: perl-35.1.1)
  • 5.8.4 (perl-28.2)
  • 5.8.1-RC3 (perl-25.2)
  • 5.8.1 (perl-24.1) “Maintenance release working toward v5.8.1″
  • 5.6.0 (perl-21)
  • 5.6.0 (perl-17)

Now which one(s) are tied to which Mac OS X is still in the air a bit! This is my best stab at it:

  10.6    5.10.0 & 5.8.9      perl-63
  10.5    5.8.8               perl-51
  10.4    5.8.6               perl-38 or perl-35
  10.3    5.8.4 or 5.8.1 ?
  10.2    5.8.1 ?
  10.1    5.6.0               ? perl-21 and/or perl-17
  10.0    5.6.0 ?

Hopefully this information is useful to someone

/I3az/

refs:

Sawyer X: On Nagios, Thunk, Shinken and wrapper included marketing

Nagios is probably the most famous and used monitoring program on the market. It's free, GPL and has nice features such as object representation of data, inheritance, plugin systems, passive testing, built-in Perl interpreter, result caching, pipe interface, alert delegations and so on and so on.

The web interface of Nagios is, however, incredible ugly. It's written in CGI the way the early CGI scripts were written. When you make a change to a server via the web interface, you get a few screens (avoiding Javascript is a benefit for some cell phones, I guess) and the old and quickly-annoying "You have done the action you wanted, please click this back link we created to go backwards" screen. You won't simply get automatically directed back to where you were before with a new message at the top in bright green saying "Action X done" or something like that. That would be too easy and Web 2.0. It uses frames (yuck!) to show the sidebar, you don't see the content of comments in hover, only when you click on the comment to get to the comment screen to view the comment. It's literally a pitfall and at least one company where I worked at rewrote the entire interface in ASP and .NET (I know, I know...) by parsing the Nagios log.

For that reason, it has always been a bit difficult (though possible) selling Nagios to the enterprise when your boss isn't tech savvy, and other programs, not much better or worse (OpenNMS for instance) find their way since they have a much better user interface.

A new fork of Nagios has begun with a PHP interface, calling Icinga. The point is to accept a lot of patches that were difficult to get into Nagios (that has only one actual developer - Ethan Galstad), and provide a beautiful web interface with Javascript. At least one Nagios community members sees the entire fork's point is the web interface, and assumes there's a good chance it will be merged back into Nagios, keeping the core as it is.

Apparently, there is a Perl Catalyst-based project to revamp the Nagios web interface, called Thunk. It is available on Github and also has a demo page. It's incredibly fast and seems promising. However, there is no website for it. Currently the only face it has is the Github page which seems generic and perhaps non-welcoming for some people who consider using it. You can also view the demo, but it still has the basic dull look of Nagios' oldschool interface.

Another new project relating to Nagios is Shinken, which is a Python rewrite of Nagios. Apparently someone thought that Nagios is great, except it's written in C, which makes it a barrier to include other peoples' work. I personally disagree for a variety of reasons which I won't go into. What does seem interesting is that Shinken is very welcoming, even though (at least to me) it seems like an exercise in futility. I'm assuming it will rapidly develop a stable core userbase, and the website is to thank, IMHO.

One more issue to note: Ethan Galstad is now developing Nagios XI, an enterprise solution. The "solution" boasts a new web interface, which I suspect will be the leading selling point of it.

So:


  • Nagios has a terrible interface.

  • Nagios XI has a pretty interface (and a free iPod Touch with every purchase, according to the site).

  • Icinga say they will put more patches in that were rejected or ignored for Nagios (but will keep core compatibility). However the strong point of Icinga (and where most efforts are now going to) is the web interface.

  • Thunk attacks the interface problem directly, but doesn't seem it will be adopted much because it doesn't even have a face.

  • Shinken tries to replace Nagios by rewriting the core to be in Python, but the best reason to adopt it (as with OpenNMS) is the web interface.

Now, of course there's a difference in Python, Perl, C and all that. However, in marketing, the selling value goes to the more presentable. In a System Monitoring conference I would have a pretty difficult time selling anything with just a Github page. It all boils down (in marketing terms) to the interface.

Jérôme Quelin: which module to extract perl prereqs?

in dzil plugin autoprereq, i'm extracting prereqs from the dist modules. i want this extract to be fast, based on the actual code (not makefile.pl or meta.yml, since the goal is to generate them), and as accurate as possible. it should also find base classes, moose roles and other "hidden" dependencies. finally, it should extract the minimum version needed for a given module, including minimum perl version.

my first version was regex-based. i can already see your horrified face - but really it wasn't so bad, since it only needed to find some specific statements such as uses and requires. current version is using ppi, which is better suited for corner cases.

however, long-term makes me think that it would be better to rely on an external module. so, what are the alternatives out there on cpan, and can i use them in autoprereq?
  • b::perlreq - parses the file, but reports file (File/Basename.pm) instead of modules, and is generally more suited for rpm
  • module::extract::use - using ppi to parse a file, but extracts only use & require statements (no inheritance, moose roles, etc). also, it reports no minimum version extraction, only a list of modules.
  • test::dependencies - using either b::perlreq (see above) or a regex scheme underneath
  • module::scandeps - runs the file (which is slow), and finds all modules included - and sometimes a bit more (eg: file::homeDir::darwin is found for a module using file::homedir, even on a unix platform). can also run as a static analyser, but calls cpanplus (?!) which is slow.
  • module::info - regex based
  • module::cpants::generator::prereq - parses makefile.pl, where i want sthg that parses actual code
  • module::cpants::kwalitee::prereq - parse meta.yml, makefile.pl or build.pl
so, no module was doing exactly what i wanted... since i am using ppi and that module::extract::use does the same, i contacted brian d foy to see whether he would be interested in additional extractions (moose roles, base classes, etc.) for this module. he was, so those enhancements are now pushed on my github clone.

i'm now waiting for a new release of this module with my enhancements, meaning that i can get rid of this part of the code in dzil autoprereq. which was, if you recall, the original goal! :-)

Ranguard's blog: More design love please...

This is a follow up, or rather extension to xSawyerxs Marketing the Entire Box (including the wrapper)

I completely agree, design is the first thing people notice about a project. If you do not put effort into the look of your projects web page then whether you like it or not it will have an affect on the perception of your project (and indirectly on Perl).

If someone (my list of projects is way too large at the moment unfortunately) has time I think there are two projects which would make a massive difference.

  • Provide a Creative Commons licensed design(s) that anyone can use for a Perl project. Lots of us are not designers, so even if we'd like to make our projects look better we don't know how!
  • See if it is possible to get CPAN redesigned, this is already the home to thousands of Perl projects, so improving this will have a massive impact.


Many of the Apple applications websites have a similar feeling (simple and clean), this is because they all use iWeb. Good design does not mean lots of design. It just means a clean page, and separating content under headings (rather than having a single page with everything just listed).

Ovid: Miscellaneous Thoughts

Though 1: Test::Class::Most fails on Perl 5.010001 -- but only on Linux. Not on FreeBSD, OS X or Solaris. I've already emailed the testers and hopefully someone can track down what's going on. I don't want to say "bug in Perl", but this is not platform-specific code, so it looks suspicious.

Thought 2: Can anyone please explain to me why anyone would want to buy a whyPad?

Thought 3: What does the following print and why? Results are the same for 5.10.1 and 5.8.9, in case you're wondering. Try and figure out the output before you run the code. I now know why it does this, but I was confused by the behavior.

#!/usr/bin/env perl -l

use strict;
use warnings;

my $foo = 3;
sub Foo::showit { print $foo }
sub Foo::incit { $foo++ }
my $foo = 10;
sub Bar::showit { print $foo }
sub Bar::incit { $foo++ }

Foo::showit;
Bar::showit;
Foo::incit;
Foo::showit;
Bar::incit;
Bar::showit;

The Cattle Grid: L'importanza di perl 5.12.0

perl 5.12.0 sta per arrivare: la 5.11.4 è infatti la prima versione di sviluppo di perl 5.11 a seguito del code freeze che dovrebbe portare presto ad un rilascio della prossima versione stabile dell'interprete. Perché attendo con interesse il rilascio di perl 5.12.0? Non si tratta (solo) di una questione di novità per quanto riguarda le feature, anche se qualcosa d'interessante c'è (leggete i perldelta per maggiori informazioni). L'aspetto più importante è, tuttavia, il nuovo ciclo di svliuppo. Dopo il rilascio di Perl 5.10.0 c'è stata un po' di "discussione" su come dovessero essere gestite le release dell'interprete, poiché tra l'una e l'altra in precedenza erano passati spesso periodi interminabili che potevano dare l'impressione di uno sviluppo stagnante. L'attuale attività mostra che lo sviluppo procede spedito, e questo trasmette fiducia: non bisognerà aspettare altri 5 anni per una versione stabile e migliorata di perl. Se osserviamo infatti le ultime release di sviluppo dell'interprete: 5.11.0 - 2 Ottobre 2009 5.11.1 - 20 Ottobre 2009 5.11.2 - 20 Novembre 2009 5.11.3 - 21 Dicembre 2009 5.11.4 - 20 Gennaio 2010 notiamo che la volontà di rilasciare con cadenza mensile è stata fin'ora rispettata. Credo che tutto ciò sia una parte alquanto importante di ciò che oggi viene chiamato The Perl Renaissance: avere l'impressione, supportata dai fatti, di un linguaggio che viene attivamente sviluppato a livello sia di librerie che di interprete è fondamentale, anche per attrarre nuovi programmatori....

Sawyer X: Marketing the Entire Box (including the wrapper)

For a long while now I've been wondering about some observations I've made of Perl, Ruby, Python and PHP in marketing terms. I'm going to discuss them here in detail, and I hope it will gain us some insight into better marketing understanding or at least not bore anyone.

I've understood through the years that projects with beautiful websites have a better chance of getting picked up by users (even when the project itself is purely command-line) and definitely gives much more credit to the encompassing layer (like the programming language itself).

I've also noticed that a rather large portion of Ruby-related projects have very beautiful websites. You might have noticed this yourself as well. I tried to understand how this situation occurred and had a few discussions at $work with people whose opinion I appreciate, one being a Python and PHP programmer (or, as I like to see him, a programmer who knows PHP) and the other being a Ruby, RoR and overall Javascript whiz.

At first, my thoughts were that most Ruby programmers major in Rails, so they're into web, while the major web frameworks in Perl (Jifty, Catalyst, Mojo, Mojolicious, Mojolicious::Lite, Dancer, CGI::Application, etc.) only flowered in recent years, which could lend the thought that most Perl programmers aren't web-oriented or web-aware. I mean this obviously as "aware" in higher extent than most users. As mst eloquently expressed, we're still into IRC (me included, of course).

Quickly that theory was squashed when I entered the Django website. For a web framework, it is horrible. Especially for a web framework that speaks of cleanliness. The website design is horrendous. Same goes for the HAML website. PHP as a very successful web-oriented language shows interesting results. Most PHP programs/scripts come in various index sites (much like oldschool PERL scripts), few PHP frameworks have websites and fewer have beautiful websites. Most of them are awful. Frontpage awful.

So apparently, dealing with web doesn't mean you create beautiful websites. Yeah, I guess anyone could say that, but to put it in better words: knowing web doesn't mean you do web. So what is the reason it is almost given that a Ruby project (as small as it might be) will have a website, and it will be beautiful?

A thought that formed rapidly through the conversations was that Ruby programmers see the marketing as relating to not just the product, but its wrapper. That is, that many Ruby programmers understand at a very core level (more than most programmers - at least me) that the website which shows the project is the actual wrapper of the project and is just as important, if not more so. When I thought of a "nice wrapper", I just thought of a detailed POD and --help output. Obviously, this isn't the case for many Ruby programmers. Those Ruby programmers think "I cannot ship this application without wrapping it nicely, it is just incomplete."

I began to see the major difference between many Perl programmers and many Ruby programmers. I haven't ever written a website for a project I worked on. Evidently, perhaps many of them don't merit a website of their own, but perhaps some of them do! One of the prime examples for this, IMHO, is LessCSS which has a beautiful website for a rather simple filter, which could definitely be accomplished just as well in other languages such as Perl. I thought how simple it would be to write a version in Perl, stepped up to CPAN and here is CSS::LESSp and Apache2::Filter::LESS::CSS. I have to say that CPAN is indeed comfortable for me, but as a design, it isn't very attractive or compelling. CPAN is an index site, not a front page wrapper for your project.

When I see the LessCSS website, I'm seeing beauty which attracts me not only to LessCSS (which I don't really have any use for right now), but also (and this is key) to Ruby itself.

Returning to my quest, I had already sat down with our resident Ruby/Rails expert and a cup of tea and flux-seeds crackers. What I still wondered about was how can a Rails programmer know Ruby, Rails, Javascript (usually) and web design all at once and on such a high level for each to produce such beautiful websites. It's definitely not common, or is it?

We've laid the foundation that many programmers use prepared (sorry, there is no "pre-prepared" in English) templates and layouts, or hire an actual professional web designer. Hiring a web designer is not a cheap cost. Why are these people spending good sums of money on professional web designers to create (usually static) websites for such small programs? I couldn't understand it.

Then he mentioned some conventions he went to. When someone shows her peers and future employers projects she has done, the first impression is on the looks of things, not the sound or ability of them. If it's beautiful, she got to first base. If it works much better, but it doesn't even have a face, it stands much less chance of even getting a shot. This is what I was missing.

Marketing my project is not just marketing a code that does a task, it's marketing me!

Dancer doesn't just market a small web framework, it markets Perl and Plack (the encompassing technologies), but also the developer(s) of Dancer. Same goes for Catalyst, same goes for Moose and KiokuDB and Module::Starter and Test::More and on and on and on.

A response I'm expecting is "well, obviously!" and my answer would be the same, I do know it, however Understanding it is much more important than knowing it. To understand it means to write a website for whatever project I have done which I think can elevate me, and which I think can elevate Perl, or some other encompassing technology (like Moose or Dancer).

This morning I imagined having a space for project websites. Actual beautiful static website for each Perl project. I don't have the time to set up an infrastructure for hosting and all that. However, I did purchase PerlProjects.net and I will be giving NS hosting for free to anyone who wants to set up a website at <YourProject>.PerlProjects.net, no matter how small. Also, I want to set up a main site at PerlProjects.net which will be an index of... Perl Projects.

I do believe that having individual beautiful websites for projects (even the small or relatively exclusive ones) will help market Perl better, and any technology (of Perl or not) you're using in your project. This will help boost interest and consequently the number of programmers who know Perl will go up and of course job offerings will follow. If we can present an image of "Perl projects come in beautiful wrappers", we can present an image of "Perl is beautiful", and that is what we should strive for - at least in marketing terms.

If you feel like helping with PerlProjects.net (design, design, design) or want your own subdomain and free NS hosting, let me know!

Blog of Gábor Szabó: Showing Perl on non-Perl conferences, getting money from TPF for swag

On this weekend I'll be on FOSDEM along with several other Perl Mongers. In addition to just trying to enjoy the talks we'll be promoting Perl. I am really happy that The Perl Foundation was ready to provide us with $500 to buy conference swag we can give away.

Similarry, a month from now we will be present at CeBIT in Hannover, Germany where we will have a booth to promote Perl and Perl based projects. I've just heard from Karen Pauley that TPF agreed to provide us another $500 for further materials to give away on CeBIT.

This is awesome and this probably means that if you are ready to talk to people on non-perl related events telling them about Perl you could also get some money from TPF in order to have some hard material to give away.

For FOSDEM we are preparing some round tuits and a postcard listing the Perl events in Europe that will take place in the coming months.

For CeBIT we are also preparing some further marketing materials - more business oriented - that we will be able to hand out.

There are further ideas to prepare various fun promotional materials one can give away during a chat with a fellow developer and we have also discussed that we should have a Linux liveCD full of Perl applications ready to be used by anyone. (Also include Windows binaries in the form of Strawberry Perl for those using Windows)

Of course almost none of the the above is my work. There are several people involved in the preparation: Rich Sands organzied the tuits and the postcards for FOSDEM, Salve J. Nilsen is working and on community intro cards, Renee Backer is working hard on the materials for CeBIT just to thank a few of them.

We are also expecting some help from the Enlightend Perl Organization (EPO) as they are preparing beer-mats to give away.

Getting involved

Most of the conversation about this is done on the events mailing list and some data is collected on the TPF wiki. If you would like to help in preparing the materials or if you'd are planning to attend a non-perl event and would like to have some fun or business oriented material to give away, please join us on the events mailing list to discuss the details. You can find information on the TPF wiki.

Perl Hacks: Cultured Perl Blog

A couple of years ago I thought that one thing the Perl community was missing was a network of blog sites about Perl. I'm not talking about the individual blogs that are being shown off to such good effect by the Iron Man project, I'm talking about a set of multi-author blogs that covered particular facets of the Perl world. Something like a Perl-specific version of LifeHacker or BoingBoing. To that end, I registered a number of domains and set about installing Movable Type.

That bit was easy. That bit I can do. The next bit is harder.

The next bit involves getting authors interested in writing for the blogs on a regular basis. That bit I didn't do so well at and none of the blogs florished.

One of them didn't even get going. That was Cultured Perl. The idea behind Cultured Perl was that it would discuss Perl culture. That's all the non-technical bits of the Perl world. Perl Mongers, Perl conferences, things like that. I had a few authors signed up, but nothing ever really happened.

So why am I telling you this? Well, the Cultured Perl domains are up for renewal. And I'm trying to work out whether it's worth keeping them.

Would you be interested in reading a Cultured Perl blog? And would you be interested in writing for it?

Ovid: Changing Test::Most (and a note about my OSCON proposals)

After uploading Test::Class::Most, I kept thinking and thinking about the fact that you automatically get strict and warnings with it. I started to think about my annoyance with test suites in general and am now thinking about doing this with Test::Most. I proposed this to Perl-QA and received three positive responses (two offlist) and no negative. It won't import "features" of 5.10, but instead of this:

use strict;
use warnings;
use Test::Most tests => 34;

You can just write:

use Test::Most tests => 34;

... and it's the same thing. I can't recall the last time I've seen a test suite using modern tools and not using strict and warnings, so I think this is a win. chromatic's Modern::Perl is a tiny bit of code but the underlying idea is very important and it's a pattern I'd like to see more of.

In other news, the OSCON Call for Proposals is now closed. I have two proposals in there:

  • Refactoring Enterprise Class Test Suites -- Too Slow, Too Complex, Too Fragile
  • Scratching the 40 Itch of Inheritance with Smalltalk-style Traits

I feel that both of these are very important topics that aren't covered enough. With luck, I'll be in Portland in July (and will be a happily married man at that time).

Phred's Journal: SF.pm Annual Report and Plans

Phred's Journal
Reposted from blogs.perl.org

San_Francisco.pm (SF.pm) started off 2009 with a bang! Fred Moyer took on the daunting role of President, and Joe Brenner stepped up to the unforgiving role of Speakers Co-Chair. Both leaders relieved X-President Quinn Weaver, who sheperded SF.pm for 6 years leading up to 2009.

A Meetup web portal was created at http://www.meetup.com/San-Francisco-Perl-Mongers/ which served to facilitate organizing meetings. Six Apart generously donated a conference space for monthly meetings on the 4th Tuesday of the month (as has been a tradition for the last 10 years). Matt Lanier, the founder of SF.pm, liasoned with Six Apart to obtain this arrangement. We started off the year with a 35 person meeting on how not to use memcached - http://www.meetup.com/San-Francisco-Perl-Mongers/calendar/9432356/

With 15 official events in 2009 (http://www.meetup.com/San-Francisco-Perl-Mongers/calendar/past_list/) SF.pm had a banner year. We had a booth at OSCON, several celebrity speakers, lightning talks, and a growing membership as the year progressed. Tools developed in our forge such as App::PM::Announce allowed us to get out the message that SF Bay Area Perl was alive and on the move.

Red Hot Penguin Consulting LLC took care of the standard pizza fare, and Julian Cash Photography helped with soliciting food donations for the group from members (the monthly food/drink bill averaged $150 in 2009). SF.pm is still actively looking for a food and drink sponsor, so if you want to get your message out to SF.pm, the best way to our hearts is through our stomachs.

San Francisco Perl Mongers plans to continue steady growth throughout 2010 by focusing on what it did right in 2009. Regular meetings, growing membership, new and exciting talks, and some extra-perl-icular presences will help to solidify our presence as a stable and active technology group leader, and the pre-eminent Perl Mongers group in terms of regular events and attendance.

Modern Perl Books, a Modern Perl Blog: When Context Gets Complicated (and why it's not a problem)

In Essential Skills for Perl 5 Programmers I mentioned that no one can be an adept Perl programmer without understanding context. This trips up many, many people -- and you often hear (unfair) criticisms of Perl 5 based on misunderstandings and guesses about how context works.

Context is reasonably easy to explain. (The previous sentence is grammatically correct.) Contexts is not difficult to understands. (The previous sentence is grammatically incorrect, even if you speak the Queen's English.)

If you can find the errors in the previous paragraph, you can understand quantity context in Perl 5: like subject-verb agreement in terms of number, expressions in Perl 5 can behave differently in contexts that imply zero, one, or more results.

fetch_something_awesome();              # void   context
my $item  = fetch_something_awesome();  # scalar context
my @items = fetch_something_awesome();  # list   context

Context gets a little bit trickier when you need to coerce what would normally be one context into another:

my ($item) =        fetch_something_awesome(); # list   context
push @items, scalar fetch_something_awesome(); # scalar context

If you know the visual cues (if you don't randomly sprinkle punctuation about your program until it works), those are easy to understand as well.

The subtlety comes when dealing with complex contexts, usually with nested expressions:

# list context, thanks to say
say reverse $name;

my %values =
(
    # list context, thanks to hash assignment
    name => get_name(),
    rank => get_rank(),
);

# list context (param flattening)
$screen->flip( $fleet->get_spaceships() );

This is often where more fair criticisms of Perl 5 suggest that context may not be worth it, because you have to understand what a line of code means and what it implies to read it correctly.

There's a fair point there, but it's also silly in some ways. Skimming code which calls other functions may give you some idea of what those functions do, but you rely only on the names of those functions and not their documentation to tell you any other details. Do they modify global or thread-local variables? Do they have caching or performance characteristics? Do they block? Do they require special initialization or error handling? Do they return special values?

The valid point is that chaining multiple expressions into complex compound expressions can have interesting effects. (I see this in Haskell code often; invisible partial application means that I personally can't skim Haskell code without tracking down the arity of functions to figure out what happens where.)

That's no argument against language features. It's an argument against making expressions more complex than necessary. Note that the same argument applies against complex prefix-unless expressions. unless can be amazingly useful when used properly. If you abuse it, you make amazing problems. Don't make problems.

Perlbuzz: Perlbuzz news roundup for 2010-02-01

These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com.

nothingmuch's perl blog: $obj->blessed

I've been meaning to write about this gotcha for a long time, but somehow forgot. This was actually an undiscovered bug in Moose for several years:

use strict;
use warnings;

use Test::More;

use Try::Tiny qw(try);

{
    package Foo;

    use Scalar::Util qw(blessed);

    sub new { bless {}, $_[0] }
}

my $foo = Foo->new;

is( try { blessed($foo) }, undef );

is( try { blessed $foo }, undef );

done_testing;

The first test passes. blessed has't been imported into main, so the code results in the error Undefined subroutine &main::blessed.

The second test, on the other hand, fails. This is because blessed has been invoked as a method on $foo.

The Moose codebase had several instances of if ( blessed $object ), in packages that did not import blessed at all. This worked for ages, because Moose::Object, the base class for most objects in the Moose ecosystem, didn't clean up that export, and therefore provided an inherited blessed method for pretty much any class written in Moose.

I think this example provides a very strong case for using namespace::clean or namespace::autoclean routinely in your classes.

To cover the other half of the problem, the no indirect pragma allows the removal of this unfortunate feature from specific lexical scopes.

Perl Hacks: FOSDEM

This weekend is the annual FOSDEM conference in Brussels. I really enjoy FOSDEM but, for reasons I don't really understand, this will be the first time I've been since 2005. It will also be one of the rare occasions where I attend a conference without giving a talk - the organisers turned down my proposed talk on Modern Perl.

I like FOSDEM because it's not just a Perl conference. It's about the wider open source movement. In fact Perl is a really small part of of the conference. In many years it has been completely unrepresented. One of the things I mentioned in my "M Word" talk at the London Perl Workshop was that Perl needed to be better represented at non-Perl conferences. With that in mind, the Perl Foundation has booked a stand at the conference and various volunteers (including me) will be there telling people about how wonderful Perl is.

The main driver behind this push to get Perl represented at other conferences has been Gabor Szabo and he'll also be at FOSDEM giving a couple of talks. One is a lightning talk introducing people to Padre. The other is about packaging CPAN modules for Linux distributions. Those of you with long memories might remember me talking about this at YAPC in Copenhagen. I'm hoping that attending Gabor's talk will galvanise me into having another go at my project to automatically build RPMs of many more CPAN modules than are currently available.

So, as you can see, there are plenty of good reasons to be at FOSDEM this weekend. And that's even before considering that it takes place in one of my favourite European cities. I might even treat myself to a Kwak in one of the bars on the Grand Place.

If you're at FOSDEM next weekend, please stop by the Perl stand and say hello.

Ahmad M. Zawawi (azawawi): Stable Padre 0.56 is out!

Peter Lavender (the great Padre release manager) has just released Padre 0.56 to the public. Great job everyone!

To install Padre,
cpan Padre

To upgrade an existing Padre installation,
perl -MCPAN -e "CPAN->upgrade('/^Padre/')"

Perl.org NOC: Rainy Day Outage (update four)

Good news everyone!

The few remaining services that were out should be back shortly.

We moved the failed server from Robert's house to Ask's office today and finally got enough parts replaced that the server is running again. As we hoped all data is intact.  It's currently copying all the data off to a couple of 2TB disks.  Actually, I just checked and the only missing services right now are some pm.org sites and some of the historical mirrors /archives (very old versions of perl, some cpan-testers mails).

The pm.org sites should be back by the morning; the rest might need a few days for us to do the sneakernet thing to get the data to a fast enough network connection that copying hundreds of GB isn't too slow.

Blog of Gábor Szabó: Test Automation using Perl classes

The conference season is warming up and so I'll start offering my Test Automation using Perl training class so I can have an excuse to go to the workshops and conferences. Here is the schedule:

  • March 8-11, Berlin, Germany, after CeBIT where we have a Perl booth
  • March 15-18, Tel Aviv, Israel
  • Apr 13-16 Vienna, Austria, right after the QA Hackathon
  • May 11-14, Stockholm, Sweden, (no workshop here but I cannot make it to NPW in Rekyavik on 1-2/May)
  • June 1-4, Stuttgart, Germany, just before the German Perl Workshop
  • June 15-18, Columbus, Ohio, USA, the week before YAPC::NA

The first one in Berlin is a bit tight in the schedule now but I hope there will be enough people interested to warrant opening the class.

The class is 4 days long. Details can be found here: Test Automation using Perl. For pricing please contact me directly via e-mail.

Related announcement in German

Perl Foundation News: 2010Q1 Call for Grants Proposals

NOTE: perl.org Request Tracker was down for about a week (hardware problems). As this happened in the last days of the call for grants period we are extending it till the end of the week (day 5).

The Perl Foundation is looking at giving some grants ranging from $500 to $3000 in February/March 2010.

In the past, we've supported Adam Kennedy's PPI, Strawberry Perl and Perl on a Stick, Nicholas Clark's work on Perl internals, Jouke Visser's pVoice, Chris Dolan on Perl::Critic and many others (just check http://www.perlfoundation.org/grants for more references).

You don't have to have a large, complex, or lengthy project. You don't even have to be a Perl master or guru. If you have a good idea and the means and ability to accomplish it, we want to hear from you!

Do you have something that could benefit the Perl community but just need that little extra help? Submit a grant proposal by January 31.

As a general rule, a properly formatted grant proposal is more likely to be approved if it meets the following criteria

  • It has widespread benefit to the Perl community or a large segment of it.
  • We have reasons to believe that you can accomplish your goals.
  • We can afford it (please, respect the limits or your proposal should be rejected immediately).

To submit a proposal see the guidelines at http://www.perlfoundation.org/how_to_write_a_proposal and TPF rules of operation at http://www.perlfoundation.org/rules_of_operation. Then send your proposal to tpf-proposals@perl-foundation.org. Note that proposals should be properly formatted accordingly with our POD template.

On February 1st, proposals will be made available publicly (on this blog) for public discussion, as it happened in the previous round. So, please make it clear in your proposal if it should not be public.

Note that accepted but not funded proposals in the previous round do not need to be re-submitted.

Ovid: Test::Class::Most

I'm really tired of boilerplate. In fact, I hate it so much that I can't stand when I write this:

package Some::Test::Class;
use strict;
use warnings;
use base 'My::Test::Class';
use Test::More;
use Test::Exception;

Of course, you already know about Test::Most and Modern::Perl, so you could reduce it to this:

package Some::Test::Class;
use base 'My::Test::Class';
use Modern::Perl;
use Test::Most;

But that's still boilerplate. So here's what I've just uploaded:

package Some::Test::Class;
use Test::Class::Most parent => 'My::Test::Class';

That gets you strict, warnings, all test functions from Test::Most and, if you have 5.10 or better, all the modern features of Modern::Perl. It (reluctantly) supports multiple inheritance (pass an arrayref of class names as the value to 'parent') and if you don't want the modern features, you can do this:

package Some::Test::Class;
use Test::Class::Most parent => 'My::Test::Class', feature => 0;

To create your own Test::Class base class (which inherits directly from Test::Class), just don't specify an import list:

package My::Test::Class;
use Test::Class::Most;

The documentation includes links to my Test::Class tutorials for those not familiar with it:

  1. Organizing your test suites
  2. Reusing test code
  3. Making Test::Class easier to use
  4. Test control methods
  5. Final tips and summary

Test::Class::Most is not quite as pretty as Piers Cawley's lovely Test::Class::Sugar, but it has far less magic. I think the docs are clear, but if they confuse you see the the Test::Class tests I ship with it for a good example. Hopefully it will make your test classes much more pleasant to write.

Ovid: Testing with PostgreSQL

I've been working on a personal project lately and I decided that, amongst other things, I was going to use PostgreSQL. Some of you may recall that I had an interesting testing strategy for MySQL. The basic idea is that I don't want to teardown and rebuild the database for every test. Truncating a table is generally much faster than dropping and recreating it. However, if I leave the database up, how do I guarantee it's always in a pristine state? One way is to use transactions and always roll them back at the end of a test. That means, amongst other things, that I can't easily test "commit". You can make it work with nested transactions (if your database supports them), but "rollback" can cause issues.

There's also the problem that by breaking "commit", you're altering the behavior of your code somewhat. Plus, if you have more than one process, unless you can share the database handle, separate processes can't see what's happening in another's transaction.

My strategy is not one that everyone is comfortable with, but I prefer to track the changes to the database and simply truncate tables which have changed, possibly restoring the "static" data which some tables need to have when the app is launched. Making this work with PostgreSQL really helped me to relearn a lof things I had forgotten about this excellent database. Here's the full code, with some interesting goodies you may not have expected (plus some hacks I need to fix at some point).

package Testing::Veure;

use Modern::Perl;
use Moose;
use YAML::Tiny;
use aliased 'Test::WWW::Mechanize::Catalyst' => 'Mech';

# Mysql prototype:  http://use.perl.org/~Ovid/journal/37412

use Readonly;
Readonly my $TEST_DB_CONF => 't/conf/db.yml';
Readonly my $PREFIX       => '_test_';
Readonly my $CHANGES      => "${PREFIX}changed_table";

has schema         => ( is => 'rw', isa => 'Veure::Schema' );
has dbh            => ( is => 'rw', isa => 'DBI::db' );
has tables         => ( is => 'rw', isa => 'HashRef' );
has static_tables  => ( is => 'rw', isa => 'ArrayRef' );
has dynamic_tables => ( is => 'rw', isa => 'ArrayRef' );
has debug          => ( is => 'ro', isa => 'Bool' );
has mech           => (
    is      => 'ro',
    isa     => Mech,
    default => sub {
        Mech->new( catalyst_app => 'Veure' );
    }
);
has _should_rebuild => ( is => 'rw', isa => 'Bool' );
has _config         => ( is => 'rw', isa => 'HashRef' );

use Veure;
use mro ();
use feature ();

BEGIN {
    my $config = Veure->config->{database}{test};

    Veure::Model::DB->config(
        schema_class => $config->{schema_class},
        connect_info => {
            dsn      => $config->{dsn},
            user     => $config->{user},
            password => $config->{password},
        }
    );
}

sub import {
    my ($class, @args)  = @_;
    my $caller = caller;
    warnings->import();
    strict->import();
    feature->import(':5.10');
    mro::set_mro( scalar caller(), 'c3' );
    eval "package $caller; use Test::Most \@args";
}

sub BUILD {
    my $self = shift;

    my $model = Veure::Model::DB->new;
    $self->schema( $model->schema );
    $self->dbh( $model->schema->storage->dbh );
    $self->setup;
    return $self;
}

sub setup {
    my ($self) = @_;
    my $dbh = $self->dbh;
    $self->_set_tables;

    # eventually we'll want sanity checks on triggers
    if ( $self->_should_rebuild ) {
        $self->_rebuild_test_database;
    }
    else {
        $self->_refresh_test_database;
    }
    return $self;
}

sub _set_passwords {
    my $self  = shift;
    my $users = $self->schema->resultset('Users');
    while ( my $user = $users->next ) {
        $user->password('test');
        $user->update;
    }
}

sub _refresh_test_database {
    my $self    = shift;
    my $dbh     = $self->dbh;
    my $changes = $dbh->selectall_arrayref(<<"    END") or die $dbh->errstr;
    SELECT table_name, is_static
    FROM   $CHANGES
    WHERE  inserts > 0
       OR  updates > 0
       OR  deletes > 0
    END
    my ( $static, @dynamic );
    foreach my $change (@$changes) {
        my ( $table, $is_static ) = @$change;
        if ($is_static) {
            $static = 1;    # only needs to happen once
        }
        else {
            push @dynamic => $table;
        }
    }
    my @tables = @dynamic;
    if ($static) {
        push @tables => @{ $self->static_tables };
    }
    return unless @tables;
    {
        local $" = ', ';
        my $sql = "TRUNCATE TABLE @tables";
        warn $sql if $self->debug;
        $dbh->do($sql) or die $dbh->errstr;
    }
    if ($static) {
        my $sql = "BEGIN;\n";
        foreach my $table ( @{ $self->static_tables } ) {
            my $backup = "$PREFIX$table";
            $sql .= <<"            END_SQL";
    INSERT INTO $table (SELECT * FROM $backup);
            END_SQL
        }
        $sql .= "COMMIT;\n";
        warn $sql if $self->debug;
        $dbh->do($sql) or die $dbh->errstr;
    }
    my $sql = <<"    END";
    UPDATE $CHANGES
    SET    inserts = 0,
           updates = 0,
           deletes = 0
    END
    warn $sql if $self->debug;
    $dbh->do($sql) or die $dbh->errstr;
}

sub _rebuild_test_database {
    my $self = shift;

    $self->_set_passwords;
    $self->_create_change_table;
    my $dbh = $self->dbh;

    my @static_tables  = @{ $self->static_tables };
    my @dynamic_tables = @{ $self->dynamic_tables };

    # now make thebackups
    foreach my $table (@static_tables) {
        my $sql = "CREATE TABLE $PREFIX$table AS SELECT * FROM $table";
        warn $sql if $self->debug;
        $dbh->do($sql) or die $dbh->errstr;
    }
    {

        # doing it this way means we don't need to disable foreign keys
        local $" = ', ';
        my $sql = "TRUNCATE TABLE @dynamic_tables";
        warn $sql if $self->debug;
        $dbh->do($sql) or die $dbh->errstr;
    }

    $self->_add_triggers_and_records;
    return $self;
}

sub _add_triggers_and_records {
    my $self = shift;
    $self->_add_changed_table_data( $self->static_tables,  1 );
    $self->_add_changed_table_data( $self->dynamic_tables, 0 );
}

sub _add_changed_table_data {
    my ( $self, $tables, $is_static ) = @_;
    my $dbh = $self->dbh;

    foreach my $action (qw/insert update delete/) {
        my $function = <<"        END_SQL";
        CREATE OR REPLACE FUNCTION fn_${action}_changes () 
        RETURNS TRIGGER AS \$\$
          BEGIN
            UPDATE $CHANGES SET ${action}s = ${action}s + 1
            WHERE table_name = TG_ARGV[0];
            RETURN NEW;
          END;
        \$\$ LANGUAGE plpgsql;
        END_SQL
        warn $function if $self->debug;
        $dbh->do($function) or die $dbh->errstr;
        warn "---------------- Function for '$action' succeeded"
          if $self->debug;
    }
    foreach my $table (@$tables) {
        $dbh->do( "INSERT INTO $CHANGES (table_name, is_static) VALUES (?, ?)",
            undef, $table, $is_static );

        foreach my $action (qw/insert update delete/) {
            my $trigger = <<"            END_SQL";
            CREATE TRIGGER tr_${action}_$table AFTER $action ON $table
            FOR EACH ROW EXECUTE PROCEDURE fn_${action}_changes('$table')
            END_SQL
            warn $trigger if $self->debug;
            $dbh->do($trigger) or die $dbh->errstr;
            warn "---------------- Trigger for '$action' succeeded"
              if $self->debug;
        }
    }
}

sub _set_tables {
    my $self = shift;
    my $dbh  = $self->dbh;
    my $sql  = <<'    END';
    SELECT    c.relname 
    FROM      pg_catalog.pg_class c
    LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
    WHERE c.relkind IN ('r','') 
      AND n.nspname NOT IN ('pg_catalog', 'pg_toast')
      AND pg_catalog.pg_table_is_visible(c.oid)
      AND c.relname <> 'dbix_migration'
    END
    my %count_for;
    foreach my $table ( @{ $dbh->selectcol_arrayref($sql) } ) {
        my $result = $dbh->selectcol_arrayref("SELECT count(*) FROM $table");

        # a naive solution: if we have data when the database is created, it's
        # static data
        $count_for{$table} = $result->[0];
    }
    if ( !exists $count_for{$CHANGES} ) {

        # we're starting with a fresh DB, so assume that if a table has data,
        # it's a static table
        my $yaml = YAML::Tiny->new;
        $yaml->[0] = \%count_for;
        $yaml->write($TEST_DB_CONF);
        $self->_should_rebuild(1);
    }
    else {
        my $yaml = YAML::Tiny->read($TEST_DB_CONF);
        %count_for = %{ $yaml->[0] };
    }
    $self->static_tables(  [ grep { $count_for{$_} } keys %count_for ] );
    $self->dynamic_tables( [ grep { !$count_for{$_} } keys %count_for ] );
}

sub _create_change_table {
    my $self = shift;
    my $dbh  = $self->dbh;
    $dbh->do(<<"    END");
    CREATE TABLE $CHANGES (
        id         SERIAL PRIMARY KEY,
        table_name VARCHAR(30) NOT NULL,
        is_static  INTEGER NOT NULL DEFAULT 0,
        inserts    INTEGER NOT NULL DEFAULT 0,
        updates    INTEGER NOT NULL DEFAULT 0,
        deletes    INTEGER NOT NULL DEFAULT 0
    )
    END
}

1;

With this, when you write a test program, you start with this:

# Veure is a placeholder name
use Testing::Veure 'no_plan';

With that, you automatically get the benefits of Modern::Perl (I copied the code) and you automatically import the test behavior from Test::Most. If I hadn't done that interesting diddling with Testing::Veure::import(), every test program would have started with this:

use Modern::Perl;
use Test::Most 'no_plan';
use Testing::Veure;

I don't like boilerplate, so I decided it had to go away.

To get a pristine test database, just call the constructor:

my $test   = Testing::Veure->new;
my $mech   = $test->mech;   # Test::WWW::Mechanize::Catalyst object
my $schema = $test->schema;  # DBIx::Class
my $dbh    = $test->dbh;

# change as much as you want in the database

$test = Testing::Veure->new;
# congrats. The db is reset to its pristine condition

The code still needs a lot of work, but there were several things I appreciated.

First, I didn't have to disable foreign keys at all because PostgreSQL allows the following:

TRUNCATE TABLE table1, table2, ... tableN

If those tables have interdependent keys, it will happily truncate them for you. Another nice feature was discovering that when misspelling a table name in a PostgreSQL trigger or function, it will tell you at compile time, unlike with MySQL.

We also assume that any tables with data in them when we're first adding the test tables are "static" data which must be refreshed every time the constructor is called. I use YAML::Tiny to cache them. This is a decision that I will likely have to revisit.

And if you're curious, here are my (stub) tests for this:

#!/usr/bin/env perl

use lib 't/lib';
use Testing::Veure tests => 4;

my $DEBUG   = 0;
my $REBUILD = 0;
if ($REBUILD) {
    system('./util/recreate_db') == 0
      or die "Could not recreate database: $?";
}

my $CHANGES = <<'END';
SELECT table_name
FROM   _test_changed_table
WHERE  inserts > 0
   OR  updates > 0
   OR  deletes > 0
END

subtest 'new database' => sub {
    my $test = Testing::Veure->new( { debug => $DEBUG } );
    my $schema = $test->schema;
    isa_ok $schema, 'Veure::Schema';
    can_ok $test,   'dbh';
    isa_ok my $dbh = $test->dbh, 'DBI::db', '... and the object it returns';

    ok grep( { $_ eq 'star' } @{ $test->static_tables } ),
      'Basic sanity on static tables';
    ok grep( { $_ eq 'email' } @{ $test->dynamic_tables } ),
      'Basic sanity on dynamic tables';
    my $tables = $dbh->selectcol_arrayref($CHANGES);
    ok !@$tables, 'No tables start out changed';
    $dbh->do("INSERT INTO email (from_id, to_id, message) VALUES (1,1,'boo!')");
    $tables = $dbh->selectall_arrayref($CHANGES);
    eq_or_diff $tables, [ ['email'] ],
      '... but if we change a table, we should see the change';
    $dbh->do("INSERT INTO roles (role) VALUES ('booboo')");
    $tables = $dbh->selectall_arrayref($CHANGES);
    eq_or_diff $tables, [ ['email'], ['roles'] ],
      '... even if we change multiple tables';
    done_testing;
};

subtest 'refresh_db' => sub {
    ok my $test = Testing::Veure->new,
      'We should be able to reconnect to the test database';
    my $dbh    = $test->dbh;
    my $tables = $dbh->selectcol_arrayref($CHANGES);
    ok !@$tables, 'No tables start out changed';
    $dbh->do("INSERT INTO email (from_id, to_id, message) VALUES (1,1,'boo!')");
    $tables = $dbh->selectall_arrayref($CHANGES);
    eq_or_diff $tables, [ ['email'] ],
      '... but if we change a table, we should see the change';
    $dbh->do("INSERT INTO roles (role) VALUES ('booboo')");
    $tables = $dbh->selectall_arrayref($CHANGES);
    eq_or_diff $tables, [ ['email'], ['roles'] ],
      '... even if we change multiple tables';
    done_testing;
};

subtest passwords => sub {
    my $test = Testing::Veure->new;
    my $users = $test->schema->resultset('Users');
    while ( my $user = $users->next ) {
        ok $user->check_password('test'),
            'Passwords should all be changed to test';
    }
    done_testing;
};

subtest mechanize => sub {
    my $test = Testing::Veure->new;
    can_ok $test, 'mech';
    isa_ok my $mech = $test->mech, 'Test::WWW::Mechanize::Catalyst',
        '... and the object it returns';
    $mech->get_ok('/', '... and it should be able to fetch pages'); 
    done_testing;
};

I'm working as hard as I can to make writing tests as easy as possible to ensure that I don't have to revisit this later. Dealing with cumbersome test suites is a serious drain on productivity.

Next, I'm going to try to work out a solution with Test::Class which will allow me to do this:

package Testing::Something;

use parent 'My::Test::Class';

sub some_tests : Tests { ... }

The idea being that I should just be able to use the appropriate parent class and get proper Modern::Perl and Test::Most behavior without having to specify them in every test class. That one is going to be a bit trickier.

Shlomi Fish: Project Euler Problem #10 in Haskell, Perl and C

zerothorder told how he found a solution for Project Euler's Problem 10 in Haskell. The problem is "Find the sum of all primes less than 2,000,000". It is given here below:

primes :: [Integer]
primes = 2 : filter isPrime [3, 5 ..]
    where
        -- only check divisibility of the numbers less than the square root of n
        isPrime n = all (not . divides n) $ takeWhile (\p -> p*p <= n) primes
        divides n p = n `mod` p == 0
 
result = sum $ takeWhile (< 2000000) primes
 
main = do putStrLn( show result )

He says "If this doesn't give you a nerdgasm, I don't know what will.". The problem is that this nerdgasm will last a long time. Benchmarking this program gives that 20 iterations of it run at 310 seconds - less than - about 15 seconds each (on my Pentium 4 2.4GHz machine running Mandriva Linux Cooker). So next I tried a better Haskell implmenetation that I recalled from a thread I started in the Haskell Café mailing list about implementing a sieve of Eratosthenes in Haskell:

import Data.Int

primes :: Int64 -> [Int64]

primes how_much = sieve [2..how_much] where
         sieve (p:x) = 
             p : (if p <= mybound
                 then sieve (remove (p*p) x)
                 else x) where
             remove what (a:as) | what > how_much = (a:as)
                                | a < what = a:(remove what as)
                                | a == what = (remove (what+step) as)
                                | a > what = a:(remove (what+step) as)
             remove what [] = []
             step = (if (p == 2) then p else (2*p)) 
         sieve [] = []
         mybound = ceiling(sqrt(fromIntegral how_much))

--main = print (length (primes 1000000))
main = print (sum (primes 2000000))

This does not involve costly operations such as modulo or division and 20 iterations of it run at 135 wallclocks seconds - over two times faster than zeroth's Haskell version.

Now how about Perl? Since Perl has assignment, we have the advantage that we can create a vector of bits that we will mark with the primes by iterating over all numbers up to the root of the limit. Here is the code:

#!/usr/bin/perl

use strict;
use warnings;

use Math::BigInt lib => 'GMP';

my $limit = 2_000_000;

my $primes_bitmask = "";

my $loop_to = int(sqrt($limit));
my $sum = 0;
my $total_sum = Math::BigInt->new('0');

for my $p (2 .. $loop_to)
{
    if (vec($primes_bitmask, $p, 1) == 0)
    {
        $sum += $p;

        my $i = $p * $p;

        while ($i < $limit)
        {
            vec($primes_bitmask, $i, 1) = 1;
        }
        continue
        {
            $i += $p;
        }

    }
}

for my $p ($loop_to .. $limit)
{
    if (vec($primes_bitmask, $p, 1) == 0)
    {
        if (($sum += $p) > (1 << 30))
        {
            $total_sum += $sum;
            $sum = 0;
        }
    }
}

$total_sum += $sum;
print "$total_sum\n";

20 runs of it run at 95 walclock seconds - even faster than the Haskell version. But it gets better. Since all the primes we encounter greater than 2 are not even, we can create a map of their pseduo-halves and conserve on memory and iterations. This is the Perl version:

#!/usr/bin/perl

use strict;
use warnings;

use Math::BigInt lib => 'GMP';

my $limit = 2_000_000;

my $primes_bitmask = "";

my $loop_to = (int(sqrt($limit)))>>1;
my $half_limit = ($limit-1)>>1;

my $sum = 0+2;
my $total_sum = Math::BigInt->new('0');

for my $half (1 .. $loop_to)
{
    if (vec($primes_bitmask, $half, 1) == 0)
    {
        my $p = (($half<<1)+1);
        $sum += $p;

        my $i = ($p * $p)>>1;

        while ($i < $limit)
        {
            vec($primes_bitmask, $i, 1) = 1;
        }
        continue
        {
            $i += $p;
        }

    }
}


for my $half ($loop_to .. $half_limit)
{
    if (vec($primes_bitmask, $half, 1) == 0)
    {
        if (($sum += (($half<<1)+1)) > (1 << 30))
        {
            $total_sum += $sum;
            $sum = 0;
        }
    }
}

$total_sum += $sum;
print "$total_sum\n";

Running this 20 times takes 73 wallclock seconds, close to half that of my Haskell version.

Then I wondered how long C will take. Here is a C implementation without the halving:

#include <string.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>

#define limit 2000000
int8_t bitmask[(limit+1)/8];

int main(int argc, char * argv[])
{
    int p, i;
    int mark_limit;
    long long sum = 0;

    memset(bitmask, '\0', sizeof(bitmask));
    mark_limit = (int)sqrt(limit);
    
    for (p=2 ; p <= mark_limit ; p++)
    {
        if (! ( bitmask[p>>3]&(1 << (p&(8-1))) ) )
        {
            /* It is a prime. */
            sum += p;
            for (i=p*p;i<=limit;i+=p)
            {
                bitmask[i>>3] |= (1 << (i&(8-1)));
            }
        }
    }
    for (; p <= limit; p++)
    {
        if (! ( bitmask[p>>3]&(1 << (p&(8-1))) ) )
        {
            sum += p;
        }
    }

    printf("%lli\n", sum);

    return 0;
}

This was too fast to measure with 20 runs alone, so 500 runs of it took 15 seconds, two or three orders of magnitude faster than the fastest Haskell or Perl versions. But naturally, we can apply the halving paradigm there too:

#include <string.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>

#define limit 2000000
int8_t bitmask[(limit+1)/8/2];

int main(int argc, char * argv[])
{
    int half, p, i;
    int half_limit;
    int loop_to;
    long long sum = 0 + 2;

    memset(bitmask, '\0', sizeof(bitmask));

    loop_to=(((int)(sqrt(limit)))>>1);
    half_limit = (limit-1)>>1;
    
    for (half=1 ; half <= loop_to ; half++)
    {
        if (! ( bitmask[half>>3]&(1 << (half&(8-1))) ) )
        {
            /* It is a prime. */
            p = (half << 1)+1;
            sum += p;
            for (i = ((p*p)>>1) ; i < half_limit ; i+=p )
            {
                bitmask[i>>3] |= (1 << (i&(8-1)));
            }
        }
    }

    for( ; half < half_limit ; half++)
    {
        if (! ( bitmask[half>>3]&(1 << (half&(8-1))) ) )
        {
            sum += (half<<1)+1;
        }
    }

    printf("%lli\n", sum);

    return 0;
}

500 runs of it take 10 wallclock seconds - 54.35 times per second, and 50% better than the previous C version. And I still haven't applied platform-specific gcc optimisations.

I should also note that the executables generated by ghc are extremely large in comparison to their C ones:

$ ls -l c_* haskell_*
-rwxr-xr-x 1 shlomi shlomi   6082 2009-12-04 07:46 c_mine
-rwxr-xr-x 1 shlomi shlomi   6103 2009-12-04 07:46 c_mine_half
-rwxr-xr-x 1 shlomi shlomi   6092 2009-12-04 07:46 c_mine_micro_opt
-rwxr-xr-x 1 shlomi shlomi 796825 2009-12-04 07:56 haskell_mine
-rwxr-xr-x 1 shlomi shlomi 571717 2009-12-04 07:46 haskell_zeroth

c_mine_half is less than 1% the size of haskell_mine (and runs faster). When talking about this to other people, they said that Haskell has a very optimised primes sequence generator, which I can try using (which should be over 3 times as fast), and that it has two kinds of integers, which the other type is faster, and that it has a better way to emulate assignment. But the bottom line is that the naïve and intuitive way to write such programs in Haskell is under-performant, even in comparison to Perl, and 100 or 1,000 times as much in comparison to C.

I've written this as a separate post and not as a comment to the original blog post because I'm very limited with the markup in the commenting there (I'm going to post a comment there with a link to this blog post, though). I should note that you can find all the code I mentioned inside a dedicated Github repository, and you can experiment with it further.

Dave's Free Press: Bryar security hole

Someone on IRC reported a bug in Bryar. Namely that a Naughty Person can exploit the feature that notifies you of blog-spam by email to execute arbitrary code on your machine, as the user you run Bryar under.

A patched release is on the way to the CPAN, and you are strongly urged to upgrade.

Dave's Free Press: Thanks, Yahoo!

[originally posted on Apr 3 2008]

I'd like to express my warm thanks to the lovely people at Yahoo and in particular to their bot-herders. Until quite recently, their web-crawling bots had most irritatingly obeyed robot exclusion rules in the robots.txt file that I have on CPANdeps. But in the last couple of weeks they've got rid of that niggling little exclusion so now they're indexing all of the CPAN's dependencies through my site! And for the benefit of their important customers, they're doing it nice and quickly - a request every few seconds instead of the pedestrian once every few minutes that gentler bots use.

Unfortunately, because generating a dependency tree takes more time than they were allowing between requests, they were filling up my process table, and all my memory, and eating all the CPU, and the only way to get back into the machine was by power-cycling it. So it is with the deepest of regrets that I have had to exclude them.

Cunts.

[update] For fuck's sake, they're doing it again from a different netblock!

Dave's Free Press: Module pre-requisites analyser

As a service to module authors, here is a tool to show a module's pre-requisites and the test results from the CPAN testers. So before you rely on something working as a pre-requisite for your code, have a look to see how reliable it and its dependencies are.

Perlgeek.de : Set Phasers to Stun!

Did you ever wonder how BEGIN, CHECK, END and so on are called in Perl? Well, they didn't have a good name, until recently.

The Perl 6 spec listed them under closure traits, which is unwieldy, and not really exact either. Now they are called phasers, because they tell you which execution phase the block or statement runs in.

There are so many possible puns that I'll refrain from writing any.

Perlgeek.de : Keep it stupid, stupid!

How hard is it to build a good search engine? Very hard. So far I thought that only one company has managed to build a search engine that's not only decent, but good.

Sadly, they seem to have overdone it. Today I searched for tagged dfa. I was looking for a technique used in regex engines. On the front page three out of ten results actually dealt with the subjects, the other uses of dfa meant dog friendly area, department of foreign affairs or other unrelated things.

That's neither bad nor unexpected. But I wanted more specific results, so I decided against using the abbreviation, and searched for the full form: tagged deterministic finite automaton. You'd think that would give better results, no?

No. It gave worse. On the first result page only one of the hits actually dealt with the DFAs I was looking for. Actually the first hit contained none of my search terms. None. It just contained a phrase, which is also sometimes abbreviated dfa.

WTF? Google seemed to have internally converted my query into an ambiguous, abbreviated form, and then used that to find matches, without filtering. So it attempted to be very smart, and came out very stupid.

I doubt that any Google engineer is ever going to read this rant. But if one is: Please, Google, keep it stupid, stupid.

I'm fine with getting automatic suggestions on how to improve my search query; but please don't automatically "improve" it for me. I want to find what I search for. I'm not interested in dog friendly areas.

Perlgeek.de : Musing and the future of feather and the Pugs repository

(This blog post will probably only interest Perl 6 hackers, since it talks only about infrastructure.)

One of the central pieces of Perl 6 infrastructure is the Pugs svn repository. It holds not only the Pugs source code, but also lots of other Perl 6 projects:

  • Specification
  • Test suite
  • STD.pm (the standard grammar)
  • SMOP
  • mildew and mildew-js
  • sprixel
  • vill
  • mp6, kp6, perlito
  • elf
  • various websites (perl6.org, pugscode.org, perlcabal.org/syn/)
  • a host of scripts related to keep things running (rebuild the HTML version of the synopsis; smartlink checking; cronjobs for updating websites etc.)
  • various documentation efforts
  • An unknown number of projects more or less related to Perl 6

It's huge, but at the same time it's very practical: anybody who is interested can get write access very easily, create a new subfolder for a new project, or can fix a typo in someone else's README file without asking for commit access first.

The pugs repo is also viral: Anybody with commit access and invite new committers. Despite what you might think of it: it actually works in practice, so far I haven't seen a single case of abuse.

The pugs repository is hosted on feather, a server kindly provided by Juerd and his company. It contains three virtualized servers, feather{1,2,3}. feather2 is used for "sensitive" data (for example an IRC bot that has API keys for various github accounts, and the perl6.org website). Only a handful of "trusted" users have an account there. feather3 is used for low security stuff like evalbots which might go astray.

feather1 holds all the rest. That means the pugs repository, commitbit (the software we use for handing out commit bits and resetting svn passwords), various websites, and a whole bunch of Perl 6 developers and users have a shell account there, for trying out Perl 6 and hosting their screen + irssi sessions there.

I was about to write feather1 is maintained by a bunch of volunteers, but that would be a lie. It is "maintained" on an as-needed basis by whoever has time and feels half-way competent. It is an "interesting" mixture of Debian unstable and experimental. And it's becoming unmaintainable.

It seems clear what to do: set up a fourth virtual machine, set up a replacement for feather1, and en passant migrate some things (like websites and the pugs repo) to feather2.

But wait! There is an issue. There's always an issue. Setting up a new machine and migrating services takes time. Lots of time. And nobody wants to do it right now. Quite understandably.

Take the pugs repository for example: you might think it's easy enough to copy /var/svn/... recursively on the new host and set up Apache... except that you need authentication. And authentication is coupled to commitbit. And commitbit runs on Jifty. And if you want to install Jifty, you install half of CPAN. Does anybody know if commitbit is still maintained? and if it installs cleanly on a Debian stable?

So I thought about some changes to the infrastructure to make it easier to maintain. For example we handle commit bits differently for git projects on github: an IRC bot knows the API keys of the project owners, and can add committers to the projects. That's nice, and the IRC frontend is much leaner than the Jifty-based web frontend. But we lose virality. For security reasons the bot has to keep a whitelist of IRC users who are allowed to invite commiters. Since IRC nick names and (git|svn) user names don't always match, such a list has to be maintained manually.

The second question is: should we take the chance and move the pugs repo to git? I prefer git over svn by far, but there are also costs involved. For example Rakudo checks out a copy of the test suite via svn. The test suite is only a subdirectory of the pugs repo, so unless we split the pugs repo, we'd have to find a way to check out only a part of a git repository. I have no idea if that's possible.

Either way, I'd like to hear your thoughts, and learn from your experience:

  • Do you know any "viral" git or svn hosting solution (ie where every committer can invite more committers)? (don't say commitbit - it's a PITA to maintain. Something more lightweight would be appreciated)
  • If you use the pugs repository now: do you prefer svn or git? how strongly?

Perlgeek.de : Perl 6: Failing Softly with Unthrown Exceptions

Most programming languages handle failures with either of two paradigms: failing routines return special values, or they throw exceptions.

Either way has its severe problems: in languages like C it can be very simple to forget to catch such a return value, and very tedious to propagate them to the caller; on the other hand throwing exceptions often clutters the code with way too many try blocks, and it's generally unfriendly if you try to automatically parallelize expressions.

So Perl 6 offers a middle ground: soft or unthrown exceptions. If a routine calls fail("message"), a new Failure object is created and returned from the current routine. That object behaves as an undefined value, which stores the message, file and line information of the fail() location, a backtrace and so on.

When you ask such an object whether it's true or false, or defined or undefined, you'll get a correct answer, and the exception is marked as handled. However if you try to use it as an ordinary value, it turns into an (ordinary) fatal exception. So both of these work:

# Variant 1: no exception thrown

my $handle = open('nonexistingfile');
if $handle {
    .say for $handle.lines;
} else {
    # do something else
}


# Variant 2

my $handle = open('nonexistingfile');

# throws a fatal exception while calling $handle.lines
.say for $handle.lines;

Now if you do some automatically parallelized operations, a single failure doesn't have to abort the whole operation, and neither is information lost

# divide @a1 by @a2 element-wise, a division by zero might occur:
@a1 »/« @a2;

The API for accessing the Failure objects isn't very mature yet, but the concept stands. See S04/Exceptions for the gory details, as they stand today.

Dave's Free Press: cgit syntax highlighting

For the last few months I've been using git for my version control system. It's better than CVS because it can handle offline commits. So if I'm using my laptop on a train, I can still use version control without having to have a notwork connection.

And to give a pretty web front-end to it for other people to read code without having to check it out of the repository, I use cgit, which I mostly chose because it's a dead simple CGI and not a huge fancy application.

One problem with cgit is that by default it doesn't do code highlighting. But it has the ability to run blobs of code through any filter you care to name before displaying them, so to get something nice like this all you need to do is write a highlighter and add a single line to your cgitrc:

source-filter=/web/www.cantrell.org.uk/cgit/highlighter

My highlighter program is this:

   1 #!/usr/local/bin/perl
   2 
   3 use warnings;
   4 use strict;
   5 
   6 my $file = shift;
   7 
   8 if($file =~ /\.(p[ml]|t)$/i) {
   9     system "/usr/local/bin/perltidy -html -st -ntoc -npod -pre -nss -nnn"
  10 } else {
  11     system "cat -n";
  12 }

Dave's Free Press: Graphing tool

I made a shiny thing! It can plot arbitrary functions of the form x=f(y) or y=f(x). Under the skin, it just massages its arguments and passes them through to Gnuplot. Here's the source code.

Update: now 48.3% even shinier - see on the right

Perlgeek.de : Publicity for Perl 6

I've blogged about the Perl 6 Advent Calendar 2009, and want to follow up that it's been a huge success so far.

We're about half-way through, and you might be interested in our number of visits per day:

On 6th and 7th of December we had about 18k visitors in sum, courtesy of slashdot and Tim O'Reilly on Twitter.

As a result we got a lot of new people in our #perl6 IRC channel, asking for help on how to get Rakudo working, how to code some specific things in Perl 6, or asking about general design decisions.

Also mj41 reported that he has collected more than 50% more Perl 6/parrot blog posts than in the previous year.

We've also acquired perl6.org for our uses this year, and it has been promoted enough to be the third hit on a google search for Perl 6.

All in all I'm rather happy with the marketing state of Perl 6 in 2009, and hope to see similar efforts for 2010. With the upcoming releases of Rakudo Star and a Perl 6 book I'm pretty sure we'll do well, and I hope for more slashdot coverage :-).

Dave's Free Press: Ill

I am ill. I've been ill since Thursday, with a cold. You're meant to be able to cure a cold with [insert old wives tale remedy here] in 5 days, or if you don't, it'll clear itself up in just under a week. So hopefully today is the last day.

So what have I done while ill?

On Friday I became old (see previous post), and went to the Byzantium exhibition at the Royal Academy. It was good. You should go.

Saturday was the London Perl Workshop. My talk on closures went down well, and people seemed to understand what I was talking about. Hurrah! I decided that rather than hang around nattering and going to a few talks, I'd rather hide under my duvet for the rest of the day.

I mostly hid on Sunday too, and spent most of the day asleep. In a brief moment of productivity, I got my laptop and my phone to talk to each other using magic interwebnet bluetooth stuff. I'd tried previously without success, but that was with the previous release of OS X. With version X.5 it seems to Just Work, so no Evil Hacks were necessary.

The cold means that I can't taste a damned thing, not even bacon. So now I know what it's like to be Jewish. Being Jewish sucks.

And today, I am still coughing up occasional lumps of lung and making odd bubbling noises in my chest, although my nasal demons seem to be Snotting less than they were, so hopefully I'll be back to normal tomorrow.

Dave's Free Press: Devel::CheckLib can now check libraries' contents

Devel::CheckLib has grown a new feature. As well as checking that libraries and headers exist, it can now also check that particular functions exist in a library and check their return values. This will be particularly useful if you need to check that a particular version of a library is available.

It works if you have a Unixish toolchain. I need to wait for the CPAN-testers reports to see if I managed not to break anything on Windows. Unfortunately, even though the lovely Mr. Alias has worked hard to make Windows machines available to developers, I found it to be just too hard to use. Even downloading my code to the Windows machine was hard, as Windows seemed to think it knew better and shouldn't download the file I told it to download. Then once I had downloaded it, Windows decided to hide it somewhere that I couldn't get to using the command line. So I gave up.

I might try again once there are some decent tools on the machines: wget, tar, and gzip at minimum, as given those I can quickly bootstrap anything else. Software development isn't just about having compilers available.

Dave's Free Press: POD includes

One of my CPAN distributions is CPAN-FindDependencies. It contains a module CPAN::FindDependencies, and a simple script that wraps around it so you can view dependencies easily from the command line. That script, naturally, has a man page. However, that manpage basically says "if you want to know what arguments this program takes, see the CPAN::FindDependencies docs". This is Bad from a usability point of view, good from a not-duplicating-stuff point of view, and good from a laziness point of view. Which means that it's Bad.

So, the solution.

=over

#include shared/parameters

=back

and some Magic that does the cpp-stylee substitution at make dist time. Note the 'dist' section in my call to WriteMakefile.

This is, of course, crying out to be made less horribly hacky, but it works for now, so I'm happy.

My original idea was to write some crazy shit that would do the #include at install-time, when the user was installing my code. But that has the disadvantage that tools like search.cpan wouldn't show it properly, as they simply look at the files in the distribution. So this does the #includes at the last moment just before I package up the code and upload to the PAUSE. You lovely people get the right documentation in all the right places, I only have to maintain it in one place so it stays in sync, and (in the interests of Laziness) I don't have to remember to run any extra scripts before releasing, make dist just Does The Right Thing.

Dave's Free Press: YAPC::Europe 2007 report: day 1

As is becoming normal, I used the times between talks to bugfix some of my modules - this time Tie::STDOUT and Data::Transactional. The former was failing on perl 5.6, the latter on 5.9.5. The former was a bug in perl (you can't localise tied filehandles and expect the tieing to go away in 5.6, so it now declares a dependency on 5.8), the latter was a bug in my code.

Philippe Bruhat's talk on Net::Proxy was great - you can tell it's great because I came away with ideas for at least four things that I need to write. First up will be a plugin for it to allow the user to specify minimum and maximum permitted data rates for proxied connections. This will permit bandwidth limits for maximum permitted rates, but will also help to defeat IDSes doing traffic analysis if you specify a minimum permitted data rate.

This will protect (eg) ssh sessions from being identified based on their very bursty traffic pattern, by "filling in the blanks" with junk data.

In the evening, the CPAN-testers BOF was productive.

Dave's Free Press: XML::Tiny released

I have released my XML::Tiny module. The parser at its core is less than twenty lines of code. Pretty easy to follow code too, I think, and that also includes error handling. One of my aims in writing it was to keep memory usage and code to the absolute minimum, so it doesn't handle all of XML. The documentation says that it supports "a useful subset of XML". Personally, I think it supports the useful subset. It's certainly enough to parse the data I get back from Amazon when I use their web services, and to parse an RSS feed.

Dave's Free Press: Palm Treo call db module

To make up for a disappointing gap in Palm's software for the Treo smartphone, I wrote a <a href=http://www.cantrell.org.uk/david/tech/treo/call-dumper/>small perl script</a> to parse the database that stores my call history. I then re-wrote it as <a href=http://search.cpan.org/search?query=Palm%3A%3ATreoPhoneCallDB>a re-useable module</a> which also figgers out whether the call was incoming or outgoing.

Perlgeek.de : Perl 6 Tidings from October 2009

These tidings posts seem to become rather sparse, but I hope you get some news by reading the Planet Six feed aggregator anyway.

Specification

  • Larry lifted up the dual nature of Ranges. They mostly serve as an interval now, for smart iteration the series operator has been pimped up. You can now write for 1, 3 ... *+2, 9 { .say } to print all the odd numbers between 1 and 9. (r28344, r28348, r28351).
  • Rational and Complex types now have their own literals (r28173).
  • Stubbed classes are now documented (r28196, r28197).
  • The new S08 documents Parcels and Captures.
  • The numeric types have been cleaned up a lot (r28502 and later commits up to r28597).
  • New and improved signature introspection (r28664, r28665).

Compilers

Rakudo

As opposed to two months ago, Rakudo now

  • supports the Rat type
  • supports overloading of many built-in operators
  • has contextual variables
  • has a faster and much better signature binder
  • supports all kind of trigonometric functions, including on complex numbers
  • implements sophisticated signature introspection

Patrick Michaud is also working on a new tool named npq-rx, which combines a self-hosting NQP compiler and a new regex engine, which already supports proto regexes, NQP code assertions and closures, and is generally much faster and better than PGE.

Sprixel

Mathew Wilson aka diakopter started sprixel, a Perl 6 to Javascript compiler.

Mildew

Mildew now has an experimental javascript emitter.

Other matters

perl6.org is redesigned again, this time spanning multiple pages, thus allowing much more stuff to be linked there.

Four Perl 6 and Rakudo hackers announced that they are writing a Perl 6 book, the print release of which shall coincide with the release of Rakudo Star.

Perlgeek.de : Why was the Perl 6 Advent Calendar such a Success?

I think it's not too bold to call the Perl 6 Advent Calendar a full success: over 40k page views, more than 150 non-spam comments, and lots of new faces and nicknames in #perl6 - it's been a very pleasant surprise.

I asked myself why it became such a success, and came up with this list of things:

Limited in Scope

24 days are over rather quickly, so everyone who contributes knows when the deed is done

It appeals to very different contributors

Some people like the shiny new regex and grammar features, others are interested in the object model, operator overloading or complex numbers. Since we had no fixed topics, everybody could choose a topic that played to their strengths.

Each task is small and well limited

Writing a good post about a topic you know takes about half an hour up to an hour, maybe two if you need to do some research on the topic. Either way it's a relatively small task (compared to "write a regex engine" or "redesign the build system of $compiler" or so).

Schedule and polite nagging

We had a schedule for each day, and people could add their name to a free slot. When the day approached, the other authors started to ask politely how it was progressing, making sure people did not forget their posts. Also authors felt the subtle pressure to finish their posts on time.

Lots of positive feedback

We had lots of feedback, most of it quite positive: blog comments, visitors in our IRC channel, being featured on slashdot. That was very encouraging.

Also other authors would preview and proof-read the posts before they were published, pointing out falsities and gems.

We had a driving force

PerlJam and colomon took care. The decided to make the advent calendar happen, set up a blog, discussed things and so on - they made it happen.


All these factors encouraged contributions. I don't think all of them are necessary for a successful, lively projects, but they certainly help.

What other factors do you know that encourage contribution in open-source projects? What could we have done even better?

Dave's Free Press: Number::Phone release

There's a new release, <a href=http://www.cantrell.org.uk/david/tech/perl-modules/Number-Phone-1.58.tar.gz>version 1.58</a>, of Number::Phone, my set of perl modules for picking information out of phone numbers. Changes from the previous release are that Mayotte, Reunion and Comoros can't decide which country is which, and there's the usual updates to the database of UK numbers, mostly to support the <a href=http://www.ofcom.org.uk/media/news/2007/02/nr_20070213b>new 03 numbers</a>.

Perlgeek.de : Immutable Sigils and Context

If you have an array @a and want to access the first element, in Perl 5 you write that as $a[0], in Perl 6 you write it as @a[0]. We call the former mutable sigil, and the latter immutable sigil.

You might think that's a small change, but the implications are rather deep, and we've had quite a few discussions about it in #perl6. In particular people often ask if it's possible to backport the Perl 6 behavior to Perl 5. The answer is "not easily".

In Perl 5 context propagates inwards, which means that in a statement like

... = func()

The compiler wants to know at compile time which context func() is in. If it doesn't, it complains:

2$ perl -ce '(rand() < 0.5 ? $a : @a) = func()'
Assignment to both a list and a scalar at -e line 1, at EOF
-e had compilation errors.

This also means that, in Perl 5, array slices and scalar array accesses have to be syntactically distinguished:

my @a;
$a{other_func()} = ...; # scalar context
@a{other_func()} = ...; # list context

So you can't just make sigils in Perl 5 immutable without also rewriting the whole context handling rules.

In Perl 6 that's not a problem at all, because functions don't know the context they're in, in fact can't know because of multi dispatch.

Instead functions return objects that behave appropriately in various contexts, and the context is determined at run time.

After getting used to it the immutable sigils are quite nice, and less complicated when references are involved. Anybody who objects without having tried it for at least three weeks, and is spoiled by Perl 5, will be shot.

Dave's Free Press: CPANdeps

<a href=http://cpandeps.cantrell.org.uk/>CPANdeps</a> now lets you filter test results by perl version number, and also knows what modules were in core in which versions of perl. Hurrah!

Dave's Free Press: YAPC::Europe 2007 report: day 2

A day of not many talks, but lots of cool stuff. Damian was his usual crazy self, and MJD's talk on building parsers was really good. Although I probably won't use those techniques at work as functional programming seems to scare people.

The conference dinner at a Heuriger on the outskirts of Vienna was great. The orga-punks had hired a small fleet of buses to get us there and back, and one of the sponsors laid on a great buffet. The local wine was pretty damned fine too, and then the evening de-generated into Schnapps, with toasts to Her Majesty, to her splendid navy, and to The Village People.

It wasn't all debauchery in the evening though - on the bus, I had a very useful chat with Philippe about Net::Proxy, and re-designing it to make it easier to create new connectors for it.

Dave's Free Press: YAPC::Europe 2006 report: day 3

There were quite a few interesting talks in the morning, especially Ivor's one on packaging perl applications. Oh, and mine about rsnapshot, of course, in which people laughed at the right places and I judged the length of it just right, finishing with a couple of minutes left for questions.

At the traditional end-of-YAPC auction, I avoided spending my usual stupid amounts of money on stupid things, which was nice. Obviously the hundred quid I put in to buying the hair style of next year's organisers wasn't stupid. Oh no. Definitely not.

An orange mohican will suit Domm beautifully.

Perlgeek.de : We write a Perl 6 book for you

We want a Perl 6 book. We want it badly enough to write it ourselves. So that's what we're doing: writing one.

We, that is Patrick Michaud (architect of the Rakudo Perl compiler), Jonathan Worthington (prolific contributor to both Rakudo and Parrot), Carl Mäsak (frenetic Rakudo user, and our number one bug finder) and Moritz Lenz (keeper of the Perl 6 test suite, and Perl 6 user and blogger). We are also open to contribution from others - already Jonathan Scott Duff has written an initial preface for us.

We don't have a name yet for our book. We want to cover the basics of Perl 6, enough to get your feet wet, and enough to make you want to use it. We want it to be based on useful examples. It is not going to be the definitive book, that task we leave to Larry Wall and Damian Conway.

Our vision is to present primarily the subset of Perl 6 that Rakudo understands, and have printed copies available by the time Rakudo Star is released, that is April or May 2010. chromatic and Allison Randal have kindly offered to published it via Onyx Neon Press.

Until then, monthly releases will be published under a Creative Commons license (noncommercial, attribution, share-alike).

Currently we have four chapters under construction, and the intention of writing the more introductory chapters later, when we know what we need to introduce for the later chapters. So far we have

  • Multi dispatch
  • Classes and Object
  • Regexes
  • Grammars

You can download the preliminary PDF version of the book here.

Interested? Check out the git repository, and join us in irc://freenode.net#perl6book.

Dave's Free Press: YAPC::Europe 2007 report: day 3

My Lightning Talk on cpandeps went down really well, although as José pointed out, I need to fix it to take account of File::Copy being broken. I also need to talk to Domm after the conference is over to see if I can get dependency information from CPANTS as well as from META.yml files.

There were lots of other good lightning talks. Dmitri Karasik's regexes for doing OCR, Juerd Waalboer's Unicode::Semantics, and Renée Bäcker's Win32::GuiTest were especially noteworthy.

Richard Foley's brief intro to the perl debugger was also useful. Unfortunately Hakim Cassimally's talk was about debugging web applications, which I'd not noticed on the schedule, so I didn't stay for that.

And finally, Mark Fowler's grumble about why perl sucks (and what to do about it) had a few interesting little things in it. I am having vaguely sick ideas about mixing some of that up with an MJD-stylee parser.

At the auction I paid €250 to have the Danish organisers of next year's YAPC::Europe wear the Swedish flag on their foreheads. This, I should point out, was Greg's idea. I would never be so evil on my own.

Dave's Free Press: Perl isn't dieing

Perl isn't dieing, but it tells me that it wishes it was. Last night it went out on the piss with Python and Ruby (PHP was the designated driver) and it did rather too many cocktails. It isn't quite sure what happened, but it woke up in the gutter in a puddle of its own fluids and its head hurts a lot.

It asked me to ask you all to keep the volume down.

Dave's Free Press: YAPC::Europe 2007 travel plans

I'm going to Vienna by train for YAPC::Europe. If you want to join me you'll need to book in advance, and probably quite some way in advance as some of these trains apparently get fully booked.

arrdepdate
Waterloo1740Fri 24 Aug
Paris Nord2117
Paris Est2245
Munich08590928Sat 25 Aug
Vienna1335

The first two legs of that are second class, cos first wasn't available on Eurostar (being a Friday evening it's one of the commuter Eurostars and gets booked up months and months in advance) and was way too spendy on the sleeper to Munich. Upgrading to first class from Munich to Vienna is cheap, so I have.

Coming back it's first class all the way cos upgrading was nearly free ...

arrdepdate
Vienna0930Fri 31 Aug
Zurich1820
Zurich1402Sun 2 Sep
Paris Est1834
Paris Nord2013
Waterloo2159

Don't even think about trying to book online or over the phone, or at the Eurostar ticket office at Waterloo. Your best bet is to go to the Rail Europe shop on Picadilly, opposite the Royal Academy and next to Fortnums.

Perlgeek.de : The Perl 6 Advent Calendar

In the great tradition of Perl Advent Calendars, colomon started and announced the 2009 Perl 6 Advent Calendar, with a post about Perl 6 each day.

After the first post many #perl6 regulars volunteered to contribute a post, so 20 of the 24 days are already allocated.

I'm looking forward to many nice posts, most of which will probably highlight a small Perl 6 feature.

Perlgeek.de : Defined Behaviour with Undefined Values

In Perl 5 there is the undef value. Uninitialized variables contain undef, as well as non-existing hash values, reading from unopened or exhausted file handles and so on.

In Perl 6 the situation is a bit more complicated: variables can have a type constraint, and are initialized with the corresponding type object:

my Int $x;
say Int.WHAT();     # Int()

These type objects are also undefined, but in Perl 6 that doesn't mean they are a magical value named undef, but that they respond with False to the defined() subroutine and method.

In fact there is no undef anymore. Instead there are various values that can take its place:

Mu is the type object of the root type of the object hierarchy (or put differently, every object in Perl 6 conforms to Mu). It's the most general undefined value you can think of.

Nil is a "magic" value: in item (scalar) context it evaluates to Mu, in list context it evaluates to the empty list. It's the nothing to see here, move along value.

Each type has a type object; if you want to return a string, but can't decide which, just return a Str.

Other interesting undefined values are Exception (which usually contain a message and a back trace), Failure (unthrown exceptions), Whatever is a generic placeholder that can stand for "all", "infinitely many", "many" or as a placeholder for a real value.

Perlgeek.de : Is Perl 6 really Perl?

A few days ago masak blogged about the social aspect of the is Perl 6 really Perl? question.

He presumes that the answer is yes, but doesn't tell us why. I'll try to give some reasons.

Perl 6 started as the successor to Perl 5

Perl 6 started off as the successor to Perl 5, at a Perl 5 meeting, by the Perl crowd. It was a plan to escape both the backwards compatibility trap (which meant that broken things couldn't be fixed without many people yelling), and the lack of momentum in the community.

Perl 6 embraces the Perl philosophy

What makes Perl Perl? In my opinion it's not the sigils on variables that make Perl Perl, or that writing a regex only need two characters and so on. It's mostly the philosophy that makes the difference.

There are some underlying principles like TIMTOWTDI, context sensitivity, convenience over consistency, making simple things easy and hard things possible, often used constructs short and less frequent things longer, and so on.

Perl 6 is founded on all those philosophies and ideals, and also shares some technical principles. For example sigils on variables (oh, I mentioned them already ...), easy access to powerful regexes, that fact that operations coerce their arguments (instead of the type of arguments determining the operation like in javascript, where a + can either mean addition or string concatenation).

So if you agree with my definition of what makes Perl Perl, Perl 6 is also Perl. If not, please tell me what's the essence of Perl!

Perlgeek.de : Doubt and Confidence

<meta>From my useless musings series.</meta>

As a programmer you have to have confidence in your skills, to some extent, and at the same time you have to constantly doubt them. Weird, eh?

Confidence

You need some level of confidence to do anything efficiently. Planning ahead requires confidence that you can achieve the steps on your way.

As a programmer you also need some confidence with the language, libraries and other tools you're using.

If you program for money, you also have to assess what kind of programs you can write, and where you might have problems.

Doubt

In the process of programming you make a lot of assumptions, some of the explicit, some of them implicit. If you want to write a good program, it's essential that you are aware of as many assumptions as possible.

When you find a bug in your program, you have to challenge previous assumptions, and that's where doubt comes in. You not only suspect, but you know that at least one of the assumptions was false (or maybe just a bit too specific), and you know that you did something wrong.

Sometimes programmers make really stupid mistakes which are rather tricky to track down. That's when you have to question your own sanity.

One example (that luckily doesn't happen all that often to me) is when I edit my program, and nothing seems to change. Nothing at all. Depending on the setup it might be some cache, but something it is even more devious - for example I didn't notice that the console where I edit and the console where I test are on different hosts - and thus the edits actually have no effect at all.

After having done such a thing once or twice I adopted the habit of just adding a die('BOOM'); instruction to my code, to verify that the part I'm looking at is actually run.

These are moments when I question my own sanity, thinking "how could I have possibly done such a stupid thing?". Doubt.

The same phenomena applies when doing scientific research: since you usually do things that nobody has done before (or at nobody has published about it yet), you can't know the results beforehand -- if you could, your research would be rather boring. So you have no external reference for verification, only your intuition and discussion with peers.

Perlgeek.de : Perl 6: Lost in Wonderland

When you learn a programming language, you not only learn about the syntax, semantics and core libraries, but also the coding style and common idioms.

Idioms are common usage patterns; learning and reusing them means you have to spend less time thinking on common things, and have more time working out the algorithms you deal with.

That's different if you learn Perl 6 - it's a largely unexplored field, and while there are loads of nice features, you might still feel a bit lost. At least I do. That's because I often think "There's got to be a much easier way to achieve $this, but it often takes time to find that easier solution - because nobody developed an idiom for it.

In those cases it helps to ask on the #perl6 IRC channel; many smart people read and write there, and are rather good in coming up with simpler solutions.

For example see masak's ROT13 implementation in Perl 6. In the right column you can see later revisions, and how they gradually improve, steady up to a one-liner.

I also made some simplifications to JSON::Tiny, which basically shows that when I wrote these reduction methods first I used Perl 6 baby talk language.

The nice things about exploring the Perl 6 wonderland of unexplored idioms is that it really pushes your ego if you find a nice simplification, and that you have something to blog about for the Planet Perl Iron man ;-)

Dave's Free Press: Wikipedia handheld proxy

I got irritated at how hard it was to use Wikipedia on my Treo. There's so much rubbish splattered around their pages that it Just Doesn't Work on such a small screen. Given that no alternatives seemed to be available - at least, Google couldn't find any - I decided to write my own Wikipedia handheld proxy.

It strips away all the useless rubbish that normally surrounds Wikipedia pages, as well as things like the editing functions which are also hard to use on portable devices. Internally, it's implemented using perl, LWP, and mod_perl, and is hosted by Keyweb.de.

Dave's Free Press: CPANdeps upgrade

While you won't notice any changes, there have been biiiig upgrades at CPANdeps. Here's the diff.

Until now, it's used a SQLite database of test results that I downloaded every day and then mangled a bit to do things like add some necessary indices, figure out which reports are from dev versions of perl, and so on. That worked really well back in the summer of 2007, when there were only half a million reports in the database. I started worrying a bit at the beginning of 2009 when we hit 3 million, but the update happened overnight so I didn't care. But now that we've got over 6 million reports, the update would take anywhere between 8 and 14 hours. Not only is that not sustainable given the current growth rate, it also hurts the other users on that machine, because almost all of that time is spent waiting for disk I/O - which means that they're also waiting for the disk. On top of that, when you have big databases, a SQLite CGI ain't a great idea because indices have to be fetched from disk every time, so reads pound the disk too. Doubleplusungood!

Fun fact: SQLite is great for prototyping, but it doesn't scale :-)

So now it uses MySQL. Having a database daemon running all the time means that there's now some caching, so reads are quicker. In addition, given that I can't just simply fiddle with the structure of the database that I download to produce what I want, and instead have to import the data into MySQL, it now only imports new records, so the daily update takes only a few seconds.

I also re-jigged the structure of how it caches test results. Instead of being all in one directory with hundreds of thousands of files, they're split into a hierarchy. This probably won't have any significant effect on normal operations, but it will certainly make it faster for me to navigate around and see what's going on when people submit bug reports!

Perlgeek.de : Perl 6 in 2009

Much has happened in the Perl 6 land in 2009. Here is my humble attempt to summarize some of it; If you find something that I missed, feel free to contact me, I'll try to add it.

Specification

The year started with lots of improvements to S19. In January we also learned that *-1 constructs a closure, which means that Perl 6 has semi-automatic currying features built into most operators.

Lists, Captures and Parcels

We've seen a lot of talk about slices, lists, captures and parcels. The heart of the discussions is always how interpolation and non-interpolation of lists can be made both flexible and intuitive. For example: should 1, 2, 3 Z 'a', 'b', 'c' return a single, flat list? or instead a list of lists? How can a function which receives the result decide for itself what it want to receive? How does that mix with multi-dimensional arrays?

I haven't followed these discussions very closely, and so I'm hard pressed to give a good summary; however it seems that in the end an agreement was reached: each parenthesis constructs a Parcel, short for Parenthesis cell. A Parcel can behave context sensitively: A single-item Parcel degrades to its contents; as a signature list it is converted to a Capture object; code object also return parcels.

It remains to be seen how multi-dimensional slices (with the @@ sigil) evolve, and if we can't find anything suitable to replace them.

Built-in Routines

S29, the list of built-in functions and methods, finally got some long awaited attention in 2009, starting with Carl Mäsak's S29 Laundry List, and later carried on by Timothy Nelson, who split S29 into a set of documents summarized as S32.

In December it was decreed that most built-in methods have a candidate in a new class Cool, (Convenient OO Loopbacks), of which all value types and container types in Perl 6 inherit. That way maximal DWIMyness can be retained, while keeping user defined types clean of the more than hundred methods defined in Cool.

It is rather perlish to have a distinct name for each operation, and make it coerce its arguments. A few exceptions exist in Perl 5 (like reverse, which is list reverse in list context, and string reverse in string context); in Perl 6, most of these exceptions have been removed: reverse now only reverses lists, strings are reverted with flip, hashes with invert.

At the Nordic Perl Workshop, Larry decided that the prefix:<=> operator had to go, and replaced it with the .get and .lines methods.

Operators

The Cross Meta Operator is now Xop instead of XopX; in analogy the R meta operator reverses the argument list, so $a R- $b is the same as $b - $a.

Ranges served two purposes: one for denoting ranges in the sense that the mathematicians use them, and for generating lists according to simple schemes. These two functions have been separated: ranges are still constructed with two dots, but the :by adverb is gone; more intricate, lazy list generation can be achieved with the new series operator:

.say for 1, 1.1, 1.2 ... 5;
.say for 1 ... *+0.1, 5;

Numbers

The above actually works, and doesn't suffer from floating-point arithmetics, because 0.1 isn't stored as a floating-point number, but rather as a fractional number of type Rat.

Other languages decided against that approach, because some very simple loops quickly produce rather large numerators and denominators, degrading performance of the integer operations. Perl 6 instead has a limit in denominator size, and falls back to floating-point operations when that limit is crossed.

Implementations

Rakudo

A lot of work has been done in Rakudo; in fact it's hard to remember how it used to be in January 2009; Most features were implemented by Patrick Michaud and Jonathan Worthington, but we had a lot of other contributors too.

In January, Rakudo left the Parrot repository and since then lives on github as a git repository. It now relies on an installed parrot.

Rakudo implements many new features and lifts old limitations:

  • Many built-in routines are now written in Perl 6
  • eval() and classes now have access to outer lexical variables
  • Much improved Unicode support, both in IO and regular expression
  • punning of roles when .new is called
  • Typed arrays and hashes, parametric roles
  • Routine return types are now enforced
  • Error messages now contain backtraces with filenames and line numbers
  • Multi dispatch is now implemented with a custom dispatcher and signature binder, bringing much improvements over the dispatch and binding semantics that parrot supports.
  • User-defined operators now possible, and automatically generate some of their associated meta-operators.
  • Contextual variables
  • User-defined traits are now possible; some of the built-in traits are now written in pure Perl 6.
  • Rational numbers are now implemented, and support for Complex numbers has been much improved.
  • routine signatures can now be introspected properly.

SMOP and Mildew

SMOP and Mildew have seen a major refactoring, connected to the changed semantics of slices, captures and parcels, and to the way method invocations are stored.

Paweł Murias implemented multi dispatch as a Summer of Code project. Mildew now supports an impressive set of features, but since it is not very user oriented, I know of no projects that actually use mildew as a platform.

Other implementations

Elf development seems to have stalled. Pugs mostly sleeps, too, though Audrey updated it to work with the latest Haskell compilers. (It doesn't live in the Pugs repository anymore though, and is distributed by cabal, the Haskell package manager).

New in the field are Sprixel, a Perl 6 to Javascript compiler, and vill, an experimental LLVM backend to STD.pm+viv.

Test Suite

The test suite continued to grow; most tests have now been moved to t/spec/, the official Perl 6 test suite. Most tests in the other remaining files are either rather dubious, or rely on behaviour that's not officially specified (or are specific to an implementation).

Many new tests have been contributed by two new faces: Solomon Foster contributed a large number of tests for trigonometric functions on the various number types, and rational and complex numbers. Kyle Hasselbacher provided us with many regression tests for Rakudo which are also useful to other implementations.

Documentation

Bemoaning the fact that Perl 6 has nearly no user-level documentation, Carl Mäsak started u4x, User-Level Documentation for X-Mas. Hinrik Örn Sigurðsson chimed in, and started to write grok, a tool for retrieving and showing documentation, sponsored by the Google Summer of Code project.

Patrick Michaud, Jonathan Worthington, Carl Mäsak, Jonathan Scott Duff and Moritz Lenz started to work on a Perl 6 book, with a few chapters already being written.

Websites

In an attempt to provide an up-to-date link list, Moritz registered perl6-projects.org and collected links. Later Susanne "Su-Shee" Schmitt contributed a nice design, and Daniel Wright made the domain perl6.org available to us.

So we now have a community driven, central Perl 6 site at perl6.org.

Leo Lapworth redesigned perl.org, and also the old Perl 6 development page, and updated it a bit.

Blogs

As an attempt to improve the visibility of the Perl community, Matt S. Trout issued the Ironman Perl Blogging Challenge. So far it's a huge success, and quite a few hackers blog about Perl 6 there. Also the blog roll of the Planetsix Blog Aggregator continued to grow, some excellent new blogs were added in 2009.

Carl Mäsak blogged at least once per day in Novemeber, same procedure as least year :-)

IRC

The #perl6 IRC channel has been very pleasant and active in 2009, with three times the activity of 2008.

The Future

For April 2010 the Rakudo developers have planned a big release called Rakudo *, not feature complete but still useful and usable. Around the same time the new Perl 6 book will be released.

The specification is still evolving, and has some areas that are in need of implementation before they can evolve more; among them are macros, concurrency and IO.

Update: improved floating point example as per comment from Matthias.

Header image by Tambako the Jaguar. Some rights reserved.