Hacker News new | past | comments | ask | show | jobs | submit login
An experiment-driven guide to Perl (might.net)
55 points by sea6ear on Feb 11, 2014 | hide | past | favorite | 43 comments



Full disclosure: I'm the guy who made http://perl-tutorial.org a few years ago because the top google result for "perl tutorial" was a perl 4 tutorial. I have looked at many tutorials and have a vested interest in getting quality tutorials in people's hands to avoid them writing shitty perl.

That said, this tutorial is terrible on a number of points, since it teaches outdated things that have long been known to be dangerous and are only kept around for the sake of backwards compatibility. Reading it is wholly a waste of your time, unless you already know perl like the back of your hand and wish to get enraged; or have the masochistic desire to learn perl in a manner that will punish both yourself and others for your mistake of reading this tutorial.

If you truly wish to learn about Perl in a whirlwind tour, read either the very short free book Modern Perl [1] or any other short tutorial linked on the site i mentioned first.

If you're the author of this tutorial, i applaud you for the effort, but wish you'd have spoken to any part of the community before publishing. If you feel like it, #perl-help on irc.perl.org is a great place to start. And if you meant this as a troll, 10/10, would rage anytime.

[1] http://onyxneon.com/books/modern_perl/


Article author here.

To be clear, this is not a tutorial on writing good, idiomatic Perl. (And, I've strengthened the article's disclaimers to that emphasize that.)

It's a semantic excavation of Perl.

My goal was to understand how the Perl interpreter thinks, and to answer language design questions like: How are parameters passed--by value, by alias, by name, by reference, by need? How are variables scoped--lexically, dynamically, globally? What is the effect of @ in a prototype? For the .. operator, how is the implicit toggle scoped--at the procedure or the nearest enclosing block? How do prototypes influence context, and how do contexts influence evaluation?

That is, I wanted to understand what was possible. The possible is entirely separate from the good.

As a formal semanticist, I was continually surprised by how Perl behaved.

As someone that has had to occasionally debug other people's Perl, I believe there is value in understanding the syntactic and semantic quirks in the language.

Thanks for your comments.

I'll be updating the article with your feedback.


I can dig it. By analogy something like "If you insist on installing deck screws with a hammer instead of a screwdriver, this is what happens with a sledgehammer, this is what happens with a large heavy rock, this is what happens if you use your fist..." It is interesting, rather than terrifying, when seen in that light.

However, Google is going to google, thats its thing, so some victim in the future might think this is the one true answer to using objects in perl, which is not cool.


It might also be useful to add to the disclaimers that many of these things that aren't explicitly documented can be in flux. Parameter passing is currently for example being discussed and may change (experimentally and in a backwards compatible fashion) between the upcoming 5.20 and next year's 5.22 release. It also may not, but I know it's been actively discussed this month.


In that case it would be nice if you actually restricted yourself to posing questions, answering them with experimental code and output, then discussing the result; as opposed to explaining the basics of scalars or posting patently false things like this:

  > For example, the following statement prints to the console:
  > 
  > print Hello, world! ;    # prints Hello, world!
Pretty much half of that article should be deleted, and if the rest was discussion of interesting behaviors in a scientifical manner, that would actually be interesting.


I haven't read the article completely, but I've enjoyed what I've read so far. I'd like to follow along in more detail with a perl interpreter. I'd love to see an entire book that completely tears down the entire interrupter from the perspective of within.

I purchased Ruby Under a Microscope, but I was kind of put off by the similar issues other commentators are having, that it appears to be a basic introduction to how to write Ruby, but that is just how the task has to be approached (based off your existing assumptions and test to see if they hold up). But when the task is complete, you have knowledge on a better way to write Ruby, by accounting for all the exceptions that you can't "see" in the source code, due to leaky abstractions.


I posted this article. I believe that the author, Matt Might is a professor specializing in programming language theory at the University of Utah.

My understanding of this article was that it was less of a tutorial and more of an analysis of the actual semantics of Perl. Thus it is less about how one should write "Modern Perl" than it is about how Perl actually behaves in various circumstances.


Right, but when there's more than one syntax to use to get that behaviour, he invariably picks the c.2001 preference rather than what we've done since.

It's like if somebody started their ruby tutorial off with how to implement your own object system with method_missing.

Sure, it fits the 'actual semantics', but it's not what we really want people to see first :)


I'd actually enjoy that :) However, I first really understood objects from learning how to create them from closures in Scheme. And I'm not sure how many object systems built from array reference hacks in Tcl I've looked at.

But probably not what most people are probably looking for when they are first trying to understand something like Perl or Ruby.


Actually (and sadly) for some jobs with Perl that involve codebases older than 2007. That material would be useful. Because, like the languages you mention, deriving object systems from Perl5's rough axioms, was at one point something of a hobby for people.


Over the course of the article, I tried to show as many different ways of doing something as possible, without assigning judgment as to the "right" way to do something.

If you point out instances where I only documented the old syntax, let me know, and I'll add newer examples as well.


The point is that in every single topic you touch where there is a more modern way to do it, if you mention that modern way, you do it as an aside; instead of first showing the modern way of doing it, and then mentioning as an aside that there is an insane way of doing it.


Wow. Author asks for bug reports ... you explain it's all buggy and don't provide a single example.

For your sins, I'm going to see if I can convince the author to get it into a repository and then I'm going to give you a commit bit ... :D


You're exactly right.

An alternate theme for this article might be: "What happens when a formal semanticist looks at Perl?"


That would be a better title than "Learn Perl by experiment" or "perl-by-example" or "A Guide to Perl: By Experiment" all of which suggest a tutorial, which you're saying is not what you're trying to present.


If it's not a tutorial then i don't understand why he explains EXTREMELY basic things in the worst possible way, fails to actually explain the interesting details of subtle and complex things (like the differences between the ways a sub can be called (no, they do NOT all do the same thing)), and managed to produce a document that looks like it contains literally ALL the things from ALL the bad tutorials i've seen in the past 9 years, with nothing from any of the good ones.


It appears that you've decided it must be a tutorial, and your only basis for that is that it's not a tutorial. That's extremely uncharitable, almost insanely so. Especially since the word "tutorial" appears nowhere in the text. The only reference to learning or teaching is a list of resources for actually learning Perl.


Actually, he does appear to explain &foo() later on, he just doesn't bother mentioning up front that it's completely different.

I think if you consider it as an academic piece, where you're supposed to read all of it and then think, it wouldn't be a bad introduction to perl as it was written in 2003.


Okay, I've read a few of Might's posts, and they're pretty interesting. But it irks me for some reason that nowhere in the posts can I find a date. Am I missing something? Where did you grab this 2003?


From the code. Were I to paranthesise the sentence, "perl as it was written in 2003" would be one clause.


>have a vested interest in getting quality tutorials in people's hands to avoid them writing shitty perl.

Thank you!

I did have the idea of starting a more general site to highlight this problem (wrongtutorial.com) but I couldn't find the right approach.

Good tutorials are hard to find - and the amount of obsolete or just plain wrong information is staggering.


as a fellow perl'er (but not knowing it like the back of my hand) and having to learn through hardship and failure and obscenely old tutorials, what key pionts here can you point out that are absolute no-no's and should be avoided?


    * very first program simply does not work
    * lack of strict
    * lack of warnings
    * lack of my
    * bareword filehandles
    * mentions &-calling of subs
    * thinks it's the same as normal calling
    * snowflake formatting style
    * mentions EXTREMELY outdated books as further reading
    * confuses capitalization of builtins in code examples
    * fails to explain compile phase semantics properly, instead introduces "use" as magical
    * quotes hash keys
    * explains prototypes as something that could be used in general
    * explains post-fix dereference syntax, but describes cumbersome circumfix syntaxes as default
    * 2-arg open instead of 3-arg open

I'm halfway down, i can't be arsed anymore. I feel like i'm reading the Leeds Perl 4 tutorial all over again.


what's funny, i saw the lack of strict and thought...hrmm, well without that we'll see what happens here.


> very first program simply does not work

Guilty. Facepalm. Embarassed. Fixed.

> lack of strict

Not in scope (it's not a tutorial on good Perl), but I'll mention it.

> lack of warnings

Added a general disclaimer in the abstract.

> lack of my

I documented `my` in the subsection on scoping disciplines.

I _tried_ not to use features before I'd introduced them.

And, for most of the "probes," `my` wasn't necessary.

> bareword filehandles

Good point. Added scalar filehandles, as well as how to pass bare words with typeglobs.

Changed most examples to scalar filehandles too.

In doing so, I stumbled across the implicit method invocation form that happens when the first argument to a procedure is an object, so I added an example of that too.

This is exactly the kind of "semantic surprise" that led me to start digging.

> mentions &-calling of subs

Of course.

It's possible, and it can change the semantics of procedure call.

> thinks it's the same as normal calling

I had documented the differences.

Look carefully: The procedure call example includes an error case.

In the parameter passing subsection, I had included a mention of how `&proc` (no args), receives current @_.

> snowflake formatting style

Yep. Definitely not a style guide.

> mentions EXTREMELY outdated books as further reading

I added a link to Modern Perl (as suggested).

And, the new edition of Mastering Perl came out last week. It flipped through it, and it seemed updated.

> confuses capitalization of builtins in code examples

Bug. Fixed.

> fails to explain compile phase semantics properly, instead introduces "use" as magical

Guilty.

I thought about including this in the first revision, but I was nearing exhaustion. I'll add it later.

> quotes hash keys

It's legal.

> explains prototypes as something that could be used in general

I just explain what they are.

They're a part of the language, and they have important consequences for both parsing and interpretation.

> explains post-fix dereference syntax, but describes cumbersome circumfix syntaxes as default

I don't endorse either syntax as default.

> 2-arg open instead of 3-arg open

I'm not trying to document the library, or teach good use, but I added an example for 3-arg.

Thanks for your feedback!


Trying to do OO stuff without the mighty MOOSE? Whoa, retro, man. MOOSE came out about a decade ago. Awesome OO implementation, no point using any other technique.

He's calling "open" in a way thats been a no-no since like Clinton was prez, or at least a long time ago.

You can debate making filehandles plain ole variables or not. The cool kids do it a different way than he does, which is not necessarily wrong.

Backticks are looked at about the same way... so how exactly do you handle stdout/stderr separately with backticks, oh you don't, um... There's another, better way to safely call system stuff.

Also he seems to be missing all error detection / correction / recovery code in general, both in every example and as a general topic.

The Perl Cookbook was awesome... in 2003. The reference book you need is "Modern Perl" by chromatic, edited by Shane Warden, etc.

CPAN gets one mention at the end. Thats wrong. The first thing you do when writing Perl is see whats out there to glue together. Also this is a fun way to learn stuff, rather than boring basic arithmetic or boring toy examples you can sling XML all over creation using a parser, or all kinds of crazy stuff. Life's just a lot more fun with CPAN.

From a style perspective if Perl::Critic and/or perltidy disagree with you, and if you're a noob, you're doin' it wrong. I know when its acceptable to disagree with Perl::Critic but a noob will not, noob should trust Perl::Critic. Perltidy is a little bit more flexible, Perl isn't whitespace controlled but if you get really weird no one is going to understand your code. So pipe it all thru perltidy, in vi its "(esc):%! perltidy". Perltidy is also an interesting, although very forceful, way to find mismatched quotes and the like.


Ahh perl... the first language I used to make something non-trivial. The coolest script I wrote was a load balancer that used ssh to submit jobs and monitor the activity of nodes in a cluster via commands remotely executed by ssh. No root access needed, no servers to install, just needed to have an account accessible via ssh on the remote machine and ssh + perl on the machine you were working on. It was the simplest solution to the complex problem of "I have all of these computers, now how do I use them to their maximum potential?" Stuff like Linda existed than, but I found it way too complex in comparison to my simple scripts. Alas, I since moved onto the heady world of lisp, but I still use perl for the occasional $ perl -pe 's/this/that/' at the command line and as a alternative to bash scripting.

A merry Tim Toady to you all! Also, an obligatory xkcd: https://xkcd.com/224/


As a long time Perl programmer, I am a little disappointed with the lack of comments on this post.

I generally follow Matt Might's blog, and I am impressed by previous posts. Unfortunately, this tutorial is likely to leave beginners more confused than when they started. I have to encourage people to look at chromatic's free book "Modern Perl" instead.


> In Perl, there are three contexts in which an expression may be evaluated.

> 1. scalar

> 2. array

> 3. void

There's actually no such thing as "array context" in Perl; instead there's "list context". An array is a list that's been stored in a variable (this is a fairly common mistake).

See http://friedo.com/blog/2013/07/arrays-vs-lists-in-perl and http://perlmaven.com/scalar-and-list-context-in-perl for good examples/discussion.

EDIT:

Posted this before I finished the article. Understanding the difference between arrays and lists makes the following potential WTFs a lot clearer:

    sub take_two_arrs (\@\@) {
      print $_[0], $_[1] ;
    }

    take_two_arrs @a1, @b1 ;         # prints ARRAY(0xAddr) ARRAY(0xAddr)

    take_two_arrs ((1,2),(3,4)) ;    # error: arrays must be named
The second doesn't work because the prototyped function takes array references, not lists. It would work if you called it like this:

    take_two_arrs ([1,2],[3,4])
  
I'll admit that this is baffling.

    sub what_are (++) {
      print $_[0], " ", $_[1] ;
    }

    what_are ((1,2),(3,4)) # prints 2, then 4
(This is part of the reason that Perl programmers don't use prototypes very often.) perlsub warns:

> When using the + prototype, your function must check that the argument is of an acceptable type.

The plus here forces scalar context on the arguments, which are lists (not arrays!), so they return their last elements. This would work how the author probably wants if called like this:

    what_are ([1,2],[3,4]);   # prints ARRAY(0xAddr), ARRAY(0xAddr)


I'll admit that this is baffling.

The inner parentheses force evaluation of the two inner comma operators in the scalar context provided by the function prototype. `1` and `3` get evaluated in void context and discarded, leaving `2` and `4` as the two arguments to the function. (I had to look up `+` in prototypes, however.)

Without a working understanding of lists and context, this example is undoubtedly baffling, but that's why the documentation exists.


I guess I really meant "is baffling without a thorough understanding of context." I feel better about looking up the + now that I know you did too, though! I don't think I've ever seen code like that in the wild, and I've never seen a Perl programmer attempt to call a sub with nested parens, like what_are((1,2),(3,4)) before.


You may not realized that's what you saw, but for example all of the Moose documentation uses effectively a nested param:

    has foo => ( is => 'ro' );
is equivalent to:

    has('foo', ('is', 'ro'));
because Moose's `has` sugar is written as a exported function.


That's true, but I've never actually seen anyone call 'has' like that. Plus, Perl's behavior on lists makes sense (always flatten) if there's no mucking around with prototypes, so that's much less confusing than the behavior in the article.


The first one I presented is the way the Moose documentation calls has, the way the Moose test suite typically calls has, and the way I and most of the rest of the Moose Cabal call has.

The second one isn't common at all, but I have seen people both completely leave out the parentheses:

    has foo => is => 'ro', isa => 'Str', ...;
or treat has as a straight function

    has(foo, is => 'ro', isa => 'Str');
both of which cause Perl::Tidy to do weird things.


Sorry, I meant I'd never seen it called the second way (nested parens). I also haven't seen the no-parens version, which looks really bizarre!


The core documentation is not always clear about the difference between lists and arrays--and much credit to the author for identifying the comma operator as an operator--but this is really confused:

By default, the arguments to a procedure are in the array context, which means that the comma operator expects both of its operands to be arrays. It promotes them to single-element arrays if they are scalars. In Perl, comma (,) can mean cons, append, flatten all at once.

I wrote an explanation of context in Perl which is hopefully clearer:

http://modernperlbooks.com/books/modern_perl/chapter_01.html...


I loved this article.

I started writing Perl nine months ago because of my new job. I learned it with the Modern Perl book, which is really nice and goes directly to the best practices. However, I've found that real life Perl code is full of the old/deprecated/insane ways of doing things as well. And Perl developers really take the TIMTOWTDI principle to the limit.

This article helped me to understand Perl more. And specially to understand real life Perl code better. I also liked the language designer perspective and the semantic analysis. Thanks for writing it!


"alarming brevity" :-)

Looking forward to reading further into this.


  To accept a reference to one of several specifiers, Perl accepts a grouped \[ specifiers ] form:
  
  sub array_or_hash (\[@%]) {
    print $_[0] ;
  }
Dear god...


Hi Matt,

I've been exploring your blog. Great stuff! I was particularly appreciative of the parsing articles.

Larry Wall (the Perl designer) has been designing and helping develop a new language for years. (He claims he began thinking about this new language before Perl 5 shipped 20 years ago.) Arguably it addresses the same sort of audience as Scala and Haskell. Have you taken a look at it?


I can't find the link, but I remember a few talks about Perl 6 Grammars which were very interesting both in the eDSL structure and the fact that Perl 6 was built on top of them. I think it was Damian Conway speaking but I'm not sure anymore.


I'll concentrate on the mistakes. If I were to criticise all the other many weird formulations and expressions in this guide which make it hard to unambiguously understand what the author meant, I would still sit and be typing here tomorrow.

----

> A code comment in Perl begins with a hash #

Hashes are already something different in Perl. Avoid ambiguity, use the common name of that character: number sign.

> procedure

This word is used through-out, but the official Perl documentation does not mention it. Use the word subroutine (or just sub for short) instead.

> The $ prefix references a variable as a scalar

> Array variables use the prefix @

This is the wrong explanation. The sigil denotes the mode of access, @ indicating the expression evaluating to a list value, $ indicating a single value. This becomes clear when one examines slices of a compound data structure.

    @arr = ("foo","bar","baz");
    $arr[1];    # "bar"
    @arr[2,3];  # ("bar", "baz")

    %hash = ("foo", 1, "bar", 2);
    $hash{"foo"};        # 1
    @hash{"foo", "bar"}; # (1, 2)
The guide mentions the change from @ to $ or from % to $ only in passing without explanation, and does not mention slices at all.

> Hash variables expect an array for initialization.

No, a list.

> three contexts in which an expression may be evaluated:

> 1. scalar

> 2. array

> 3. void

No, the second is list context.

> Is localtime() returning a scalar, or an array?

No, a scalar or a list.

> By default, the arguments to a procedure are in the array context, which means that the comma operator expects both of its operands to be arrays. It promotes them to single-element arrays if they are scalars.

> It seems that the function call still flattened out the arrays (and hashes) when making the call.

This is completely misleading. A sub takes always a list. What is described here has nothing to do with arguments, but is the consequence of the specifics of how values are evaluated into a list. This also happens, for example, on list assignment.

> all of the following are equivalent procedure calls:

> print3 (1,2,3) ;

> &print3 (1,2,3) ;

This is wrong, there is a difference, it just did not show up in the example.

> In fact, the argument isn’t even hash, despite what the specifier says

Refer to the documentation: when not backslashed, % is defined to behave like @.

> sub use_hash (%) {

> print $_[0]{"foo"} ;

> print $_{"foo"} ;

> print @_{"foo"} ;

> }

> use_hash ("foo" => 1701) ; # prints nothing

No wonder. The code is broken.

@_ contains a plain list value. To access it with a hash subscript, turn it into a hashref first.

    print +{ @_ }->{"foo"};
    print ${ {@_} }{"foo"};
> The specifier & expects to receive a function

Not function, coderef is the appropriate word.

> To accept a bareword filehandle as an argument, it becomes necessary to use the rarely used * prototype specifier

Simply passing *F is also possible, no prototype involved.

> The repetition operator x repeats a string or an array,

No, it repeats single or list values. Scalars are coerced into their string representation, and lists are simply repeated unchanged.

----

Closing words: This is amateur hour, not worthy of a professor. Advice for next time: consult domain experts and have them proof-read before publishing, and also always give your documents a last-modification date and version history, or at least a version identifier.


"Every programmer needs Perl in their arsenal."

I stopped reading at this point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: