Perl
Last edited September 6, 2006
More by UnrulyGrrl »
Learn Perl: Perl Q&A: How do I sort a hash by the hash "value"?
www.devdaily.com/perl/edu/qanda/plqa00016/plqa0001...
Question: How do I sort a hash by the hash value?

Answer:


First, sorting a hash by the hash key

Sorting the output of a hash by the hash key is a pretty well-known recipe. It's covered in another Q&A article titled "How to sort a hash by the hash key".


Sorting a hash by the hash value

Sorting a hash by the hash value is a bit more difficult than sorting the hash by the key, but it's not too bad. It just requires a small "helper" function.

This is easiest to demonstrate by example. Suppose we have a class of five students. Rather than give them names, we'll call them student1, student2, etc. Suppose these students just took a test, and we stored their grades in a hash (called associative arrays prior to the release of Perl 5) named grades.

The hash definition might look like this:

%grades = (
	student1 => 90,
	student2 => 75,
	student3 => 96,
	student4 => 55,
	student5 => 76,
);

If you're familiar with hashes, you know that the student names are the keys, and the test scores are the hash values.

The key to sorting a hash by value is the function you create to help the sort command perform it's function. Following the format defined by the creators of Perl, you create a function I call a helper function that tells Perl how to sort the list it's about to receive. In the case of the program you're about to see, I've created two helper functions named hashValueDescendingNum (sort by hash value in descending numeric order) and hashValueAscendingNum (sort by hash value in ascending numeric order).

Here's a program that prints the contents of the grades hash, sorted numerically by the hash value:

#!/usr/bin/perl -w

#----------------------------------------------------------------------#
#  printHashByValue.pl                                                 #
#                                                                      #
#  Copyright 1998 DevDaily Interactive, Inc.  All Rights Reserved.     #
#----------------------------------------------------------------------#

#----------------------------------------------------------------------#
#  FUNCTION:  hashValueAscendingNum                                    #
#                                                                      #
#  PURPOSE:   Help sort a hash by the hash 'value', not the 'key'.     #
#             Values are returned in ascending numeric order (lowest   #
#             to highest).                                             #
#----------------------------------------------------------------------#

sub hashValueAscendingNum {
   $grades{$a} <=> $grades{$b};
}


#----------------------------------------------------------------------#
#  FUNCTION:  hashValueDescendingNum                                   #
#                                                                      #
#  PURPOSE:   Help sort a hash by the hash 'value', not the 'key'.     #
#             Values are returned in descending numeric order          #
#             (highest to lowest).                                     #
#----------------------------------------------------------------------#

sub hashValueDescendingNum {
   $grades{$b} <=> $grades{$a};
}


%grades = (
	student1 => 90,
	student2 => 75,
	student3 => 96,
	student4 => 55,
	student5 => 76,
);

print "\n\tGRADES IN ASCENDING NUMERIC ORDER:\n";
foreach $key (sort hashValueAscendingNum (keys(%grades))) {
   print "\t\t$grades{$key} \t\t $key\n";
}

print "\n\tGRADES IN DESCENDING NUMERIC ORDER:\n";
foreach $key (sort hashValueDescendingNum (keys(%grades))) {
   print "\t\t$grades{$key} \t\t $key\n";
}

! Aware to Perl: Perl Built-in Functions
www.rocketaware.com/perl/perlfunc/
Perl Functions by Category
! Aware to Perl: map BLOCK LIST
www.rocketaware.com/perl/perlfunc/map.htm
map BLOCK LIST
 
map EXPR,LIST
Evaluates the BLOCK or EXPR for each element of LIST (locally setting $_ to each element) and returns the list value composed of the results of each such evaluation. Evaluates BLOCK or EXPR in a list context, so each element of LIST may produce zero, one, or more elements in the returned value.

    @chars = map(chr, @nums);

translates a list of numbers to the corresponding characters. And

    %hash = map { getkey($_) => $_ } @array;

is just a funny way to write

    %hash = ();
    foreach $_ (@array) {
        $hash{getkey($_)} = $_;
    }
eval EXPR
 
eval BLOCK
EXPR is parsed and executed as if it were a little Perl program. It is executed in the context of the current Perl program, so that any variable settings or subroutine and format definitions remain afterwards. The value returned is the value of the last expression evaluated, or a return statement may be used, just as with subroutines. The last expression is evaluated in scalar or array context, depending on the context of the eval.

If there is a syntax error or runtime error, or a die() statement is executed, an undefined value is returned by eval(), and $@ is set to the error message. If there was no error, $@ is guaranteed to be a null string. If EXPR is omitted, evaluates $_. The final semicolon, if any, may be omitted from the expression. Beware that using eval() neither silences perl from printing warnings to STDERR, nor does it stuff the text of warning messages into $@. To do either of those, you have to use the $SIG{__WARN__} facility. See warn() and the perlvar manpage.

Note that, because eval() traps otherwise-fatal errors, it is useful for determining whether a particular feature (such as socket() or symlink()) is implemented. It is also Perl's exception trapping mechanism, where the die operator is used to raise exceptions.

If the code to be executed doesn't vary, you may use the eval-BLOCK form to trap run-time errors without incurring the penalty of recompiling each time. The error, if any, is still returned in $@. Examples:

    # make divide-by-zero nonfatal
    eval { $answer = $a / $b; }; warn $@ if $@;

    # same thing, but less efficient
    eval '$answer = $a / $b'; warn $@ if $@;

    # a compile-time error
    eval { $answer = };

    # a run-time error
    eval '$answer =';   # sets $@

When using the eval{} form as an exception trap in libraries, you may wish not to trigger any __DIE__ hooks that user code may have installed. You can use the local $SIG{__DIE__} construct for this purpose, as shown in this example:

    # a very private exception trap for divide-by-zero
    eval { local $SIG{'__DIE__'}; $answer = $a / $b; }; warn $@ if $@;

This is especially significant, given that __DIE__ hooks can call die() again, which has the effect of changing their error messages:

    # __DIE__ hooks may modify error messages
    {
       local $SIG{'__DIE__'} = sub { (my $x = $_[0]) =~ s/foo/bar/g; die $x };
       eval { die "foo foofs here" };
       print $@ if $@;                # prints "bar barfs here"
    }

With an eval(), you should be especially careful to remember what's being looked at when:

    eval $x;            # CASE 1
    eval "$x";          # CASE 2

    eval '$x';          # CASE 3
    eval { $x };        # CASE 4

    eval "\$$x++"       # CASE 5
    $$x++;              # CASE 6

Cases 1 and 2 above behave identically: they run the code contained in the variable $x. (Although case 2 has misleading double quotes making the reader wonder what else might be happening (nothing is).) Cases 3 and 4 likewise behave in the same way: they run the code '$x', which does nothing but return the value of $x. (Case 4 is preferred for purely visual reasons, but it also has the advantage of compiling at compile-time instead of at run-time.) Case 5 is a place where normally you WOULD like to use double quotes, except that in this particular situation, you can just use symbolic references instead, as in case 6.

wantarray
Returns TRUE if the context of the currently executing subroutine is looking for a list value. Returns FALSE if the context is looking for a scalar. Returns the undefined value if the context is looking for no value (void context).

    return unless defined wantarray;    # don't bother doing more
    my @a = complex_calculation();
    return wantarray ? @a : "@a";
Perl Predefined Names
 
The following names have special meaning to perl. I could have used alphabetic symbols for some of these, but I didn't want to take the chance that someone would say reset "a-zA-Z" and wipe them all out. You'll just have to suffer along with these silly symbols. Most of them have reasonable mnemonics, or analogues in one of the shells.

$_
The default input and pattern-searching space. The following pairs are equivalent:
	while (<>) {...	# only equivalent in while!
	while ($_ = <>) {...

	/^Subject:/
	$_ =~ /^Subject:/

	y/a-z/A-Z/
	$_ =~ y/a-z/A-Z/

	chop
	chop($_)
(Mnemonic: underline is understood in certain operations.)

$.
The current input line number of the last filehandle that was read. Readonly. Remember that only an explicit close on the filehandle resets the line number. Since <> never does an explicit close, line numbers increase across ARGV files (but see examples under eof). (Mnemonic: many programs use . to mean the current line number.)

$/
The input record separator, newline by default. Works like awk's RS variable, including treating blank lines as delimiters if set to the null string. You may set it to a multicharacter string to match a multi-character delimiter. Note that setting it to "\n\n" means something slightly different than setting it to "", if the file contains consecutive blank lines. Setting it to "" will treat two or more consecutive blank lines as a single blank line. Setting it to "\n\n" will blindly assume that the next input character belongs to the next paragraph, even if it's a newline. (Mnemonic: / is used to delimit line boundaries when quoting poetry.)

$,
The output field separator for the print operator. Ordinarily the print operator simply prints out the comma separated fields you specify. In order to get behavior more like awk, set this variable as you would set awk's OFS variable to specify what is printed between fields. (Mnemonic: what is printed when there is a , in your print statement.)

$""
This is like $, except that it applies to array values interpolated into a double-quoted string (or similar interpreted string). Default is a space. (Mnemonic: obvious, I think.)

$\
The output record separator for the print operator. Ordinarily the print operator simply prints out the comma separated fields you specify, with no trailing newline or record separator assumed. In order to get behavior more like awk, set this variable as you would set awk's ORS variable to specify what is printed at the end of the print. (Mnemonic: you set $\ instead of adding \n at the end of the print. Also, it's just like /, but it's what you get "back" from perl.)

$#
The output format for printed numbers. This variable is a half-hearted attempt to emulate awk's OFMT variable. There are times, however, when awk and perl have differing notions of what is in fact numeric. Also, the initial value is %.20g rather than %.6g, so you need to set $# explicitly to get awk's value. (Mnemonic: # is the number sign.)

$%
The current page number of the currently selected output channel. (Mnemonic: % is page number in nroff.)

$=
The current page length (printable lines) of the currently selected output channel. Default is 60. (Mnemonic: = has horizontal lines.)

$-
The number of lines left on the page of the currently selected output channel. (Mnemonic: lines_on_page - lines_printed.)

$~
The name of the current report format for the currently selected output channel. Default is name of the filehandle. (Mnemonic: brother to $^.)

$^
The name of the current top-of-page format for the currently selected output channel. Default is name of the filehandle with "_TOP" appended. (Mnemonic: points to top of page.)

$|
If set to nonzero, forces a flush after every write or print on the currently selected output channel. Default is 0. Note that STDOUT will typically be line buffered if output is to the terminal and block buffered otherwise. Setting this variable is useful primarily when you are outputting to a pipe, such as when you are running a perl script under rsh and want to see the output as it's happening. (Mnemonic: when you want your pipes to be piping hot.)

$$
The process number of the perl running this script. (Mnemonic: same as shells.)

$?
The status returned by the last pipe close, backtick (\`\`) command or system operator. Note that this is the status word returned by the wait() system call, so the exit value of the subprocess is actually ($? >> 8). $? & 255 gives which signal, if any, the process died from, and whether there was a core dump. (Mnemonic: similar to sh and ksh.)

$&
The string matched by the last successful pattern match (not counting any matches hidden within a BLOCK or eval enclosed by the current BLOCK). (Mnemonic: like & in some editors.)

$\`
The string preceding whatever was matched by the last successful pattern match (not counting any matches hidden within a BLOCK or eval enclosed by the current BLOCK). (Mnemonic: \` often precedes a quoted string.)

$'
The string following whatever was matched by the last successful pattern match (not counting any matches hidden within a BLOCK or eval enclosed by the current BLOCK). (Mnemonic: ' often follows a quoted string.) Example:
	$_ = 'abcdefghi';
	/def/;
	print "$\`:$&:$'\n";  	# prints abc:def:ghi

$+
The last bracket matched by the last search pattern. This is useful if you don't know which of a set of alternative patterns matched. For example:
    /Version: (.*)|Revision: (.*)/ && ($rev = $+);
(Mnemonic: be positive and forward looking.)

$*
Set to 1 to do multiline matching within a string, 0 to tell perl that it can assume that strings contain a single line, for the purpose of optimizing pattern matches. Pattern matches on strings containing multiple newlines can produce confusing results when $* is 0. Default is 0. (Mnemonic: * matches multiple things.) Note that this variable only influences the interpretation of ^ and $. A literal newline can be searched for even when $* == 0.

$0
Contains the name of the file containing the perl script being executed. Assigning to $0 modifies the argument area that the ps(1) program sees. (Mnemonic: same as sh and ksh.)

$<digit>
Contains the subpattern from the corresponding set of parentheses in the last pattern matched, not counting patterns matched in nested blocks that have been exited already. (Mnemonic: like \digit.)

$[
The index of the first element in an array, and of the first character in a substring. Default is 0, but you could set it to 1 to make perl behave more like awk (or Fortran) when subscripting and when evaluating the index() and substr() functions. (Mnemonic: [ begins subscripts.)

$]
The string printed out when you say "perl -v". It can be used to determine at the beginning of a script whether the perl interpreter executing the script is in the right range of versions. If used in a numeric context, returns the version + patchlevel / 1000. Example:
	# see if getc is available
        ($version,$patchlevel) =
		 $] =~ /(\d+\.\d+).*\nPatch level: (\d+)/;
        print STDERR "(No filename completion available.)\n"
		 if $version * 1000 + $patchlevel < 2016;
or, used numerically,
	warn "No checksumming!\n" if $] < 3.019;
(Mnemonic: Is this version of perl in the right bracket?)

$;
The subscript separator for multi-dimensional array emulation. If you refer to an associative array element as
	$foo{$a,$b,$c}
it really means
	$foo{join($;, $a, $b, $c)}
But don't put
	@foo{$a,$b,$c}		# a slice--note the @
which means
	($foo{$a},$foo{$b},$foo{$c})
Default is "\034", the same as SUBSEP in awk. Note that if your keys contain binary data there might not be any safe value for $;. (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon. Yeah, I know, it's pretty lame, but $, is already taken for something more important.)

$!
If used in a numeric context, yields the current value of errno, with all the usual caveats. (This means that you shouldn't depend on the value of $! to be anything in particular unless you've gotten a specific error return indicating a system error.) If used in a string context, yields the corresponding system error string. You can assign to $! in order to set errno if, for instance, you want $! to return the string for error n, or you want to set the exit value for the die operator. (Mnemonic: What just went bang?)

$@
The perl syntax error message from the last eval command. If null, the last eval parsed and executed correctly (although the operations you invoked may have failed in the normal fashion). (Mnemonic: Where was the syntax error "at"?)

$<
The real uid of this process. (Mnemonic: it's the uid you came FROM, if you're running setuid.)

$>
The effective uid of this process. Example:
	$< = $>;	# set real uid to the effective uid
	($<,$>) = ($>,$<);	# swap real and effective uid
(Mnemonic: it's the uid you went TO, if you're running setuid.) Note: $< and $> can only be swapped on machines supporting setreuid().

$(
The real gid of this process. If you are on a machine that supports membership in multiple groups simultaneously, gives a space separated list of groups you are in. The first number is the one returned by getgid(), and the subsequent ones by getgroups(), one of which may be the same as the first number. (Mnemonic: parentheses are used to GROUP things. The real gid is the group you LEFT, if you're running setgid.)

$)
The effective gid of this process. If you are on a machine that supports membership in multiple groups simultaneously, gives a space separated list of groups you are in. The first number is the one returned by getegid(), and the subsequent ones by getgroups(), one of which may be the same as the first number. (Mnemonic: parentheses are used to GROUP things. The effective gid is the group that's RIGHT for you, if you're running setgid.)

Note: $<, $>, $( and $) can only be set on machines that support the corresponding set[re][ug]id() routine. $( and $) can only be swapped on machines supporting setregid().

$:
The current set of characters after which a string may be broken to fill continuation fields (starting with ^) in a format. Default is "\ \n-", to break on whitespace or hyphens. (Mnemonic: a "colon" in poetry is a part of a line.)

$^D
The current value of the debugging flags. (Mnemonic: value of -D switch.)

$^F
The maximum system file descriptor, ordinarily 2. System file descriptors are passed to subprocesses, while higher file descriptors are not. During an open, system file descriptors are preserved even if the open fails. Ordinary file descriptors are closed before the open is attempted.

$^I
The current value of the inplace-edit extension. Use undef to disable inplace editing. (Mnemonic: value of -i switch.)

$^L
What formats output to perform a formfeed. Default is \f.

$^P
The internal flag that the debugger clears so that it doesn't debug itself. You could conceivable disable debugging yourself by clearing it.

$^T
The time at which the script began running, in seconds since the epoch. The values returned by the -M , -A and -C filetests are based on this value.

$^W
The current value of the warning switch. (Mnemonic: related to the -w switch.)

$^X
The name that Perl itself was executed as, from argv[0].

$ARGV
contains the name of the current file when reading from <>.

@ARGV
The array ARGV contains the command line arguments intended for the script. Note that $#ARGV is the generally number of arguments minus one, since $ARGV[0] is the first argument, NOT the command name. See $0 for the command name.

@INC
The array INC contains the list of places to look for perl scripts to be evaluated by the "do EXPR" command or the "require" command. It initially consists of the arguments to any -I command line switches, followed by the default perl library, probably "/usr/local/lib/perl", followed by ".", to represent the current directory.

%INC
The associative array INC contains entries for each filename that has been included via "do" or "require". The key is the filename you specified, and the value is the location of the file actually found. The "require" command uses this array to determine whether a given file has already been included.

$ENV{expr}
The associative array ENV contains your current environment. Setting a value in ENV changes the environment for child processes.

$SIG{expr}
The associative array SIG is used to set signal handlers for various signals. Example:
	sub handler {	# 1st argument is signal name
		local($sig) = @_;
		print "Caught a SIG$sig--shutting down\n";
		close(LOG);
		exit(0);
	}

	$SIG{'INT'} = 'handler';
	$SIG{'QUIT'} = 'handler';
	...
	$SIG{'INT'} = 'DEFAULT';	# restore default action
	$SIG{'QUIT'} = 'IGNORE';	# ignore SIGQUIT
The SIG array only contains values for the signals actually set within the perl script.
! Aware to Perl: grep BLOCK LIST
www.rocketaware.com/perl/perlfunc/grep.htm
grep BLOCK LIST
grep EXPR,LIST
This is similar in spirit to, but not the same as, grep(1) and its relatives. In particular, it is not limited to using regular expressions.

Evaluates the BLOCK or EXPR for each element of LIST (locally setting $_ to each element) and returns the list value consisting of those elements for which the expression evaluated to TRUE. In a scalar context, returns the number of times the expression was TRUE.

    @foo = grep(!/^#/, @bar);    # weed out comments

or equivalently,

    @foo = grep {!/^#/} @bar;    # weed out comments

Note that, because $_ is a reference into the list value, it can be used to modify the elements of the array. While this is useful and supported, it can cause bizarre results if the LIST is not a named array. Similarly, grep returns aliases into the original list, much like the way that Foreach Loops's index variable aliases the list elements. That is, modifying an element of a list returned by grep actually modifies the element in the original list.

The content on this page is provided by a Google Notebook user, and Google assumes no responsibility for this content.