Tag Archives: Perl

Perl debug with strace

Short “FTR”

Not really specific to perl, but handy anyway.

You can use strace utility to inspect the syscalls (filesystem and network operations are usually of most interest) that a process is making.

here’s e.g. how you can see all network activity for a given process:

strace -p $PID -f -e trace=network -s 10000

Also if you have a stuck process you can check if it is waiting on some filehandle read and then check what that filehandle is using

lsof -p $PID

Also filehandles could be found in /proc/$PID/fd/ – so if you run strace on your process and see e.g.

write(3, "foo\n", 4)

you can check aforementioned lsof | grep $PID and see this

perl       9014 bturchik    3w      REG  253,3        4     26758 /tmp/hung

or

ls -l /proc/9014/fd/3

and see this:

l-wx------ 1 bturchik users 64 Aug  6 09:43 /proc/9014/fd/3 -> /tmp/hung

Heap algorithm in Perl

Another day, another study. Again, nothing fancy – merely a FTR, heap (or minheap, with root containing the minimum) implementation in Perl. extract_top name is to keep main interface similar to existing CPAN modules (like Heap::Simple)

package Heap::Hydra;

sub new {
   return bless {_heap => [undef,], _heap_size => 0}, shift;
}

sub getParent {
   my $self = shift;
   my $child_pos = shift;

   my $parent_pos = int($child_pos * 0.5) || 1;
   return $self->{_heap}->[$parent_pos], $parent_pos;
}

sub getMinKid {
   my $self = shift;
   my $parent_pos = shift;

   my $child_pos = $parent_pos * 2;
   return undef if $child_pos >= scalar @{$self->{_heap}};

   my $min_child = $self->{_heap}->[$child_pos];
   if (defined $self->{_heap}->[$child_pos + 1] && $self->{_heap}->[$child_pos + 1] < $min_child) {      $child_pos += 1;      $min_child = $self->{_heap}->[$child_pos];
   }
   return $min_child, $child_pos;
}

sub extract_top {
   my $self = shift;
   my $top_pos = shift || 1;

   my ($new_top, $new_top_pos) = $self->getMinKid($top_pos);
   if (!defined $new_top) {
     return splice @{$self->{_heap}}, $top_pos, 1;
   }

   my $prev_top = $self->{_heap}->[$top_pos];
   $self->{_heap}->[$top_pos] = $self->extract_top($new_top_pos);
   $self->{_heap_size}--;

   return $prev_top;
}

sub insert {
   my $self = shift;
   my $child = shift;

   $self->{_heap_size}++;
   $self->{_heap}->[$self->{_heap_size}] = $child;

   my $child_pos = $self->{_heap_size};
   my ($parent, $parent_pos) = $self->getParent($child_pos);
   while ($parent > $child) {
     $self->{_heap}->[$parent_pos] = $child;
     $self->{_heap}->[$child_pos] = $parent;
     $child_pos = $parent_pos;

     ($parent, $parent_pos) = $self->getParent($child_pos);
   }

   return $child_pos;
}

Perl sort

You only realize the difference when you compare things. I sometimes thought “how fast built-in Perl sort really is?”, but never got to any comparison (finally got to some sort algorithms research). Well, no surprises, of course (and my implementations are quite sub-optimal), but still (sorta FTR – this is a sort of 10K-element array filled with int representations of rand(10000) values):

  • perl: 0.00242
  • quicksort: 0.097454
  • mergesort: 0.115439
  • insert sort: 8.223053

Another FTR – Perl (since 5.8) has both quicksort and mergesort, although as I understand quicksort is still used by default – just shuffling (big?) arrays  to save from worst-case scenario in pre-sorted case. You can change the sort algorithm though with this (from http://perldoc.perl.org/sort.html):

  1. use sort ‘stable’; # guarantee stability
  2. use sort ‘_quicksort’; # use a quicksort algorithm
  3. use sort ‘_mergesort’; # use a mergesort algorithm
  4. use sort ‘defaults’; # revert to default behavior
  5. no sort ‘stable’; # stability not important

Like, if you need a “stable” sort (which mergesort is) – when same elements preserve their initial order.