# Category Archives: Algorithms

Math is beautiful – it’s a sheer magic to me, but now and then some good people publish approachable articles that allow me get a tiny dash of understanding of how things work.

There’s one such article today – https://innovation.vivint.com/introduction-to-reed-solomon-bc264d0794f8 (and complimentary https://medium.com/@jtolds/joseph-louis-lagrange-and-the-polynomials-499cf0742b39) that I could highly recommend. It’s a fairly simplified overview that provides just a basic idea of a particular error correction technique, but it’s simple yet comprehensive.

In fact it was so fascinating I couldn’t stop myself from giving it a (very short and simple) try. I won’t repeat the articles in any way, just post a code with some comments.

Let’s say we have this code in Python (minor comments inline):

```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 xs = [ 1.0, # float format to make suure calculation precision is not impacted - it fails badly otherwise 2.0, 3.0, 4.0, ]   ys = [ 10, # the values here are random - like in "I made them up". But this is about "any numbers", right? 42, 74, 111, ]   # it's a direct representation of what's described in the https://medium.com/@jtolds/joseph-louis-lagrange-and-the-polynomials-499cf0742b39 article def l0(x): return ( (x-xs) * (x-xs) * (x-xs) ) / ( (xs - xs) * (xs - xs) * (xs - xs) ) def l1(x): return ( (x-xs) * (x-xs) * (x-xs) ) / ( (xs - xs) * (xs - xs) * (xs - xs) ) def l2(x): return ( (x-xs) * (x-xs) * (x-xs) ) / ( (xs - xs) * (xs - xs) * (xs - xs) ) def l3(x): return ( (x-xs) * (x-xs) * (x-xs) ) / ( (xs - xs) * (xs - xs) * (xs - xs) )   # as well as this def f(num): return ys * l0(num) + ys * l1(num) + ys * l2(num) + ys * l3(num)   for x in range(0, 10): fx = f(x) print "{}: {}".format(x, fx)```

And we run it and we get this:

```0: -27.0
1: 10.0
2: 42.0
3: 74.0
4: 111.0
5: 158.0
6: 220.0
7: 302.0
8: 409.0
9: 546.0
```

(you can add more point; you could also use matplotlib to plot them)

So as you can see it reflected the pre-defined 4 points (10, 42, 74 and 111), but also calculated other points on a curve. So let’s say we sent 6 point, but client received only points 1,2,5 and 6 (10, 42, 158, 220).

If we adjust the input values to look like this:

```xs = [ 1.0, 2.0, 5.0, 6.0, ]   ys = [ 10, 42, 158, 220, ]```

and run it again, we’d still get all the values, because 4 values are enough to define the cubic function curve, and these were taken from that very curve:

```0: -27.0
1: 10.0
2: 42.0
3: 74.0
4: 111.0
5: 158.0
6: 220.0
7: 302.0
8: 409.0
9: 546.0
```

Magic, right?

Some extra calculations around it are also available at http://mathonline.wikidot.com/linear-lagrange-interpolating-polynomials

# Hash functions

This is “JFTR” – saving few known simple hash functions for further reference (nothing special, really – it’s a “stash” post).

``````unsigned int HashRs(const char * str)
{

static const unsigned int b = 378551;
unsigned int a = 63689;
unsigned int hash = 0;

for(; *str; str++)
{
hash = hash * a + (unsigned char)(*str);
a *= b;
}

return hash;

}``````
``````unsigned int HashLy(const char * str)
{

unsigned int hash = 0;

for(; *str; str++)
hash = (hash * 1664525) + (unsigned char)(*str) + 1013904223;

return hash;

}
``````
``````unsigned int HashRot13(const char * str)
{

unsigned int hash = 0;

for(; *str; str++)
{
hash += (unsigned char)(*str);
hash -= (hash << 13) | (hash >> 19);
}

return hash;

}
``````

# Revelations

# of bits in a number N is floor(log N) + 1

Things like this make me tear off patches of imaginary hair from my bald skull in desperate frustration of “what the hell did I waste my time on in university?” rethorics. For whatever I learned there has nothing to do with CS. Nothing to do with anything, in fact – even my drinking habits are long different from those days.

# Heap algorithm in Perl

Another day, another study. Again, nothing fancy – merely a FTR, heap (or minheap, with root containing the minimum) implementation in Perl. extract_top name is to keep main interface similar to existing CPAN modules (like Heap::Simple)

`package Heap::Hydra;`

``` sub new {    return bless {_heap => [undef,], _heap_size => 0}, shift; } sub getParent {    my \$self = shift;    my \$child_pos = shift;    my \$parent_pos = int(\$child_pos * 0.5) || 1;    return \$self->{_heap}->[\$parent_pos], \$parent_pos; } sub getMinKid {    my \$self = shift;    my \$parent_pos = shift;    my \$child_pos = \$parent_pos * 2;    return undef if \$child_pos >= scalar @{\$self->{_heap}};    my \$min_child = \$self->{_heap}->[\$child_pos];    if (defined \$self->{_heap}->[\$child_pos + 1] && \$self->{_heap}->[\$child_pos + 1] < \$min_child) {      \$child_pos += 1;      \$min_child = \$self->{_heap}->[\$child_pos];    }    return \$min_child, \$child_pos; } sub extract_top {    my \$self = shift;    my \$top_pos = shift || 1;    my (\$new_top, \$new_top_pos) = \$self->getMinKid(\$top_pos);    if (!defined \$new_top) {      return splice @{\$self->{_heap}}, \$top_pos, 1;    }    my \$prev_top = \$self->{_heap}->[\$top_pos];    \$self->{_heap}->[\$top_pos] = \$self->extract_top(\$new_top_pos);    \$self->{_heap_size}--;    return \$prev_top; } sub insert {    my \$self = shift;    my \$child = shift;    \$self->{_heap_size}++;    \$self->{_heap}->[\$self->{_heap_size}] = \$child;    my \$child_pos = \$self->{_heap_size};    my (\$parent, \$parent_pos) = \$self->getParent(\$child_pos);    while (\$parent > \$child) {      \$self->{_heap}->[\$parent_pos] = \$child;      \$self->{_heap}->[\$child_pos] = \$parent;      \$child_pos = \$parent_pos;      (\$parent, \$parent_pos) = \$self->getParent(\$child_pos);    } ```

```   return \$child_pos; } ```

# Greatest prime divisor

Just for the record – custom algorithm for searching greatest prime divisor of a number, O(n) worst-case (O(sqrt(n)) taken by prime detection):
``` def prime(num, get_number=False):   for divisor in range(2, int(math.sqrt(num)) + 1):     if num % divisor == 0:       return False   return num if get_number else True```

``` ```

```def hydra_gpd(num, get_number=None):   sequence = range(2, int(math.sqrt(num)) + 1)   gpd = 1   for divisor in sequence:     divided = int(num / divisor)     if num % divisor == 0:       if prime(divisor) and divisor > gpd:         gpd = divisor       if prime(divided) and divided > gpd:         gpd = divided       recursive_gpd = hydra_gpd(divided)       if recursive_gpd > gpd:         gpd = recursive_gpd       return gpd   return gpd```

Nothing too special, just to keep it around.

# Accidental hashing

How often do you use raw data structures in your everyday coding routine? No, I don’t mean when you populate a dictionary (hash) or fill an array (list). I mean constructing hash yourself, or making a binary search tree or something? Well, you might be luckier than me and do that regularly – but me, I seldom do.

There was a care recently though, and I was puzzled – how do I get a simple hash function that would handle arbitrary key, would depend on characters position (so “foo” and “oof” wouldn’t produce same result) and produce result of limited range to reduce collisions – because collisions are as likely as a ratio of a key dictionary size to a has size (i.e. size of a list of all possible keys to a number of “slots” in hash).

And I accidentally ran into this:

key_hash = 1.0

for character in key:

key_hash = chr(character) / key_hash

and it works like a charm: it produces square-like results (thus sticking within quite limited dictionary size), but more varied. And if you need to spread it over bigger hash – you just multiply it by 10^n. Number sequence for n(1..100) looks like this (stroked line inside is sqrt(n): and results for arbitrary keys are:

“foobar” : 1.129169193450576

“this is supposed to be a long line, long enough to represent some arbitrary dictionary entry” : 98.989898474541334

so you multiply it by 10^n, get the remainder of division by hash size – and you’re done!

Of course it ain’t perfect – for one, “o” and “ooo” would yield the same result, but it’s perfect for the simple case I had (and can be improved if required).

# Perl sort

You only realize the difference when you compare things. I sometimes thought “how fast built-in Perl sort really is?”, but never got to any comparison (finally got to some sort algorithms research). Well, no surprises, of course (and my implementations are quite sub-optimal), but still (sorta FTR – this is a sort of 10K-element array filled with int representations of rand(10000) values):

• perl: 0.00242
• quicksort: 0.097454
• mergesort: 0.115439
• insert sort: 8.223053

Another FTR – Perl (since 5.8) has both quicksort and mergesort, although as I understand quicksort is still used by default – just shuffling (big?) arrays  to save from worst-case scenario in pre-sorted case. You can change the sort algorithm though with this (from http://perldoc.perl.org/sort.html):

1. use sort ‘stable’; # guarantee stability
2. use sort ‘_quicksort’; # use a quicksort algorithm
3. use sort ‘_mergesort’; # use a mergesort algorithm
4. use sort ‘defaults’; # revert to default behavior
5. no sort ‘stable’; # stability not important

Like, if you need a “stable” sort (which mergesort is) – when same elements preserve their initial order.