Piecharts: dark side without cookies

I was just reading a security article when I realized that it’s a fairly interesting data that is actually rather valuable, but its representation is completely wrong. And it’s mostly because of extensive pie charts usage. There’s an excellent review of what in particular is wrong with pie charts (in many ways), so I won’t repeat that – I’ll just do a short “case study” here.

So let’s have a look at this one example:

Most frequently targeted countries in 2012-2013


Note that I copied it with a part of URL, and that’s for a reason that I’m gonna start the list with:

  1. it doesn’t fit on screen (chart subject couldn’t fit at all), and my Air is not the smallest screen example. Yes, legend could be rearranged, but…
  2. chart needs a separate legend anyways. You just can’t properly tie labels (not even numbers) to slices – all attempts are futile. Which makes it a mandatory
  3. to travel back and forth between slice size, its location, number and legend (and scroll if your screen is not huge). And then my favorite,
  4. colors! Every chart that needs more than few (like, 4) colors is a bad idea. I cannot tell Ukraine from Vietnam here (no one needs to know what’s under 3%, but let’s leave that aside) and India from Kazakhstan color-wise. And I’m not even color-blind in this spectrum! (although I am in red-green shades, so I feel for others)

And now what the above essentially is:

Most frequently targeted countries in 2012-2013


yes, it’s THAT bloody simple! And it’s a very fast approach. And you can easily reduce sight travel (and improve precision) like this:

Most frequently targeted countries in 2012-2013


That’s it! One color, tightly packed (yet not crumpled), neat and straight to the point. Ain’t no jolly colors, true that – but it ain’t no lollypop fare ad either.

So… just don’t use those pie charts, there’s pretty much no use case where they are better that other visual data representations. Not unless it’s actually a tasty pie you’re sharing, at least.

And while we’re at it: LibreOffice charts widget sucks. Even GoogleDrive, with all the simplicity, shines (usability-wise, at least) in comparison.

Latencies and sizes

This was hanging around in my Keep for quite a while – right till I realized that I need it somewhere for a quick reference, although it just messes my Keep flow. Thus posting it here in a “JFTR” fashion.

Data access latencies:

L1 cache reference ……………………. 0.5 ns
Branch mispredict ………………………. 5 ns
L2 cache reference ……………………… 7 ns
Mutex lock/unlock ……………………… 25 ns
Main memory reference …………………. 100 ns
Compress 1K bytes with Zippy …………. 3,000 ns = 3 µs
SSD random read …………………… 150,000 ns = 150 µs
Read 1 MB sequentially from memory ….. 250,000 ns = 250 µs
Round trip within same datacenter …… 500,000 ns = 0.5 ms
Read 1 MB sequentially from SSD ….. 1,000,000 ns = 1 ms
Send 1MB over 1 Gbps network ……. 10,000,000 ns = 10 ms
Disk seek ……………………… 10,000,000 ns = 10 ms
Read 1 MB sequentially from disk …. 20,000,000 ns = 20 ms
Send packet CA->Netherlands->CA …. 150,000,000 ns = 150 ms

MySQL Storage Requirement per Data Type:

TINYINT – 1 byte (BOOL is an alias for this)
SMALLINT – 2 bytes
MEDIUMINT – 3 bytes
INT, INTEGER – 4 bytes
BIGINT – 8 bytes
FLOAT(p) – 4 bytes if 0 <= p <= 24, 8 bytes if 25 <= p <= 53
FLOAT – 4 bytes
BIT(M) – approximately (M+7)/8 bytes

Good vs. Better

I’ve mentioned this collection of interface tips&tricks named “Good UI” already in a direct post on FB, but thought I save some things “for latter use” here. With my thoughts applied. That article, it’s quite a long read – I usually don’t go for articles like “X reasons of doing something” where X > 10, but this one is rather justified – and you don’t have to go through all of them in a single leap, you’re free to stop when your morning coffee is done and come back next morning.

That said, below goes a list of personal thoughts on each point. Wanted to make sure I actually read those, not scrolled through.

A note: all these things are improvement points, one should never aim to fulfill any portion of this list as a necessary prerequisite to launch a product. If you have the working thing – throw it in and go through the lists like this afterwards, applying whatever is the fastest and cheapest thing to do first, or what you see (not oversee) users struggling with.

  1. One-column layout. This is generally useful when you have a stream of information (like that article itself), although extra column with related article or hierarchy or breadcrumbs or tag cloud could still be useful. Just don’t put any content there.
  2. Gifts like ‘first month free’ – sounds fair, I’d go for a free month. Or first N free deliveries. Or first X movies for free. Or whatever. Usually I don’t follow up (like with Amazon premium), but that’s mostly because I can’t appreciate most of the benefits due to regional restrictions. BTW on giving gifts: make sure customer can claim it before announcing it. I despise Netflix for their “free month” they advertise to everyone while it’s there for US folks only.
  3. Merging similar functions. That’s renowned Occam’s Razor, no comments there.
  4. The social proof – I actually find it annoying and ridiculous. Reviews at the 3rd-party site (like appstore, hotel search, gadget review site) – fine, reviews at the company site (however pulled from social networks) – no go. You want to tell about yourself, you describe what you/you product do. Few sentences, strict to the point. Links to advanced docs/tutorial. Short example, if relevant. That stuff.
  5. Repeating primary action – might be OK if not overused. Don’t really have any opinion here.
  6. Distinct clickable/selected styles – this one’s obvious, but still often neglected. It’s very important to distinct clickable items, as well as marking selected. What’s more important though is keeping it common, something audience is used to. Think buttons, blue colors.
  7. Recommendations. Choices. Analysis paralysis. Right.
  8. Undos. This is big, one of the things I love Gmail for. Although worth saying “consider frequence” here – if the button is to be pressed once in a week, there’s no need to bother with undos – annoying confirmation would do just fine.
  9. Telling who the thing is for – I guess it makes sense.
  10. Being direct – sure, why not. I mean, it’s always better to state something than question or get lost in options. Some might disagree with your statement, but most will take strong opinion as “I guess those folks know what they’re saying”. Important thing here: you must know what you’re saying, there should be concrete base for that.
  11. More Contrast (for key elements) – definitely a good thing, just don’t go extreme.
  12. Showing where it’s made – it, well, depends. On where it is made, just as well as on the audience it’s been targeted at. Don’t want to throw any examples in, I guess you can imagine a handful yourself.
  13. Fewer form fields – indeed so. I always wonder why would image sharing site need my home address. I’m making it up, but some cases are puzzling indeed.
  14. Expose options aka radiobuttons vs dropdowns. It’s relevant to some cases, irrelevant to others. Just consider how many steps (or time) would it take customer to the next page (scroll/search included). Meaning, you can’t just unravel all items from country list on a page – user would still need to search for the country, but select-as-you-type wouldn’t work, and getting to other elements would require extra scrolling.
  15. Endless Scroll – personally, I think this is a brilliant example of how technically sexy feature was considered useful “because it’s so cool”. While it’s everything but. You have many items to show? Solve it the other way. Filters, tags, classes. Explicit “show next page”, after all. Just don’t load more when no one asked for that. It’s like caching – it looks attractive and sounds simple, but there’s so many edge cases that it’s actually efficient in, like, 5% of cases. Or maybe even less.
  16. Keeping focus – let’s see. When I see a link within a text I read, I open it in a new tab. And then I contemplate whether to go read that one first because it might influence the current page understanding or read that later because it could just add some extra points to the content after I consume it. Yet I often use this approach myself – call it a habit or a crucial hypertext principle. Links are important – but there’s sidebar, “see also” or other similar approaches to have references around. All in all… as usual, use it but don’t abuse it.
  17. Showing state – essentially, displaying item’s status. Quite useful.
  18. Benefit button, or droolwords. I think that’s cheap and filthy. There will be more of this, and I’ll shorten them as “drooler”.
  19. Direct manipulations – displaying actions for current element only (and within its scope) is indeed convenient.
  20. Exposing fields, or add actions to description – it’s pretty much shortening the number of steps to conversion. Some common sense on the number of inputs should be applied, but generally reasonable.
  21. Transitions. Well… just read the description. OK, but minor.
  22. Gradual engagement – starting with some partial involvement is “sorta” OK. It requires some considerations, like saving the steps taken and current position for limited time (because if user took 5 steps out of 8 and then came back two days later, he/she wouldn’t remember those 5. So… I’d say it sounds like a lovely tech challenge, but not very beneficial (unless proven by an experiment). I had some experience with this approach regarding Moto G, but never bought one.
  23. Fewer borders – this heavily depends on general visual approach and type of the product. If you’re building a bugtracker or some other element-rich tool, you need borders.
  24. Selling benefits – drooler, buzzwords.
  25. Design for zero data – there’s a sticky phrase in there, “A zero data world is a cold place”. Love it. But the point itself is very fair – considering the starting point, a blank page, is very important.
  26. Opt-out instead of opt-in – that’s up to you. It’s dirty IMO, but no one would really opt-in for a newsletter. I mean, nobody. So… I immediately unsubscribe from everything I receive, but all the sites are doing it the opt-out way anyways.
  27. Consistency – well, yes, sure. This could actually be merged with #6. And #3.
  28. Smart defaults. Weeell… no. I mean, what could you actually guess? City and country, maybe. But that’s about it – I’d rather fill in blanks that thinking why those are pre-filled and whether suggestions are correct (and they would mostly be not, and no matter how slightly).
  29. Conventions – add this to #27 (and #6, and #3). It’s all about common experience, least astonishment. That’s a link within a text, is it not? =)
  30. Loss aversion – that even sounds like another drooler, which it definitely is.
  31. Visual Hierarchy – OK, but make sure you don’t take your customer away from the page before he/she got through all the important points. Or do it asynchronously, so customer would stay on the page while still doing things along the way.
  32. Grouping related items – see #27 and other recursive points.
  33. Inline validation – frankly, it puzzles me why people are still doing it “press-submit-to-see-you-did-it-wrong” way. It’s very annoying, especially (should I even mention it?) there’s a password involved. Which is, of course, discarded when validation stage is not passed. So yes, I strongly second this, should be at the top of the list.
  34. Forgiving inputs – yes and no. Phone number fields should definitely accept many formats – but it’s quite easy to parse that. However there are many cases when it’s not relevant – it’s easier to error our (the #33 way, please) than auto-assume something.
  35. Urgency – pure drooler. To clarify: I’m not saying this won’t work – in fact, it might work extraordinary well. I’m just against this sort of tricks, I reckon it’s dirty. IMHO indeed.
  36. Scarcity – ditto.
  37. Recognition – that’s often neglected while being a source of a considerable pain. I often struggle with what I should enter to some field, and dynamic suggestions are of no help.
  38. Bigger click areas – this is essential, I don’t really understand why it’s #38. Should be #3, at least – it’s very cheap yet very effective. Aiming is a problem, especially in laptops/mobile world, which is expanding tremendously.
  39. Faster load times – it’s rather about keeping people busy while page loads. Or doing async loads. That’s dubious – people tend to wait till something is over even though they have options to adjust while they wait. So… just make it snappy.
  40. Keyboard shortcuts – as note says, do it if you have a lot of returning customers that spend significant time on your page. I use Gmail keyboard shortcuts a lot, I think they’re brilliant. But I don’t use Facebook shortcuts (did you know there are some?), because my use case there is mostly “scroll it down”.

Aaand… we’re done here! Playing critic is not that bad after all when you got something to say. have a nice weekend!

Network communication layers

This is an extra brief scrape of a net communication layers – just to keep it around to look at. To understand the difference in Layer 4 / Layer 7 DDOS attacks.

7 – Application Layer. Constructing appropriate data that another application that supports same protocol would understand. Generally by using “Layer 7” term it’s common to combine layers 5-7.

6 – Presentation Layer. A bit less fuzzy – represents compression, encryption, encoding etc.

5 – Session Layer. Generally to establish a bond between sides for further communication. Quite fuzzy.

4 – Transport Layer. TCP, generally. Establishes and manages connection, controls flow, retransmissions. TCP connection: client sends SYN, server responds with that SYN and new ACK, client sends over that ACK and a new ACK.

3 – Network Layer. Logical addressing, routing. ICMP.

2 – Data Link Layer. Ethernex, WiFi (802.11), all that jazz. Responsible for Logical Link Control, data framing (putting into proper frames), addressing

1 – Physical Layer. Establishing hardware specification, encoding/decoding signals, transmitting signals. The only layer that actually transfers data.


Neat little thing, or bash tab-completion for your tools

You know that thing, the magic of having all the options listed in front of you when you [double-]press Tab after typing something on the console? Or the unique option completing itself if there’s a match? Of course you do. One thing that bothered me is the frustration of when it’s suddenly not there.

For general tools it’s already alright, they either come bundled with tab-completion or you can easily set it up – for instance, there’s a setup tutorial for Mac, coming with a Git bundle. One important note on that one: in iTerm, you have to go to settings -> Profiles and change Command to /opt/local/bin/bash -I for your/default profile to run proper bash version.

But then there are your own little tools that start as a one-parameter two-liner but eventually grow to 30-params fire-breathing hydra. And that’s when you start missing that tab-completion thing.

But that’s easy (for simple cases – see a note below) – you just create a script named, say, mycomplete.bash, containing something like this:

  local complist=`fdisk 2>&1|grep -Eo ‘^ +[a-z]+’|tr ‘\n’ ‘ ‘`
  local cur=${COMP_WORDS[COMP_CWORD]}
  COMPREPLY=( $(compgen -W “$complist” — $cur) )
complete -F _completecmd yourcmd

where _compelecmd is a unique function name, yourcmd is a command this should be applied to, and complist is constructed from fdisk output just to illustrate the approach – it should be output of yourcmd parsed there. Note: try your parser before you set it up, I faced weird differences on different platforms.

Then you need to add this to your ~/.bashrc:

source /path/to/mycomplete.bash

and you’re done. To have it right away, you can also run source /path/to/mycomplete.bash directly in your bash prompt.

Mind that that this approach wouldn’t work for intricate cases when you have a deep parameter sequence dependency – have a look at Git approach, it’s a bloody burning hell there.

Git hooks

Git hooks are lovely. Consider automated syntax check before committing changes:

  • git config –global init.templatedir ‘~/.git-templates’
  • mkdir -p ~/.git-templates/hooks
  • vi ~/git-pre-commit-hook.sh and put, for instance, following:


  for f in $( git diff --cached --name-status|awk '$1 != "R" { print $2 }' ); do
    echo "Veryfying $f..."
    filename=$(basename "$f")
    if [ "$extension" = "pl" ] || [ "$extension" = "pm" ]; then
      perl -c $f
      if [ $lineretcode != 0 ]; then

  if [ $retcode != 0 ]; then
    echo "Pre-commit validation failed. Please fix issues or run 'git commit --no-verify' if you're certain in skipping validation step."
    exit 1

  exit 0

  • chmod a+x ~/git-pre-commit-hook.sh
  • ln -sn ~/git-pre-commit-hook.sh ~/.git-templates/hooks/pre-commit (point of this to have it centralised – on next step global template is copied to repository, so if it’s a symlink – it’s easier to adjust for all later)
  • cd to your repository and run git init there – it’d re-instantiate the repository with copying global hook templates there

That’s it – now you have yourself neat little safeguard. There are better techniques of course – flymake, for one, or some IDE build/check configuration. That works, too – but gets too tricky when you have to patch the code on remote instances. In my case, I have to edit files remotely – and while Emacs has no problem with that (Sublime has plugin for that, too, and you can always add some fuse SSH mount), my flymake settings suck, and I’m not keen to dive into a bloodthirsty piragna pool of its remote execution configuration.

So… git hooks it is.

Basic DNS records list

This is a “you learn better when you write it down” sort of post. Never actually got into DNS record types – as a lot of things I’ve missed, there was just no need and I wasn’t curious enough. Although curiosity without regular application of that knowledge is rather pointless – “you soon will forget the tune that you play”, if you play it just once or twice.

That said, I’m gonna be needing this knowledge soon (I presume), so I thought I better do me a hint page (a “crib sheet”, as the dictionary suggests).

  •  A record – “Address”, a connection of a name to an IP address like, for instance, “example.com. IN A” – where IN is for the Internet, i.e. “Internet Address…” Wildcards could be used for “all subdomains”
  • AAAA – “four times the size”, A-address for IPV6 addresses (see a note on IPV6 below)
  • CNAME – Canonical Name, specifies an alias for existing A record, like “subdomain.example.com CNAME example.com“. Useful to make sure you only have one IP address in A record, and others rely on A name – so if IP changes, it’s one place you have to change it at. Note: do not use CNAME aliases in MX records.
  • MX – Mail eXchange, specifies which server serves zone’s mail exchange purposes – like, for instance, “mydomain.com IN MX 0 mydomain.com.“; final dot is important, 0 is for priority: ther could be multiple MX records for the zone, and they processed in priority order (the lower the number the higher the priority). Same-priority records are processes in random order. Right-side name should be an A record.
  • PTR – specify pointer for a reverse DNS lookup, required to validate hostname identity in some cases – “ IN PTR name.net” (note that IP of name.net is
  • NS – Name Server, specifies a (list of) authoritative DNS server for the domain, for instance: “example.com. IN NS ns1.live.secure.com“. This should be specified at authoritative server as well.
  • SOA – State Of Authority, an important record with zone’s name server details – “authoritative information about an Internet domain, the email of the domain administrator, the domain serial number, and several timers relating to refreshing the zone“. Example:  mydomain.com. 14400 IN SOA ns.mynameserver.com. root.ns.mynameserver.com. (
    2004123001 ; Serial number
    86000 ; Refresh rate in seconds
    7200 ; Update Retry in seconds
    3600000 ; Expiry in seconds
    600 ; minimum in seconds )
  • SRV – an option to specify a server for a Service, like “_http._tcp.example.com. IN SRV 0 5 80 www.example.com.” – here’s the service name (_http), priority (0), weight (5) for services with the same priority, and port (80) for the service.
  • NAPTR – recent and complex regexp-based name resolution I’m not keen to into.
  • There’s MUCH MORE of this crap, hope I won’t need to ever dig that deep
  • There’s also a number of decentralized DNS initiatives

Oh, and on IPV6:

  • it’s 128-bit (IPV4 is 32)
  • it’s recorded in hex numbers, 8 quads
  • it has following structure:
global prefix subnet  Interface ID
  • local address is 0000:0000:0000:0000:0000:0000:0000:0001
  • and IPV4 record in that case would look like 0000:0000:0000:0000:0000:0000:
  • zeroes could be omitted: ::1 or ::
  • to make sure address is shortened correctly, use ipv6calc util: ipv6calc –in ipv6addr –out ipv6addr –printuncompressed ::1

Retrieving multi-line sequences from text files

Had some free time and had a need to parse out a number if similar (but not the same) blocks from a log file. There are tools for that – it could be done with a mixture of grep, sed, bash and some arcane magic – but I’m afraid to find the right toolset, learn required keys for each and experiment with their values and combination would take me longer than to just write me a tool. And it was another opportunity to write a few lines in Python, which I don’t do often enough. And I do love text processing.

So I made me a neat little tool that essentially does one simple thing – starts printing the input stream when some (start) trigger is found there, and stops when another (end) occurs. There are some additions – like, print some lines before/after the block, print couns, unique blocks only etc. – but those are glitter, mostly.

Available in a BitBucket repository.

WordPress plugins etc.

I’ve been (quite subconsciously) using WordPress for quite some time now, mostly for my alcoholic beverages blog (it’s in Russian, sorry). Subconsciously because it was the first option GoDaddy offered me a “automated install” blogging platform – and also because I’ve heard the name a number of moons back, so it should’ve been well documented and supported at that point. It’s on PHP, but who cares. I’ve spent years writing PHP code.

So I had this problem: my articles all have a rating (I use Author Post Ratings plugin by Philip Newcomer), but it’s not possible to see all the high-rated articles, nor it is possible to order articles by rating within a category – and this feature made a lot of sense, because when you go to a site with a bunch of reviews, you usually look for the best stuff within some category.

So I gave it a thought and just went and added required functionality – now it’s there on bitbucket, https://bitbucket.org/hydralien/author-post-ratings/src

Turned out writing WordPress plugins is a no-brainer if you need something simple (I started with a post-by-rating list) – you just add directory, create a PHP file with a proper header, and you’re done. Well, after you add your functionality, that is. WordPress has some lovely documentation on that.

It gets trickier if you need to change “internal behavior” – such as category sort order – but documentation helps there as well, there are filter hooks for that.

I guess this is worth a slogan – something like “Better drinking with no hassle” or “Drinking better just got easier”. Or whatever.

SECR-2013 notes at the end of 2013

Haven’t updated this place for a while – need to get back on this new habit, really. Well, at least
I hope I do so in upcoming year (should be a big one anyways). So a quick update just before I tear the old calendar down from the wall (and I pity that, it’s the Futurama genuine 2013 calendar) – some records that date back as far as late October, some points I squeezed from that SECR conference I attended back then. These are quick and without any verbose explanation – rather “FTR” stuff to keep.


  • monitor user delivery/connection time, detect slow (comparing to others or expected), direct them to closer/… node/instance by default. Meaning, you have logs, so you have the time of requests processing. So if you have some requests of one kind taking (under circumstances – be it another location or another browser or something) – taking longer that requests of the same king in other set of circumstances, you might need to adjust delivery rules – redirect traffic, test under different environment etc.
  • config/(code?) deployment through the Dropbox – as simple as it sounds, “simple parts” (I’m thinking configs, but might be updates as well) distribution via Dropbox.

Initial project stage:

  • “visual brief” – ask the basic associations, like “fast”, “pretty”, … – give some appropriate images to pick from, iterate. Like, ask customer what impression his product is aiming to project. Say the answer is “paramount”. Then suggest few images – mountain, the Sun, the Earth and the Moon, big teacher and small student – and ask customer to pick one (or few). And then iterate with styles and colors and other ideas.
  • moodboards (MoodShare…) – pick lots of stuff on the screen, let customer select, SCAMPER (http://www.mindtools.com/pages/article/newCT_02.htm)


  • try test using DB/ssh connect (separate config) to run local changes against test instance remotely
  • togglers: big-feature switches, allow to disable/enable particular features separately, config- (or environment-) controlled way (triggers in cookies, ENV, config, URL etc.
  • same tests on CI and local, local ran for feature-related validations (branches), CI for default
  • think performance tests

KPI (http://management.about.com/cs/generalmanagement/a/keyperfindic.htm):

  • measure what you want to adjust (meaning parameters selected fro KPI estimation)

Dev points:

  • make user feedback easy (put feedback interface at a clearly visible place, make it simple)
  • get a cool team (people that fit together (this is important), motivated (not just money, also interested in what they’re doing), productive)
  • make internal communication easy (chat, video, whatever – to let teams or team members connect without any hassle)


  • ask testers to write test cases descriptions right after ticket is created (confirmed, put to backlog) and discuss/adjust along the way. I think this one is really beneficial if used the right way, although no personal experience here yet.

Well, that’s it for now – and probably for this year, too. See me (for I think I’m the only reader – or at least a writer – of this blog) in 2014!