Book Review – Peak: Secrets from the New Science of Expertise

Hiker standing on the peak of a large rock.

Photo by Kalen Emsley

Peak is Co-authored by Anders Ericsson (a heavyweight in the field of physiology) and Robert Pool (an established technical writer). Peak is a summation of close to 40 years of Dr. Ericsson’s research in elite performance and some the surprising discoveries made along the way. A lot of his research has surfaced in other titles such as:

  • Outliers: The Story of Success By Malcolm Gladwell – Which introduced the concept of the 10,000 hour rule, something derived from Dr. Ericsson’s research. Outliers takes a very different position on the meaning of the results and is interesting to see the contrast.
  • Deep Work  By Cal Newport – A recommended book on the application of many of these concepts in the context of programming and the knowledge worker economy.

But Peak is direct from the man himself and revels quite a few insights overlook by other authors.

Deliberate Practice

This book centers around the concept of deliberate practice, what it is and isn’t, and how its different from how we usually learn. So what is deliberate practice? Let review the key concepts:

  1. Has Specific Goals – Deliberate practice must have very clear constraints and focus. You must dive deep to become an elite performer and goals keep us on track. Veering off costs you time and productivity.
  2. Focused Practice – You need to be completely focused for long periods, no distractions. The authors mention that high level violinists practice so intensely that many take mid-day naps to recover. Most people have to build up and practice to reach the required levels.
  3. Always Uncomfortable– Its important to always be working on the area which you are worst. This seems obvious but usually once we get pretty good at something we stick to habit and stop improving.
  4. Feedback Loop – In order to improve areas which we are weak we need to know where we are weak. Its important that our practice gives us as close to real-time results as possible so we know if we are heading in the right direction.
  5. Teachers – If possible you should have a teacher, coach, or mentor to help uncover your weak areas. They can also show you the proper mental representations that will improve your ability to make complex ideas useful.

Mental Representations

One of the concepts discussed is the idea of mental representations. The key idea behind this is that we can generalized concepts once they are intuitive to us and build a hierarchy of knowledge.

For instance, the example of a study participant who worked his way up to memorizing about 80 unique numbers. At first he just tried memorizing the numbers but was limited to around 7-9 numbers. Which is the usual limit of short term memory.

By creating memorized representations like robot is 42 or cat is 68, simply picturing a robot picking up a cat could generate the number 4268.

We humans are better at recalling stories and images (chess-masters sometimes refer to picturing the board as lines of force for example) rather than raw information. Meaning a large part of building knowledge involves finding the best way to represent a complex topic as a simple abstraction.

Things need to be practiced and mulled over till it seems obvious. Concepts made intuitive integrate easy with the rest of your knowledge. This web of knowledge allows you to make new and unique connections that is seen in expert performance.

Final Thoughts

Peak: Secrets from the New Science of Expertise is an excellent book. I would say it’s required reading for anyone in a technical field.

Unfortunately, if you are looking for step by step advice you wont find it.

But it does do a great job giving everything needed to classify deliberate practice, helping to qualify whether you are actually performing it. Figuring out the steps ultimately requires advice from experts as well as experimentation.

If you are a developer you may find it difficult to find truly great teachers. The good news for those starting out, you just need someone who is good at explaining things and better than you. But as you progress, you will continue to need better teachers.

Something Cal Newport brings up quite a bit is that few people know or implement this stuff. That means if you can figure it out you can enjoy a huge advantage over your peers. Its fertile territory for those who want to blaze a path since most knowledge work has yet to reach the structure of sports or music.

If you are interested you can pick up Peak: Secrets from the New Science of Expertise from Amazon fairly cheap, the audio version is also great.

If you enjoyed the read or have any comments be sure to follow me @zen_code and let me know.

Do Vim Plugins Improve Productivity?

View on the long journey

Photo by Aneta Ivanova

 

I’m usually open to experimentation when it comes to productivity in my development work.  I recently stumbled upon an article advocating Vim bindings for Visual Studios and decided to take the plunge and give it a try for at least a week. I had a little Vim experience so I figured I wasn’t flying completely blind. The idea of being able to work while never moving my fingers away from the home keys was quite appealing to me. I figured that alone would be a workflow improvement, not to mention the other wealth of features.

Sensitive users may want to skip this next statement. I know some of you are thinking, isn’t Vim only for people who still call themselves AMIGA programmers. Why should we be moving backwards. My honest answer, I don’t know. But I have spoken with others and read quite a few post which have given me a good enough argument to at least find out for myself.

Deciding to go full immersion, I rebound all of my IDE’s and dug in. Boy, did I have a rude awaking as to how polished my Vim skills were.

First There was Despair

My first few days were polar opposite of productive. I found myself grasping for my cheat sheet of Vim modes and commands for what felt like every 30 seconds. Struggling and having to look up a command you literally just looked up is a good test of humility. More than a few times I had to turn off plugins to get important work done quickly. Cheating, I know.

But as I persisted I improved. Getting better at basic things leads to exploring the more difficult things. Knowing the light at the end of the tunnel was I would be gaining a useful skill regardless was helpful.

Its also worth briefly discussing why I decided to just use plugins for other IDE’s rather than Vim itself. The truth is I have Visual Studios and Sublime tuned in for the type of projects I’m currently working on. Does that mean I shouldn’t explore using it in my workflows, sure I should and will, but I didn’t want to bite of more than I could chew. This does come at the cost of losing some of the, some would argue, best features.

What was Gained?

So I know I’m going to leave out someone’s favorite feature here. But these just some of the features I found to be useful early on and is in no way meant to represent the full breadth of features these plugins have.

  • The base key bindings are all very close to the home keys. No reaching for the mouse required if you set things up right. Once the keybindings become second nature text just seems to flow like a river. No need to look down to jump to and end of a document or move the cursor with arrow keys.
  • Macros, they are a huge advantage when doing very repetitive tasks. The fact you can record a set of keystrokes quickly and then replay them is handy when working in HTML.
  • Similar to macros but rather than recording keystrokes you can command and execute a pattern of keystrokes such as jump 4 words or down 10 lines. What neat is even those can be stacked.

Final Thoughts

So the big question, was it worth the struggle? Yes, I think so. It was far from easy but I did in fact find that it improved my input speed. While far from being a matrix like neural plug, its kind of weird how it begins to feel like you can almost think it and it happens. Things require such little movement its almost hard to explain. But as its joking called, the Vim learning cliff is not easy to scale and may be worth easing in a bit more slowly than I did. I am still far from smooth and still find myself regularly looking things up. There is so much nuance available it could take years to master, which is kind of exciting for me. But if you interested in following suit here is a list of plugins. You can find one for pretty much any editor.

Other Useful Resources and Links

  • Sublime Six – Surprisingly broad support for Vim, worth checking out if you use sublime.
  • VsVim – Adds support to Visual Studios. A bit limited on features compared to others but has all the important ones.
  • vim-adventures – Learn Vim as a game if that is your learning style, it is paid but you do get the first three levels for free.
  • Cheat sheet – Don’t be surprised if you clutch it like a life preserver at first of course this is the one I used but there are plenty of other good ones that my be better formatted for you.
  • Learning Goodies – A wonderful list of helpful links to help you on your learning adventure.

It you decided to take the journey be sure to let me know @zen_code.

K-Means What? A Less Bewildering Introduction

Today my hope is to give a less bewildering introduction to one of the cornerstone unsupervised learning, K-Means. The only expectation will be some programming experience and a passing understanding of Python. If you are already a rock star machine learning developer then you will likely know all this like the back of your hand. As for everyone else buckle up.

So, quick refresher of some machine learning 101. There are two primary types of learning algorithms: supervised, and unsupervised. Supervised algorithms validate their estimations while learning by correcting themselves after looking at supplied answers. By supply the answers this allows the algorithm to model data in a way specified by its creator.

Unsupervised algorithms simply require data with enough inherent context for them to unravel a pattern. This is handy since one difficulty in machine learning is labeling data to get an algorithm to learn to fill gaps and generalize accurately. But it does come at a cost. Unsupervised algorithms will not be able to label and categorize as straightforwardly as supervised learners. This is due to the inherent context that labels give data. For instance, it would not be possible to give an unsupervised algorithm trained with animal data points a dog and expect it to directly output the category dog. But it will likely throw it in the same category as other dog data points, maybe wolfs as well etc. Unsupervised learners real strength is for finding patterns we are not looking for.

So, let’s get started.

K-Means Basics

We are also going to use the variable k denote the number of categories we want the algorithm to split our data into.

Let us also call the center (x,y) point of a category in our graph its centroid. More on how this works shortly. We will assign each data point x_n will always be assigned to its closest centroid.

For those who (like myself) slept through math class, lets quickly talk about finding the distance between two vectors. Which is formally noted as:

\left \| \vec{x^n} - \vec{\mu^k} \right \|

Which broken down looks like:

distance = \sqrt{(x_{2} - x_{1})^2 + (y_{2} - y_{1})^2}

For those a little rusty on there euclidean geometry here is a simple explanation as to how distance is derived.

Optional Nerd Knowledge

I’m sure, like most of us, you’re wondering why not just use cosine dissimilarity or some other form of distance. Well in truth there are variations to k-means which do calculate distance with other methods.

Here is a brief explanation as to why euclidean distance is used as well why we square the distance, as you will see below. The short, smarty pants, answer goes as follows “.. the sum of squared deviations from (the) centroid is equal to the sum of (the) pairwise squared Euclidean distances divided by the number of points”. But mostly, it saves us from taking the square root of the difference between vectors.

Step Through

Lets quickly walk through the algorithms steps. But first lets randomly initialize the x and y positions of K number of centroids.

1.) We now check each for the one with the shortest distance squared between and assign each point to its nearest centroid.

min_{k} \left \| \vec{x^n} - \vec{\mu^k} \right \|^2

2.) Now that we have updated all of our points, lets updated our centroids. Each centroid now moves to the average of all the newly assigned points. Which is done by adding up all the points in each cluster and dividing by the count.

K_{i} = (1/C_i)\sum^{C_i}_{j=1} x_i

Now all that is needed is to repeat steps one and two by either a fixed amount of say 100 iterations or perhaps check to see how much the centroids have moved. When it stops moving any significant amount, stop the loop.

Now that wasn’t too bad, was it? Lets see a bit of sample code to demonstrate it in practice.


def assign_clusters():
    for x_indx, x in enumerate(points):
        min_distance = sys.maxint
        min_category = 0
        
        for idx,centroid in enumerate(k):
           # using built in vector function for distance to make simple
           distance = x[0].dist(centroid)
           
           # is it closer? if so lets make it our category
           if distance < min_distance:
               min_distance = distance
               min_category = idx
        
        # update point with new category
        points[x_indx] = (x[0], min_category)
        
def update_centroid():
    for idx,_ in enumerate(k):
        sum = PVector(0,0)
        count = 0
        
        # sum the vectors
        for p in (item for item in points if item[1] == idx):
            sum.add(p[0])
            count +=1
        
        # bad things happen when you divide by zero
        if count == 0:
            continue
        
        # normalize to the average position of all points
        sum.div(count)
        k[idx] = sum

Advantages

  • Its simple, fast, simple to understand, and usually pretty easy to debug.
  • Reliable when clear patterns exist.

Disadvantages

  • It can get stuck in local optima, which usually requires re-running the algorithm several times and taking the best result.
  • It won’t detect non linear clusters.

K-Means doesn’t like non linear data –   🙁 [source]

Final Thoughts

Hopefully you have found this to be useful, if not or should you have any questions be sure to let me know on twitter @zen_code. Of course this article is a pretty elementary description of what K-Means can do. For those looking for a bit more, or would like to see some interesting applications. If you interested in using this on a large or production data set, what ever you do don’t right you own. Sci-kit learn has an excellent k-means tool set. I’ve included a link to the full code as well additional resources below and as always happy learning.