Linkedin rolled out their one-click endorsement feature this past month. As I’ve traded clicks with friends and colleagues, I’ve been trying to figure out what their real goal is. On the surface, this feature feels redundant with their existing “recommendation” feature. Given their aggressive efforts to promote this new feature, the data they are gathering is clearly important to the development of the algorithms behind their services – but how?
Linkedin has been fairly quiet about the inner workings of their search engine. This is likely to prevent people from manipulating the results, since there is significant value in being on the first page of a Linkedin search for a lucrative professional skill. They share some basic pointers about how to “be visible” on their help page. Key points from their page:
- There is no single rank for Linkedin Search – results are unique to each user/query
- The profile keywords of both parties (searcher, results) play a significant role
- Rankings are adjusted based on how prior searchers have reacted to your profile
While the above metrics are fine for identifying which candidates are relevant to a search, they don’t rate candidate quality: who actually knows their stuff? What’s missing here is a broader assessment of “page trust” (graph model analysis concept) that candidates possess the skills that they reference on a profile. For example, Google’s search algorithm incorporates an evaluation of the credibility of a site using link patterns, brand signals, and social activity. With these new features, it looks like Linkedin may be trying to adapt Google’s Pagerank algorithm (or something similar) to ranking candidates for specific skillsets.
This feature is important to Linkedin’s primary paying customers: recruiters. In their most recent quarterly results (link here), the Hiring Services business was responsible for 53% of their revenues, up from 48% last year. Recruiter activity plays a strong role in convincing regular business users to sign up and use the site, providing the traffic and value proposition behind their other revenue streams. This is a key part of the difference in their earning power vs. Facebook (we mentioned this in our website revenue model essay). Providing recruiters with a high quality candidate search experience is as important to Linkedin’s business model as ranking websites is to Google’s.
From a technical perspective, there are many similarities between online search and job candidate identification. Start by accumulating a pile of unstructured text documents: at Linkedin they are called profiles (aka. resumes), at Google they are called web pages. Build a query service which generates a ranked list of the most relevant documents for a particular query. Optimize your algorithms so you do a good job of addressing the real intent behind the search. Think of Linkedin as a search engine for human talent and skills.
So What Exactly Do You Do For A Living?
To illustrate where I think Linkedin is going, we’re going to do a little thought experiment. We’re going to help a recruiter identify the best pricing consultant in Atlanta. As of this writing, Linkedin thinks there are about 5000 people that might be of interest to us…
The first challenge with searching unstructured text is that humans often have multiple words and terms for the same thing. For example, our pricing strategy guru might have pricing optimization, revenue management, or just ”strategy” on their business card. And we’re not always clear in our language. If our prospect worked in the hotel industry, they might be part of the inventory management department. But don’t ask for that at a logistics company - that team puts boxes in the barn! Early search engines tried to solve this with the “keywords” meta tag: eg. ok human, you tell me what you want this website to rank for!
Making our fictional recruiter’s job more difficult is the fact that most linkedin profiles only share a subset of the prospect’s expertise. This is particularly true for passive candidates. Most companies frown on employees posting their full result to Linkedin. This means you are trying to find and rank people using a few job titles and 100 – 200 words worth of high level descriptions. To be effective at spotting relevant talent in this environment, you need to be able to fill in the blanks. Pricing is a good example of this - a good pricing leader will need expertise in marketing, finance, analytics, process/technology, and leading change. This is a lot to extract from 100 – 200 words! So you need a way to read between the lines like a good human recruiter: extrapolating the balance of the candidate’s skills from what data is available and ranking people based on their expertise in adjacent skills. Maybe the best candidate isn’t in pricing – maybe they’re working in marketing analytics and have a solid base of hands-on process improvement expertise. Someone with this background would be a strong candidate for a pricing role, although it won’t be written on their profile.
Solving this problem was probably the strategic goal behind the introduction of Linkedin’s skills feature they deployed last year. While they positioned it as a social feature, it gives their analytics team a way to crowd-source the classification of skills and knowledge. Think of this as a “keywords” meta tag for Linkedin profiles.
The interesting thing about this is you don’t really need total participation for this to work. While your most accurate classifications will occur when people self-identify their skills, there are many ways to fill in the blanks. Once you’ve got a decent sample of accurately classified participants you can use text mining and graph analysis to complete the picture. If 70% of your pricing strategists present a valid-looking claim to know analytics, it stands to reason that most of the remaining 30% are probably analysts. This could be structured as a distance function applied to a graph of adjacent skills (known by same person). Skills which frequently occured together would be deemed related (pricing, marketing analytics). Skills which rarely were known by the same person (direct sales, regulatory compliance) would be rated as unrelated. This type of prediction could be further refined with data such as job descriptions, role/company peer-norms, social activity, search history, groups, etc.
Just How Good Are You? Be Honest, please….
Now that we’ve got a list of candidates, we need to rank them and evict any spammers. We’re going to do this concurrently. So, who is the best qualified? Remember, our friend here is a recruiter – they don’t really know about pricing strategy and can be confused with buzzwords. After a while, the candidates start to look alike. Furthermore, as you may suspect, being the best pricing strategist in Atlanta is a heavy burden. There are all these crazy recruiters who want to hire you for big money! Naturally this kind of cash attracts some less ethical types - unqualified people who might try to game the system so their resume ranks first! I’m shocked, shocked to think this type of fraud occurs…
For example, consider this potential candidate description:
10 years of pricing strategy experience in Atlanta. Worked at the Atlanta pricing strategy practice for Big5 Consulting. Founder of the Atlanta pricing strategy boutique Price, inc. Regular speaker at pricing strategy conferences and teaches a pricing strategy course at Price U.
Those of you who were online in the late 1990′s may remember websites which repeated a single phrase an insane number of times. This was called keyword stuffing and worked fairly well for several years. It’s fairly obvious what this candidate is trying to rank for! Incidently, as of this writing, Linkedin’s current search algorithms appear susceptible to keyword stuffing. They use multiple factors (public statement here) but this still works. However, as a recruiter, I would prefer to interview candidates who have credentials beyond developing a profile that repeats atlanta pricing strategy about twenty times.
Google solved this problem in the late 1990′s when they implemented the Pagerank Algorithm. The pagerank algorithm uses links to gauge the relative authority of a web page; delivering an objective assessment of the website’s credibility. A link is viewed as a vote of trust – vouching that the content on the other end is good stuff. The inbound links are weighted by the authority of the page referring them. Links from a highly respected page carry more weight. The text in the link (aka anchor text) is used to explictly describe the nature of the vote of trust – what, specifically, is the referring page claiming about the content. This is an abbreviated explaination of pagerank: entire libraries have been written about this system and Google invests millions in improving the model.
One altenative solution to this problem was trustrank: this involves selecting “experts” in a discipline as the “seeds” of an authority graph and mapping out what sites (people) they endorse. Links serve as votes of trust. You have the ability to hand-curate the authority sources for a particular discipline. This solved the same basic issue Pagerank addressed: build a network of links and use who links to who to assess which sources are reliable.
Back to Linkedin. You know what those one-click endorsements smell like? Links. The skills are the keywords. Once you have started accumulating this data, you can rate the authority / credibility of a profile for a given skill. A search for “pricing strategy” under this model would return a list of candidates who claim this skill, ranked by the power of the endorsements they’ve received. Endorsements received from other Linkedin members with high credibility on pricing strategy or other relevant skills would carry more weight that recommendations from people without pricing expertise.
These endorsements are the missing link in this system. Linkedin has always had good information about “weak ties” between business associates – who knows who. Their data on strong ties is limited: a relatively small number of recommendations, in free text form. Weak ties are enough to validate the legitimacy of a profile, proving a real person exists and interacts with relevant peers. For example, a good python developer will be connected with other python folks. They mention it on their profiles, participate in Python groups, and work at employers with other Python people. Claim to know Python but don’t know anyone else who does? Go to the back of the line. The weak links aren’t enough to rank people. After you eliminate the bottom 50% of the pool who are marginally qualified, then what? The endorsements are focused enough – and easy enough – you can use them to build a network based on professional competency.
Linkedin’s expansion into social media supports this direction. It provides them with an additional way to validate the skills and interests of candidate profiles. Continuing the example, many python developers probably share relevant articles, comment on these articles, join python discussion groups, and get ‘likes” from other python developers. Google uses social media activity (likes, shares) to validate website content; this is an obvious place to apply Google’s experience to Linkedin’s search efforts.
Competitive Assessment: Why This Matters
From a competitive strategy perspective, executing this concept would cement Linkedin’s lead in the recruiting space. Their ability to link social proof (contacts, recommendations, social activity) with traditional resume data sets them apart from the traditional job boards. Implementing a pagerank algorithm to assess candidate capability and trust would be of tremendous value to recruiters and something other networks couldn’t match.
Any new entrant would have the monumental task of building a social network and bringing a search engine online. Facebook is out of position: they have too much invested in the casual consumer audience (with their embarassing party pictures). Twitter is a howling torrent of spam. The largest potential threat to this model would be Google+.
The Google+ scenario is interesting: Google has an edge in search and the massive resources required to build a new network. This would provide a real purpose to creating a Google+ profile for the average business person. They can effectively advertise for free. An acquisition, however pricey, might also make sense if the personalities and anti-trust elements could be aligned appropriately. Google clearly wants to add a social network to its portfolio of properties; their deep expertise in search algorithms and advertising could dramatically accelerate Linkedin’s growth curve.
This article is purely speculative. I’m looking past what Linkedin is saying and evaluating the capabilities that having this data gives them. This analysis is exploring their motive, means, and opportunity.
The strategic value this move would create for them provides a motive. As I said at the beginning of this article, improving and maintaining their search quality is essential to sustaining their business model.The data acquisition campaign they are currently executing provides them with the means. We explored this above.
What they do with this opportunity remains to be seen…
If you like this article, please share it!
You Might Also Like Some Of Our Other Articles / Projects:
- Escaping The Walled Garden of Enterprise Analytics: Using R and Python For Data Analysis
- We’ve written a couple of other articles on SEO / Social Media Analytics:
- We did a research project on website revenue statistics; the results of our work were packaged up into our Website Revenue Calculator.
- Finally, we really like the bottle.py framework and assembled a resource page; along the way we researched python shared hosting and free python hosting.