Sunday, December 14, 2014

Expat Travel Insurance (Part 2)

If you’ve settled upon travel insurance to cover illness/accident on your visit back to your country of citizenship (queue USA chant), it’s decision time on the exact policy.

There is a staggering array of choice, divided into three main product types:

  • Single-trip
  • Multi-trip
  • Backpacker

While there are tools to narrow the search, I am not aware of any that will take the unique concerns of expats (e.g. cover in country of citizenship) into consideration. This means reading terms and conditions: you cannot take for granted that the policy will be right for you, since the screens available are designed for vanilla residence==citizenship types.

For the UK-based among us, this means using a comparison engine like Money Super Market or Confused.

Comparison engine evilness

  • For underwriting purposes, your age is fair game, but not personal identifying information such as Name, email or phone number. I used the traditional ‘abc xyz’ and as work arounds.
  • Check the insurance agent directly. You will almost certainly find price discrepancies between going direct with the agent and going through the comparison engine.
  • For visits longer than ~7days, check both single-trip and annual multi-trip rates. I found the break-even point is between the two types of products 7-10 days.

Debenhams evilness

After reading through 1/2 dozen terms and conditions, I settled upon an annual multi-trip policy fronted by Debenhams. Cheaper than many single-trip policies for my length-of-stay.

There is much evilness to be had here, but some of their competitors were even worse…
* £30.60 for the Gold policy from Confused versus the (as far as I can tell identical) £61.51 Superior policy ‘direct’ from Debenhams.
* ‘Direct’ is a bit of a misnomer since Debenhams is a front. The policies are actually underwritten by Rock Insurance, which is itself the UK-front for a Swedish firm named SOLID Försäkringar. If it’s any consolation, both firms are registered with the Financial Services Authority [1, 2]. Though, if anything goes amiss, I have more faith in Debenhams attempting to salvage the reputation of its personal finance business than the FSA.
* The 3-star Defaqto rating is sufficient. Self-insure against luggage loss and cancellation, so the extra coverage that the 4/5 star policies cover in these areas is irrelevant. Given that the legal coverage for ‘high street’ policies is a joke (£15k), all they’re worth is the medical (£10M) and personal liability (£2M) cover.
* If you’re an expat considering travel insurance, be aware that you aren’t really protected against a very serious illness/accident, see Part 1.

Debenhams BEYOND evil

Saving the most evil for last…
* Buried in the terms and conditions there is an automatic renewal clause, BEYOND evil IMHO. Opt-out can be performed online:
* Making Melkor look like Wayne Brady, some companies have a written-only or phone-only (invariably an 0845 or other premium rate number) opt-out. Unspeakably evil.

As with all things, YMMV.

Written with StackEdit.

Expat Travel Insurance (Part 1)

At times, US policy seems designed to make life difficult for expats. Look no-farther than the catch 22 of insurance for trips back to the good ol’ US of A, where the protections of European-style socialized medicine do not apply.

At first glance, there are three alternatives:

  • US health insurance
  • Expat health insurance
  • Travel insurance

Each with their own unique pros/cons…

Type Eligibility US treatment Price
US Min 6mo US residence + $$
Expat Min 6mo foreign residence + $$$
Travel Min 6mo residence - $

US Health Insurance

Long-time foreign resident are wholly ineligible for US-based health insurance due to residency requirements.

For those returning stateside, good luck finding an affordable short-term PPACA compliant policy to cover the 6 month gap before eligibility in US plans.

On the upside, bless the wonks, there is an Obama-care exemption for those US citizens which pass either the bona fide resident or physical presence test under USC 26 §5000A(f)(4). Not that most non-executives could afford the premiums demanded by reasonable individual plan.

Expat health insurance

These plans generally cover treatment in either your country of citizenship or country of residence. In a perfect world, this would be the plan-type of choice for expats. Country of citizenship treatment for serious accident/illness; country of residence for more minor issues.

Unfortunately, expat health plans are often even more expensive than equivalent US-based health insurance plans. If you’re a non-exec, it’s doubtful that your company will offer this perk, so good luck affording a policy.

Travel insurance

These come in two flavors, US-based plans and foreign-based plans. Generally, there is a 6 month residence requirement, which will determine whether whether you are eligible for the US-based (min 6mo US residence) or foreign-based (min 6mo foreign residence) plans.

The biggest down-side of travel insurance is that they only cover emergency treatment, i.e. to minimize their cost, they will repatriate you ASAP to your country of residence (usually where the plan is acquired).

For relatively minor injuries (e.g. a broken leg) this level of coverage should be sufficient. However, in the event of a serious accident/illness, this money-saving tactic could kill you. For example, say you were involved in a serious car accident while visiting family in your country of citizenship. Travel insurance would cover your stabilization and medical repatriation to your country of residence (expat home). You’d then be left to the state system (e.g. NHS) with limited social/family support during recovery. Moreover, if you’re unable to work, or otherwise violate your visa conditions due to your illness/accident, you very well may face deported back to your country of citizenship, where you’d be uninsured, and more or less left to die without adequate treatment.

Seriously, it’s that dire. In a place like the US, the best an uninsured former expat could hope for is medical bankruptcy (certainty) and surviving without treatment long enough to become eligible for Medicaid (highly uncertain).

Additionally, some plans have additional restrictions on travel to the ‘home country.’ For example, if you’re an Indian national, resident in the US, your US-based plan may not cover you in India. As of yet, I’ve found no way to screen for these exceptions, other than to (attempt) to read through the 20-100 page terms and conditions for each policy.


None of the above is professional advice…far from it. If you’re in the same unfortunate boat, thrust into the complex tax and compliance situation that US policy imposes on its expat community, but can’t afford appropriate advice, good luck!

“There is no hint that help will come from elsewhere to save you.”
- R.I.P. Sagan.

PS. Given my income constraints, I’ve decided to opt for travel insurance, i.e. adequate cover for a minor illness/accident; wholly screwed in the event of a major problem.

Written with StackEdit.

Tuesday, December 2, 2014

Building blast+ databases with taxonomy ID (taxid_map)

Building NCBI BLAST+ databases with linked taxonomy is far more difficult than it should be.
For example, in taxonomy-based tools such as Kraken, mapping
1) taxonomy id to sequence id (gi or accession) and
2) taxonomy id to a human-readable taxonomy tree,
are built-in and transparent to the user.
Unfortunately, with BLAST+ these steps must be completed manually and are included in two separate programs, makeblastdb for (1) and blastn/blastp/blastx for (2).

(1) Taxonomy id <–> sequence id

In BLAST+, a taxid_map file file must be created and passed to makeblastdb
makeblastdb -in <FASTA file> -dbtype nucl -parse_seqids -taxid_map taxid_map.txt 
where taxid_map.txt is a space or tab separated list of sequence ids (either gi or accession) and taxonomy ids.
For example, with gi:
556927176 4570
556926995 4573
501594995 3914
Alternatively with accession:
NC_022714.1 4570
NC_022666.1 4573
NC_021092.1 3914
There is no turn-key way to generate this mapping taxid to sequence_id for a moderately large set of sequencing.
Fortunately, there is always a hack work-around. NCBI allows export of both the FASTA and GenBank files. The former are used as the default input for makeblastdb, and the latter contain both the sequence_id and taxid. They can both be obtained from the NCBI, searching and exporting with Send to:
enter image description here
This simple Python code snippet will do the trick for small and moderately large datasets.
from Bio import SeqIO
genbankfile = ""
f = open('taxid_map.txt','w')
for gb in SeqIO.parse(genbankfile,"gb"):
        annotations = gb.annotations['gi']
        taxid = gb.features[0].qualifiers['db_xref'][0].split(':')[1]
        f.write("{} {}\n".format(annotations, taxid))
For large datasets, the bandwidth cost of of downloading the GenBank from NCBI becomes prohibitive, and the dictionary approach would probably be warranted.
Download both the FASTA and GenBank, alternatively extract the FASTA from GenBank, e.g. with BioPython.

(2) Taxonomy id <–> Taxonomy tree

Simply include this NCBI database in the same directory as your database for the look-up to work with blastn/blastp/blastx:

Eating your cake

blastn -db <DATABASE> -query <QUERY> -outfmt "10 qseqid sseqid pident staxids sscinames scomnames sblastnames sskingdoms"
1,gi|312233363|ref|NC_014692.1|,86.26,310261,Sus scrofa taiwanensis,Sus scrofa taiwanensis,even-toed ungulates,Eukaryota
1,gi|223976078|ref|NC_012095.1|,86.26,9825,Sus scrofa domesticus,domestic pig,even-toed ungulates,Eukaryota
1,gi|5835862|ref|NC_000845.1|,86.26,9823,Sus scrofa,pig,even-toed ungulates,Eukaryota
Your comma separated file (-outfmt) showing human-readable taxonomy info.

Tuesday, May 20, 2014

Due Diligence: Seven Bridges Genomics (Part 3)

At last, following a genomics industry overview (Part 1) and cloud-based genomics analysis platform macro-view (Part 2), we arrive at the micro-level market landscape surrounding Seven Bridges Genomics.

In Part 2, I mentioned that the cloud-based genomics analysis space is crowded.

To give you a sense of just how crowded (read: very crowded), I’ve enumerated the companies in Seven Bridges Genomics’ competitive sphere. I used my best judgement with respect to direct competitors.

Suffice to say, even when restricted to direct competitors, it really is crowded. There is no clear market leader as of yet, so the next few years are going to very exciting/scary for many of these folks. So many ways to die.

Speaking of ways to be disintermediated before wide-spread platform acceptance, I want to give a special shout-out the Next-gen sequencing companies directly building cloud application to plug-into their own systems: GenapSys, Ion Torrent Systems, Oxford Nanopore.

Oxford Nanopore, still in quasi-shadow-mode as they are, like to be secretive; however, they mention AWS cloud applications in their Early Access Documentation. That, along with the very end-to-end nature and slow-to-market release, leads me to believe it’s in the pipe. I’ll be exploring these guys a bit more, later.

In Part 4, we’ll take a deep-dive into Seven Bridges Genomics, assessing positioning and key risks. Until then, enjoy (aside: sorry about the side-scroll, until I find a better solution for narrow blogger page widths)…
Company 1,2 Location Founding Tags 3,4 Products Cloud Direct Competitor
Agile Genomics* Mt Pleasant, SC 2007? Consulting AlignShop, MiST Database X
Aridhia Informatics* Edinburgh, UK 2008 Healthcare/Clinical AnalytiXagility X
Appistry* St. Louis, MO 2001 Consulting Ayrris X
Ayasdi* San Francisco Bay Area, CA 2008 General Machine Learning Ayasdi Cure/Topological Data Analysis (TDA) X
BGI EasyGenomics* Greater Boston, MA 1999/2010 Nonprofit, Core Facility, Open Source Various X
Bina Technologies* San Francisco Bay Area, CA 2011 Hardware/IT Bina Applications X X
BioDatomics* Greater Washington, DC 2012 Open Source SaaS, Pro, Community5 X X
Congenica* Cambridge, UK 2013 Healthcare/Clinical, Healthcare/Diagnostic Sapienta ? ? (Not Released)
Cypher Genomics Greater San Diego Area, CA 2011 Mantis X X (Early Access)
DNAnexus* San Francisco Bay Area, CA 2009 DNAnexus Platform X X
Eagle Genomics* Cambridge, UK 2008 Consulting ElasticAP X
Era7 Bioinformatics* Granada, Spain; Greater Boston, MA 2004 Consulting, Open Source, Bacterial N/A
Fios Genomics* Edinburgh, UK 2008 Consulting N/A
GenapSys6 San Francisco Bay Area, CA 2010 Hardware/Sequencing Genius X ?
Genestack* Cambridge, UK; St. Petersburg, Russia 2012 Genestack Platform X X (Beta)
Genome Cloud Seoul, Korea ? g-Insight X X
Genomics Limited Oxford, UK 2014 Shadow-mode N/A
GenoSpace Greater Boston, MA 2011 Shadow-mode ? X
Geospiza PerkinElmer* Seattle, WA 1997 Desktop GeneSifter
Globus Genomics Chicago, IL ? Globus Platform X ?
Ion Torrent Systems by Life Technolgies 7* San Francisco Bay Area, CA 2007 Hardware/Sequencing Ion Reporter X ?
Maverixbio* San Francisco Bay Area, CA 2012 Desktop Maverix Analytic Platform
NextBio by Illumina* San Francisco Bay Area, CA 2004 Desktop? NextBio Platform
NZGL Dunedin, New Zealand ? Consulting
Omicia Biocomputing* San Francisco Bay Area, CA 2009 Healthcare/Clinical Opal X
Oxford Gene Technology* Oxford, UK 1995 Desktop, Sequencing Service CytoSure Interpret
Oxford Nanopore 8* Oxford, UK 2005 Hardware/Sequencing X ?
Personalis* San Francisco Bay Area, CA 2011 Consulting, CRO, Sequencing Service N/A
Seven Bridges Genomics* Greater Boston, MA; Belgrade, Serbia (IT) 2009 Igor X X
Spiral Genetics* Seattle, WA 2012? Consulting, Desktop N/A / Anchored Assembly Method
Station X* San Francisco Bay Area, CA 2010 Desktop Gene Pool
Syapse* San Francisco Bay Area, CA 2009 Healthcare/Clinical Synapse Platform X
The Genome Analysis Centre (TGAC)* Norwich, UK 2009 Nonprofit, Core Facility Various X
Tute Genomics* Salt Lake City, UT 2012 Tute Platform/ANNOVAR X ?
Technical Notes:
After a misguided regression/sojourn into typing Part 2 in Google docs then copying to Blogger, which resulted in ultra-crap formating, I’m back to using, which I explored here. If there is demand/interest, I’m willing to update/convert this listing to a more dynamic format. Just give a shout in the comments.

  1. To the best of my knowledge, these companies form a closed set under the LinkedIn feature ‘People Also Viewed’, omitting spurious hits.
  2. * Direct link to company LinkedIn Page
  3. Companies are for-profit unless otherwise stated, e.g. Nonprofit
  4. Core facility implies Sequencing Service.
  5. BioDT Community is free to use
  6. Special shout-out for Hardware/Sequencing companies with cloud applications.
  7. Special shout-out for Hardware/Sequencing companies with cloud applications.
  8. Special shout-out for Hardware/Sequencing companies with cloud applications.

Monday, May 19, 2014

Due Diligence: Seven Bridges Genomics (Part 2)

Continuing with the top-down analysis from Part 1, lets look at the cloud genomics analysis industry with a focus on macro-scale phenomena. Seven Bridges Genomics, or any other individual firm for that matter, won't be able to do much about these, other than role with the punches.

In a future post, I'll go micro and drill-down to the unique selling points, enduring competitive advantages and economic moat that make Seven Bridges Genomics value proposition durable and secure (aside: hopefully).

Value Proposition

Provide the tools for scientist to do analysis without having to worry about the details of (1) compute/IT and (2) standardized work-streams.

Those are some hefty assumptions...
(1) Assumes scientists are compute limited.
(2a) Assumes there is a value-add in standardized work-streams
(2b) Which then, in-turn assumes, that there exists standard work-streams.

Making Money

As a private company, I can't do a deep-dive into their financials (aside: woe!), so I have to make some assumptions. From the marketing there seem to be two potential revenue streams...
(A) 'Compute Spread,' basically an interest rate spread but for AWS CPUs. The justify their mark-up over AWS compute pricing based on the perception of value-added. Note that this is a subclass of software as a service.
(B) Consulting

(A) must necessarily dwarf (B). Traditional consulting doesn't scale, which dooms a tech company before it can gain its sea legs / line-cross / other nautical right of passage, i.e. shark VC money. Consulting firms can bootstrap, but that doesn't seem like the growth trajectory they're going for.

 So for simplicity, lets reduce to (A). Taking the spread comes with both top-line and bottom-line risk. 

 Paramount amount them, the bottom-line risk of becoming an AWS whipping boy. You can scream for mercy, not that it helps. Honestly, other than try and take the compute in-house or trade masters.

In-house: Manage to do it even comparable to AWS...ha! 
Trade Masters: High switching cost...if it comes to this, were doomed a long time ago.

On the top-line, they need to either work in an highly inefficient market (alas, big banks) or continuously justify the spread they take through value-add. As I mentioned in the previous post, there is loads of competition with no clear market leader. Market structure will not save them, so value-add they must maintain, less open source eats their lunch.

Macro Swallow

The internet meme of near-misses between whales and humans, including such precious lines as “You’re gunna have to do more than clean that wet suit bro” [Youtube] are the impetus behind this section.

It IS a big ocean; however, there are lots of fishies $£€ to be had in a quite restricted space, the wind-up to a feeding frenzy. There are many ways to die.

Last post I based my mental model on drivers and constraints, but this time around a framework based on relative growth rates seems more suitable. A swallow, in this context, means the facet of growth that trumps the others.

Data swallow

Fail: Value Proposition 1

I/O swallow:
Problem: Impractical to upload data to cloud.
Solution: Co-locate with sequencing centers; however, this requires a) consolidation in sequencing industry (mass-market) or b) working with and servicing big co's exclusively.
Prognosis: Not great. a) Is survivable, but may kill the growth curve. b) Basically become just another IT integrator / service provider. Not scalable. Both mean having an additional whipping masters (AWS + core/big co). 

Storage swallow:
Problem: Impracticable to store data.
Solution: Stream data to be processed in real-time.
Prognosis: Would actually be a boon for Seven Bridges if they could solve the streaming and real-time analysis, as it enhances the value proposition.

Compute swallow

Fail: Tech swings against you

But personal processing power grows even faster:
Problem: New algos or technology lower the compute burden, making the cloud unnecessary. Can go back to on-laptop analysis, where other established firms, e.g. Acelrys, may well eat your lunch.
Solution: Go toe-to-toe away from the cloud. Convince that cloud is worthwhile for other reasons (hassle free, a la Google Docs).
Prognosis: If desktop, grim (infrastructure re-boot). If cloud, fine.

But processing doesn't grow fast enough:
Problem: Can't make money off the AWS spread because tasks are sucking too much compute
Solution: Hope parallelism and clever algo saves you, otherwise...
Prognosis: If AWS can’t do it, neither can you most likely. Hosed.

People swallow

Fail: Value proposition 2

Problem (2a): Scientist don't value your workstreams.
Solution: Hope your compute value proposition holds.
Prognosis: If your API doesn't suck, they build there own in your sandbox IF the compute justification is strong enough. Will become niche for low-end / small-time users, as more sophisticated users disintermediate you and take their algo straight to compute.

Problem (2b): Model fails since there are no standardized workflows. Everything must be custom/application specific.
Solution: Turn into a consulting company.
Prognosis: No scale. Either turn niche, or eaten by a bigger consulting fish with scale in consulting.


If you're placing a positive bet on the cloud genome analysis industry, not just Seven Bridges Genomics in particular, you're taking a few implicit assumptions...
  1. I/O swallow will not kill the industry in the cradle.
  2. Compute challenges are Goldilocks.
  3. Bioinformatics is amenable to automation and cross-application standardization.
I'm fairly confident of an all-clear on (2) and (3), but (1) worries me. There are solutions here if the company can pivot fast enough, but I'm not convinced that a start-up, as opposed to a core/big co, has the leverage to pull it off. 

There is also a get out of jail free card...alternate value propositions. 

One that sits quite well for the cloud is integration between different datasets, a task made much easier once all this disperate data is sitting on servers you control. One can imagine mining other peoples data and selling insights. This is NOT consulting in the traditional sense, but scalable returns from data integration and automated analysis.

Only time will tell...

Due Diligence: Seven Bridges Genomics (Part 1)

"Demonstrate your learning capabilities," how exactly to do that, I wondered. Develop a mental model! 

I've spent the past few hours reading about the field of genomics & next-gen sequencing, with respect to one firm: Seven Bridges Genomics

First I developed a sense of...
  • Promise -- Hard: Routine genomic diagnosis; Harder: Personalized Medicine
  • Problem -- Next-gen sequencing Data  Actionable Results
  • Solution -- ???

After that, I was bit stuck. How can one summarize an entire field with one mental model, one graphic. 

I thought about...
  • Competitive landscape (SWOT)
  • BCG Matrix (Definite with ? for most firms)
  • Key players (Companies, People, Locations)

But none of those are quite it. What is the root cause of the Problem. Here is a perfectly, imperfect mental model (all mental models are wrong, but some are useful) that seems to be working for me...

Mental Model for Genomics & Next-gen sequencing landscape

I think it comes down to drivers and constraints. Drivers being those things that push a technology forward, which of course require some metric to track changes in status (italics). Constraints being the rate-limiting resource which most hamper development, also complete with metrics (not-shown, but examples would be number of distinct technologies in the pipeline vs. maturity/ETA, cost per base-pair).

Each aspect of the ecosystem, from sequencing  assembly  analysis, has its own unique set of drivers and constraints. 

Now, is there a rate-limiting step in the ecosystem as a whole?  If so, that's as good a place as any to begin with high-impact solution...
  • Sequencing -- Cost and time is already falling, with a healthy pipeline of new technologies (i.e. not a 'Pfizer').
  • Assembly -- Incremental improvements in Robustness and speed. Throwing more compute (cheap!) at it generally seems to do the trick.
  • Analysis -- More data (sequencing & assembly) don't seem to be resulting in more actionable insight. Ding ding. I think we have a winner.

One thing that may temper going for the rate-limiting step is relative easiness of attacking other problems first. They're all hard problems, so lets stick to our guns and go with rate-limiting.

Which brings us full-circle, back to Seven Bridges Genomics and their solutionIgor, a cloud-based analysis framework.

The software is constraint-oriented, knocking down barriers to compute and people. Let our clever architecture and Amazon Web Services (AWS) take care of the computational scaling. Let our clever bioinformaticians do the heavy-lifting, standardizing workflows for common problems, adapting and scaling existing solutions and maybe even banging out something completely novel.

The result -- time and cost savings due to the experience curve effects, standardization and economies of scale. Awesome, no?

It remains to be seen whether they can compete effectively. It's a crowded space, with no clear market leader; but that's a story for Part 2. Other takes here and here.

PS. I also quite enjoyed the play on the Seven Bridges problem (aside: at least I think it's intentional). Change the graph, e.g. bombing a bridge -- which is more or less what they hope to do with the analytics end of things, -- and you can force a solution.

Friday, May 16, 2014

Blogger posting with Markdown using StackEdit

I recently discovered StackEdit, a tool for writing and previewing Markdown.

Now, I’ve been using Markdown for quite sometime, it’s useful for everything from electronic lab notebooks to taking notes on informational interviews.

Finally, through StackEdit’s built-in post to Blogger feature, I’m able to abandon the clunky default interface, and write posts the way they were meant to be written!

Additional benefits include easy code snippets…

def stackedit():
    print 'I love your product!'

Easy inspirational quotes…

“Perfection is Achieved Not When There Is Nothing More to Add, But When There Is Nothing Left to Take Away”

And easy equations (same syntax as LaTeX, by the way)…


Needless to say, I’m excited about the switch, and would highly recommend giving StackEdit a try yourself!

Thursday, May 15, 2014

Due Diligence: Tessella

I like to do a bit of public due diligence on companies of interest. Here's a brief example of some highlights from the dossier...

Tessella is a medium-sized technology consulting company that hires a small intake of talented developers each year, many of whom have PhDs. I'm interested in Tessella, and where their past employees have gone. I'd want to use a little LinkedIn based analysis tool for this data-dive [source], but until I get access, I'll have to do it the old fashioned way (read: manually, without a slick API...the humanity!)

Macro trends

At the macro-level, massive turn-over is a negative, but if no one ever leaves, that could be bad too. You'd expect at least a few consultants to fall for a client, and go work for them; it's how ideas spread in a knowledge ecosystem. I'm specifically interested in the Boston-office, so lets compare Boston to total...


Gender assignments are based on name/picture. Some of the gender numbers don't add-up due to the unavailability of this information.

Tessella, All Offices
Past: 276
Current: 193
Current + Past: 36
--> Left company: 240
--> Leaver/Current: 1.2
--> Promotion/Unpromoted: 0.23

Tessella, Boston
Past: 17
Current: 13
   Men: 9
   Women: 3
Current + Past: 2
--> Left company: 15
   Men: 13
   Women: 1
--> Leaver/Current: 1.3
--> Promotion/Unpromoted: 0.18
--> Men/Women: 22/4 = 5.5

No excess turn-over in Boston office

Self-explanatory, but some caveats...
1) Boston office is 10 years old, compared to 30 for the company as a whole; however, LinkedIn has a bias towards more recent events. A priori expect leaver/current to be higher for All offices.
2) The company has grown massively, so most of the people that have ever worked for Tessella have worked there is the past 10 years. Negates (1)
It's a wash; I'd say they're comparable.

Internal promotions are consistent between Boston and the rest of company

0.23 versus 0.18. I'd say these numbers are roughly the same given:
1) Small sample size for Boston
2) LinkedIn quirk. Not everyone listed a promotion as a separate job. I'd only detect a promotion if the person put in a separate entry. I personally know people that don't do this, thus 20 year tenures in their most senior position (for a 40 year old).
3) I wish I had a base-line metric for internal promotion. I'd be interesting to compare Tessella to peers.

White Male dominated

22:4 Men:Women for the Boston office. This is technology consulting after all. No surprises there. Of the male consultants, past and present, all but two are white. Curious how gender and diversity compare to peers. (Vet me LinkedIn! don't seem to present gender/racial data to the API, so I'd have to have some fun!)

Micro trends

I'm interested in where people did before Tessella, and where they go afterwards. With access to the LinkedIn API (please give me vetted access!) I could run an analysis on everyone in the company. There are 240 leavers, which is do-able by hand, but 1) I'm lazy (in the good way) and 2) to get a sense, I probably don't need to sample everyone. I'm specifically interested in the Boston office, so that's where I'll start...

Data (with homebrew classification)

Note: I made the bolded function/company classifications to help organize the data. There are of course other-ways to organize. There are only 14 people since I can't access information for the 15th.

Software Engineering: 5
Informatics/Analysis: 5
   Bioinformatics: 1
   Cheminformatics: 1
   Data Science: 1
   Industry R&D: 1
   Tech Consultant: 1
Project Management: 3
   Project Manager: 2
   Account Executive: 1
Self-employed: 1

Life Sciences Research: 2
   Dana-Farber Cancer Institute
Life Sciences Companies: 4
   Life Technologies
   Novartis (2)
Big Technology: 4
   BBN Technologies (Raytheon R&D)
   IDBS (IT consultancy)
   Microsoft (2)
Small Technology: 3
   Complete (Digital Marketing)
   Extreme Reach (Video ads)
   Tokyo Electron (Semi-conductors)
Finance: 1
   HighVista Strategies (Asset Management)

Some range in functional exits, but they are all technically-aligned.

10/14 exits are day-to-day technical. The 2 project managers work at NIBR (Novartis Institutes for BioMedical Research) and Life Technologies, so I'll assume that they are managing technical projects. The account executive works for IDBS, so I'll assume he's selling and overseeing technical projects. That means substantially ALL of the exits are technical. The biggest jump this group has made was to technical sales and support (account manager). It IS possible to not program day-to-day, but don't expect to stray too far from the technology function.

Some range in industry, but it's really either life sciences or tech.

6/14 in life sciences. 7/14 in technology. It's Boston after-all, I suspect the UK exits would be less life sciences dominated. Within the life sciences and technology silos, there is some diversity between Fortune500, research institutions and SMEs. Some are small, but I wouldn't consider any of them start-ups. I'd be interested in talking to the one outlier in Finance. Don't worry, he's still a techie.

Observation: People Leak on LinkedIn

I wasn't explicitly looking for this, but folks leak loads of information on LinkedIn. If it were possible to scrape and analyse this information, it would be possible to independently audit private company numbers accounts, or infer them if they don't release.

From reading profiles on LinkedIn, I inferred that Tessella will earn £21M for 2013. This is spot on with Tessella's own reporting [source], which (scouts honour) I did not view before the data-dive.

Information I observed from looking at profiles...
  • 2014: 250 people.
  • 2014: 30% PA growth in life sciences. 
  • 2011: His (two) offices (of 8) contribute ¼ of total revenue.
  • 2011, 210 staff. 160 earning, 50 admin.
  • 2011: Consulting is growing 20% PA, ⅕ of revenue. US is 17% of revenue.
  • 2011: Oversaw 37 staff, 4M GBP in revenue. 
  • 2010: Grew office to >20, >2M GBP
  • 2010: 88% of revenues are repeat business

How to estimate  £21M for 2013

£4M from his office, which is ¼ of company = £16M in 2011
- 30% life sciences...emphasizing that this is higher than average. Consulting is 20%, that it's mentioned implies it's higher than average. Let's assume 15% PA growth as a modest guess.
£16M in 2011 * 2 years at 15% = £21M in 2013
Simple example, but with a fire hose of data being automatically parsed, it may be possible to glean much much more.

Other LinkedIn Estimates

3:1 Tooth:Tail
£100k/consultant in revenues


Overall, the information available in the financials [source], line-up well with LinkedIn. Good to know that folks are bending the truth on their profiles.

There is so much gold in financials. I'm like a kid in a candy shop when I read through them. Not entirely sure it's normal, but I get positively giddy reading annual reports (financial tables first, naturally).

Many SMEs live hand-to-mouth; that's ok for start-ups, where you should be being compensated with a risk premium, but is inexcusable for an established SME. Not the case with Tessella, but it's a good idea to double check.

5:1 Tooth:Tail

(191:40   Billable:Non-billable)
Not sure how this compares to peers. My gut told me the LinkedIn estimate was a bit low. Went and checked. My gut was right. More tooth...woo!

Revenues are £110k/consultant

(21M / 191 consultants from 2013 annual report)
I think this places a natural cap on what a consultant could ever expect to earn. I'm guessing this is ball-park for a tech consultancy shop. Operations houses are pulling in 1.5-2x, while strategy houses are likely billing 3-5x.

Compensation Structure

Ok, you can't get this from the LinkedIn data...

  • £25k     Low
  • £45k     Median (est)
  • £65k     Mean
  • £160k   High
Annual report is £12M in wages (+£600k in pension cost) over 191 consultants. That's £65k/consultant in earnings versus (£21M revenue)  Highest paid director at £160k. Entry-level consultant at £25k. A 6x high-low multiple is actually fairly reasonable in this day and age. Though, the directors did gorge themselves a bit in 2012; suspect this was related to the management buy-out though. If you take the distribution of US household income as a guide, the median is about 2/3 of the mean, giving around £45k median wage.


If you're a tech-head, who wants to become/remain a male (kidding) tech-related-head with a technology or life sciences company, Tessella offers good exits. If you wanted to break into other areas of consultancy, say operations or strategy, or into another non-technical functions, one should probably look elsewhere. Loads of money, a la finance, also look elsewhere. 

To their credit, what I've found is exactly what's written on the tin. Tessella doesn't sell anything more, or anything less, than a solid technical training ground for new hires. Solid pass on the basic DD.