Biotech’s Engine

IMG_20180712_155043733.jpg

High-tech tools power biotechnology and medical research.  Advances in cell engineering and large-scale bioinformatics will propel tool development in new directions.

Life Sciences Tools, the Industry

Our understanding of life advances only as quickly as the tools that we use to study it.  After all, it was 17th-century microscope hackers like Robert Hooke and Antony van Leeuwenhoek that first revealed how tiny cells are the foundation of all life.  Today, we read about breakthrough scientific discoveries and novel therapies in the news, sometimes overlooking how “methods” researchers and instrument builders have laid their foundation.  But follow the scientific journals, and the sophistication of routine tools is astounding: high-throughput assays, whole-genome sequencing, massively parallel transcript analysis, and real-time 3d in vivo imaging, to name a few.

Beneath the hood of life sciences research, a large Life Sciences Tools (LST) sector develops the instrumentation, reagents, and computation that power these discoveries.  With around $1B in annual and U.S. venture capital and frequent nine-figure acquisitions, LST is a thriving startup space.  Grant-backed academic scientists are customers, but the sector’s strategic value to pharmaceutical and biotechnology users is immense: today’s R&D tool can quickly become a game-changing diagnostic or platform that discovers tomorrow’s blockbuster drug.  Combined with adjacent diagnostic and clinical research services markets, LST accounts for around $50B in annual revenue and $275B in market cap.  

LST is a vehicle for biotechnology innovation and predictive of future opportunities in health care.  Tracking trends in LST is not for casual fans: many leads are buried in the depths of science journals.  But with eyes on both research and business news, here are a few trends that I believe forecast the direction of LST.

Data Analysis Outpaces Its Creation

A single human genome is such a huge amount of data that analyzing it is a career path of its own.  Bioinformatics computing resources have grown exponentially, and the recent deployment of deep learning methods means that large datasets will provide valuable clinical leads for generations to come.  But, whereas Gmail and CAPTCHAs generate high-quality data to train networks for language and image processing, biology makes data very hard to get.

In many areas of the life sciences, we are entering the dilemma where our ability to analyze new types of data outpaces our ability to create it.  Instruments that supply this shortage are needed, and how they interface with analytics will be a key requirement from the earliest stage of development. Cloud infrastructure trivializes the challenges of analysis at scale, so assaying, dissecting, culturing, and imaging will be better performed by robots.   But data analysis can be painfully inelastic: human analysts will grudgingly read through handwritten notebooks, whereas digital data needs to be structured in a very consistent way.  And machine learning is distractible, so that trivial discrepancies in experiment design and technique can lead to false discoveries.  

As such, standardization is more important than ever.  Calibrating instruments and reagents is critical to sustaining meaningful insight where researchers entrust algorithms to identify trends.  The biggest challenge will be reconciling this with the fact that research is not manufacturing: no two experiments are the same, projects pivot as early results appear, and biology has a way of surprising us all.

wave of biotech companies that primarily analyze existing data for therapeutic leads have attracted much attention and investment.  But as machine learning heats up, progress will demand tools that create new biological data along new dimensions, not just new software to analyze it.

Microscopy: Convergence of Anatomical and Molecular

Visual information has always driven the study of biology: consider how Cajal’s century-old hand sketches still feature in neurobiology textbooks.  But digital photography has fundamentally altered the way we research life.  Few cell biology papers are published without some kind of digitally interpreted fluorescence imagery, lots of which is 3d, multiplex, or superresolution.  Certain image analysis software is now considered critical infrastructure to the world’s research community: the scientists that maintain it and outfox digital cheaters are rightly celebrated by funding agencies.  

But nowhere is bioimaging more exciting than where it connects to molecular biology, such as the structure and regulation of DNA, RNA, and proteins.  Studying molecular biology of purified cells in plastic vials ignores the context of anatomy, which can be enormously informative. But scientists can now bring molecular assays to the microscope: in situ labeling can identify huge swaths of the proteome and transcriptome in primary tissue.  They can also use the microscope to pick out cells for molecular analysis, such as by laser-dissecting single cells and sequencing genomes to construct family trees within a single cancerous tumor.  The tools to do this are diverse, and gaining steam by the day (my day job is part of that steam).  So the distinction between studying anatomy and molecules is disappearing.

The most powerful new research tools will measure some kind of molecular information and superimpose it on true spatial images of cells and tissues.  This will reveal new therapeutic strategies, advance personalized medicine beyond narrow gene tests, and lead to fundamentally new categories of diagnostic tests.

Personalized Medicine Bumps Into Research

Much of our basic knowledge of biology comes from studying model organisms like fruit flies and yeast.  Much of human-specific biology was studied in cells that were isolated and cultured in the lab.  Experimenting in live humans is, of course, strongly restricted by ethics and regulation, and risky interventions must serve compelling medical purposes that go beyond advancing basic science.  So there remains a formidable gap between our scientific models and the messy, complicated environment in human bodies.  

But today, new biology discoveries are increasingly common in live humans.  Analysis of single cells’ genes and transcriptomes is now a driver of therapeutic lead discovery.  This is so valuable that freezing and organizing patients’ tissue is a major revenue-generating function of major medical systems.  Induction of stem cells from skin cells, organoids, and engraftment of patient tissue into animal models yield powerful study systems that are far more sophisticated than classic cell culture lines.  While it does not alleviate the need for strong ethical guidance and consent, these minimally invasive strategies now mean that whole populations of humans now participate in science experiments without even knowing it.  

These allow us to study cells in environments that are impressively close to the mysteries of the human body.  What this really means is that the distinction between basic research and medicine is increasingly blurry. Previously, LST teams could limit early products to research use and disregard most ethical and regulatory considerations that burden clinical products.  But now, where exactly the line is drawn is less clear: planning for accelerated entry into clinical niches is both a responsible and smart LST strategy.

Looking Beyond Genomics

In its formative years, the Human Genome Project was a quest for a master script that would illuminate most unanswered questions in biology.  We now know how much more complicated the truth is, thanks in no small part to the huge variety of ways that cells can deviate from the script. These are astoundingly complicated networks of transcription regulation, differentiation and epigenetics, post-translational modification, protein-protein signaling, microbiomes, and microenvironments, none of which are directly predictable in the genome.  And in the most pressing – and fascinating – problems like cancer and neurodegeneration, errors in genetic information are more like undiscovered languages, not simply typos in individual DNA bases.

So the forefront of discovery goes far beyond the genome, the code that nature practically put there to be read and understood.  Ultimately, understanding biology at the level of individual cells’ protein expression is both more difficult and more powerful. Proteomics is a different battle entirely: once proteins are created, their sequences cannot be base-paired and amplified like DNA and RNA, so detection is difficult.  Furthermore, post-translational modifications direct the trafficking and function of proteins; tools to rigorously study such proteoforms are in their infancy.  This difficulty also means huge opportunities as the past few decades’ investment in genomics technology reaches maturity.

Advance Science by Advancing the Tools

The past ten years have witnessed tremendous progress in our ability to understand and manipulate life.  A complete human genome, once a multinational Big Science endeavor, is now something that everyday citizens order for their amusement.  Gene transfection is a clinical reality, and CRISPR is a household word. Nowhere is the distance between laboratory discovery and life-changing utility shorter than in biotechnology.  But the engine behind this rush of innovation consists mainly of instruments, reagents, and software. To fully understand progress in the life sciences, pay close attention to what’s under the hood.