Firms increasingly recognize that better, faster use of their information is the key to success. This is even more important to Life Science firms as they address an explosion of data. Their “big bang” occurred in the past decade, creating new fields like proteomics, metabolonomics, disease modeling, etc., and their new data types. It cannot be stressed enough that this diversity is even more challenging than just an increase in the volume of data within a “type”.
A number of new techniques and technologies have been unevenly applied (reflecting Gartner’s varied hype assessments) in BioPharma firms, more recently under the umbrella of “translational medicine”. The NIH has just (2010) announced creation of a division on “Advancing Translational Sciences” , but some are already playing there productively. I have recently helped several clients wade through the claims and confusion to implement some tools that actually solve real BioPharma problems and opened new doors of opportunity. Many firms still remain unaware of Semantics’ potential.
Let’s consider just two types of toolsets: data federation this week and semantics (of which text analytics is a subset) next time. Big players on the software and services side have been buying up these and other tools, creating “Business Intelligence” suites. But for BioPharma R&D, these are often overkill or misaligned (misapplied?) to the unique challenges of Life Sciences. The key is to understand what problems we face in R&D can be now “solved?”, .vs what is often “what tools are ‘hot’ that we can purchase now?”.
The information explosion (reflected in another way above*) created two kinds of problems; it was hard to relate new data types to others meaningfully, and an excess of “unstructured” data preventing our ability to distill it all down into meaning and insights. Ten years ago, correlating such data was done ad hoc, or if clinical/file-able, it was put into a data warehouse. It takes intensive efforts to Extract, Transform, and Load (ETL) the information before it is usable. But a key path to unlocking the potential in R&D investments is to connect information across the entire value chain, from Discovery to Markets. ETL /warehouse approaches could not cope with much more than clinical (well-behaved, well-categorized) data . About five years ago, a new technique broke this barrier: Data Abstraction and Federation (no, not “DAF” – a more professional acronym is used: “EII” or Enterprise Information Integration.)
EII solves half of the above-cited needs from the R&D information explosion. Instead of having to rework and recode the entire warehouse when new data definitions: subgroups in proteomics, etc. appear, EII easily accommodates them. Now scattered databases – whether new proteomic , legacy chemometric or clinical can be easily linked and queried. I led a team in the first Pharma application of this technology and had scientists answering questions they thought were beyond reach – within weeks of piloting. This dramatically increases the value we can pull out of R&D information and investments by connecting the diverse data into insights that are actionable and that solve formerly intractable problems. From another perspective, the time formerly spent in gathering, extracting, transform and loading is compressed so much that, 1) barriers to examine new ideas are removed, and 2) many thousands of person-hours can now be applied to more innovative and valued activities.