Examining the “other end” of the Semantics telescope

Powerful, new ways to get the value from most your data are limited by confusing claims.  Pundits estimate that over 80% of data is unstructured1, formerly requiring teams of experts to address and interpret.  Most industries only face this as a text problem, but BioPharma also “suffers” from unstructured data in the forms of chemical structures, disease pathways, biomarkers, etc..   Opinions vary widely on tech-based tools to address this challenge, but their utility is often tinged by the seller’s agenda. 

Such things as Entity ID are already helping BioPharmas

Recently (2009), Gartner cited text analytics as being at or just over their “hype” peak of maturity, again slowing acceptance.   However other capabilites of semantics are actually well into the stages Garner describes as having productive maturity (D&E on right.) 

This keeps many in BioPharma from addressing formerly intractable, huge problems that can now be solved.  Let’s let others debate the technical aspects of semantics, text analytics, ontology’s, etc., and instead look at key areas that these tools can now unlock.  Looking through “the other end” of the telescope vs. how software sellers and IT consultants view things. 

Semantics offer improvements at all pointsThere are many problems that semantics can address across the broad time frame and activities of BioPharma R&D.  Firms are exploring how semantics can improve all of the decision points highlighted here, from finding a biomarker to mining clinical responses in electronic health records.  

But if we look at this with an eye for value and competitive advantage, several areas stand out: 
  1) In-licensing opportunities
  2) Positive and cautionary signals on a drug before clinical trials (post a question below as to how)
  3) Identify better discovery candidates faster
  4) Detect weak patents while discovering where competitors are investing 
Solving any one of these can represent hundreds of millions of dollars of value.  

In-depth approaches to each of these would take many pages, so let’s just explore just one of these: increasing success and returns on in-licensing efforts. 

Text analytics can sift through thousands of documents*, database records and web-sourced information, pulling out concepts that are novel or increasing in dialogs.  Such concepts or insights into the nature of a disease pathway can then guide the licensing team to articles or even web postings (even Twitter or Facebook) of firms or academics with research of relevance.  End results:  whoever recognizes the value of intellectual property ahead of the competition can secure rights first.  Lower costs of such rights are a result of several factors, including action before another bidder is involved.  

A pre-semantics problem example:  In the 1990’s, a licensing leader at Bristol-Myers was reading through dozens of journals each month, looking for insights on possible investments.  He noticed a new study on an old drug revealed it acted on a slightly different pathway than previously assumed.  Similar drugs were found to have adverse effects, so this product was never filed for approval in the USA.   The knowledge his Discovery organization had recently provided on the nature of the disease led this leader to recognize that the adverse effect was triggered within the other pathway.  This serendipity of ideas and understanding came by hard work and chance.  With this new insight, the firm secured the rights to apply and market the drug for five years – and at a great price.  It resulted in the launch of Glucophage, a drug that netted BMS well over eight billion dollars in sales in the US in five years.   

Nice results.  But could this success be aided or reproduced with new capabilities?  Some firms are now using a combination of semantic tools to do just that.   They pull out insights and correlations that advise and support very skilled and smart experts.  Now these folks can spend time now considering and acting on ideas, not reading thousands of pages before they can “think”.  

If folks are interested, we can expand on exactly how these tools can do this.  Alternatively, if someone posts a question to how the other problems can be addressed (preclinical signals, novel discovery targets …), and how much value they can bring (again, many millions of dollars), we will explore them in the next post.  Please let us know your thoughts with posts below.   

Architect’s toolbox:  Instead of seeing where to “tack on” a new R&D capacity, look first at areas that have big opportunities or needs like automating the $8B Glucophage serendipity.  This provides better returns and acceptance while building on the existing goals of the organization.   Next, map out the capacities in hand vs. those you need to create the benefits.  Use this “capability framework” to guide tool selection, configuration and implementation.  RandDReturns has created numerous such frameworks for clients looking to boost performance by “changing the game” with new, transformative tools. 

* Similar approaches to other data types such as chemical structures (or space filling or receptor models) can also winnow through other databases and filings.  These provide a way to correlate patents or target molecules, to key ideas from Discovery or from the complementary, text analytics process.    

Finally, some sources.  One provides a glimpse of the “sentiment analysis” dimension of text analytics.  Note that RandDReturns has no relationship with any of these firms or tools: 

From http://www.optify.net/social-media/sentiment-analysis-metrics-tools-part-2/ 

Seven Sentiment Analysis Tools to Check Out:

  1. OpenAmplify: In the company’s words it is a “technology company specializing in natural language processing and text analysis.”
  2. Social Mention: “Social Media Alerts: Like Google Alerts but for social media.”
  3. Amplified Analytics: This tool is geared primarily toward product reviews and marketers interested in tracking those reviews across multiple sites. Free trial, no registration required.
  4. Jodange: “Automatically filters and aggregates thoughts, feelings and statements from traditional and social media.” Offers a 30-day free trial.
  5. Lithium: Lithium Social Media Monitoring: “It’s easy to configure, works in real-time, and finds the best stuff from millions of social media sources.” Offers a free, 14-day trial with sign up.
  6. SAS Sentiment Analysis Manager: Part of SAS Text Analytics program, the Sentiment Analysis Manager: “crawls content sources, including mainstream Web sites and social media outlets, as well as internal organizational text sources…[it] creates reports that describe the expressed feelings of consumers, customers and competitors in real time.”
  7. Trackur: “Trackur is an online reputation & social media monitoring tool designed to assist you in tracking what is said about you on the internet.” Claims more than 27,000 users. Offers a free plan with sign up, as well as a 10-day money back guarantee with any paid plan.

Other Links of interest: 

1.  http://en.wikipedia.org/wiki/Unstructured_data



One Response to Examining the “other end” of the Semantics telescope

  1. A recent article in the Wall Street Journal (see link), gives another example of the utility of semantics affecting medical therapies, their markets and regulations. Two firms are deconvoluting the millions (>6.4M) of Adverse Event (AEs) reports in FDA databases, trying to extract significant “signals” for consumers. A major use of internet search is on health interests and yet the usefulness to the average consumer is overwhelming or misleading. For instance, a search for Tysabri and “side effects” turns up 3.9M hits on Google. This would take a while to read through. Two firms, AdverseEvents Inc. and Clarimed, LLC are trying to change that. While their technologies are not mentioned, much of the value they add is from putting relevant content and structured analysis on the masses of “unstructured data” in the FDA databases. One example report is shown here (http://adverseevents.com/attachments/s_drugreports.pdf) profiling the top ten AEs for the same search – for subscribing customers. A bit clearer than Google (but note the cautions on confidence below.)

    This can be of great value to consumers wondering if their “dry mouth” or chills or loss of smell is linked to any of the several medications they are on. However, a key, missed opportunity is that this information is based on data acquired over years, and sometimes filed quite a bit later after the recorded “event”. This is far later than when signals from other sources can pop up. I can think of at least one example where blog and twitter sources indicated signals eighteen months ahead of the point where the FDA acted upon their “signals” of adverse event significance. Still, confidence in such sources – even in aggregate fall far short of clinical and regulatory standards. A tradeoff between signal time and confidence.

Leave a Reply

Your email address will not be published. Required fields are marked *