8K+
employees
200+
peer-reviewed publications in 2019
1M+
exomes sequenced
How a leading biotechnology company with 8,100+ employees worldwide is accelerating life-saving discoveries through intelligent data search.
“Our next product lies in our data—we just have to find it.”
— Leonard Schleifer, Founder and CEO, Regeneron
When data discovery saves lives
In the world of biotechnology, time isn’t just money—it’s lives. Regeneron, a research-driven biotechnology company focused on developing life-saving pharmaceuticals for patients with serious illnesses like cancer and infectious diseases, understands this reality acutely. With over 8,100 employees worldwide, more than 200 peer-reviewed publications in 2019 alone, and the world’s largest and most diverse genomic database with over 1 million exomes sequenced, Regeneron generates and processes massive amounts of data daily.
But having data isn’t enough. Finding the right information at the right time is what transforms raw data into life-saving treatments. This is where Regeneron’s MetaBio platform, powered by Lucidworks, has become a critical accelerator in the company’s mission.
The data challenge in biotechnology
Regeneron manages the entire lifecycle of creating treatments—from research and development to manufacturing. This comprehensive approach creates unique data challenges:
- Scientists need to quickly sift through millions of research documents to find relevant studies
- Clinical teams must identify similar trials conducted worldwide to assess feasibility
- Regulatory experts must stay current with constantly evolving global regulations
- Manufacturing teams need to track any events that could affect drug production
“We’re not just delivering search results. We’re pushing the boundaries to help users understand the value of the data, what insights can be drawn from it, and how it can advance their work.” — Shahzad Ahmed, Senior Information Technology Engineer, Regeneron
MetaBio: An intelligent data ecosystem
At the heart of Regeneron’s data strategy is MetaBio, a cutting-edge data contextualization engine that revolutionizes how information is processed and utilized. Abdul Shaik, Head of Data Platform Engineering, and his team have created a sophisticated platform that uses AI and machine learning to implement self-learning capabilities by leveraging signals to understand user intent and business context.
MetaBio is a full-stack, cloud-based solution that ingests, connects, processes, and analyzes data using machine learning, artificial intelligence, and deep learning. The platform features:
- A centralized data lake connected to Lucidworks via a JDBC connector
- Advanced indexing and curation of data, making it accessible across Regeneron’s ecosystem
- Security trimming to ensure only authorized users can view sensitive documents
- User behavior tracking to optimize result relevance and better predict intent
- Personalization based on department, job title, and individual preferences
“We have a long list of applications and technology stacks created at Regeneron. Lucidworks fits nicely into the data ecosystem,” notes Ahmed.
Accelerating COVID-19 research
The COVID-19 pandemic presented an extraordinary challenge: an overwhelming influx of information—research papers, new findings, and potential treatments—that human analysts alone couldn’t possibly filter through efficiently.
Regeneron’s antibody cocktail was the first treatment to demonstrate statistically significant antiviral activity against the novel coronavirus. This breakthrough was accelerated by MetaBio’s ability to process over 60,000 COVID-19 related research documents and journals.
The COVID-19 Search App, built on the MetaBio platform, helped Regeneron’s Regulatory Intelligence Group quickly filter through the noise, enabling teams to identify truly relevant data and make informed decisions on how to respond to the rapidly evolving situation.
Lucidworks’ pipeline framework accelerated the data ingestion and indexing process, while automatically applying machine learning at the moment of data ingest to add deeper understanding and context to every dataset.
Streamlining clinical trial research
Before initiating any clinical trial, Regeneron conducts extensive feasibility assessments to review similar trials worldwide. MetaBio’s Clinical Trials Research App supports this critical process by providing researchers with immediate access to relevant studies.
For example, when scientists working on multiple myeloma search for related clinical trials in China, the platform can surface over 36,000 relevant documents. Advanced filters allow researchers to refine results further, while custom visualizations highlight trial success rates in specific countries.
Researchers can bookmark important documents and export them directly into Regeneron’s workflow applications, facilitating analysis and collaboration across teams.
Personalized discovery for diverse teams
What makes MetaBio particularly powerful is its ability to tailor the data experience to different user groups. Whether it’s researchers, clinicians, clinical development teams, or commercial units, the platform adapts to diverse needs:
- Search queries and user profile information help surface contextually valuable results
- Users can bookmark, save searches, and adjust data source preferences
- When sufficient behavior data isn’t available, the platform uses AI and synthetic data to simulate user behaviors and predict the actions of other user groups
This personalization eliminates data silos and creates knowledge discovery experiences tailored to each scientist’s needs, allowing them to find the data they need when they need it.
Award-winning impact
MetaBio’s significance has not gone unnoticed. In 2022, Regeneron received the prestigious CIO 100 award, recognizing the platform’s innovative use of technology to deliver substantial business value.
User engagement metrics tell the same story. Since its launch, daily traffic to MetaBio has increased significantly, with steady improvements in relevance metrics like click-through rates and successful search results.
“One of the key selling points of Lucidworks that I’ve shared with leadership is its flexibility. We collaborate closely with stakeholders to understand their challenges, and we can quickly build custom solutions that solve specific problems relying on Lucidworks capabilities.” — Shahzad Ahmed, Senior Information Technology Engineer, Regeneron
The future of data-driven discovery
As Regeneron continues expanding MetaBio’s capabilities to its commercial operations, the platform’s core mission remains the same: expediting data discovery through a self-service, on-demand platform that accelerates decision-making across diverse teams.
By analyzing external sales data from pharmacies, Regeneron aims to gain deeper insights into market performance and inform business strategy—extending the platform’s value beyond research and development.
The company’s founder and CEO, Leonard Schleifer, captured the essence of MetaBio’s importance when he said: “Our next product lies in our data—we just have to find it.” With MetaBio’s fast and efficient data discovery, the path to critical breakthroughs is clearer than ever, allowing Regeneron’s researchers to focus on what matters most: saving lives.
With over 8,100 employees worldwide, 200+ peer-reviewed publications annually, and more than 1 million exomes sequenced, Regeneron has transformed how its scientists discover life-saving treatments by implementing an AI-powered search platform that turns vast amounts of data into actionable insights—proving that in biotechnology, finding the right information faster can literally save lives.