How Can We Enable and Advance Clinical Research: The Power of Federated Data

January 12, 2023

Contributors: Dr. Rajni Aneja, Sandy Pentland and Anne Kim

As a vital connector of the clinical research world, patient advocacy groups are responsible for setting disease-ending insights in motion. The two main goals: finding cures and ending disease.

Today’s patient advocacy organizations (PAOs) are doing amazing things, but there’s still so much untapped potential for your organization.

Clinical research is only as effective as the data that researchers can easily access. For a variety of reasons — limited sample sizes, a lack of representative data, and patient privacy concerns — the quality and quantity of clinical data stand to improve.

What can patient advocacy groups do about it? In truth, advancing clinical research — and the data it produces — will be a collective effort from PAOs, researchers, hospitals and companies. However, it’s important for PAOs to understand the factors that are holding back clinical research.

It’s also essential for your organization to seize the opportunity and become a next-generation patient advocacy group – one that embraces new strategies and tools that help unlock more data access for their researchers and partners. With collective action and creative solutions, next-gen PAOs can attract mission-aligned funding, grow their PI, researcher and hospital networks and drive disease-ending insights.

Limitations to a more profound impact on clinical research

PAOs such as yours face obstacles in achieving their common goal: preventing, treating, and curing conditions for every population and disease subtype. Aligning a vast network of researchers, patients, hospitals and other partners isn’t easy.

Here are three trends that are curbing the potential of clinical research:

Limited sample size

Clinical research typically relies on a limited number of clinical sites: 15-20 in many cases. Research can only go so far when the accessible data pool is limited.

Expanding that data pool is easier said than done, in part due to the needs of the various stakeholder groups in the process. Researchers need datasets in certain formats. Hospital leadership has to oversee the approval of sharing datasets. IT professionals have their own set of responsibilities to grant and monitor that access.

As is the case in many business processes, getting everyone on the same page is a slow and arduous process. We all want more access to data, but achieving this happens at a slower pace than we’d all like.

Lack of representative data

Those limited clinical sites have another major problem: representation. From 2015-2019, 76% of clinical trial participants were white, according to data from the Food and Drug Administration.  Unfortunately, many clinical insights only represent a small subset of the population and ignore certain demographics, subtypes, etc.

What about everyone else? How can we source clinical data that are representative of everyone?

There’s an unfortunate trend of distrust of medical research in some minority populations — a feeling that’s been fostered by numerous cases of either malicious or negligent abuse. It’s a vicious cycle; we lack representative data, which means we lack representative cures, which further erodes trust.

If the research community had access to more representative data sets – and more context into where clinical data comes from – we could help address these institutional biases.

Patient privacy concerns

Some of the distrust is tied to concerns over patient privacy. Vital personal information such as demographic data, history of drug usage and familial data has entered the digital space at a rapid pace. As an industry, we haven’t clearly defined all of the ethical rules and regulations to keep this data safe.

Nearly 75% of patients express concern about protecting the privacy of personal health data, according to a survey from the American Medical Association. Only 20% of patients feel that they understood the scope of companies and individuals with access to their data.

Hospitals and other institutions are often reluctant to make their data accessible – and reasonably so. Many of these organizations lack the infrastructure or people power to efficiently share datasets while maintaining privacy standards.

Whatever solutions we create need to be built on a foundation of keeping patient data secure, private, and used only for the right reasons.

The Solution: What can patient advocacy groups do?

As a PAO leader, you’re probably coming back to the same question: what can my patient advocacy group do to alleviate these problems and achieve our goals?

The use of federated data provides one potential avenue for PAOs to maximize their impact. Federated analytics is an approach for performing computational data analysis on multiple data sets without needing to pull the data itself into a central location.

Software that utilizes this methodology can help researchers gather representative data in a collaborative, privacy-preserving manner. It has the potential to play a role in clinical research’s transformation by addressing concerns with the status quo of clinical research.

A clinical data federation fueled by federated analytics can be implemented on a large scale to:

Unlock data access

Federated learning accelerates the speed that models learn because the data doesn’t need to be transferred multiple times, as it would be with the traditional approach. Instead of requesting every hospital to share their data individually, researchers can quickly run machine learning algorithms against the federated data for multiple hospitals at once

Preserve patient privacy

This approach differs from traditional methods because local data sources don’t need to send data to a centralized platform for processing. It’s relatively easy for IT teams to federate data by enabling secure access without any actual data exchange. This means IT and security teams will be more willing to implement federated learning without worrying about introducing privacy risks.

Increase representation in data

When patient data is easily accessible for researchers in a privacy-preserving way, researchers won’t have to rely on the same datasets over and over. This opens up more possibilities for using datasets that represent different demographics and disease subtypes.

Array Insights: A clinical data federation for next-gen patient advocacy groups

At Array Insights, we offer software for creating a Clinical Data Federation for connecting patient advocacy groups, hospitals, and clinical researchers.

Through Array Insight’s platform, traditionally siloed data can be accessed and shared securely. Our platform uses the power of federated analytics, which helps avoid potential privacy and security concerns and speeds up data sharing with the use of trained AI models. With Array Insights, IT teams can easily federate data by enabling secure access without any actual data exchange.

Through this process, Array Insights can help your next-gen PAO expand your hospital and research network, thus enabling access to more diverse and representative datasets — and alleviating concerns over slow workflows and patient privacy.

We’re happy to play a role in advancing clinical research, but the true progress will come through patient advocacy groups, such as yours. Embracing new solutions and adopting a collective mindset will go a long way in fulfilling your organization’s potential.

Could your PAO take advantage of a Clinical Data Federation? Securely connect to representative data that enables disease-ending insights at

About the Contributors

Dr. Rajni Aneja is a leader in developing the vision for strategy and innovation across different spectrums of healthcare, including health management and digital health. She is a physician executive at the intersection of business, medicine and tech, a Connection Science Fellow at MIT and an executive strategic advisor/ board member for a variety of health and wellness organizations. She previously served as chief medical officer for WebMD.

Professor Alex “Sandy” Pentland has been recognized as one of the most-cited and most-popular data scientists in the world. He is a professor at MIT and directs MIT Connection Science, a worldwide alliance of progressive companies, nations, and multilateral organizations dedicated to increasing privacy, trust, and security in data systems.

Anne Kim is co-founder and CEO of Array Insights, which is based on her graduate work at MIT. She worked with Professor Pentland on federated learning and blockchain solutions for clinical trial optimization using Open Algorithms (OPAL). Outside of her research, Anne has done a number of different projects in computer science and molecular biology and cyberbiosecurity work with the EFF, ACLU, and DEFCON.