With Data Science, we aim to prevent a potential patient from becoming a patient. Interview with the directors of IQVIA
Piotr and Maciej recently discussed big data activities of IQVIA’s Real World & Analytics solutions and what the company does with real-world healthcare data. Here is a transcript of that interview and the duo’s in-depth explanation of using big data to drive healthcare forward.
IQVIA is referred to as “The Human Data Science Company™”. What does this mean for you, how do you understand it?
Piotr: The term “human data science” is quite new, even to us. It refers to a combination of research on humans, their physiology, behavior (and) well-being and computer-based data analysis using statistics or machine learning.
Almost every second of our life, we generate enormous amounts of data, including information about our well-being (blood pressure/oxygen saturation/breath), medications taken, treatments taken, kilometers traveled, our eating patterns, habits and forms of spending free time. Each of these events has a potential impact on our health — the current one, but more importantly the future one. For hundreds of years, men have been formulating theses about a healthy lifestyle or methods of treating specific ailments. Thanks to modern technology, we have the opportunity not only to support these theses with numbers, but also to discover some additional, previously unknown dependencies based on the collected data.
Maciej: This slogan also reflects where our main competencies and strengths are. We help clients and (indirectly) patients by creating solutions based on data science — finding dependencies, drawing conclusions and predicting events based on data. In our case, we collect and analyze data about a person — a patient or a potential patient. Many of our solutions prevent a potential patient (someone at risk of a complication or disease) from becoming an actual patient.
In other words, we have the human aspect and the data aspect, but the term “data science” also has a very technical connotation as an engineering field. We are a company focused on a specific domain, but we are also technological at our core!
What does the Real World Solutions department do?
Piotr: The RWS department is collecting medical data to support formulated theses or generating new insights based on these data. From clinical trials, through surveys, sales information, information about visits to medical facilities, data on adverse drug reactions, claims with insurers, to the analysis of messages from social networks or data collected by wearables, all this information is carefully processed in terms of privacy (risk of identifying a single person, data loss), consistency (within the set, between sets, but also between data providers) and quality (elimination of outliers, filling gaps in data). The data prepared this way is stored in a form that ensures efficient reporting or delivery to the client.
Maciej: Next, we take the data that we call “analytically ready data” and use it for analytics. From simple data extracts, through interactive data exploration business-intelligence style, to dedicated analytical applications that allow you to build the so-called cohorts (groups of patients whose medical history meets certain, often very complex, criteria) and run advanced statistical and machine learning models on them. We also create applications closer to real time, which, for example, analyze the results of patient examinations and raise an alert if they discover a threat of an adverse event.
These applications are available in SaaS mode, but we also work with clients, building dedicated models for them addressing very specific cases.
We also work with clients to help them process their data into a form in which they can be part of intercenter research networks, such as EHDEN [European Health Data & Evidence Network] or OHDSI [Observational Health Data Sciences and Informatics].
You mainly deal with medical data analytics. What is the main goal of your work?
Piotr: We are part of a global organization, and we implement international projects. We employ people with different levels of experience and specializations — developers, testers, architects, scrum masters, product-owners, people-managers. The Polish part of our team (about 100 people in Poland, in Warsaw or remotely) focuses on creating an application ecosystem in the big data area, the so-called Data Factory, which is used to process “dirty” data from the world (real-world data) and create clean, homogeneous, statistically correct datasets from it.
Maciej: I would like to clarify that we are not dealing strictly with analytics, but with creating tools and platforms that allow us to do such analytics quickly, efficiently and repeatedly. What is an important element of our solutions is that they must be scalable — they should be fit to use in many countries, for data coming from different sources and for different purposes. It is a real art and a big challenge to design and build systems that meet today’s needs and can be easily extended to cases that we do not know yet.
How can the collected data help to create the best solutions for patients?
Piotr: The data we collect enables analyzing, for example, epidemiological trends simultaneously in many countries around the world. In addition, it allows the identification of patients at risk of a disease or health decline. As a result, medical facilities that provide us with data are able to contact the patient and to react in advance, in order to prevent the disease or detect it at an early stage. Knowing what treatments patients receive in each facility, we are also able to suggest optimal locations for conducting clinical trials of newly developed drugs. Based on the prescriptions received, we can estimate the length and effectiveness of therapy (drug doses, usage duration) and even suggest a change in treatment toward more effective therapy.
What data do you collect? How do you filter it?
Piotr: In our department, we focus on collecting the so-called “electronic medical records,” EMRs for short, which is all data related to the patient’s contact with a medical facility. Therefore, it will be dictionary information about the patient, facility, doctor and related medical events (diagnoses, reported problems, allergies, prescriptions, laboratory tests performed and their results, vaccinations, referrals or family history).
As mentioned before, these data, in accordance with the GDPR (or other relevant global standards), are pre-anonymized (replacement of identifiers, detection and replacement of text fragments in text fields) and an assessment of the risk of re-identification is performed. As a result of the conducted assessment, it may be necessary to further modify the data in order to reduce the risk of reidentification (e.g., the patient’s age over 100 years makes him highly identifiable in a given region of the country). Then the data is checked from a legal point of view (practice with a valid contract) and quality of the transferred data is assessed.
It is worth mentioning that IQVIA is not the legal owner of most of its data, and its use typically involves a fee. As a result, unlike companies in the financial or retail sectors, IQVIA by definition does not have full market data and … is willing to prioritize data accuracy over completeness. Data of doubtful quality is rejected and any gaps in data (due to rejection or lack of a valid contract) are filled within the warehouse.
Maciej: I would emphasize the aspect of ensuring that patients cannot be identified. This is probably the most fundamental feature of our solutions, which forces, for example, multi-stage (in different physical locations and operated by different entities) processing. What distinguishes our solutions is the fact that we not only apply transformations, but as Piotr mentioned, we use methodologies and solutions that quantify the risk of identification. So, we not only remove some sensitive information and hope that it will be enough, but we rigorously check and calculate whether this risk has actually been reduced to a given level.
Who uses your work, data collected by you?
Piotr: Our data is a huge mine of knowledge for medical institutions, public health institutions, pharmaceutical companies, doctors, insurance companies and many others. Our clients value, above all, the ability to analyze data and formulate conclusions based on data from many countries at the same time. They trust the quality and correctness of the information presented. They want to use the experience that IQVIA (and previously IMS Health and Quintiles) has accumulated by being a world leader in the health industry for years. Wherever you see the attribution, “based on IQVIA data from 2022,” this may be data our department collects.
Maciej: Interestingly, it happens that we do not know what research our data will be used for. We ensure its quality and availability, bearing in mind certain applications, and it often turns out that data finds a second and third life, helping in projects that we initially did not even imagine. We work with many teams that deliver projects for a wide variety of user segments, from large pharmaceutical companies to medtech, biotech, insurers, hospitals and national and regional governments.
How did you find your current company?
Piotr: In my case, it is quite an interesting story. Since my studies, I have been interested in news about data warehousing. While working in the UK, I had the opportunity to get to know the Netezza technology, which was quite innovative at the time. Unfortunately, after returning to Poland, it turned out that the only job related to this technology was related to the attempt to sell this technology in the surrounding markets. However, I primarily wanted to deliver value based on new technologies, not to advertise or build them.
When I was looking for a job, I came across an offer from IMS (a predecessor to IQVIA). The offer concerned the Oracle database developer, but a short verification of the profiles of employees from the USA (the company’s headquarters is located there) allowed me to say that somewhere in this company there is a department working with Netezza. I took a risk — I decided to apply for the position, at the same time emphasizing that I was only interested in working with (as it turned out later) the director of the data warehouse department.
I was lucky. I found a manager who saw my passion and connected me with a selected person. This person in turn agreed to hire an employee in Poland to support the newly established data warehouses in the UK and Germany and ensure the development of technology in the region. IQVIA has created a job opening for me that was not planned in this region before. And all thanks to the open minds of two people whom I admire to this day.
Maciej: I, too, have dealt with data and analytics all my professional life. For some time, I had broader roles (i.e., as chief technology officer, I was responsible for all aspects of IT, including setting directions for technological development and innovation), but I had the most fun working on data-related topics and what wonders can be conjured up based on them.
For quite a long time, I was looking for a company for which data is core business and a company at the same time technologically advanced and complex enough in terms of business so I could continue to develop in the direction of creating strategies, creating platforms, products and solutions. IQVIA is such a place. While I admit that I had underestimated the business and technological complexity, this is a company that stimulates me to constantly learn. I have a feeling that it is changing and growing so fast that I will never explore more than a fraction of its possibilities.
What positions did you hold before? How did you start in IT?
Piotr: I started my professional career in the third year of studies [toward the equivalent of a Bachelor’s degree]. I was hired in the company, which is also, nota-bene, from the related and pharmaceutical industries, as a database administrator. Later, I worked for many other sectors, in sequence: telecommunications, retail, several financial institutions, energy. I even dealt with the monitoring of the network infrastructure or training on the basics of SQL.
I have worked in Poland, in local companies and international corporations, even in Great Britain in a company that performs data analysis for the American market. Always with databases, with their ever-growing volume. From the first traditional warehouse in Oracle, through the era of the so-called appliance (hardware-assisted database logic) in Teradata, then Netezza. Then as part of IQVIA [with] Cloudera Data Platform, and in recent years, Databricks or Snowflake. I was an analyst for two years and a developer for the next two years. Then I started designing database systems as an application architect — about five years in total.
The last five years, in turn, I have been working with a greater number of implementation teams at the same time, a bit as an application architect, a bit of a technology expert, a bit of a system architect, and sometimes as operational support.
Maciej: My path was quite typical, at least at the beginning. I started as a developer at Hewlett-Packard in the area of data warehouses and business intelligence, then as a technical leader and solutions architect. Thanks to HP, I got to know the specifics of working in a global corporation, and mainly, for global corporations.
Then I spent a few years in consulting in a small Swiss company, advising clients from the financial industry. Lots of clients, lots of projects, interesting and rapidly changing environment, but I lacked having real impact on reality. And I lacked feedback loop — do the strategies I define for my clients actually work? The incessant travels around the world also contributed to the fact that I returned to the more stable world of American corporations.
I performed various roles, such as the aforementioned CTO position, but always related to architecture and technology. However, I have learned that a successful IT solution is much more than good design and perfect components. For several years, I have been trying to make the best use of this knowledge and experience in IQVIA, where for the first time in my professional life I create platforms and solutions that can improve human life and health.
What do you like most about your current job?
Maciej: The aspect of creating solutions that do something more significant than optimizing advertising profits. Not to disregard the numerous colleagues who do just that. We work at the junction of two very dynamic industries — technology and medicine — and there is no time to stand still; you have to constantly develop both people and our platforms and products. I value the decision-making and flexibility very much, and despite its large size, the company does not impose any artificial restrictions that would hinder the creation of cutting edge solutions. Fortunately, IQVIA has a huge pool of experts in various fields, so there is someone to learn from and to take an inspiration from.
Piotr: From the very beginning, I appreciated the possibility of working in an international group. Regional and cultural differences give me a great opportunity to learn not only about other people’s habits, but also allow me to look at the world through other people’s eyes and, thus, better understand the reality that surrounds us. I also love working in the area of data warehouses, especially when the data volume is large enough to make effective processing demanding and the results difficult to predict in advance. In addition, I like to conduct presentations, conduct training, share knowledge or insights. Fortunately, my work gives me more and more opportunities to do so every year.
What do you see in the future for analytics and AI in the field of medical services?
Piotr: This is a very interesting question. On one hand, artificial intelligence in medicine has enormous potential, both in the field of knowledge discovery and automation of many expensive processes. As we know well, the amount of data increases exponentially year by year. It is therefore possible to find and process information that is important to us by simple analysis.
On the other hand, it is a considerable risk, because modern algorithms less and less often present in a clear and unambiguous manner how they came to the given conclusions (the best ones can assess the accuracy of the model based on test data). And while computer-generated conclusions and models are generally correct, the key issue is the quality of the data leading to the decision-making. Let’s take the patient’s health condition as an example — how do you use the computer to assess whether the patient’s slightly red hand is a simple rubbing, erythema [a result of injury or irritation] or an allergic reaction?
It is possible that the future of analytics in medicine will depend on the reality assessment correctness and the provided data quality. After all, it is difficult to come up with correct conclusions based on distorted input data.
Maciej: I fully agree with Piotr. The quality and quick availability of data may not be a sexy topic, but it has a huge impact on the usability of the systems we create. A good example is how little AI has helped clinically in the fight against COVID-19 (despite hundreds of solutions created). If the data is not properly prepared, artificial intelligence learns, for example, to distinguish between standing and lying patients or younger and older patients, instead of more or less at risk of complications. Such mistakes were actually made in a pandemic!
We see more and more solutions based on artificial intelligence approved for clinical applications, for example in radiology or in the development of new drugs. But I would like to highlight two areas that lead to an increase in the usefulness and scope of AI in medicine.
The first is data and privacy protection. We can see a very dynamic development here — solutions that allow for teaching models without access to source data, central data processing, while maintaining full control over who uses [the data], when and for what, creating a distributed network of data owners who conduct research together without exchanging data and improve models or data publishing to be useful but not identifiable. Topics such as federated learning, confidential computing and differential privacy will surely open up new doors in the diffusion of artificial intelligence.
The second is explainability, i.e., the functionality, thanks to which the AI model explains to us why it returns such a result (recommendation, prediction, classification, etc.). We see it in AI implementations in a clinical context, solutions that allow medical staff to make decisions on their own (with AI support) are much more useful and valuable. And in order for AI to be a good advisor and helper, it must be able to clearly explain not only what but also why it suggests.
As the world’s leading human health data and analytics company, IQVIA deals with real-world data — data from sources associated with outcomes in heterogeneous patient populations — and with big data, or extremely large data sets that reveal patterns, trends and associations when analyzed computationally.
To meet our goal of understanding how healthcare works in over 100 countries around the globe, IQVIA draws data from 1 million sources that give us access to 100 billion healthcare records, including 600 million anonymous patient records, in these 100-plus countries. Handling this data requires a massive technology infrastructure.
One part of that massive technology infrastructure is the Data Factory, which processes electronic medical records, or EMRs, including data about pseudonymous patients and doctors, doctor visits, prescriptions, lab tests, diagnosis, immunizations, allergies, problems with healthcare and more. The Data Factory employs more than 100 team members in Poland, France, the U.K. and Germany.
Piotr Kaczor. Associate director, IT Architecture, for IQVIA in Warsaw is responsible for end-to-end healthcare data warehouse design for local IQVIA teams around the world. This includes design and implementation of data processing systems for EMRs and their development (data quality, applications and warehousing, and development of a reference database). His 15 years of experience includes working with BigData since 2016 and with Oracle, Teradata and IBM PureData for Analytics (Netezza) technology before that.
Maciej Piotrowski. Director, IT Architecture, for IQVIA in Warsaw is responsible for defining the technology strategy for IQVIA’s Real World & Analytics solutions organization, with a focus on data systems. He is responsible for defining and implementing a technological strategy for products in the area of data transformation, analytics and artificial intelligence.