Loading metrics

Open Access


Research Article

Assessing the impact of healthcare research: A systematic review of methodological frameworks

Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Validation, Writing – original draft, Writing – review & editing

Affiliation Centre for Patient Reported Outcomes Research, Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, United Kingdom

ORCID logo

Roles Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Supervision, Validation, Writing – review & editing

* E-mail: [email protected]

Roles Data curation, Formal analysis, Methodology, Validation, Writing – review & editing

Roles Formal analysis, Methodology, Supervision, Validation, Writing – review & editing


Fig 1

Increasingly, researchers need to demonstrate the impact of their research to their sponsors, funders, and fellow academics. However, the most appropriate way of measuring the impact of healthcare research is subject to debate. We aimed to identify the existing methodological frameworks used to measure healthcare research impact and to summarise the common themes and metrics in an impact matrix.

Methods and findings

Two independent investigators systematically searched the Medical Literature Analysis and Retrieval System Online (MEDLINE), the Excerpta Medica Database (EMBASE), the Cumulative Index to Nursing and Allied Health Literature (CINAHL+), the Health Management Information Consortium, and the Journal of Research Evaluation from inception until May 2017 for publications that presented a methodological framework for research impact. We then summarised the common concepts and themes across methodological frameworks and identified the metrics used to evaluate differing forms of impact. Twenty-four unique methodological frameworks were identified, addressing 5 broad categories of impact: (1) ‘primary research-related impact’, (2) ‘influence on policy making’, (3) ‘health and health systems impact’, (4) ‘health-related and societal impact’, and (5) ‘broader economic impact’. These categories were subdivided into 16 common impact subgroups. Authors of the included publications proposed 80 different metrics aimed at measuring impact in these areas. The main limitation of the study was the potential exclusion of relevant articles, as a consequence of the poor indexing of the databases searched.


The measurement of research impact is an essential exercise to help direct the allocation of limited research resources, to maximise research benefit, and to help minimise research waste. This review provides a collective summary of existing methodological frameworks for research impact, which funders may use to inform the measurement of research impact and researchers may use to inform study design decisions aimed at maximising the short-, medium-, and long-term impact of their research.

Author summary

Why was this study done.

What did the researchers do and find?

What do these findings mean?

Citation: Cruz Rivera S, Kyte DG, Aiyegbusi OL, Keeley TJ, Calvert MJ (2017) Assessing the impact of healthcare research: A systematic review of methodological frameworks. PLoS Med 14(8): e1002370. https://doi.org/10.1371/journal.pmed.1002370

Academic Editor: Mike Clarke, Queens University Belfast, UNITED KINGDOM

Received: February 28, 2017; Accepted: July 7, 2017; Published: August 9, 2017

Copyright: © 2017 Cruz Rivera et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and supporting files.

Funding: Funding was received from Consejo Nacional de Ciencia y Tecnología (CONACYT). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript ( http://www.conacyt.mx/ ).

Competing interests: I have read the journal's policy and the authors of this manuscript have the following competing interests: MJC has received consultancy fees from Astellas and Ferring pharma and travel fees from the European Society of Cardiology outside the submitted work. TJK is in full-time paid employment for PAREXEL International.

Abbreviations: AIHS, Alberta Innovates—Health Solutions; CAHS, Canadian Academy of Health Sciences; CIHR, Canadian Institutes of Health Research; CINAHL+, Cumulative Index to Nursing and Allied Health Literature; EMBASE, Excerpta Medica Database; ERA, Excellence in Research for Australia; HEFCE, Higher Education Funding Council for England; HMIC, Health Management Information Consortium; HTA, Health Technology Assessment; IOM, Impact Oriented Monitoring; MDG, Millennium Development Goal; NHS, National Health Service; MEDLINE, Medical Literature Analysis and Retrieval System Online; PHC RIS, Primary Health Care Research & Information Service; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses; PROM, patient-reported outcome measures; QALY, quality-adjusted life year; R&D, research and development; RAE, Research Assessment Exercise; REF, Research Excellence Framework; RIF, Research Impact Framework; RQF, Research Quality Framework; SDG, Sustainable Development Goal; SIAMPI, Social Impact Assessment Methods for research and funding instruments through the study of Productive Interactions between science and society


In 2010, approximately US$240 billion was invested in healthcare research worldwide [ 1 ]. Such research is utilised by policy makers, healthcare providers, and clinicians to make important evidence-based decisions aimed at maximising patient benefit, whilst ensuring that limited healthcare resources are used as efficiently as possible to facilitate effective and sustainable service delivery. It is therefore essential that this research is of high quality and that it is impactful—i.e., it delivers demonstrable benefits to society and the wider economy whilst minimising research waste [ 1 , 2 ]. Research impact can be defined as ‘any identifiable ‘benefit to, or positive influence on the economy, society, public policy or services, health, the environment, quality of life or academia’ (p. 26) [ 3 ].

There are many purported benefits associated with the measurement of research impact, including the ability to (1) assess the quality of the research and its subsequent benefits to society; (2) inform and influence optimal policy and funding allocation; (3) demonstrate accountability, the value of research in terms of efficiency and effectiveness to the government, stakeholders, and society; and (4) maximise impact through better understanding the concept and pathways to impact [ 4 – 7 ].

Measuring and monitoring the impact of healthcare research has become increasingly common in the United Kingdom [ 5 ], Australia [ 5 ], and Canada [ 8 ], as governments, organisations, and higher education institutions seek a framework to allocate funds to projects that are more likely to bring the most benefit to society and the economy [ 5 ]. For example, in the UK, the 2014 Research Excellence Framework (REF) has recently been used to assess the quality and impact of research in higher education institutions, through the assessment of impact cases studies and selected qualitative impact metrics [ 9 ]. This is the first initiative to allocate research funding based on the economic, societal, and cultural impact of research, although it should be noted that research impact only drives a proportion of this allocation (approximately 20%) [ 9 ].

In the UK REF, the measurement of research impact is seen as increasingly important. However, the impact element of the REF has been criticised in some quarters [ 10 , 11 ]. Critics deride the fact that REF impact is determined in a relatively simplistic way, utilising researcher-generated case studies, which commonly attempt to link a particular research outcome to an associated policy or health improvement despite the fact that the wider literature highlights great diversity in the way research impact may be demonstrated [ 12 , 13 ]. This led to the current debate about the optimal method of measuring impact in the future REF [ 10 , 14 ]. The Stern review suggested that research impact should not only focus on socioeconomic impact but should also include impact on government policy, public engagement, academic impacts outside the field, and teaching to showcase interdisciplinary collaborative impact [ 10 , 11 ]. The Higher Education Funding Council for England (HEFCE) has recently set out the proposals for the REF 2021 exercise, confirming that the measurement of such impact will continue to form an important part of the process [ 15 ].

With increasing pressure for healthcare research to lead to demonstrable health, economic, and societal impact, there is a need for researchers to understand existing methodological impact frameworks and the means by which impact may be quantified (i.e., impact metrics; see Box 1 , 'Definitions’) to better inform research activities and funding decisions. From a researcher’s perspective, understanding the optimal pathways to impact can help inform study design aimed at maximising the impact of the project. At the same time, funders need to understand which aspects of impact they should focus on when allocating awards so they can make the most of their investment and bring the greatest benefit to patients and society [ 2 , 4 , 5 , 16 , 17 ].

Box 1. Definitions

Whilst previous researchers have summarised existing methodological frameworks and impact case studies [ 4 , 22 – 27 ], they have not summarised the metrics for use by researchers, funders, and policy makers. The aim of this review was therefore to (1) identify the methodological frameworks used to measure healthcare research impact using systematic methods, (2) summarise common impact themes and metrics in an impact matrix, and (3) provide a simplified consolidated resource for use by funders, researchers, and policy makers.

Search strategy and selection criteria

Initially, a search strategy was developed to identify the available literature regarding the different methods to measure research impact. The following keywords: ‘Impact’, ‘Framework’, and ‘Research’, and their synonyms, were used during the search of the Medical Literature Analysis and Retrieval System Online (MEDLINE; Ovid) database, the Excerpta Medica Database (EMBASE), the Health Management Information Consortium (HMIC) database, and the Cumulative Index to Nursing and Allied Health Literature (CINAHL+) database (inception to May 2017; see S1 Appendix for the full search strategy). Additionally, the nonindexed Journal of Research Evaluation was hand searched during the same timeframe using the keyword ‘Impact’. Other relevant articles were identified through 3 Internet search engines (Google, Google Scholar, and Google Images) using the keywords ‘Impact’, ‘Framework’, and ‘Research’, with the first 50 results screened. Google Images was searched because different methodological frameworks are summarised in a single image and can easily be identified through this search engine. Finally, additional publications were sought through communication with experts.

Following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (see S1 PRISMA Checklist ), 2 independent investigators systematically screened for publications describing, evaluating, or utilising a methodological research impact framework within the context of healthcare research [ 28 ]. Papers were eligible if they included full or partial methodological frameworks or pathways to research impact; both primary research and systematic reviews fitting these criteria were included. We included any methodological framework identified (original or modified versions) at the point of first occurrence. In addition, methodological frameworks were included if they were applicable to the healthcare discipline with no need of modification within their structure. We defined ‘methodological framework’ as ‘a body of methods, rules and postulates employed by a particular procedure or set of procedures (i.e., framework characteristics and development)’ [ 18 ], whereas we defined ‘pathway’ as ‘a way of achieving a specified result; a course of action’ [ 19 ]. Studies were excluded if they presented an existing (unmodified) methodological framework previously available elsewhere, did not explicitly describe a methodological framework but rather focused on a single metric (e.g., bibliometric analysis), focused on the impact or effectiveness of interventions rather than that of the research, or presented case study data only. There were no language restrictions.

Data screening

Records were downloaded into Endnote (version X7.3.1), and duplicates were removed. Two independent investigators (SCR and OLA) conducted all screening following a pilot aimed at refining the process. The records were screened by title and abstract before full-text articles of potentially eligible publications were retrieved for evaluation. A full-text screening identified the publications included for data extraction. Discrepancies were resolved through discussion, with the involvement of a third reviewer (MJC, DGK, and TJK) when necessary.

Data extraction and analysis

Data extraction occurred after the final selection of included articles. SCR and OLA independently extracted details of impact methodological frameworks, the country of origin, and the year of publication, as well as the source, the framework description, and the methodology used to develop the framework. Information regarding the methodology used to develop each methodological framework was also extracted from framework webpages where available. Investigators also extracted details regarding each framework’s impact categories and subgroups, along with their proposed time to impact (‘short-term’, ‘mid-term’, or ‘long-term’) and the details of any metrics that had been proposed to measure impact, which are depicted in an impact matrix. The structure of the matrix was informed by the work of M. Buxton and S. Hanney [ 2 ], P. Buykx et al. [ 5 ], S. Kuruvila et al. [ 29 ], and A. Weiss [ 30 ], with the intention of mapping metrics presented in previous methodological frameworks in a concise way. A consensus meeting with MJC, DGK, and TJK was held to solve disagreements and finalise the data extraction process.

Included studies

Our original search strategy identified 359 citations from MEDLINE (Ovid), EMBASE, CINAHL+, HMIC, and the Journal of Research Evaluation, and 101 citations were returned using other sources (Google, Google Images, Google Scholar, and expert communication) (see Fig 1 ) [ 28 ]. In total, we retrieved 54 full-text articles for review. At this stage, 39 articles were excluded, as they did not propose new or modified methodological frameworks. An additional 15 articles were included following the backward and forward citation method. A total of 31 relevant articles were included in the final analysis, of which 24 were articles presenting unique frameworks and the remaining 7 were systematic reviews [ 4 , 22 – 27 ]. The search strategy was rerun on 15 May 2017. A further 19 publications were screened, and 2 were taken forward to full-text screening but were ineligible for inclusion.



Methodological framework characteristics

The characteristics of the 24 included methodological frameworks are summarised in Table 1 , 'Methodological framework characteristics’. Fourteen publications proposed academic-orientated frameworks, which focused on measuring academic, societal, economic, and cultural impact using narrative and quantitative metrics [ 2 , 3 , 5 , 8 , 29 , 31 – 39 ]. Five publications focused on assessing the impact of research by focusing on the interaction process between stakeholders and researchers (‘productive interactions’), which is a requirement to achieve research impact. This approach tries to address the issue of attributing research impact to metrics [ 7 , 40 – 43 ]. Two frameworks focused on the importance of partnerships between researchers and policy makers, as a core element to accomplish research impact [ 44 , 45 ]. An additional 2 frameworks focused on evaluating the pathways to impact, i.e., linking processes between research and impact [ 30 , 46 ]. One framework assessed the ability of health technology to influence efficiency of healthcare systems [ 47 ]. Eight frameworks were developed in the UK [ 2 , 3 , 29 , 37 , 39 , 42 , 43 , 45 ], 6 in Canada [ 8 , 33 , 34 , 44 , 46 , 47 ], 4 in Australia [ 5 , 31 , 35 , 38 ], 3 in the Netherlands [ 7 , 40 , 41 ], and 2 in the United States [ 30 , 36 ], with 1 model developed with input from various countries [ 32 ].



Methodological framework development

The included methodological frameworks varied in their development process, but there were some common approaches employed. Most included a literature review [ 2 , 5 , 7 , 8 , 31 , 33 , 36 , 37 , 40 – 46 ], although none of them used a recognised systematic method. Most also consulted with various stakeholders [ 3 , 8 , 29 , 31 , 33 , 35 – 38 , 43 , 44 , 46 , 47 ] but used differing methods to incorporate their views, including quantitative surveys [ 32 , 35 , 43 , 46 ], face-to-face interviews [ 7 , 29 , 33 , 35 , 37 , 42 , 43 ], telephone interviews [ 31 , 46 ], consultation [ 3 , 7 , 36 ], and focus groups [ 39 , 43 ]. A range of stakeholder groups were approached across the sample, including principal investigators [ 7 , 29 , 43 ], research end users [ 7 , 42 , 43 ], academics [ 3 , 8 , 39 , 40 , 43 , 46 ], award holders [ 43 ], experts [ 33 , 38 , 39 ], sponsors [ 33 , 39 ], project coordinators [ 32 , 42 ], and chief investigators [ 31 , 35 ]. However, some authors failed to identify the stakeholders involved in the development of their frameworks [ 2 , 5 , 34 , 41 , 45 ], making it difficult to assess their appropriateness. In addition, only 4 of the included papers reported using formal analytic methods to interpret stakeholder responses. These included the Canadian Academy of Health Sciences framework, which used conceptual cluster analysis [ 33 ]. The Research Contribution [ 42 ], Research Impact [ 29 ], and Primary Health Care & Information Service [ 31 ] used a thematic analysis approach. Finally, some authors went on to pilot their framework, which shaped refinements on the methodological frameworks until approval. Methods used to pilot the frameworks included a case study approach [ 2 , 3 , 30 , 32 , 33 , 36 , 40 , 42 , 44 , 45 ], contrasting results against available literature [ 29 ], the use of stakeholders’ feedback [ 7 ], and assessment tools [ 35 , 46 ].

Major impact categories

1. primary research-related impact..

A number of methodological frameworks advocated the evaluation of ‘research-related impact’. This encompassed content related to the generation of new knowledge, knowledge dissemination, capacity building, training, leadership, and the development of research networks. These outcomes were considered the direct or primary impacts of a research project, as these are often the first evidenced returns [ 30 , 62 ].

A number of subgroups were identified within this category, with frameworks supporting the collection of impact data across the following constructs: ‘research and innovation outcomes’; ‘dissemination and knowledge transfer’; ‘capacity building, training, and leadership’; and ‘academic collaborations, research networks, and data sharing’.

1 . 1 . Research and innovation outcomes . Twenty of the 24 frameworks advocated the evaluation of ‘research and innovation outcomes’ [ 2 , 3 , 5 , 7 , 8 , 29 – 39 , 41 , 43 , 44 , 46 ]. This subgroup included the following metrics: number of publications; number of peer-reviewed articles (including journal impact factor); citation rates; requests for reprints, number of reviews, and meta-analysis; and new or changes in existing products (interventions or technology), patents, and research. Additionally, some frameworks also sought to gather information regarding ‘methods/methodological contributions’. These advocated the collection of systematic reviews and appraisals in order to identify gaps in knowledge and determine whether the knowledge generated had been assessed before being put into practice [ 29 ].

1 . 2 . Dissemination and knowledge transfer . Nineteen of the 24 frameworks advocated the assessment of ‘dissemination and knowledge transfer’ [ 2 , 3 , 5 , 7 , 29 – 32 , 34 – 43 , 46 ]. This comprised collection of the following information: number of conferences, seminars, workshops, and presentations; teaching output (i.e., number of lectures given to disseminate the research findings); number of reads for published articles; article download rate and number of journal webpage visits; and citations rates in nonjournal media such as newspapers and mass and social media (i.e., Twitter and blogs). Furthermore, this impact subgroup considered the measurement of research uptake and translatability and the adoption of research findings in technological and clinical applications and by different fields. These can be measured through patents, clinical trials, and partnerships between industry and business, government and nongovernmental organisations, and university research units and researchers [ 29 ].

1 . 3 . Capacity building , training , and leadership . Fourteen of 24 frameworks suggested the evaluation of ‘capacity building, training, and leadership’ [ 2 , 3 , 5 , 8 , 29 , 31 – 35 , 39 – 41 , 43 ]. This involved collecting information regarding the number of doctoral and postdoctoral studentships (including those generated as a result of the research findings and those appointed to conduct the research), as well as the number of researchers and research-related staff involved in the research projects. In addition, authors advocated the collection of ‘leadership’ metrics, including the number of research projects managed and coordinated and the membership of boards and funding bodies, journal editorial boards, and advisory committees [ 29 ]. Additional metrics in this category included public recognition (number of fellowships and awards for significant research achievements), academic career advancement, and subsequent grants received. Lastly, the impact metric ‘research system management’ comprised the collection of information that can lead to preserving the health of the population, such as modifying research priorities, resource allocation strategies, and linking health research to other disciplines to maximise benefits [ 29 ].

1 . 4 . Academic collaborations , research networks , and data sharing . Lastly, 10 of the 24 frameworks advocated the collection of impact data regarding ‘academic collaborations (internal and external collaborations to complete a research project), research networks, and data sharing’ [ 2 , 3 , 5 , 7 , 29 , 34 , 37 , 39 , 41 , 43 ].

2. Influence on policy making.

Methodological frameworks addressing this major impact category focused on measurable improvements within a given knowledge base and on interactions between academics and policy makers, which may influence policy-making development and implementation. The returns generated in this impact category are generally considered as intermediate or midterm (1 to 3 years). These represent an important interim stage in the process towards the final expected impacts, such as quantifiable health improvements and economic benefits, without which policy change may not occur [ 30 , 62 ]. The following impact subgroups were identified within this category: ‘type and nature of policy impact’, ‘level of policy making’, and ‘policy networks’.

2 . 1 . Type and nature of policy impact . The most common impact subgroup, mentioned in 18 of the 24 frameworks, was ‘type and nature of policy impact’ [ 2 , 7 , 29 – 38 , 41 – 43 , 45 – 47 ]. Methodological frameworks addressing this subgroup stressed the importance of collecting information regarding the influence of research on policy (i.e., changes in practice or terminology). For instance, a project looking at trafficked adolescents and women (2003) influenced the WHO guidelines (2003) on ethics regarding this particular group [ 17 , 21 , 63 ].

2 . 2 . Level of policy impact . Thirteen of 24 frameworks addressed aspects surrounding the need to record the ‘level of policy impact’ (international, national, or local) and the organisations within a level that were influenced (local policy makers, clinical commissioning groups, and health and wellbeing trusts) [ 2 , 5 , 8 , 29 , 31 , 34 , 38 , 41 , 43 – 47 ]. Authors considered it important to measure the ‘level of policy impact’ to provide evidence of collaboration, coordination, and efficiency within health organisations and between researchers and health organisations [ 29 , 31 ].

2 . 3 . Policy networks . Five methodological frameworks highlighted the need to collect information regarding collaborative research with industry and staff movement between academia and industry [ 5 , 7 , 29 , 41 , 43 ]. A policy network emphasises the relationship between policy communities, researchers, and policy makers. This relationship can influence and lead to incremental changes in policy processes [ 62 ].

3. Health and health systems impact.

A number of methodological frameworks advocated the measurement of impacts on health and healthcare systems across the following impact subgroups: ‘quality of care and service delivering’, ‘evidence-based practice’, ‘improved information and health information management’, ‘cost containment and effectiveness’, ‘resource allocation’, and ‘health workforce’.

3 . 1 . Quality of care and service delivery . Twelve of the 24 frameworks highlighted the importance of evaluating ‘quality of care and service delivery’ [ 2 , 5 , 8 , 29 – 31 , 33 – 36 , 41 , 47 ]. There were a number of suggested metrics that could be potentially used for this purpose, including health outcomes such as quality-adjusted life years (QALYs), patient-reported outcome measures (PROMs), patient satisfaction and experience surveys, and qualitative data on waiting times and service accessibility.

3 . 2 . Evidence-based practice . ‘Evidence-based practice’, mentioned in 5 of the 24 frameworks, refers to making changes in clinical diagnosis, clinical practice, treatment decisions, or decision making based on research evidence [ 5 , 8 , 29 , 31 , 33 ]. The suggested metrics to demonstrate evidence-based practice were adoption of health technologies and research outcomes to improve the healthcare systems and inform policies and guidelines [ 29 ].

3 . 3 . Improved information and health information management . This impact subcategory, mentioned in 5 of the 24 frameworks, refers to the influence of research on the provision of health services and management of the health system to prevent additional costs [ 5 , 29 , 33 , 34 , 38 ]. Methodological frameworks advocated the collection of health system financial, nonfinancial (i.e., transport and sociopolitical implications), and insurance information in order to determine constraints within a health system.

3 . 4 . Cost containment and cost-effectiveness . Six of the 24 frameworks advocated the subcategory ‘cost containment and cost-effectiveness’ [ 2 , 5 , 8 , 17 , 33 , 36 ]. ‘Cost containment’ comprised the collection of information regarding how research has influenced the provision and management of health services and its implication in healthcare resource allocation and use [ 29 ]. ‘Cost-effectiveness’ refers to information concerning economic evaluations to assess improvements in effectiveness and health outcomes—for instance, the cost-effectiveness (cost and health outcome benefits) assessment of introducing a new health technology to replace an older one [ 29 , 31 , 64 ].

3 . 5 . Resource allocation . ‘Resource allocation’, mentioned in 6frameworks, can be measured through 2 impact metrics: new funding attributed to the intervention in question and equity while allocating resources, such as improved allocation of resources at an area level; better targeting, accessibility, and utilisation; and coverage of health services [ 2 , 5 , 29 , 31 , 45 , 47 ]. The allocation of resources and targeting can be measured through health services research reports, with the utilisation of health services measured by the probability of providing an intervention when needed, the probability of requiring it again in the future, and the probability of receiving an intervention based on previous experience [ 29 , 31 ].

3 . 6 . Health workforce . Lastly, ‘health workforce’, present in 3 methodological frameworks, refers to the reduction in the days of work lost because of a particular illness [ 2 , 5 , 31 ].

4. Health-related and societal impact.

Three subgroups were included in this category: ‘health literacy’; ‘health knowledge, attitudes, and behaviours’; and ‘improved social equity, inclusion, or cohesion’.

4 . 1 . Health knowledge , attitudes , and behaviours . Eight of the 24 frameworks suggested the assessment of ‘health knowledge, attitudes, behaviours, and outcomes’, which could be measured through the evaluation of levels of public engagement with science and research (e.g., National Health Service (NHS) Choices end-user visit rate) or by using focus groups to analyse changes in knowledge, attitudes, and behaviour among society [ 2 , 5 , 29 , 33 – 35 , 38 , 43 ].

4 . 2 . Improved equity , inclusion , or cohesion and human rights . Other methodological frameworks, 4 of the 24, suggested capturing improvements in equity, inclusion, or cohesion and human rights. Authors suggested these could be using a resource like the United Nations Millennium Development Goals (MDGs) (superseded by Sustainable Development Goals [SDGs] in 2015) and human rights [ 29 , 33 , 34 , 38 ]. For instance, a cluster-randomised controlled trial in Nepal, which had female participants, has demonstrated the reduction of neonatal mortality through the introduction of maternity health care, distribution of delivery kits, and home visits. This illustrates how research can target vulnerable and disadvantaged groups. Additionally, this research has been introduced by the World Health Organisation to achieve the MDG ‘improve maternal health’ [ 16 , 29 , 65 ].

4 . 3 . Health literacy . Some methodological frameworks, 3 of the 24, focused on tracking changes in the ability of patients to make informed healthcare decisions, reduce health risks, and improve quality of life, which were demonstrably linked to a particular programme of research [ 5 , 29 , 43 ]. For example, a systematic review showed that when HIV health literacy/knowledge is spread among people living with the condition, antiretroviral adherence and quality of life improve [ 66 ].

5. Broader economic impacts.

Some methodological frameworks, 9 of 24, included aspects related to the broader economic impacts of health research—for example, the economic benefits emerging from the commercialisation of research outputs [ 2 , 5 , 29 , 31 , 33 , 35 , 36 , 38 , 67 ]. Suggested metrics included the amount of funding for research and development (R&D) that was competitively awarded by the NHS, medical charities, and overseas companies. Additional metrics were income from intellectual property, spillover effects (any secondary benefit gained as a repercussion of investing directly in a primary activity, i.e., the social and economic returns of investing on R&D) [ 33 ], patents granted, licences awarded and brought to the market, the development and sales of spinout companies, research contracts, and income from industry.

The benefits contained within the categories ‘health and health systems impact’, ‘health-related and societal impact’, and ‘broader economic impacts’ are considered the expected and final returns of the resources allocated in healthcare research [ 30 , 62 ]. These benefits commonly arise in the long term, beyond 5 years according to some authors, but there was a recognition that this could differ depending on the project and its associated research area [ 4 ].

Data synthesis

Five major impact categories were identified across the 24 included methodological frameworks: (1) ‘primary research-related impact’, (2) ‘influence on policy making’, (3) ‘health and health systems impact’, (4) ‘health-related and societal impact’, and (5) ‘broader economic impact’. These major impact categories were further subdivided into 16 impact subgroups. The included publications proposed 80 different metrics to measure research impact. This impact typology synthesis is depicted in ‘the impact matrix’ ( Fig 2 and Fig 3 ).


CIHR, Canadian Institutes of Health Research; HTA, Health Technology Assessment; PHC RIS, Primary Health Care Research & Information Service; RAE, Research Assessment Exercise; RQF, Research Quality Framework.



AIHS, Alberta Innovates—Health Solutions; CAHS, Canadian Institutes of Health Research; IOM, Impact Oriented Monitoring; REF, Research Excellence Framework; SIAMPI, Social Impact Assessment Methods for research and funding instruments through the study of Productive Interactions between science and society.


Commonality and differences across frameworks

The ‘Research Impact Framework’ and the ‘Health Services Research Impact Framework’ were the models that encompassed the largest number of the metrics extracted. The most dominant methodological framework was the Payback Framework; 7 other methodological framework models used the Payback Framework as a starting point for development [ 8 , 29 , 31 – 35 ]. Additional methodological frameworks that were commonly incorporated into other tools included the CIHR framework, the CAHS model, the AIHS framework, and the Exchange model [ 8 , 33 , 34 , 44 ]. The capture of ‘research-related impact’ was the most widely advocated concept across methodological frameworks, illustrating the importance with which primary short-term impact outcomes were viewed by the included papers. Thus, measurement of impact via number of publications, citations, and peer-reviewed articles was the most common. ‘Influence on policy making’ was the predominant midterm impact category, specifically the subgroup ‘type and nature of policy impact’, in which frameworks advocated the measurement of (i) changes to legislation, regulations, and government policy; (ii) influence and involvement in decision-making processes; and (iii) changes to clinical or healthcare training, practice, or guidelines. Within more long-term impact measurement, the evaluations of changes in the ‘quality of care and service delivery’ were commonly advocated.

In light of the commonalities and differences among the methodological frameworks, the ‘pathways to research impact’ diagram ( Fig 4 ) was developed to provide researchers, funders, and policy makers a more comprehensive and exhaustive way to measure healthcare research impact. The diagram has the advantage of assorting all the impact metrics proposed by previous frameworks and grouping them into different impact subgroups and categories. Prospectively, this global picture will help researchers, funders, and policy makers plan strategies to achieve multiple pathways to impact before carrying the research out. The analysis of the data extraction and construction of the impact matrix led to the development of the ‘pathways to research impact’ diagram ( Fig 4 ). The diagram aims to provide an exhaustive and comprehensive way of tracing research impact by combining all the impact metrics presented by the different 24 frameworks, grouping those metrics into different impact subgroups, and grouping these into broader impact categories.


NHS, National Health Service; PROM, patient-reported outcome measure; QALY, quality-adjusted life year; R&D, research and development.


This review has summarised existing methodological impact frameworks together for the first time using systematic methods ( Fig 4 ). It allows researchers and funders to consider pathways to impact at the design stage of a study and to understand the elements and metrics that need to be considered to facilitate prospective assessment of impact. Users do not necessarily need to cover all the aspects of the methodological framework, as every research project can impact on different categories and subgroups. This review provides information that can assist researchers to better demonstrate impact, potentially increasing the likelihood of conducting impactful research and reducing research waste. Existing reviews have not presented a methodological framework that includes different pathways to impact, health impact categories, subgroups, and metrics in a single methodological framework.

Academic-orientated frameworks included in this review advocated the measurement of impact predominantly using so-called ‘quantitative’ metrics—for example, the number of peer-reviewed articles, journal impact factor, and citation rates. This may be because they are well-established measures, relatively easy to capture and objective, and are supported by research funding systems. However, these metrics primarily measure the dissemination of research finding rather than its impact [ 30 , 68 ]. Whilst it is true that wider dissemination, especially when delivered via world-leading international journals, may well lead eventually to changes in healthcare, this is by no means certain. For instance, case studies evaluated by Flinders University of Australia demonstrated that some research projects with non-peer-reviewed publications led to significant changes in health policy, whilst the studies with peer-reviewed publications did not result in any type of impact [ 68 ]. As a result, contemporary literature has tended to advocate the collection of information regarding a variety of different potential forms of impact alongside publication/citations metrics [ 2 , 3 , 5 , 7 , 8 , 29 – 47 ], as outlined in this review.

The 2014 REF exercise adjusted UK university research funding allocation based on evidence of the wider impact of research (through case narrative studies and quantitative metrics), rather than simply according to the quality of research [ 12 ]. The intention was to ensure funds were directed to high-quality research that could demonstrate actual realised benefit. The inclusion of a mixed-method approach to the measurement of impact in the REF (narrative and quantitative metrics) reflects a widespread belief—expressed by the majority of authors of the included methodological frameworks in the review—that individual quantitative impact metrics (e.g., number of citations and publications) do not necessary capture the complexity of the relationships involved in a research project and may exclude measurement of specific aspects of the research pathway [ 10 , 12 ].

Many of the frameworks included in this review advocated the collection of a range of academic, societal, economic, and cultural impact metrics; this is consistent with recent recommendations from the Stern review [ 10 ]. However, a number of these metrics encounter research ‘lag’: i.e., the time between the point at which the research is conducted and when the actual benefits arise [ 69 ]. For instance, some cardiovascular research has taken up to 25 years to generate impact [ 70 ]. Likewise, the impact may not arise exclusively from a single piece of research. Different processes (such as networking interactions and knowledge and research translation) and multiple individuals and organisations are often involved [ 4 , 71 ]. Therefore, attributing the contribution made by each of the different actors involved in the process can be a challenge [ 4 ]. An additional problem associated to attribution is the lack of evidence to link research and impact. The outcomes of research may emerge slowly and be absorbed gradually. Consequently, it is difficult to determine the influence of research in the development of a new policy, practice, or guidelines [ 4 , 23 ].

A further problem is that impact evaluation is conducted ‘ex post’, after the research has concluded. Collecting information retrospectively can be an issue, as the data required might not be available. ‘ex ante’ assessment is vital for funding allocation, as it is necessary to determine the potential forthcoming impact before research is carried out [ 69 ]. Additionally, ex ante evaluation of potential benefit can overcome the issues regarding identifying and capturing evidence, which can be used in the future [ 4 ]. In order to conduct ex ante evaluation of potential benefit, some authors suggest the early involvement of policy makers in a research project coupled with a well-designed strategy of dissemination [ 40 , 69 ].

Providing an alternate view, the authors of methodological frameworks such as the SIAMPI, Contribution Mapping, Research Contribution, and the Exchange model suggest that the problems of attribution are a consequence of assigning the impact of research to a particular impact metric [ 7 , 40 , 42 , 44 ]. To address these issues, these authors propose focusing on the contribution of research through assessing the processes and interactions between stakeholders and researchers, which arguably take into consideration all the processes and actors involved in a research project [ 7 , 40 , 42 , 43 ]. Additionally, contributions highlight the importance of the interactions between stakeholders and researchers from an early stage in the research process, leading to a successful ex ante and ex post evaluation by setting expected impacts and determining how the research outcomes have been utilised, respectively [ 7 , 40 , 42 , 43 ]. However, contribution metrics are generally harder to measure in comparison to academic-orientated indicators [ 72 ].

Currently, there is a debate surrounding the optimal methodological impact framework, and no tool has proven superior to another. The most appropriate methodological framework for a given study will likely depend on stakeholder needs, as each employs different methodologies to assess research impact [ 4 , 37 , 41 ]. This review allows researchers to select individual existing methodological framework components to create a bespoke tool with which to facilitate optimal study design and maximise the potential for impact depending on the characteristic of their study ( Fig 2 and Fig 3 ). For instance, if researchers are interested in assessing how influential their research is on policy making, perhaps considering a suite of the appropriate metrics drawn from multiple methodological frameworks may provide a more comprehensive method than adopting a single methodological framework. In addition, research teams may wish to use a multidimensional approach to methodological framework development, adopting existing narratives and quantitative metrics, as well as elements from contribution frameworks. This approach would arguably present a more comprehensive method of impact assessment; however, further research is warranted to determine its effectiveness [ 4 , 69 , 72 , 73 ].

Finally, it became clear during this review that the included methodological frameworks had been constructed using varied methodological processes. At present, there are no guidelines or consensus around the optimal pathway that should be followed to develop a robust methodological framework. The authors believe this is an area that should be addressed by the research community, to ensure future frameworks are developed using best-practice methodology.

For instance, the Payback Framework drew upon a literature review and was refined through a case study approach. Arguably, this approach could be considered inferior to other methods that involved extensive stakeholder involvement, such as the CIHR framework [ 8 ]. Nonetheless, 7 methodological frameworks were developed based upon the Payback Framework [ 8 , 29 , 31 – 35 ].


The present review is the first to summarise systematically existing impact methodological frameworks and metrics. The main limitation is that 50% of the included publications were found through methods other than bibliographic databases searching, indicating poor indexing. Therefore, some relevant articles may not have been included in this review if they failed to indicate the inclusion of a methodological impact framework in their title/abstract. We did, however, make every effort to try to find these potentially hard-to-reach publications, e.g., through forwards/backwards citation searching, hand searching reference lists, and expert communication. Additionally, this review only extracted information regarding the methodology followed to develop each framework from the main publication source or framework webpage. Therefore, further evaluations may not have been included, as they are beyond the scope of the current paper. A further limitation was that although our search strategy did not include language restrictions, we did not specifically search non-English language databases. Thus, we may have failed to identify potentially relevant methodological frameworks that were developed in a non-English language setting.

In conclusion, the measurement of research impact is an essential exercise to help direct the allocation of limited research resources, to maximise benefit, and to help minimise research waste. This review provides a collective summary of existing methodological impact frameworks and metrics, which funders may use to inform the measurement of research impact and researchers may use to inform study design decisions aimed at maximising the short-, medium-, and long-term impact of their research.

Supporting information

S1 appendix. search strategy..


S1 PRISMA Checklist. Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist.



We would also like to thank Mrs Susan Bayliss, Information Specialist, University of Birmingham, and Mrs Karen Biddle, Research Secretary, University of Birmingham.

The role of artificial intelligence in healthcare: a structured literature review

BMC Medical Informatics and Decision Making volume  21 , Article number:  125 ( 2021 ) Cite this article

54k Accesses

72 Citations

15 Altmetric

Metrics details


Artificial intelligence (AI) in the healthcare sector is receiving attention from researchers and health professionals. Few previous studies have investigated this topic from a multi-disciplinary perspective, including accounting, business and management, decision sciences and health professions.

The structured literature review with its reliable and replicable research protocol allowed the researchers to extract 288 peer-reviewed papers from Scopus. The authors used qualitative and quantitative variables to analyse authors, journals, keywords, and collaboration networks among researchers. Additionally, the paper benefited from the Bibliometrix R software package.

The investigation showed that the literature in this field is emerging. It focuses on health services management, predictive medicine, patient data and diagnostics, and clinical decision-making. The United States, China, and the United Kingdom contributed the highest number of studies. Keyword analysis revealed that AI can support physicians in making a diagnosis, predicting the spread of diseases and customising treatment paths.


The literature reveals several AI applications for health services and a stream of research that has not fully been covered. For instance, AI projects require skills and data quality awareness for data-intensive analysis and knowledge-based management. Insights can help researchers and health professionals understand and address future research on AI in the healthcare field.

Peer Review reports

Artificial intelligence (AI) generally applies to computational technologies that emulate mechanisms assisted by human intelligence, such as thought, deep learning, adaptation, engagement, and sensory understanding [ 1 , 2 ]. Some devices can execute a role that typically involves human interpretation and decision-making [ 3 , 4 ]. These techniques have an interdisciplinary approach and can be applied to different fields, such as medicine and health. AI has been involved in medicine since as early as the 1950s, when physicians made the first attempts to improve their diagnoses using computer-aided programs [ 5 , 6 ]. Interest and advances in medical AI applications have surged in recent years due to the substantially enhanced computing power of modern computers and the vast amount of digital data available for collection and utilisation [ 7 ]. AI is gradually changing medical practice. There are several AI applications in medicine that can be used in a variety of medical fields, such as clinical, diagnostic, rehabilitative, surgical, and predictive practices. Another critical area of medicine where AI is making an impact is clinical decision-making and disease diagnosis. AI technologies can ingest, analyse, and report large volumes of data across different modalities to detect disease and guide clinical decisions [ 3 , 8 ]. AI applications can deal with the vast amount of data produced in medicine and find new information that would otherwise remain hidden in the mass of medical big data [ 9 , 10 , 11 ]. These technologies can also identify new drugs for health services management and patient care treatments [ 5 , 6 ].

Courage in the application of AI is visible through a search in the primary research databases. However, as Meskò et al. [ 7 ] find, the technology will potentially reduce care costs and repetitive operations by focusing the medical profession on critical thinking and clinical creativity. As Cho et al. and Doyle et al. [ 8 , 9 ] add, the AI perspective is exciting; however, new studies will be needed to establish the efficacy and applications of AI in the medical field [ 10 ].

Our paper will also concentrate on AI strategies for healthcare from the accounting, business, and management perspectives. The authors used the structured literature review (SLR) method for its reliable and replicable research protocol [ 11 ] and selected bibliometric variables as sources of investigation. Bibliometric usage enables the recognition of the main quantitative variables of the study stream [ 12 ]. This method facilitates the detection of the required details of a particular research subject, including field authors, number of publications, keywords for interaction between variables (policies, properties and governance) and country data [ 13 ]. It also allows the application of the science mapping technique [ 14 ]. Our paper adopted the Bibliometrix R package and the biblioshiny web interface as tools of analysis [ 14 ].

The investigation offers the following insights for future researchers and practitioners:

bibliometric information on 288 peer-reviewed English papers from the Scopus collection.

Identification of leading journals in this field, such as Journal of Medical Systems, Studies in Health Technology and Informatics, IEEE Journal of Biomedical and Health Informatics, and Decision Support Systems.

Qualitative and quantitative information on authors’ Lotka’s law, h-index, g-index, m-index, keyword, and citation data.

Research on specific countries to assess AI in the delivery and effectiveness of healthcare, quotes, and networks within each region.

A topic dendrogram study that identifies five research clusters: health services management, predictive medicine, patient data, diagnostics, and finally, clinical decision-making.

An in-depth discussion that develops theoretical and practical implications for future studies.

The paper is organised as follows. Section  2 lists the main bibliometric articles in this field. Section  3 elaborates on the methodology. Section  4 presents the findings of the bibliometric analysis. Section  5 discusses the main elements of AI in healthcare based on the study results. Section  6 concludes the article with future implications for research.

Related works and originality

As suggested by Zupic and Čater [ 15 ], a research stream can be evaluated with bibliometric methods that can introduce objectivity and mitigate researcher bias. For this reason, bibliometric methods are attracting increasing interest among researchers as a reliable and impersonal research analytical approach [ 16 , 17 ]. Recently, bibliometrics has been an essential method for analysing and predicting research trends [ 18 ]. Table  1 lists other research that has used a similar approach in the research stream investigated.

The scientific articles reported show substantial differences in keywords and research topics that have been previously studied. The bibliometric analysis of Huang et al. [ 19 ] describes rehabilitative medicine using virtual reality technology. According to the authors, the primary goal of rehabilitation is to enhance and restore functional ability and quality of life for patients with physical impairments or disabilities. In recent years, many healthcare disciplines have been privileged to access various technologies that provide tools for both research and clinical intervention.

Hao et al. [ 20 ] focus on text mining in medical research. As reported, text mining reveals new, previously unknown information by using a computer to automatically extract information from different text resources. Text mining methods can be regarded as an extension of data mining to text data. Text mining is playing an increasingly significant role in processing medical information. Similarly, the studies by dos Santos et al. [ 21 ] focus on applying data mining and machine learning (ML) techniques to public health problems. As stated in this research, public health may be defined as the art and science of preventing diseases, promoting health, and prolonging life. Using data mining and ML techniques, it is possible to discover new information that otherwise would be hidden. These two studies are related to another topic: medical big data. According to Liao et al. [ 22 ], big data is a typical “buzzword” in the business and research community, referring to a great mass of digital data collected from various sources. In the medical field, we can obtain a vast amount of data (i.e., medical big data). Data mining and ML techniques can help deal with this information and provide helpful insights for physicians and patients. More recently, Choudhury et al. [ 23 ] provide a systematic review on the use of ML to improve the care of elderly patients, demonstrating eligible studies primarily in psychological disorders and eye diseases.

Tran et al. [ 2 ] focus on the global evolution of AI research in medicine. Their bibliometric analysis highlights trends and topics related to AI applications and techniques. As stated in Connelly et al.’s [ 24 ] study, robot-assisted surgeries have rapidly increased in recent years. Their bibliometric analysis demonstrates how robotic-assisted surgery has gained acceptance in different medical fields, such as urological, colorectal, cardiothoracic, orthopaedic, maxillofacial and neurosurgery applications. Additionally, the bibliometric analysis of Guo et al. [ 25 ] provides an in-depth study of AI publications through December 2019. The paper focuses on tangible AI health applications, giving researchers an idea of how algorithms can help doctors and nurses. A new stream of research related to AI is also emerging. In this sense, Choudhury and Asan’s [ 26 ] scientific contribution provides a systematic review of the AI literature to identify health risks for patients. They report on 53 studies involving technology for clinical alerts, clinical reports, and drug safety. Considering the considerable interest within this research stream, this analysis differs from the current literature for several reasons. It aims to provide in-depth discussion, considering mainly the business, management, and accounting fields and not dealing only with medical and health profession publications.

Additionally, our analysis aims to provide a bibliometric analysis of variables such as authors, countries, citations and keywords to guide future research perspectives for researchers and practitioners, as similar analyses have done for several publications in other research streams [ 15 , 16 , 27 ]. In doing so, we use a different database, Scopus, that is typically adopted in social sciences fields. Finally, our analysis will propose and discuss a dominant framework of variables in this field, and our analysis will not be limited to AI application descriptions.


This paper evaluated AI in healthcare research streams using the SLR method [ 11 ]. As suggested by Massaro et al. [ 11 ], an SLR enables the study of the scientific corpus of a research field, including the scientific rigour, reliability and replicability of operations carried out by researchers. As suggested by many scholars, the methodology allows qualitative and quantitative variables to highlight the best authors, journals and keywords and combine a systematic literature review and bibliometric analysis [ 27 , 28 , 29 , 30 ]. Despite its widespread use in business and management [ 16 , 31 ], the SLR is also used in the health sector based on the same philosophy through which it was originally conceived [ 32 , 33 ]. A methodological analysis of previously published articles reveals that the most frequently used steps are as follows [ 28 , 31 , 34 ]:

defining research questions;

writing the research protocol;

defining the research sample to be analysed;

developing codes for analysis; and

critically analysing, discussing, and identifying a future research agenda.

Considering the above premises, the authors believe that an SLR is the best method because it combines scientific validity, replicability of the research protocol and connection between multiple inputs.

As stated by the methodological paper, the first step is research question identification. For this purpose, we benefit from the analysis of Zupic and Čater [ 15 ], who provide several research questions for future researchers to link the study of authors, journals, keywords and citations. Therefore, RQ1 is “What are the most prominent authors, journal keywords and citations in the field of the research study?” Additionally, as suggested by Haleem et al. [ 35 ], new technologies, including AI, are changing the medical field in unexpected timeframes, requiring studies in multiple areas. Therefore, RQ2 is “How does artificial intelligence relate to healthcare, and what is the focus of the literature?” Then, as discussed by Massaro et al. [ 36 ], RQ3 is “What are the research applications of artificial intelligence for healthcare?”.

The first research question aims to define the qualitative and quantitative variables of the knowledge flow under investigation. The second research question seeks to determine the state of the art and applications of AI in healthcare. Finally, the third research question aims to help researchers identify practical and theoretical implications and future research ideas in this field.

The second fundamental step of the SLR is writing the research protocol [ 11 ]. Table  2 indicates the currently known literature elements, uniquely identifying the research focus, motivations and research strategy adopted and the results providing a link with the following points. Additionally, to strengthen the analysis, our investigation benefits from the PRISMA statement methodological article [ 37 ]. Although the SLR is a validated method for systematic reviews and meta-analyses, we believe that the workflow provided may benefit the replicability of the results [ 37 , 38 , 39 , 40 ]. Figure  1 summarises the researchers’ research steps, indicating that there are no results that can be referred to as a meta-analysis.

figure 1

Source : Authors’ elaboration on Liberati et al. [ 37 ]

PRISMA workflow.

The third step is to specify the search strategy and search database. Our analysis is based on the search string “Artificial Intelligence” OR “AI” AND “Healthcare” with a focus on “Business, Management, and Accounting”, “Decision Sciences”, and “Health professions”. As suggested by [ 11 , 41 ] and motivated by [ 42 ], keywords can be selected through a top-down approach by identifying a large search field and then focusing on particular sub-topics. The paper uses data retrieved from the Scopus database, a multi-disciplinary database, which allowed the researchers to identify critical articles for scientific analysis [ 43 ]. Additionally, Scopus was selected based on Guo et al.’s [ 25 ] limitations, which suggest that “future studies will apply other databases, such as Scopus, to explore more potential papers” . The research focuses on articles and reviews published in peer-reviewed journals for their scientific relevance [ 11 , 16 , 17 , 29 ] and does not include the grey literature, conference proceedings or books/book chapters. Articles written in any language other than English were excluded [ 2 ]. For transparency and replicability, the analysis was conducted on 11 January 2021. Using this research strategy, the authors retrieved 288 articles. To strengthen the study's reliability, we publicly provide the full bibliometric extract on the Zenodo repository [ 44 , 45 ].

The fourth research phase is defining the code framework that initiates the analysis of the variables. The study will identify the following:

descriptive information of the research area;

source analysis [ 16 ];

author and citation analysis [ 28 ];

keywords and network analysis [ 14 ]; and

geographic distribution of the papers [ 14 ].

The final research phase is the article’s discussion and conclusion, where implications and future research trends will be identified.

At the research team level, the information is analysed with the statistical software R-Studio and the Bibliometrix package [ 15 ], which allows scientific analysis of the results obtained through the multi-disciplinary database.

The analysis of bibliometric results starts with a description of the main bibliometric statistics with the aim of answering RQ1, What are the most prominent authors, journal keywords and citations in the field of the research study?, and RQ2, How does artificial intelligence relate to healthcare, and what is the focus of the literature? Therefore, the following elements were thoroughly analysed: (1) type of document; (2) annual scientific production; (3) scientific sources; (4) source growth; (5) number of articles per author; (6) author’s dominance ranking; (7) author’s h-index, g-index, and m-index; (8) author’s productivity; (9) author’s keywords; (10) topic dendrogram; (11) a factorial map of the document with the highest contributions; (12) article citations; (13) country production; (14) country citations; (15) country collaboration map; and (16) country collaboration network.

Main information

Table  3 shows the information on 288 peer-reviewed articles published between 1992 and January 2021 extracted from the Scopus database. The number of keywords is 946 from 136 sources, and the number of keywords plus, referring to the number of keywords that frequently appear in an article’s title, was 2329. The analysis period covered 28 years and 1 month of scientific production and included an annual growth rate of 5.12%. However, the most significant increase in published articles occurred in the past three years (please see Fig.  2 ). On average, each article was written by three authors (3.56). Finally, the collaboration index (CI), which was calculated as the total number of authors of multi-authored articles/total number of multi-authored articles, was 3.97 [ 46 ].

figure 2

Source : Authors’ elaboration

Annual scientific production.

Table  4 shows the top 20 sources related to the topic. The Journal of Medical Systems is the most relevant source, with twenty-one of the published articles. This journal's main issues are the foundations, functionality, interfaces, implementation, impacts, and evaluation of medical technologies. Another relevant source is Studies in Health Technology and Informatics, with eleven articles. This journal aims to extend scientific knowledge related to biomedical technologies and medical informatics research. Both journals deal with cloud computing, machine learning, and AI as a disruptive healthcare paradigm based on recent publications. The IEEE Journal of Biomedical and Health Informatics investigates technologies in health care, life sciences, and biomedicine applications from a broad perspective. The next journal, Decision Support Systems, aims to analyse how these technologies support decision-making from a multi-disciplinary view, considering business and management. Therefore, the analysis of the journals revealed that we are dealing with an interdisciplinary research field. This conclusion is confirmed, for example, by the presence of purely medical journals, journals dedicated to the technological growth of healthcare, and journals with a long-term perspective such as futures.

The distribution frequency of the articles (Fig.  3 ) indicates the journals dealing with the topic and related issues. Between 2008 and 2012, a significant growth in the number of publications on the subject is noticeable. However, the graph shows the results of the Loess regression, which includes the quantity and publication time of the journal under analysis as variables. This method allows the function to assume an unlimited distribution; that is, feature can consider values below zero if the data are close to zero. It contributes to a better visual result and highlights the discontinuity in the publication periods [ 47 ].

figure 3

Source growth. Source : Authors’ elaboration

Finally, Fig.  4 provides an analytical perspective on factor analysis for the most cited papers. As indicated in the literature [ 48 , 49 ], using factor analysis to discover the most cited papers allows for a better understanding of the scientific world’s intellectual structure. For example, our research makes it possible to consider certain publications that effectively analyse subject specialisation. For instance, Santosh’s [ 50 ] article addresses the new paradigm of AI with ML algorithms for data analysis and decision support in the COVID-19 period, setting a benchmark in terms of citations by researchers. Moving on to the application, an article by Shickel et al. [ 51 ] begins with the belief that the healthcare world currently has much health and administrative data. In this context, AI and deep learning will support medical and administrative staff in extracting data, predicting outcomes, and learning medical representations. Finally, in the same line of research, Baig et al. [ 52 ], with a focus on wearable patient monitoring systems (WPMs), conclude that AI and deep learning may be landmarks for continuous patient monitoring and support for healthcare delivery.

figure 4

Factorial map of the most cited documents.

This section identifies the most cited authors of articles on AI in healthcare. It also identifies the authors’ keywords, dominance factor (DF) ranking, h-index, productivity, and total number of citations. Table  5 identifies the authors and their publications in the top 20 rankings. As the table shows, Bushko R.G. has the highest number of publications: four papers. He is the editor-in-chief of Future of Health Technology, a scientific journal that aims to develop a clear vision of the future of health technology. Then, several authors each wrote three papers. For instance, Liu C. is a researcher active in the topic of ML and computer vision, and Sharma A. from Emory University Atlanta in the USA is a researcher with a clear focus on imaging and translational informatics. Some other authors have two publications each. While some authors have published as primary authors, most have published as co-authors. Hence, in the next section, we measure the contributory power of each author by investigating the DF ranking through the number of elements.

Authors’ dominance ranking

The dominance factor (DF) is a ratio measuring the fraction of multi-authored articles in which an author acts as the first author [ 53 ]. Several bibliometric studies use the DF in their analyses [ 46 , 54 ]. The DF ranking calculates an author’s dominance in producing articles. The DF is calculated by dividing the number of an author’s multi-authored papers as the first author (Nmf) by the author's total number of multi-authored papers (Nmt). This is omitted in the single-author case due to the constant value of 1 for single-authored articles. This formulation could lead to some distortions in the results, especially in fields where the first author is entered by surname alphabetical order [ 55 ].

The mathematical equation for the DF is shown as:

Table  6 lists the top 20 DF rankings. The data in the table show a low level of articles per author, either for first-authored or multi-authored articles. The results demonstrate that we are dealing with an emerging topic in the literature. Additionally, as shown in the table, Fox J. and Longoni C. are the most dominant authors in the field.

Authors’ impact

Table  7 shows the impact of authors in terms of the h-index [ 56 ] (i.e., the productivity and impact of citations of a researcher), g-index [ 57 ] (i.e., the distribution of citations received by a researcher's publications), m-index [ 58 ] (i.e., the h-index value per year), total citations, total paper and years of scientific publication. The H-index was introduced in the literature as a metric for the objective comparison of scientific results and depended on the number of publications and their impact [ 59 ]. The results show that the 20 most relevant authors have an h-index between 2 and 1. For the practical interpretation of the data, the authors considered data published by the London School of Economics [ 60 ]. In the social sciences, the analysis shows values of 7.6 for economic publications by professors and researchers who had been active for several years. Therefore, the youthfulness of the research area has attracted young researchers and professors. At the same time, new indicators have emerged over the years to diversify the logic of the h-index. For example, the g-index indicates an author's impact on citations, considering that a single article can generate these. The m-index, on the other hand, shows the cumulative value over the years.

The analysis, also considering the total number of citations, the number of papers published and the year of starting to publish, thus confirms that we are facing an expanding research flow.

Authors’ productivity

Figure  5 shows Lotka’s law. This mathematical formulation originated in 1926 to describe the publication frequency by authors in a specific research field [ 61 ]. In practice, the law states that the number of authors contributing to research in a given period is a fraction of the number who make up a single contribution [ 14 , 61 ].

figure 5

Lotka’s law.

The mathematical relationship is expressed in reverse in the following way:

where y x is equal to the number of authors producing x articles in each research field. Therefore, C and n are constants that can be estimated in the calculation.

The figure's results are in line with Lotka's results, with an average of two publications per author in a given research field. In addition, the figure shows the percentage of authors. Our results lead us to state that we are dealing with a young and growing research field, even with this analysis. Approximately 70% of the authors had published only their first research article. Only approximately 20% had published two scientific papers.

Authors’ keywords

This section provides information on the relationship between the keywords artificial intelligence and healthcare . This analysis is essential to determine the research trend, identify gaps in the discussion on AI in healthcare, and identify the fields that can be interesting as research areas [ 42 , 62 ].

Table  8 highlights the total number of keywords per author in the top 20 positions. The ranking is based on the following elements: healthcare, artificial intelligence, and clinical decision support system . Keyword analysis confirms the scientific area of reference. In particular, we deduce the definition as “Artificial intelligence is the theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages” [ 2 , 63 ]. Panch et al. [ 4 ] find that these technologies can be used in different business and management areas. After the first keyword, the analysis reveals AI applications and related research such as machine learning and deep learning.

Additionally, data mining and big data are a step forward in implementing exciting AI applications. According to our specific interest, if we applied AI in healthcare, we would achieve technological applications to help and support doctors and medical researchers in decision-making. The link between AI and decision-making is the reason why we find, in the seventh position, the keyword clinical decision support system . AI techniques can unlock clinically relevant information hidden in the massive amount of data that can assist clinical decision-making [ 64 ]. If we analyse the following keywords, we find other elements related to decision-making and support systems.

The TreeMap below (Fig.  6 ) highlights the combination of possible keywords representing AI and healthcare.

figure 6

Keywords treemap.

The topic dendrogram in Fig.  7 represents the hierarchical order and the relationship between the keywords generated by hierarchical clustering [ 42 ]. The cut in the figure and the vertical lines facilitate an investigation and interpretation of the different clusters. As stated by Andrews [ 48 ], the figure is not intended to find the perfect level of associations between clusters. However, it aims to estimate the approximate number of clusters to facilitate further discussion.

figure 7

Topic dendrogram.

The research stream of AI in healthcare is divided into two main strands. The blue strand focuses on medical information systems and the internet. Some papers are related to healthcare organisations, such as the Internet of Things, meaning that healthcare organisations use AI to support health services management and data analysis. AI applications are also used to improve diagnostic and therapeutic accuracy and the overall clinical treatment process [ 2 ]. If we consider the second block, the red one, three different clusters highlight separate aspects of the topic. The first could be explained as AI and ML predictive algorithms. Through AI applications, it is possible to obtain a predictive approach that can ensure that patients are better monitored. This also allows a better understanding of risk perception for doctors and medical researchers. In the second cluster, the most frequent words are decisions , information system , and support system . This means that AI applications can support doctors and medical researchers in decision-making. Information coming from AI technologies can be used to consider difficult problems and support a more straightforward and rapid decision-making process. In the third cluster, it is vital to highlight that the ML model can deal with vast amounts of data. From those inputs, it can return outcomes that can optimise the work of healthcare organisations and scheduling of medical activities.

Furthermore, the word cloud in Fig.  8 highlights aspects of AI in healthcare, such as decision support systems, decision-making, health services management, learning systems, ML techniques and diseases. The figure depicts how AI is linked to healthcare and how it is used in medicine.

figure 8

Word cloud.

Figure  9 represents the search trends based on the keywords analysed. The research started in 2012. First, it identified research topics related to clinical decision support systems. This topic was recurrent during the following years. Interestingly, in 2018, studies investigated AI and natural language processes as possible tools to manage patients and administrative elements. Finally, a new research stream considers AI's role in fighting COVID-19 [ 65 , 66 ].

figure 9

Keywords frequency.

Table  9 represents the number of citations from other articles within the top 20 rankings. The analysis allows the benchmark studies in the field to be identified [ 48 ]. For instance, Burke et al. [ 67 ] writes the most cited paper and analyses efficient nurse rostering methodologies. The paper critically evaluates tangible interdisciplinary solutions that also include AI. Immediately thereafter, Ahmed M.A.'s article proposes a data-driven optimisation methodology to determine the optimal number of healthcare staff to optimise patients' productivity [ 68 ]. Finally, the third most cited article lays the groundwork for developing deep learning by considering diverse health and administrative information [ 51 ].

This section analyses the diffusion of AI in healthcare around the world. It highlights countries to show the geographies of this research. It includes all published articles, the total number of citations, and the collaboration network. The following sub-sections start with an analysis of the total number of published articles.

Country total articles

Figure  9 and Table  10 display the countries where AI in healthcare has been considered. The USA tops the list of countries with the maximum number of articles on the topic (215). It is followed by China (83), the UK (54), India (51), Australia (54), and Canada (32). It is immediately evident that the theme has developed on different continents, highlighting a growing interest in AI in healthcare. The figure shows that many areas, such as Russia, Eastern Europe and Africa except for Algeria, Egypt, and Morocco, have still not engaged in this scientific debate.

Country publications and collaboration map

This section discusses articles on AI in healthcare in terms of single or multiple publications in each country. It also aims to observe collaboration and networking between countries. Table  11 and Fig.  10 highlight the average citations by state and show that the UK, the USA, and Kuwait have a higher average number of citations than other countries. Italy, Spain and New Zealand have the most significant number of citations.

figure 10

Articles per country.

Figure  11 depicts global collaborations. The blue colour on the map represents research cooperation among nations. Additionally, the pink border linking states indicates the extent of collaboration between authors. The primary cooperation between nations is between the USA and China, with two collaborative articles. Other collaborations among nations are limited to a few papers.

figure 11

Collaboration map.

Artificial intelligence for healthcare: applications

This section aims to strengthen the research scope by answering RQ3: What are the research applications of artificial intelligence for healthcare?

Benefiting from the topical dendrogram, researchers will provide a development model based on four relevant variables [ 69 , 70 ]. AI has been a disruptive innovation in healthcare [ 4 ]. With its sophisticated algorithms and several applications, AI has assisted doctors and medical professionals in the domains of health information systems, geocoding health data, epidemic and syndromic surveillance, predictive modelling and decision support, and medical imaging [ 2 , 9 , 10 , 64 ]. Furthermore, the researchers considered the bibliometric analysis to identify four macro-variables dominant in the field and used them as authors' keywords. Therefore, the following sub-sections aim to explain the debate on applications in healthcare for AI techniques. These elements are shown in Fig.  12 .

figure 12

Dominant variables for AI in healthcare.

Health services management

One of the notable aspects of AI techniques is potential support for comprehensive health services management. These applications can support doctors, nurses and administrators in their work. For instance, an AI system can provide health professionals with constant, possibly real-time medical information updates from various sources, including journals, textbooks, and clinical practices [ 2 , 10 ]. These applications' strength is becoming even more critical in the COVID-19 period, during which information exchange is continually needed to properly manage the pandemic worldwide [ 71 ]. Other applications involve coordinating information tools for patients and enabling appropriate inferences for health risk alerts and health outcome prediction [ 72 ]. AI applications allow, for example, hospitals and all health services to work more efficiently for the following reasons:

Clinicians can access data immediately when they need it.

Nurses can ensure better patient safety while administering medication.

Patients can stay informed and engaged in their care by communicating with their medical teams during hospital stays.

Additionally, AI can contribute to optimising logistics processes, for instance, realising drugs and equipment in a just-in-time supply system based totally on predictive algorithms [ 73 , 74 ]. Interesting applications can also support the training of personnel working in health services. This evidence could be helpful in bridging the gap between urban and rural health services [ 75 ]. Finally, health services management could benefit from AI to leverage the multiplicity of data in electronic health records by predicting data heterogeneity across hospitals and outpatient clinics, checking for outliers, performing clinical tests on the data, unifying patient representation, improving future models that can predict diagnostic tests and analyses, and creating transparency with benchmark data for analysing services delivered [ 51 , 76 ].

Predictive medicine

Another relevant topic is AI applications for disease prediction and diagnosis treatment, outcome prediction and prognosis evaluation [ 72 , 77 ]. Because AI can identify meaningful relationships in raw data, it can support diagnostic, treatment and prediction outcomes in many medical situations [ 64 ]. It allows medical professionals to embrace the proactive management of disease onset. Additionally, predictions are possible for identifying risk factors and drivers for each patient to help target healthcare interventions for better outcomes [ 3 ]. AI techniques can also help design and develop new drugs, monitor patients and personalise patient treatment plans [ 78 ]. Doctors benefit from having more time and concise data to make better patient decisions. Automatic learning through AI could disrupt medicine, allowing prediction models to be created for drugs and exams that monitor patients over their whole lives [ 79 ].

One of the keyword analysis main topics is that AI applications could support doctors and medical researchers in the clinical decision-making process. According to Jiang et al. [ 64 ], AI can help physicians make better clinical decisions or even replace human judgement in healthcare-specific functional areas. According to Bennett and Hauser [ 80 ], algorithms can benefit clinical decisions by accelerating the process and the amount of care provided, positively impacting the cost of health services. Therefore, AI technologies can support medical professionals in their activities and simplify their jobs [ 4 ]. Finally, as Redondo and Sandoval [ 81 ] find, algorithmic platforms can provide virtual assistance to help doctors understand the semantics of language and learning to solve business process queries as a human being would.

Patient data and diagnostics

Another challenging topic related to AI applications is patient data and diagnostics. AI techniques can help medical researchers deal with the vast amount of data from patients (i.e., medical big data ). AI systems can manage data generated from clinical activities, such as screening, diagnosis, and treatment assignment. In this way, health personnel can learn similar subjects and associations between subject features and outcomes of interest [ 64 ].

These technologies can analyse raw data and provide helpful insights that can be used in patient treatments. They can help doctors in the diagnostic process; for example, to realise a high-speed body scan, it will be simpler to have an overall patient condition image. Then, AI technology can recreate a 3D mapping solution of a patient’s body.

In terms of data, interesting research perspectives are emerging. For instance, we observed the emergence of a stream of research on patient data management and protection related to AI applications [ 82 ].

For diagnostics, AI techniques can make a difference in rehabilitation therapy and surgery. Numerous robots have been designed to support and manage such tasks. Rehabilitation robots physically support and guide, for example, a patient’s limb during motor therapy [ 83 ]. For surgery, AI has a vast opportunity to transform surgical robotics through devices that can perform semi-automated surgical tasks with increasing efficiency. The final aim of this technology is to automate procedures to negate human error while maintaining a high level of accuracy and precision [ 84 ]. Finally, the -19 period has led to increased remote patient diagnostics through telemedicine that enables remote observation of patients and provides physicians and nurses with support tools [ 66 , 85 , 86 ].

This study aims to provide a bibliometric analysis of publications on AI in healthcare, focusing on accounting, business and management, decision sciences and health profession studies. Using the SLR method of Massaro et al. [ 11 ], we provide a reliable and replicable research protocol for future studies in this field. Additionally, we investigate the trend of scientific publications on the subject, unexplored information, future directions, and implications using the science mapping workflow. Our analysis provides interesting insights.

In terms of bibliometric variables, the four leading journals, Journal of Medical Systems , Studies in Health Technology and Informatics , IEEE Journal of Biomedical and Health Informatics , and Decision Support Systems , are optimal locations for the publication of scientific articles on this topic. These journals deal mainly with healthcare, medical information systems, and applications such as cloud computing, machine learning, and AI. Additionally, in terms of h-index, Bushko R.G. and Liu C. are the most productive and impactful authors in this research stream. Burke et al.’s [ 67 ] contribution is the most cited with an analysis of nurse rostering using new technologies such as AI. Finally, in terms of keywords, co-occurrence reveals some interesting insights. For instance, researchers have found that AI has a role in diagnostic accuracy and helps in the analysis of health data by comparing thousands of medical records, experiencing automatic learning with clinical alerts, efficient management of health services and places of care, and the possibility of reconstructing patient history using these data.

Second, this paper finds five cluster analyses in healthcare applications: health services management, predictive medicine, patient data, diagnostics, and finally, clinical decision-making. These technologies can also contribute to optimising logistics processes in health services and allowing a better allocation of resources.

Third, the authors analysing the research findings and the issues under discussion strongly support AI's role in decision support. These applications, however, are demonstrated by creating a direct link to data quality management and the technology awareness of health personnel [ 87 ].

The importance of data quality for the decision-making process

Several authors have analysed AI in the healthcare research stream, but in this case, the authors focus on other literature that includes business and decision-making processes. In this regard, the analysis of the search flow reveals a double view of the literature. On the one hand, some contributions belong to the positivist literature and embrace future applications and implications of technology for health service management, data analysis and diagnostics [ 6 , 80 , 88 ]. On the other hand, some investigations also aim to understand the darker sides of technology and its impact. For example, as Carter [ 89 ] states, the impact of AI is multi-sectoral; its development, however, calls for action to protect personal data. Similarly, Davenport and Kalakota [ 77 ] focus on the ethical implications of using AI in healthcare. According to the authors, intelligent machines raise issues of accountability, transparency, and permission, especially in automated communication with patients. Our analysis does not indicate a marked strand of the literature; therefore, we argue that the discussion of elements such as the transparency of technology for patients is essential for the development of AI applications.

A large part of our results shows that, at the application level, AI can be used to improve medical support for patients (Fig.  11 ) [ 64 , 82 ]. However, we believe that, as indicated by Kalis et al. [ 90 ] on the pages of Harvard Business Review, the management of costly back-office problems should also be addressed.

The potential of algorithms includes data analysis. There is an immense quantity of data accessible now, which carries the possibility of providing information about a wide variety of medical and healthcare activities [ 91 ]. With the advent of modern computational methods, computer learning and AI techniques, there are numerous possibilities [ 79 , 83 , 84 ]. For example, AI makes it easier to turn data into concrete and actionable observations to improve decision-making, deliver high-quality patient treatment, adapt to real-time emergencies, and save more lives on the clinical front. In addition, AI makes it easier to leverage capital to develop systems and facilities and reduce expenses at the organisational level [ 78 ]. Studying contributions to the topic, we noticed that data accuracy was included in the debate, indicating that a high standard of data will benefit decision-making practitioners [ 38 , 77 ]. AI techniques are an essential instrument for studying data and the extraction of medical insight, and they may assist medical researchers in their practices. Using computational tools, healthcare stakeholders may leverage the power of data not only to evaluate past data ( descriptive analytics ) but also to forecast potential outcomes ( predictive analytics ) and to define the best actions for the present scenario ( prescriptive analytics ) [ 78 ]. The current abundance of evidence makes it easier to provide a broad view of patient health; doctors should have access to the correct details at the right time and location to provide the proper treatment [ 92 ].

Will medical technology de-skill doctors?

Further reflection concerns the skills of doctors. Studies have shown that healthcare personnel are progressively being exposed to technology for different purposes, such as collecting patient records or diagnosis [ 71 ]. This is demonstrated by the keywords (Fig.  6 ) that focus on technology and the role of decision-making with new innovative tools. In addition, the discussion expands with Lu [ 93 ], which indicates that the excessive use of technology could hinder doctors’ skills and clinical procedures' expansion. Among the main issues arising from the literature is the possible de-skilling of healthcare staff due to reduced autonomy in decision-making concerning patients [ 94 ]. Therefore, the challenges and discussion we uncovered in Fig.  11 are expanded by also considering the ethical implications of technology and the role of skills.


Our analysis also has multiple theoretical and practical implications.

In terms of theoretical contribution, this paper extends the previous results of Connelly et al., dos Santos et al, Hao et al., Huang et al., Liao et al. and Tran et al. [ 2 , 19 , 20 , 21 , 22 , 24 ] in considering AI in terms of clinical decision-making and data management quality.

In terms of practical implications, this paper aims to create a fruitful discussion with healthcare professionals and administrative staff on how AI can be at their service to increase work quality. Furthermore, this investigation offers a broad comprehension of bibliometric variables of AI techniques in healthcare. It can contribute to advancing scientific research in this field.


Like any other, our study has some limitations that could be addressed by more in-depth future studies. For example, using only one research database, such as Scopus, could be limiting. Further analysis could also investigate the PubMed, IEEE, and Web of Science databases individually and holistically, especially the health parts. Then, the use of search terms such as "Artificial Intelligence" OR "AI" AND "Healthcare" could be too general and exclude interesting studies. Moreover, although we analysed 288 peer-reviewed scientific papers, because the new research topic is new, the analysis of conference papers could return interesting results for future researchers. Additionally, as this is a young research area, the analysis will be subject to recurrent obsolescence as multiple new research investigations are published. Finally, although bibliometric analysis has limited the subjectivity of the analysis [ 15 ], the verification of recurring themes could lead to different results by indicating areas of significant interest not listed here.

Future research avenues

Concerning future research perspectives, researchers believe that an analysis of the overall amount that a healthcare organisation should pay for AI technologies could be helpful. If these technologies are essential for health services management and patient treatment, governments should invest and contribute to healthcare organisations' modernisation. New investment funds could be made available in the healthcare world, as in the European case with the Next Generation EU programme or national investment programmes [ 95 ]. Additionally, this should happen especially in the poorest countries around the world, where there is a lack of infrastructure and services related to health and medicine [ 96 ]. On the other hand, it might be interesting to evaluate additional profits generated by healthcare organisations with AI technologies compared to those that do not use such technologies.

Further analysis could also identify why some parts of the world have not conducted studies in this area. It would be helpful to carry out a comparative analysis between countries active in this research field and countries that are not currently involved. It would make it possible to identify variables affecting AI technologies' presence or absence in healthcare organisations. The results of collaboration between countries also present future researchers with the challenge of greater exchanges between researchers and professionals. Therefore, further research could investigate the difference in vision between professionals and academics.

In the accounting, business, and management research area, there is currently a lack of quantitative analysis of the costs and profits generated by healthcare organisations that use AI technologies. Therefore, research in this direction could further increase our understanding of the topic and the number of healthcare organisations that can access technologies based on AI. Finally, as suggested in the discussion section, more interdisciplinary studies are needed to strengthen AI links with data quality management and AI and ethics considerations in healthcare.

In pursuing the philosophy of Massaro et al.’s [ 11 ] methodological article, we have climbed on the shoulders of giants, hoping to provide a bird's-eye view of the AI literature in healthcare. We performed this study with a bibliometric analysis aimed at discovering authors, countries of publication and collaboration, and keywords and themes. We found a fast-growing, multi-disciplinary stream of research that is attracting an increasing number of authors.

The research, therefore, adopts a quantitative approach to the analysis of bibliometric variables and a qualitative approach to the study of recurring keywords, which has allowed us to demonstrate strands of literature that are not purely positive. There are currently some limitations that will affect future research potential, especially in ethics, data governance and the competencies of the health workforce.

Availability of data and materials

All the data are retrieved from public scientific platforms.

Tagliaferri SD, Angelova M, Zhao X, Owen PJ, Miller CT, Wilkin T, et al. Artificial intelligence to improve back pain outcomes and lessons learnt from clinical classification approaches: three systematic reviews. NPJ Digit Med. 2020;3(1):1–16.

Article   Google Scholar  

Tran BX, Vu GT, Ha GH, Vuong Q-H, Ho M-T, Vuong T-T, et al. Global evolution of research in artificial intelligence in health and medicine: a bibliometric study. J Clin Med. 2019;8(3):360.

Article   PubMed Central   Google Scholar  

Hamid S. The opportunities and risks of artificial intelligence in medicine and healthcare [Internet]. 2016 [cited 2020 May 29]. http://www.cuspe.org/wp-content/uploads/2016/09/Hamid_2016.pdf

Panch T, Szolovits P, Atun R. Artificial intelligence, machine learning and health systems. J Glob Health. 2018;8(2):020303.

Article   PubMed   PubMed Central   Google Scholar  

Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of artificial intelligence for computer-assisted drug discovery | chemical reviews. Chem Rev. 2019;119(18):10520–94.

Article   CAS   PubMed   Google Scholar  

Burton RJ, Albur M, Eberl M, Cuff SM. Using artificial intelligence to reduce diagnostic workload without compromising detection of urinary tract infections. BMC Med Inform Decis Mak. 2019;19(1):171.

Meskò B, Drobni Z, Bényei E, Gergely B, Gyorffy Z. Digital health is a cultural transformation of traditional healthcare. Mhealth. 2017;3:38.

Cho B-J, Choi YJ, Lee M-J, Kim JH, Son G-H, Park S-H, et al. Classification of cervical neoplasms on colposcopic photography using deep learning. Sci Rep. 2020;10(1):13652.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Doyle OM, Leavitt N, Rigg JA. Finding undiagnosed patients with hepatitis C infection: an application of artificial intelligence to patient claims data. Sci Rep. 2020;10(1):10521.

Shortliffe EH, Sepúlveda MJ. Clinical decision support in the era of artificial intelligence. JAMA. 2018;320(21):2199–200.

Article   PubMed   Google Scholar  

Massaro M, Dumay J, Guthrie J. On the shoulders of giants: undertaking a structured literature review in accounting. Account Auditing Account J. 2016;29(5):767–801.

Junquera B, Mitre M. Value of bibliometric analysis for research policy: a case study of Spanish research into innovation and technology management. Scientometrics. 2007;71(3):443–54.

Casadesus-Masanell R, Ricart JE. How to design a winning business model. Harvard Business Review [Internet]. 2011 Jan 1 [cited 2020 Jan 8]. https://hbr.org/2011/01/how-to-design-a-winning-business-model

Aria M, Cuccurullo C. bibliometrix: an R-tool for comprehensive science mapping analysis. J Informetr. 2017;11(4):959–75.

Zupic I, Čater T. Bibliometric methods in management and organization. Organ Res Methods. 2015;1(18):429–72.

Secinaro S, Calandra D. Halal food: structured literature review and research agenda. Br Food J. 2020. https://doi.org/10.1108/BFJ-03-2020-0234 .

Rialp A, Merigó JM, Cancino CA, Urbano D. Twenty-five years (1992–2016) of the international business review: a bibliometric overview. Int Bus Rev. 2019;28(6):101587.

Zhao L, Dai T, Qiao Z, Sun P, Hao J, Yang Y. Application of artificial intelligence to wastewater treatment: a bibliometric analysis and systematic review of technology, economy, management, and wastewater reuse. Process Saf Environ Prot. 2020;1(133):169–82.

Article   CAS   Google Scholar  

Huang Y, Huang Q, Ali S, Zhai X, Bi X, Liu R. Rehabilitation using virtual reality technology: a bibliometric analysis, 1996–2015. Scientometrics. 2016;109(3):1547–59.

Hao T, Chen X, Li G, Yan J. A bibliometric analysis of text mining in medical research. Soft Comput. 2018;22(23):7875–92.

dos Santos BS, Steiner MTA, Fenerich AT, Lima RHP. Data mining and machine learning techniques applied to public health problems: a bibliometric analysis from 2009 to 2018. Comput Ind Eng. 2019;1(138):106120.

Liao H, Tang M, Luo L, Li C, Chiclana F, Zeng X-J. A bibliometric analysis and visualization of medical big data research. Sustainability. 2018;10(1):166.

Choudhury A, Renjilian E, Asan O. Use of machine learning in geriatric clinical care for chronic diseases: a systematic literature review. JAMIA Open. 2020;3(3):459–71.

Connelly TM, Malik Z, Sehgal R, Byrnes G, Coffey JC, Peirce C. The 100 most influential manuscripts in robotic surgery: a bibliometric analysis. J Robot Surg. 2020;14(1):155–65.

Guo Y, Hao Z, Zhao S, Gong J, Yang F. Artificial intelligence in health care: bibliometric analysis. J Med Internet Res. 2020;22(7):e18228.

Choudhury A, Asan O. Role of artificial intelligence in patient safety outcomes: systematic literature review. JMIR Med Inform. 2020;8(7):e18599.

Forliano C, De Bernardi P, Yahiaoui D. Entrepreneurial universities: a bibliometric analysis within the business and management domains. Technol Forecast Soc Change. 2021;1(165):120522.

Secundo G, Del Vecchio P, Mele G. Social media for entrepreneurship: myth or reality? A structured literature review and a future research agenda. Int J Entrep Behav Res. 2020;27(1):149–77.

Dal Mas F, Massaro M, Lombardi R, Garlatti A. From output to outcome measures in the public sector: a structured literature review. Int J Organ Anal. 2019;27(5):1631–56.

Google Scholar  

Baima G, Forliano C, Santoro G, Vrontis D. Intellectual capital and business model: a systematic literature review to explore their linkages. J Intellect Cap. 2020. https://doi.org/10.1108/JIC-02-2020-0055 .

Dumay J, Guthrie J, Puntillo P. IC and public sector: a structured literature review. J Intellect Cap. 2015;16(2):267–84.

Dal Mas F, Garcia-Perez A, Sousa MJ, Lopes da Costa R, Cobianchi L. Knowledge translation in the healthcare sector. A structured literature review. Electron J Knowl Manag. 2020;18(3):198–211.

Mas FD, Massaro M, Lombardi R, Biancuzzi H. La performance nel settore pubblico tra misure di out-put e di outcome. Una revisione strutturata della letteratura ejvcbp. 2020;1(3):16–29.

Dumay J, Cai L. A review and critique of content analysis as a methodology for inquiring into IC disclosure. J Intellect Cap. 2014;15(2):264–90.

Haleem A, Javaid M, Khan IH. Current status and applications of Artificial Intelligence (AI) in medical field: an overview. Curr Med Res Pract. 2019;9(6):231–7.

Paul J, Criado AR. The art of writing literature review: what do we know and what do we need to know? Int Bus Rev. 2020;29(4):101717.

Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med. 2009;6(7):e1000100.

Biancone PP, Secinaro S, Brescia V, Calandra D. Data quality methods and applications in health care system: a systematic literature review. Int J Bus Manag. 2019;14(4):p35.

Secinaro S, Brescia V, Calandra D, Verardi GP, Bert F. The use of micafungin in neonates and children: a systematic review. ejvcbp. 2020;1(1):100–14.

Bert F, Gualano MR, Biancone P, Brescia V, Camussi E, Martorana M, et al. HIV screening in pregnant women: a systematic review of cost-effectiveness studies. Int J Health Plann Manag. 2018;33(1):31–50.

Levy Y, Ellis TJ. A systems approach to conduct an effective literature review in support of information systems research. Inf Sci Int J Emerg Transdiscipl. 2006;9:181–212.

Chen G, Xiao L. Selecting publication keywords for domain analysis in bibliometrics: a comparison of three methods. J Informet. 2016;10(1):212–23.

Falagas ME, Pitsouni EI, Malietzis GA, Pappas G. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses. FASEB J. 2007;22(2):338–42.

Article   PubMed   CAS   Google Scholar  

Sicilia M-A, Garcìa-Barriocanal E, Sànchez-Alonso S. Community curation in open dataset repositories: insights from zenodo. Procedia Comput Sci. 2017;1(106):54–60.

Secinaro S, Calandra D, Secinaro A, Muthurangu V, Biancone P. Artificial Intelligence for healthcare with a business, management and accounting, decision sciences, and health professions focus [Internet]. Zenodo; 2021 [cited 2021 Mar 7]. https://zenodo.org/record/4587618#.YEScpl1KiWh .

Elango B, Rajendran D. Authorship trends and collaboration pattern in the marine sciences literature: a scientometric Study. Int J Inf Dissem Technol. 2012;1(2):166–9.

Jacoby WG. Electoral inquiry section Loess: a nonparametric, graphical tool for depicting relationships between variables q. In 2000.

Andrews JE. An author co-citation analysis of medical informatics. J Med Libr Assoc. 2003;91(1):47–56.

PubMed   PubMed Central   Google Scholar  

White HD, Griffith BC. Author cocitation: a literature measure of intellectual structure. J Am Soc Inf Sci. 1981;32(3):163–71.

Santosh KC. AI-driven tools for coronavirus outbreak: need of active learning and cross-population train/test models on multitudinal/multimodal data. J Med Syst. 2020;44(5):93.

Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J Biomed Health Inform. 2018;22(5):1589–604.

Baig MM, GholamHosseini H, Moqeem AA, Mirza F, Lindén M. A systematic review of wearable patient monitoring systems—current challenges and opportunities for clinical adoption. J Med Syst. 2017;41(7):115.

Kumar S, Kumar S. Collaboration in research productivity in oil seed research institutes of India. In: Proceedings of fourth international conference on webometrics, informetrics and scientometrics. p. 28–1; 2008.

Gatto A, Drago C. A taxonomy of energy resilience. Energy Policy. 2020;136:111007.

Levitt JM, Thelwall M. Alphabetization and the skewing of first authorship towards last names early in the alphabet. J Informet. 2013;7(3):575–82.

Saad G. Exploring the h-index at the author and journal levels using bibliometric data of productive consumer scholars and business-related journals respectively. Scientometrics. 2006;69(1):117–20.

Egghe L. Theory and practise of the g-index. Scientometrics. 2006;69(1):131–52.

Schreiber M. A modification of the h-index: the hm-index accounts for multi-authored manuscripts. J Informet. 2008;2(3):211–6.

Engqvist L, Frommen JG. The h-index and self-citations. Trends Ecol Evol. 2008;23(5):250–2.

London School of Economics. 3: key measures of academic influence [Internet]. Impact of social sciences. 2010 [cited 2021 Jan 13]. https://blogs.lse.ac.uk/impactofsocialsciences/the-handbook/chapter-3-key-measures-of-academic-influence/ .

Lotka A. The frequency distribution of scientific productivity. J Wash Acad Sci. 1926;16(12):317–24.

Khan G, Wood J. Information technology management domain: emerging themes and keyword analysis. Scientometrics. 2015;9:105.

Oxford University Press. Oxford English Dictionary [Internet]. 2020. https://www.oed.com/ .

Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230–43.

Calandra D, Favareto M. Artificial Intelligence to fight COVID-19 outbreak impact: an overview. Eur J Soc Impact Circ Econ. 2020;1(3):84–104.

Bokolo Anthony Jnr. Use of telemedicine and virtual care for remote treatment in response to COVID-19 pandemic. J Med Syst. 2020;44(7):132.

Burke EK, De Causmaecker P, Berghe GV, Van Landeghem H. The state of the art of nurse rostering. J Sched. 2004;7(6):441–99.

Ahmed MA, Alkhamis TM. Simulation optimization for an emergency department healthcare unit in Kuwait. Eur J Oper Res. 2009;198(3):936–42.

Forina M, Armanino C, Raggio V. Clustering with dendrograms on interpretation variables. Anal Chim Acta. 2002;454(1):13–9.

Wartena C, Brussee R. Topic detection by clustering keywords. In: 2008 19th international workshop on database and expert systems applications. 2008. p. 54–8.

Hussain AA, Bouachir O, Al-Turjman F, Aloqaily M. AI Techniques for COVID-19. IEEE Access. 2020;8:128776–95.

Agrawal A, Gans JS, Goldfarb A. Exploring the impact of artificial intelligence: prediction versus judgment. Inf Econ Policy. 2019;1(47):1–6.

Chakradhar S. Predictable response: finding optimal drugs and doses using artificial intelligence. Nat Med. 2017;23(11):1244–7.

Fleming N. How artificial intelligence is changing drug discovery. Nature. 2018;557(7707):S55–7.

Guo J, Li B. The application of medical artificial intelligence technology in rural areas of developing countries. Health Equity. 2018;2(1):174–81.

Aisyah M, Cockcroft S. A snapshot of data quality issues in Indonesian community health. Int J Netw Virtual Organ. 2014;14(3):280–97.

Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Healthc J. 2019;6(2):94–8.

Mehta N, Pandit A, Shukla S. Transforming healthcare with big data analytics and artificial intelligence: a systematic mapping study. J Biomed Inform. 2019;1(100):103311.

Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577–9.

Bennett CC, Hauser K. Artificial intelligence framework for simulating clinical decision-making: a Markov decision process approach. Artif Intell Med. 2013;57(1):9–19.

Redondo T, Sandoval AM. Text Analytics: the convergence of big data and artificial intelligence. Int J Interact Multimed Artif Intell. 2016;3. https://www.ijimai.org/journal/bibcite/reference/2540 .

Winter JS, Davidson E. Big data governance of personal health information and challenges to contextual integrity. Inf Soc. 2019;35(1):36–51.

Novak D, Riener R. Control strategies and artificial intelligence in rehabilitation robotics. AI Mag. 2015;36(4):23–33.

Tarassoli SP. Artificial intelligence, regenerative surgery, robotics? What is realistic for the future of surgery? Ann Med Surg (Lond). 2019;17(41):53–5.

Saha SK, Fernando B, Cuadros J, Xiao D, Kanagasingam Y. Automated quality assessment of colour fundus images for diabetic retinopathy screening in telemedicine. J Digit Imaging. 2018;31(6):869–78.

Gu D, Li T, Wang X, Yang X, Yu Z. Visualizing the intellectual structure and evolution of electronic health and telemedicine research. Int J Med Inform. 2019;130:103947.

Madnick S, Wang R, Lee Y, Zhu H. Overview and framework for data and information quality research. J Data Inf Qual. 2009;1:1.

Chen X, Liu Z, Wei L, Yan J, Hao T, Ding R. A comparative quantitative study of utilizing artificial intelligence on electronic health records in the USA and China during 2008–2017. BMC Med Inform Decis Mak. 2018;18(5):117.

Carter D. How real is the impact of artificial intelligence? Bus Inf Surv. 2018;35(3):99–115.

Kalis B, Collier M, Fu R. 10 Promising AI Applications in Health Care. 2018;5.

Biancone P, Secinaro S, Brescia V, Calandra D. Management of open innovation in healthcare for cost accounting using EHR. J Open Innov Technol Market Complex. 2019;5(4):99.

Kayyali B, Knott D, Van Kuiken S. The ‘big data’ revolution in US healthcare [Internet]. McKinsey & Company. 2013 [cited 2020 Aug 14]. https://healthcare.mckinsey.com/big-data-revolution-us-healthcare/ .

Lu J. Will medical technology deskill doctors? Int Educ Stud. 2016;9(7):130–4.

Hoff T. Deskilling and adaptation among primary care physicians using two work innovations. Health Care Manag Rev. 2011;36(4):338–48.

Picek O. Spillover effects from next generation EU. Intereconomics. 2020;55(5):325–31.

Sousa MJ, Dal Mas F, Pesqueira A, Lemos C, Verde JM, Cobianchi L. The potential of AI in health higher education to increase the students’ learning outcomes. TEM J. 2021. ( In press ).

Download references


The authors are grateful to the Editor-in-Chief for the suggestions and all the reviewers who spend a part of their time ensuring constructive feedback to our research article.

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and affiliations.

Department of Management, University of Turin, Turin, Italy

Silvana Secinaro, Davide Calandra & Paolo Biancone

Ospedale Pediatrico Bambino Gesù, Rome, Italy

Aurelio Secinaro

Institute of Child Health, University College London, London, UK

Vivek Muthurangu

You can also search for this author in PubMed   Google Scholar


SS and PB, Supervision; Validation, writing, AS and VM; Formal analysis, DC and AS; Methodology, DC; Writing; DC, SS and AS; conceptualization, VM, PB; validation, VM, PB. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Davide Calandra .

Ethics declarations

Ethical approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Cite this article.

Secinaro, S., Calandra, D., Secinaro, A. et al. The role of artificial intelligence in healthcare: a structured literature review. BMC Med Inform Decis Mak 21 , 125 (2021). https://doi.org/10.1186/s12911-021-01488-9

Download citation

Received : 24 December 2020

Accepted : 01 April 2021

Published : 10 April 2021

DOI : https://doi.org/10.1186/s12911-021-01488-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

BMC Medical Informatics and Decision Making

ISSN: 1472-6947

research paper on health sector

Participating in Health Research Studies

What is health research.

The term "health research," sometimes also called "medical research" or "clinical research," refers to research that is done to learn more about human health. Health research also aims to find better ways to prevent and treat disease. Health research is an important way to help improve the care and treatment of people worldwide.

Have you ever wondered how certain drugs can cure or help treat illness? For instance, you might have wondered how aspirin helps reduce pain. Well, health research begins with questions that have not been answered yet such as:

"Does a certain drug improve health?"

To gain more knowledge about illness and how the human body and mind work, volunteers can help researchers answer questions about health in studies of an illness. Studies might involve testing new drugs, vaccines, surgical procedures, or medical devices in clinical trials . For this reason, health research can involve known and unknown risks. To answer questions correctly, safely, and according to the best methods, researchers have detailed plans for the research and procedures that are part of any study. These procedures are called "protocols."

An example of a research protocol includes the process for determining participation in a study. A person might meet certain conditions, called "inclusion criteria," if they have the required characteristics for a study. A study on menopause may require participants to be female. On the other hand, a person might not be able to enroll in a study if they do not meet these criteria based on "exclusion criteria." A male may not be able to enroll in a study on menopause. These criteria are part of all research protocols. Study requirements are listed in the description of the study.

A Brief History

While a few studies of disease were done using a scientific approach as far back as the 14th Century, the era of modern health research started after World War II with early studies of antibiotics. Since then, health research and clinical trials have been essential for the development of more than 1,000 Food and Drug Administration (FDA) approved drugs. These drugs help treat infections, manage long term or chronic illness, and prolong the life of patients with cancer and HIV.

Sound research demands a clear consent process. Public knowledge of the potential abuses of medical research arose after the severe misconduct of research in Germany during World War II. This resulted in rules to ensure that volunteers freely agree, or give "consent," to any study they are involved in. To give consent, one should have clear knowledge about the study process explained by study staff. Additional safeguards for volunteers were also written in the Nuremberg Code and the Declaration of Helsinki .

New rules and regulations to protect research volunteers and to eliminate ethical violations have also been put in to place after the Tuskegee trial . In this unfortunate study, African American patients with syphilis were denied known treatment so that researchers could study the history of the illness. With these added protections, health research has brought new drugs and treatments to patients worldwide. Thus, health research has found cures to many diseases and helped manage many others.

Why is Health Research Important?

The development of new medical treatments and cures would not happen without health research and the active role of research volunteers. Behind every discovery of a new medicine and treatment are thousands of people who were involved in health research. Thanks to the advances in medical care and public health, we now live on average 10 years longer than in the 1960's and 20 years longer than in the 1930's. Without research, many diseases that can now be treated would cripple people or result in early death. New drugs, new ways to treat old and new illnesses, and new ways to prevent diseases in people at risk of developing them, can only result from health research.

Before health research was a part of health care, doctors would choose medical treatments based on their best guesses, and they were often wrong. Now, health research takes the guesswork out. In fact, the Food and Drug Administration (FDA) requires that all new medicines are fully tested before doctors can prescribe them. Many things that we now take for granted are the result of medical studies that have been done in the past. For instance, blood pressure pills, vaccines to prevent infectious diseases, transplant surgery, and chemotherapy are all the result of research.

Medical research often seems much like standard medical care, but it has a distinct goal. Medical care is the way that your doctors treat your illness or injury. Its only purpose is to make you feel better and you receive direct benefits. On the other hand, medical research studies are done to learn about and to improve current treatments. We all benefit from the new knowledge that is gained in the form of new drugs, vaccines, medical devices (such as pacemakers) and surgeries. However, it is crucial to know that volunteers do not always receive any direct benefits from being in a study. It is not known if the treatment or drug being studied is better, the same, or even worse than what is now used. If this was known, there would be no need for any medical studies.

The use of Big Data Analytics in healthcare

Journal of Big Data volume  9 , Article number:  3 ( 2022 ) Cite this article

23k Accesses

13 Citations

2 Altmetric

Metrics details

The introduction of Big Data Analytics (BDA) in healthcare will allow to use new technologies both in treatment of patients and health management. The paper aims at analyzing the possibilities of using Big Data Analytics in healthcare. The research is based on a critical analysis of the literature, as well as the presentation of selected results of direct research on the use of Big Data Analytics in medical facilities. The direct research was carried out based on research questionnaire and conducted on a sample of 217 medical facilities in Poland. Literature studies have shown that the use of Big Data Analytics can bring many benefits to medical facilities, while direct research has shown that medical facilities in Poland are moving towards data-based healthcare because they use structured and unstructured data, reach for analytics in the administrative, business and clinical area. The research positively confirmed that medical facilities are working on both structural data and unstructured data. The following kinds and sources of data can be distinguished: from databases, transaction data, unstructured content of emails and documents, data from devices and sensors. However, the use of data from social media is lower as in their activity they reach for analytics, not only in the administrative and business but also in the clinical area. It clearly shows that the decisions made in medical facilities are highly data-driven. The results of the study confirm what has been analyzed in the literature that medical facilities are moving towards data-based healthcare, together with its benefits.


The main contribution of this paper is to present an analytical overview of using structured and unstructured data (Big Data) analytics in medical facilities in Poland. Medical facilities use both structured and unstructured data in their practice. Structured data has a predetermined schema, it is extensive, freeform, and comes in variety of forms [ 27 ]. In contrast, unstructured data, referred to as Big Data (BD), does not fit into the typical data processing format. Big Data is a massive amount of data sets that cannot be stored, processed, or analyzed using traditional tools. It remains stored but not analyzed. Due to the lack of a well-defined schema, it is difficult to search and analyze such data and, therefore, it requires a specific technology and method to transform it into value [ 20 , 68 ]. Integrating data stored in both structured and unstructured formats can add significant value to an organization [ 27 ]. Organizations must approach unstructured data in a different way. Therefore, the potential is seen in Big Data Analytics (BDA). Big Data Analytics are techniques and tools used to analyze and extract information from Big Data. The results of Big Data analysis can be used to predict the future. They also help in creating trends about the past. When it comes to healthcare, it allows to analyze large datasets from thousands of patients, identifying clusters and correlation between datasets, as well as developing predictive models using data mining techniques [ 60 ].

This paper is the first study to consolidate and characterize the use of Big Data from different perspectives. The first part consists of a brief literature review of studies on Big Data (BD) and Big Data Analytics (BDA), while the second part presents results of direct research aimed at diagnosing the use of big data analyses in medical facilities in Poland.

Healthcare is a complex system with varied stakeholders: patients, doctors, hospitals, pharmaceutical companies and healthcare decision-makers. This sector is also limited by strict rules and regulations. However, worldwide one may observe a departure from the traditional doctor-patient approach. The doctor becomes a partner and the patient is involved in the therapeutic process [ 14 ]. Healthcare is no longer focused solely on the treatment of patients. The priority for decision-makers should be to promote proper health attitudes and prevent diseases that can be avoided [ 81 ]. This became visible and important especially during the Covid-19 pandemic [ 44 ].

The next challenges that healthcare will have to face is the growing number of elderly people and a decline in fertility. Fertility rates in the country are found below the reproductive minimum necessary to keep the population stable [ 10 ]. The reflection of both effects, namely the increase in age and lower fertility rates, are demographic load indicators, which is constantly growing. Forecasts show that providing healthcare in the form it is provided today will become impossible in the next 20 years [ 70 ]. It is especially visible now during the Covid-19 pandemic when healthcare faced quite a challenge related to the analysis of huge data amounts and the need to identify trends and predict the spread of the coronavirus. The pandemic showed it even more that patients should have access to information about their health condition, the possibility of digital analysis of this data and access to reliable medical support online. Health monitoring and cooperation with doctors in order to prevent diseases can actually revolutionize the healthcare system. One of the most important aspects of the change necessary in healthcare is putting the patient in the center of the system.

Technology is not enough to achieve these goals. Therefore, changes should be made not only at the technological level but also in the management and design of complete healthcare processes and what is more, they should affect the business models of service providers. The use of Big Data Analytics is becoming more and more common in enterprises [ 17 , 54 ]. However, medical enterprises still cannot keep up with the information needs of patients, clinicians, administrators and the creator’s policy. The adoption of a Big Data approach would allow the implementation of personalized and precise medicine based on personalized information, delivered in real time and tailored to individual patients.

To achieve this goal, it is necessary to implement systems that will be able to learn quickly about the data generated by people within clinical care and everyday life. This will enable data-driven decision making, receiving better personalized predictions about prognosis and responses to treatments; a deeper understanding of the complex factors and their interactions that influence health at the patient level, the health system and society, enhanced approaches to detecting safety problems with drugs and devices, as well as more effective methods of comparing prevention, diagnostic, and treatment options [ 40 ].

In the literature, there is a lot of research showing what opportunities can be offered to companies by big data analysis and what data can be analyzed. However, there are few studies showing how data analysis in the area of healthcare is performed, what data is used by medical facilities and what analyses and in which areas they carry out. This paper aims to fill this gap by presenting the results of research carried out in medical facilities in Poland. The goal is to analyze the possibilities of using Big Data Analytics in healthcare, especially in Polish conditions. In particular, the paper is aimed at determining what data is processed by medical facilities in Poland, what analyses they perform and in what areas, and how they assess their analytical maturity. In order to achieve this goal, a critical analysis of the literature was performed, and the direct research was based on a research questionnaire conducted on a sample of 217 medical facilities in Poland. It was hypothesized that medical facilities in Poland are working on both structured and unstructured data and moving towards data-based healthcare and its benefits. Examining the maturity of healthcare facilities in the use of Big Data and Big Data Analytics is crucial in determining the potential future benefits that the healthcare sector can gain from Big Data Analytics. There is also a pressing need to predicate whether, in the coming years, healthcare will be able to cope with the threats and challenges it faces.

This paper is divided into eight parts. The first is the introduction which provides background and the general problem statement of this research. In the second part, this paper discusses considerations on use of Big Data and Big Data Analytics in Healthcare, and then, in the third part, it moves on to challenges and potential benefits of using Big Data Analytics in healthcare. The next part involves the explanation of the proposed method. The result of direct research and discussion are presented in the fifth part, while the following part of the paper is the conclusion. The seventh part of the paper presents practical implications. The final section of the paper provides limitations and directions for future research.

Considerations on use Big Data and Big Data Analytics in the healthcare

In recent years one can observe a constantly increasing demand for solutions offering effective analytical tools. This trend is also noticeable in the analysis of large volumes of data (Big Data, BD). Organizations are looking for ways to use the power of Big Data to improve their decision making, competitive advantage or business performance [ 7 , 54 ]. Big Data is considered to offer potential solutions to public and private organizations, however, still not much is known about the outcome of the practical use of Big Data in different types of organizations [ 24 ].

As already mentioned, in recent years, healthcare management worldwide has been changed from a disease-centered model to a patient-centered model, even in value-based healthcare delivery model [ 68 ]. In order to meet the requirements of this model and provide effective patient-centered care, it is necessary to manage and analyze healthcare Big Data.

The issue often raised when it comes to the use of data in healthcare is the appropriate use of Big Data. Healthcare has always generated huge amounts of data and nowadays, the introduction of electronic medical records, as well as the huge amount of data sent by various types of sensors or generated by patients in social media causes data streams to constantly grow. Also, the medical industry generates significant amounts of data, including clinical records, medical images, genomic data and health behaviors. Proper use of the data will allow healthcare organizations to support clinical decision-making, disease surveillance, and public health management. The challenge posed by clinical data processing involves not only the quantity of data but also the difficulty in processing it.

In the literature one can find many different definitions of Big Data. This concept has evolved in recent years, however, it is still not clearly understood. Nevertheless, despite the range and differences in definitions, Big Data can be treated as a: large amount of digital data, large data sets, tool, technology or phenomenon (cultural or technological.

Big Data can be considered as massive and continually generated digital datasets that are produced via interactions with online technologies [ 53 ]. Big Data can be defined as datasets that are of such large sizes that they pose challenges in traditional storage and analysis techniques [ 28 ]. A similar opinion about Big Data was presented by Ohlhorst who sees Big Data as extremely large data sets, possible neither to manage nor to analyze with traditional data processing tools [ 57 ]. In his opinion, the bigger the data set, the more difficult it is to gain any value from it.

In turn, Knapp perceived Big Data as tools, processes and procedures that allow an organization to create, manipulate and manage very large data sets and storage facilities [ 38 ]. From this point of view, Big Data is identified as a tool to gather information from different databases and processes, allowing users to manage large amounts of data.

Similar perception of the term ‘Big Data’ is shown by Carter. According to him, Big Data technologies refer to a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data by enabling high velocity capture, discovery and/or analysis [ 13 ].

Jordan combines these two approaches by identifying Big Data as a complex system, as it needs data bases for data to be stored in, programs and tools to be managed, as well as expertise and personnel able to retrieve useful information and visualization to be understood [ 37 ].

Following the definition of Laney for Big Data, it can be state that: it is large amount of data generated in very fast motion and it contains a lot of content [ 43 ]. Such data comes from unstructured sources, such as stream of clicks on the web, social networks (Twitter, blogs, Facebook), video recordings from the shops, recording of calls in a call center, real time information from various kinds of sensors, RFID, GPS devices, mobile phones and other devices that identify and monitor something [ 8 ]. Big Data is a powerful digital data silo, raw, collected with all sorts of sources, unstructured and difficult, or even impossible, to analyze using conventional techniques used so far to relational databases.

While describing Big Data, it cannot be overlooked that the term refers more to a phenomenon than to specific technology. Therefore, instead of defining this phenomenon, trying to describe them, more authors are describing Big Data by giving them characteristics included a collection of V’s related to its nature [ 2 , 3 , 23 , 25 , 58 ]:

Volume (refers to the amount of data and is one of the biggest challenges in Big Data Analytics),

Velocity (speed with which new data is generated, the challenge is to be able to manage data effectively and in real time),

Variety (heterogeneity of data, many different types of healthcare data, the challenge is to derive insights by looking at all available heterogenous data in a holistic manner),

Variability (inconsistency of data, the challenge is to correct the interpretation of data that can vary significantly depending on the context),

Veracity (how trustworthy the data is, quality of the data),

Visualization (ability to interpret data and resulting insights, challenging for Big Data due to its other features as described above).

Value (the goal of Big Data Analytics is to discover the hidden knowledge from huge amounts of data).

Big Data is defined as an information asset with high volume, velocity, and variety, which requires specific technology and method for its transformation into value [ 21 , 77 ]. Big Data is also a collection of information about high-volume, high volatility or high diversity, requiring new forms of processing in order to support decision-making, discovering new phenomena and process optimization [ 5 , 7 ]. Big Data is too large for traditional data-processing systems and software tools to capture, store, manage and analyze, therefore it requires new technologies [ 28 , 50 , 61 ] to manage (capture, aggregate, process) its volume, velocity and variety [ 9 ].

Undoubtedly, Big Data differs from the data sources used so far by organizations. Therefore, organizations must approach this type of unstructured data in a different way. First of all, organizations must start to see data as flows and not stocks—this entails the need to implement the so-called streaming analytics [ 48 ]. The mentioned features make it necessary to use new IT tools that allow the fullest use of new data [ 58 ]. The Big Data idea, inseparable from the huge increase in data available to various organizations or individuals, creates opportunities for access to valuable analyses, conclusions and enables making more accurate decisions [ 6 , 11 , 59 ].

The Big Data concept is constantly evolving and currently it does not focus on huge amounts of data, but rather on the process of creating value from this data [ 52 ]. Big Data is collected from various sources that have different data properties and are processed by different organizational units, resulting in creation of a Big Data chain [ 36 ]. The aim of the organizations is to manage, process and analyze Big Data. In the healthcare sector, Big Data streams consist of various types of data, namely [ 8 , 51 ]:

clinical data, i.e. data obtained from electronic medical records, data from hospital information systems, image centers, laboratories, pharmacies and other organizations providing health services, patient generated health data, physician’s free-text notes, genomic data, physiological monitoring data [ 4 ],

biometric data provided from various types of devices that monitor weight, pressure, glucose level, etc.,

financial data, constituting a full record of economic operations reflecting the conducted activity,

data from scientific research activities, i.e. results of research, including drug research, design of medical devices and new methods of treatment,

data provided by patients, including description of preferences, level of satisfaction, information from systems for self-monitoring of their activity: exercises, sleep, meals consumed, etc.

data from social media.

These data are provided not only by patients but also by organizations and institutions, as well as by various types of monitoring devices, sensors or instruments [ 16 ]. Data that has been generated so far in the healthcare sector is stored in both paper and digital form. Thus, the essence and the specificity of the process of Big Data analyses means that organizations need to face new technological and organizational challenges [ 67 ]. The healthcare sector has always generated huge amounts of data and this is connected, among others, with the need to store medical records of patients. However, the problem with Big Data in healthcare is not limited to an overwhelming volume but also an unprecedented diversity in terms of types, data formats and speed with which it should be analyzed in order to provide the necessary information on an ongoing basis [ 3 ]. It is also difficult to apply traditional tools and methods for management of unstructured data [ 67 ]. Due to the diversity and quantity of data sources that are growing all the time, advanced analytical tools and technologies, as well as Big Data analysis methods which can meet and exceed the possibilities of managing healthcare data, are needed [ 3 , 68 ].

Therefore, the potential is seen in Big Data analyses, especially in the aspect of improving the quality of medical care, saving lives or reducing costs [ 30 ]. Extracting from this tangle of given association rules, patterns and trends will allow health service providers and other stakeholders in the healthcare sector to offer more accurate and more insightful diagnoses of patients, personalized treatment, monitoring of the patients, preventive medicine, support of medical research and health population, as well as better quality of medical services and patient care while, at the same time, the ability to reduce costs (Fig.  1 ).

figure 1

(Source: Own elaboration)

Healthcare Big Data Analytics applications

The main challenge with Big Data is how to handle such a large amount of information and use it to make data-driven decisions in plenty of areas [ 64 ]. In the context of healthcare data, another major challenge is to adjust big data storage, analysis, presentation of analysis results and inference basing on them in a clinical setting. Data analytics systems implemented in healthcare are designed to describe, integrate and present complex data in an appropriate way so that it can be understood better (Fig.  2 ). This would improve the efficiency of acquiring, storing, analyzing and visualizing big data from healthcare [ 71 ].

figure 2

Process of Big Data Analytics

The result of data processing with the use of Big Data Analytics is appropriate data storytelling which may contribute to making decisions with both lower risk and data support. This, in turn, can benefit healthcare stakeholders. To take advantage of the potential massive amounts of data in healthcare and to ensure that the right intervention to the right patient is properly timed, personalized, and potentially beneficial to all components of the healthcare system such as the payer, patient, and management, analytics of large datasets must connect communities involved in data analytics and healthcare informatics [ 49 ]. Big Data Analytics can provide insight into clinical data and thus facilitate informed decision-making about the diagnosis and treatment of patients, prevention of diseases or others. Big Data Analytics can also improve the efficiency of healthcare organizations by realizing the data potential [ 3 , 62 ].

Big Data Analytics in medicine and healthcare refers to the integration and analysis of a large amount of complex heterogeneous data, such as various omics (genomics, epigenomics, transcriptomics, proteomics, metabolomics, interactomics, pharmacogenetics, deasomics), biomedical data, talemedicine data (sensors, medical equipment data) and electronic health records data [ 46 , 65 ].

When analyzing the phenomenon of Big Data in the healthcare sector, it should be noted that it can be considered from the point of view of three areas: epidemiological, clinical and business.

From a clinical point of view, the Big Data analysis aims to improve the health and condition of patients, enable long-term predictions about their health status and implementation of appropriate therapeutic procedures. Ultimately, the use of data analysis in medicine is to allow the adaptation of therapy to a specific patient, that is personalized medicine (precision, personalized medicine).

From an epidemiological point of view, it is desirable to obtain an accurate prognosis of morbidity in order to implement preventive programs in advance.

In the business context, Big Data analysis may enable offering personalized packages of commercial services or determining the probability of individual disease and infection occurrence. It is worth noting that Big Data means not only the collection and processing of data but, most of all, the inference and visualization of data necessary to obtain specific business benefits.

In order to introduce new management methods and new solutions in terms of effectiveness and transparency, it becomes necessary to make data more accessible, digital, searchable, as well as analyzed and visualized.

Erickson and Rothberg state that the information and data do not reveal their full value until insights are drawn from them. Data becomes useful when it enhances decision making and decision making is enhanced only when analytical techniques are used and an element of human interaction is applied [ 22 ].

Thus, healthcare has experienced much progress in usage and analysis of data. A large-scale digitalization and transparency in this sector is a key statement of almost all countries governments policies. For centuries, the treatment of patients was based on the judgment of doctors who made treatment decisions. In recent years, however, Evidence-Based Medicine has become more and more important as a result of it being related to the systematic analysis of clinical data and decision-making treatment based on the best available information [ 42 ]. In the healthcare sector, Big Data Analytics is expected to improve the quality of life and reduce operational costs [ 72 , 82 ]. Big Data Analytics enables organizations to improve and increase their understanding of the information contained in data. It also helps identify data that provides insightful insights for current as well as future decisions [ 28 ].

Big Data Analytics refers to technologies that are grounded mostly in data mining: text mining, web mining, process mining, audio and video analytics, statistical analysis, network analytics, social media analytics and web analytics [ 16 , 25 , 31 ]. Different data mining techniques can be applied on heterogeneous healthcare data sets, such as: anomaly detection, clustering, classification, association rules as well as summarization and visualization of those Big Data sets [ 65 ]. Modern data analytics techniques explore and leverage unique data characteristics even from high-speed data streams and sensor data [ 15 , 16 , 31 , 55 ]. Big Data can be used, for example, for better diagnosis in the context of comprehensive patient data, disease prevention and telemedicine (in particular when using real-time alerts for immediate care), monitoring patients at home, preventing unnecessary hospital visits, integrating medical imaging for a wider diagnosis, creating predictive analytics, reducing fraud and improving data security, better strategic planning and increasing patients’ involvement in their own health.

Big Data Analytics in healthcare can be divided into [ 33 , 73 , 74 ]:

descriptive analytics in healthcare is used to understand past and current healthcare decisions, converting data into useful information for understanding and analyzing healthcare decisions, outcomes and quality, as well as making informed decisions [ 33 ]. It can be used to create reports (i.e. about patients’ hospitalizations, physicians’ performance, utilization management), visualization, customized reports, drill down tables, or running queries on the basis of historical data.

predictive analytics operates on past performance in an effort to predict the future by examining historical or summarized health data, detecting patterns of relationships in these data, and then extrapolating these relationships to forecast. It can be used to i.e. predict the response of different patient groups to different drugs (dosages) or reactions (clinical trials), anticipate risk and find relationships in health data and detect hidden patterns [ 62 ]. In this way, it is possible to predict the epidemic spread, anticipate service contracts and plan healthcare resources. Predictive analytics is used in proper diagnosis and for appropriate treatments to be given to patients suffering from certain diseases [ 39 ].

prescriptive analytics—occurs when health problems involve too many choices or alternatives. It uses health and medical knowledge in addition to data or information. Prescriptive analytics is used in many areas of healthcare, including drug prescriptions and treatment alternatives. Personalized medicine and evidence-based medicine are both supported by prescriptive analytics.

discovery analytics—utilizes knowledge about knowledge to discover new “inventions” like drugs (drug discovery), previously unknown diseases and medical conditions, alternative treatments, etc.

Although the models and tools used in descriptive, predictive, prescriptive, and discovery analytics are different, many applications involve all four of them [ 62 ]. Big Data Analytics in healthcare can help enable personalized medicine by identifying optimal patient-specific treatments. This can influence the improvement of life standards, reduce waste of healthcare resources and save costs of healthcare [ 56 , 63 , 71 ]. The introduction of large data analysis gives new analytical possibilities in terms of scope, flexibility and visualization. Techniques such as data mining (computational pattern discovery process in large data sets) facilitate inductive reasoning and analysis of exploratory data, enabling scientists to identify data patterns that are independent of specific hypotheses. As a result, predictive analysis and real-time analysis becomes possible, making it easier for medical staff to start early treatments and reduce potential morbidity and mortality. In addition, document analysis, statistical modeling, discovering patterns and topics in document collections and data in the EHR, as well as an inductive approach can help identify and discover relationships between health phenomena.

Advanced analytical techniques can be used for a large amount of existing (but not yet analytical) data on patient health and related medical data to achieve a better understanding of the information and results obtained, as well as to design optimal clinical pathways [ 62 ]. Big Data Analytics in healthcare integrates analysis of several scientific areas such as bioinformatics, medical imaging, sensor informatics, medical informatics and health informatics [ 65 ]. Big Data Analytics in healthcare allows to analyze large datasets from thousands of patients, identifying clusters and correlation between datasets, as well as developing predictive models using data mining techniques [ 65 ]. Discussing all the techniques used for Big Data Analytics goes beyond the scope of a single article [ 25 ].

The success of Big Data analysis and its accuracy depend heavily on the tools and techniques used to analyze the ability to provide reliable, up-to-date and meaningful information to various stakeholders [ 12 ]. It is believed that the implementation of big data analytics by healthcare organizations could bring many benefits in the upcoming years, including lowering health care costs, better diagnosis and prediction of diseases and their spread, improving patient care and developing protocols to prevent re-hospitalization, optimizing staff, optimizing equipment, forecasting the need for hospital beds, operating rooms, treatments, and improving the drug supply chain [ 71 ].

Challenges and potential benefits of using Big Data Analytics in healthcare

Modern analytics gives possibilities not only to have insight in historical data, but also to have information necessary to generate insight into what may happen in the future. Even when it comes to prediction of evidence-based actions. The emphasis on reform has prompted payers and suppliers to pursue data analysis to reduce risk, detect fraud, improve efficiency and save lives. Everyone—payers, providers, even patients—are focusing on doing more with fewer resources. Thus, some areas in which enhanced data and analytics can yield the greatest results include various healthcare stakeholders (Table 1 ).

Healthcare organizations see the opportunity to grow through investments in Big Data Analytics. In recent years, by collecting medical data of patients, converting them into Big Data and applying appropriate algorithms, reliable information has been generated that helps patients, physicians and stakeholders in the health sector to identify values and opportunities [ 31 ]. It is worth noting that there are many changes and challenges in the structure of the healthcare sector. Digitization and effective use of Big Data in healthcare can bring benefits to every stakeholder in this sector. A single doctor would benefit the same as the entire healthcare system. Potential opportunities to achieve benefits and effects from Big Data in healthcare can be divided into four groups [ 8 ]:

Improving the quality of healthcare services:

assessment of diagnoses made by doctors and the manner of treatment of diseases indicated by them based on the decision support system working on Big Data collections,

detection of more effective, from a medical point of view, and more cost-effective ways to diagnose and treat patients,

analysis of large volumes of data to reach practical information useful for identifying needs, introducing new health services, preventing and overcoming crises,

prediction of the incidence of diseases,

detecting trends that lead to an improvement in health and lifestyle of the society,

analysis of the human genome for the introduction of personalized treatment.

Supporting the work of medical personnel

doctors’ comparison of current medical cases to cases from the past for better diagnosis and treatment adjustment,

detection of diseases at earlier stages when they can be more easily and quickly cured,

detecting epidemiological risks and improving control of pathogenic spots and reaction rates,

identification of patients who are predicted to have the highest risk of specific, life-threatening diseases by collating data on the history of the most common diseases, in healing people with reports entering insurance companies,

health management of each patient individually (personalized medicine) and health management of the whole society,

capturing and analyzing large amounts of data from hospitals and homes in real time, life monitoring devices to monitor safety and predict adverse events,

analysis of patient profiles to identify people for whom prevention should be applied, lifestyle change or preventive care approach,

the ability to predict the occurrence of specific diseases or worsening of patients’ results,

predicting disease progression and its determinants, estimating the risk of complications,

detecting drug interactions and their side effects.

Supporting scientific and research activity

supporting work on new drugs and clinical trials thanks to the possibility of analyzing “all data” instead of selecting a test sample,

the ability to identify patients with specific, biological features that will take part in specialized clinical trials,

selecting a group of patients for which the tested drug is likely to have the desired effect and no side effects,

using modeling and predictive analysis to design better drugs and devices.

Business and management

reduction of costs and counteracting abuse and counseling practices,

faster and more effective identification of incorrect or unauthorized financial operations in order to prevent abuse and eliminate errors,

increase in profitability by detecting patients generating high costs or identifying doctors whose work, procedures and treatment methods cost the most and offering them solutions that reduce the amount of money spent,

identification of unnecessary medical activities and procedures, e.g. duplicate tests.

According to research conducted by Wang, Kung and Byrd, Big Data Analytics benefits can be classified into five categories: IT infrastructure benefits (reducing system redundancy, avoiding unnecessary IT costs, transferring data quickly among healthcare IT systems, better use of healthcare systems, processing standardization among various healthcare IT systems, reducing IT maintenance costs regarding data storage), operational benefits (improving the quality and accuracy of clinical decisions, processing a large number of health records in seconds, reducing the time of patient travel, immediate access to clinical data to analyze, shortening the time of diagnostic test, reductions in surgery-related hospitalizations, exploring inconceivable new research avenues), organizational benefits (detecting interoperability problems much more quickly than traditional manual methods, improving cross-functional communication and collaboration among administrative staffs, researchers, clinicians and IT staffs, enabling data sharing with other institutions and adding new services, content sources and research partners), managerial benefits (gaining quick insights about changing healthcare trends in the market, providing members of the board and heads of department with sound decision-support information on the daily clinical setting, optimizing business growth-related decisions) and strategic benefits (providing a big picture view of treatment delivery for meeting future need, creating high competitive healthcare services) [ 73 ].

The above specification does not constitute a full list of potential areas of use of Big Data Analysis in healthcare because the possibilities of using analysis are practically unlimited. In addition, advanced analytical tools allow to analyze data from all possible sources and conduct cross-analyses to provide better data insights [ 26 ]. For example, a cross-analysis can refer to a combination of patient characteristics, as well as costs and care results that can help identify the best, in medical terms, and the most cost-effective treatment or treatments and this may allow a better adjustment of the service provider’s offer [ 62 ].

In turn, the analysis of patient profiles (e.g. segmentation and predictive modeling) allows identification of people who should be subject to prophylaxis, prevention or should change their lifestyle [ 8 ]. Shortened list of benefits for Big Data Analytics in healthcare is presented in paper [ 3 ] and consists of: better performance, day-to-day guides, detection of diseases in early stages, making predictive analytics, cost effectiveness, Evidence Based Medicine and effectiveness in patient treatment.

Summarizing, healthcare big data represents a huge potential for the transformation of healthcare: improvement of patients’ results, prediction of outbreaks of epidemics, valuable insights, avoidance of preventable diseases, reduction of the cost of healthcare delivery and improvement of the quality of life in general [ 1 ]. Big Data also generates many challenges such as difficulties in data capture, data storage, data analysis and data visualization [ 15 ]. The main challenges are connected with the issues of: data structure (Big Data should be user-friendly, transparent, and menu-driven but it is fragmented, dispersed, rarely standardized and difficult to aggregate and analyze), security (data security, privacy and sensitivity of healthcare data, there are significant concerns related to confidentiality), data standardization (data is stored in formats that are not compatible with all applications and technologies), storage and transfers (especially costs associated with securing, storing, and transferring unstructured data), managerial skills, such as data governance, lack of appropriate analytical skills and problems with Real-Time Analytics (health care is to be able to utilize Big Data in real time) [ 4 , 34 , 41 ].

The research is based on a critical analysis of the literature, as well as the presentation of selected results of direct research on the use of Big Data Analytics in medical facilities in Poland.

Presented research results are part of a larger questionnaire form on Big Data Analytics. The direct research was based on an interview questionnaire which contained 100 questions with 5-point Likert scale (1—strongly disagree, 2—I rather disagree, 3—I do not agree, nor disagree, 4—I rather agree, 5—I definitely agree) and 4 metrics questions. The study was conducted in December 2018 on a sample of 217 medical facilities (110 private, 107 public). The research was conducted by a specialized market research agency: Center for Research and Expertise of the University of Economics in Katowice.

When it comes to direct research, the selected entities included entities financed from public sources—the National Health Fund (23.5%), and entities operating commercially (11.5%). In the surveyed group of entities, more than a half (64.9%) are hybrid financed, both from public and commercial sources. The diversity of the research sample also applies to the size of the entities, defined by the number of employees. Taking into account proportions of the surveyed entities, it should be noted that in the sector structure, medium-sized (10–50 employees—34% of the sample) and large (51–250 employees—27%) entities dominate. The research was of all-Poland nature, and the entities included in the research sample come from all of the voivodships. The largest group were entities from Łódzkie (32%), Śląskie (18%) and Mazowieckie (18%) voivodships, as these voivodships have the largest number of medical institutions. Other regions of the country were represented by single units. The selection of the research sample was random—layered. As part of medical facilities database, groups of private and public medical facilities have been identified and the ones to which the questionnaire was targeted were drawn from each of these groups. The analyses were performed using the GNU PSPP 0.10.2 software.

The aim of the study was to determine whether medical facilities in Poland use Big Data Analytics and if so, in which areas. Characteristics of the research sample is presented in Table 2 .

The research is non-exhaustive due to the incomplete and uneven regional distribution of the samples, overrepresented in three voivodeships (Łódzkie, Mazowieckie and Śląskie). The size of the research sample (217 entities) allows the authors of the paper to formulate specific conclusions on the use of Big Data in the process of its management.

For the purpose of this paper, the following research hypotheses were formulated: (1) medical facilities in Poland are working on both structured and unstructured data (2) medical facilities in Poland are moving towards data-based healthcare and its benefits.

The paper poses the following research questions and statements that coincide with the selected questions from the research questionnaire:

From what sources do medical facilities obtain data? What types of data are used by the particular organization, whether structured or unstructured, and to what extent?

From what sources do medical facilities obtain data?

In which area organizations are using data and analytical systems (clinical or business)?

Is data analytics performed based on historical data or are predictive analyses also performed?

Determining whether administrative and medical staff receive complete, accurate and reliable data in a timely manner?

Determining whether real-time analyses are performed to support the particular organization’s activities.

Results and discussion

On the basis of the literature analysis and research study, a set of questions and statements related to the researched area was formulated. The results from the surveys show that medical facilities use a variety of data sources in their operations. These sources are both structured and unstructured data (Table 3 ).

According to the data provided by the respondents, considering the first statement made in the questionnaire, almost half of the medical institutions (47.58%) agreed that they rather collect and use structured data (e.g. databases and data warehouses, reports to external entities) and 10.57% entirely agree with this statement. As much as 23.35% of representatives of medical institutions stated “I agree or disagree”. Other medical facilities do not collect and use structured data (7.93%) and 6.17% strongly disagree with the first statement. Also, the median calculated based on the obtained results (median: 4), proves that medical facilities in Poland collect and use structured data (Table 4 ).

In turn, 28.19% of the medical institutions agreed that they rather collect and use unstructured data and as much as 9.25% entirely agree with this statement. The number of representatives of medical institutions that stated “I agree or disagree” was 27.31%. Other medical facilities do not collect and use structured data (17.18%) and 13.66% strongly disagree with the first statement. In the case of unstructured data the median is 3, which means that the collection and use of this type of data by medical facilities in Poland is lower.

In the further part of the analysis, it was checked whether the size of the medical facility and form of ownership have an impact on whether it analyzes unstructured data (Tables 4 and 5 ). In order to find this out, correlation coefficients were calculated.

Based on the calculations, it can be concluded that there is a small statistically monotonic correlation between the size of the medical facility and its collection and use of structured data (p < 0.001; τ = 0.16). This means that the use of structured data is slightly increasing in larger medical facilities. The size of the medical facility is more important according to use of unstructured data (p < 0.001; τ = 0.23) (Table 4 .).

To determine whether the form of medical facility ownership affects data collection, the Mann–Whitney U test was used. The calculations show that the form of ownership does not affect what data the organization collects and uses (Table 5 ).

Detailed information on the sources of from which medical facilities collect and use data is presented in the Table 6 .

The questionnaire results show that medical facilities are especially using information published in databases, reports to external units and transaction data, but they also use unstructured data from e-mails, medical devices, sensors, phone calls, audio and video data (Table 6 ). Data from social media, RFID and geolocation data are used to a small extent. Similar findings are concluded in the literature studies.

From the analysis of the answers given by the respondents, more than half of the medical facilities have integrated hospital system (HIS) implemented. As much as 43.61% use integrated hospital system and 16.30% use it extensively (Table 7 ). 19.38% of exanimated medical facilities do not use it at all. Moreover, most of the examined medical facilities (34.80% use it, 32.16% use extensively) conduct medical documentation in an electronic form, which gives an opportunity to use data analytics. Only 4.85% of medical facilities don’t use it at all.

Other problems that needed to be investigated were: whether medical facilities in Poland use data analytics? If so, in what form and in what areas? (Table 8 ). The analysis of answers given by the respondents about the potential of data analytics in medical facilities shows that a similar number of medical facilities use data analytics in administration and business (31.72% agreed with the statement no. 5 and 12.33% strongly agreed) as in the clinical area (33.04% agreed with the statement no. 6 and 12.33% strongly agreed). When considering decision-making issues, 35.24% agree with the statement "the organization uses data and analytical systems to support business decisions” and 8.37% of respondents strongly agree. Almost 40.09% agree with the statement that “the organization uses data and analytical systems to support clinical decisions (in the field of diagnostics and therapy)” and 15.42% of respondents strongly agree. Exanimated medical facilities use in their activity analytics based both on historical data (33.48% agree with statement 7 and 12.78% strongly agree) and predictive analytics (33.04% agrees with the statement number 8 and 15.86% strongly agree). Detailed results are presented in Table 8 .

Medical facilities focus on development in the field of data processing, as they confirm that they conduct analytical planning processes systematically and analyze new opportunities for strategic use of analytics in business and clinical activities (38.33% rather agree and 10.57% strongly agree with this statement). The situation is different with real-time data analysis, here, the situation is not so optimistic. Only 28.19% rather agree and 14.10% strongly agree with the statement that real-time analyses are performed to support an organization’s activities.

When considering whether a facility’s performance in the clinical area depends on the form of ownership, it can be concluded that taking the average and the Mann–Whitney U test depends. A higher degree of use of analyses in the clinical area can be observed in public institutions.

Whether a medical facility performs a descriptive or predictive analysis do not depend on the form of ownership (p > 0.05). It can be concluded that when analyzing the mean and median, they are higher in public facilities, than in private ones. What is more, the Mann–Whitney U test shows that these variables are dependent from each other (p < 0.05) (Table 9 ).

When considering whether a facility’s performance in the clinical area depends on its size, it can be concluded that taking the Kendall’s Tau (τ) it depends (p < 0.001; τ = 0.22), and the correlation is weak but statistically important. This means that the use of data and analytical systems to support clinical decisions (in the field of diagnostics and therapy) increases with the increase of size of the medical facility. A similar relationship, but even less powerful, can be found in the use of descriptive and predictive analyses (Table 10 ).

Considering the results of research in the area of analytical maturity of medical facilities, 8.81% of medical facilities stated that they are at the first level of maturity, i.e. an organization has developed analytical skills and does not perform analyses. As much as 13.66% of medical facilities confirmed that they have poor analytical skills, while 38.33% of the medical facility has located itself at level 3, meaning that “there is a lot to do in analytics”. On the other hand, 28.19% believe that analytical capabilities are well developed and 6.61% stated that analytics are at the highest level and the analytical capabilities are very well developed. Detailed data is presented in Table 11 . Average amounts to 3.11 and Median to 3.

The results of the research have enabled the formulation of following conclusions. Medical facilities in Poland are working on both structured and unstructured data. This data comes from databases, transactions, unstructured content of emails and documents, devices and sensors. However, the use of data from social media is smaller. In their activity, they reach for analytics in the administrative and business, as well as in the clinical area. Also, the decisions made are largely data-driven.

In summary, analysis of the literature that the benefits that medical facilities can get using Big Data Analytics in their activities relate primarily to patients, physicians and medical facilities. It can be confirmed that: patients will be better informed, will receive treatments that will work for them, will have prescribed medications that work for them and not be given unnecessary medications [ 78 ]. Physician roles will likely change to more of a consultant than decision maker. They will advise, warn, and help individual patients and have more time to form positive and lasting relationships with their patients in order to help people. Medical facilities will see changes as well, for example in fewer unnecessary hospitalizations, resulting initially in less revenue, but after the market adjusts, also the accomplishment [ 78 ]. The use of Big Data Analytics can literally revolutionize the way healthcare is practiced for better health and disease reduction.

The analysis of the latest data reveals that data analytics increase the accuracy of diagnoses. Physicians can use predictive algorithms to help them make more accurate diagnoses [ 45 ]. Moreover, it could be helpful in preventive medicine and public health because with early intervention, many diseases can be prevented or ameliorated [ 29 ]. Predictive analytics also allows to identify risk factors for a given patient, and with this knowledge patients will be able to change their lives what, in turn, may contribute to the fact that population disease patterns may dramatically change, resulting in savings in medical costs. Moreover, personalized medicine is the best solution for an individual patient seeking treatment. It can help doctors decide the exact treatments for those individuals. Better diagnoses and more targeted treatments will naturally lead to increases in good outcomes and fewer resources used, including doctors’ time.

The quantitative analysis of the research carried out and presented in this article made it possible to determine whether medical facilities in Poland use Big Data Analytics and if so, in which areas. Thanks to the results obtained it was possible to formulate the following conclusions. Medical facilities are working on both structured and unstructured data, which comes from databases, transactions, unstructured content of emails and documents, devices and sensors. According to analytics, they reach for analytics in the administrative and business, as well as in the clinical area. It clearly showed that the decisions made are largely data-driven. The results of the study confirm what has been analyzed in the literature. Medical facilities are moving towards data-based healthcare and its benefits.

In conclusion, Big Data Analytics has the potential for positive impact and global implications in healthcare. Future research on the use of Big Data in medical facilities will concern the definition of strategies adopted by medical facilities to promote and implement such solutions, as well as the benefits they gain from the use of Big Data analysis and how the perspectives in this area are seen.

Practical implications

This work sought to narrow the gap that exists in analyzing the possibility of using Big Data Analytics in healthcare. Showing how medical facilities in Poland are doing in this respect is an element that is part of global research carried out in this area, including [ 29 , 32 , 60 ].

Limitations and future directions

The research described in this article does not fully exhaust the questions related to the use of Big Data Analytics in Polish healthcare facilities. Only some of the dimensions characterizing the use of data by medical facilities in Poland have been examined. In order to get the full picture, it would be necessary to examine the results of using structured and unstructured data analytics in healthcare. Future research may examine the benefits that medical institutions achieve as a result of the analysis of structured and unstructured data in the clinical and management areas and what limitations they encounter in these areas. For this purpose, it is planned to conduct in-depth interviews with chosen medical facilities in Poland. These facilities could give additional data for empirical analyses based more on their suggestions. Further research should also include medical institutions from beyond the borders of Poland, enabling international comparative analyses.

Future research in the healthcare field has virtually endless possibilities. These regard the use of Big Data Analytics to diagnose specific conditions [ 47 , 66 , 69 , 76 ], propose an approach that can be used in other healthcare applications and create mechanisms to identify “patients like me” [ 75 , 80 ]. Big Data Analytics could also be used for studies related to the spread of pandemics, the efficacy of covid treatment [ 18 , 79 ], or psychology and psychiatry studies, e.g. emotion recognition [ 35 ].

Availability of data and materials

The datasets for this study are available on request to the corresponding author.

Abouelmehdi K, Beni-Hessane A, Khaloufi H. Big healthcare data: preserving security and privacy. J Big Data. 2018. https://doi.org/10.1186/s40537-017-0110-7 .

Article   Google Scholar  

Agrawal A, Choudhary A. Health services data: big data analytics for deriving predictive healthcare insights. Health Serv Eval. 2019. https://doi.org/10.1007/978-1-4899-7673-4_2-1 .

Al Mayahi S, Al-Badi A, Tarhini A. Exploring the potential benefits of big data analytics in providing smart healthcare. In: Miraz MH, Excell P, Ware A, Ali M, Soomro S, editors. Emerging technologies in computing—first international conference, iCETiC 2018, proceedings (Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST). Cham: Springer; 2018. p. 247–58. https://doi.org/10.1007/978-3-319-95450-9_21 .

Bainbridge M. Big data challenges for clinical and precision medicine. In: Househ M, Kushniruk A, Borycki E, editors. Big data, big challenges: a healthcare perspective: background, issues, solutions and research directions. Cham: Springer; 2019. p. 17–31.

Google Scholar  

Bartuś K, Batko K, Lorek P. Business intelligence systems: barriers during implementation. In: Jabłoński M, editor. Strategic performance management new concept and contemporary trends. New York: Nova Science Publishers; 2017. p. 299–327. ISBN: 978-1-53612-681-5.

Bartuś K, Batko K, Lorek P. Diagnoza wykorzystania big data w organizacjach-wybrane wyniki badań. Informatyka Ekonomiczna. 2017;3(45):9–20.

Bartuś K, Batko K, Lorek P. Wykorzystanie rozwiązań business intelligence, competitive intelligence i big data w przedsiębiorstwach województwa śląskiego. Przegląd Organizacji. 2018;2:33–9.

Batko K. Możliwości wykorzystania Big Data w ochronie zdrowia. Roczniki Kolegium Analiz Ekonomicznych. 2016;42:267–82.

Bi Z, Cochran D. Big data analytics with applications. J Manag Anal. 2014;1(4):249–65. https://doi.org/10.1080/23270012.2014.992985 .

Boerma T, Requejo J, Victora CG, Amouzou A, Asha G, Agyepong I, Borghi J. Countdown to 2030: tracking progress towards universal coverage for reproductive, maternal, newborn, and child health. Lancet. 2018;391(10129):1538–48.

Bollier D, Firestone CM. The promise and peril of big data. Washington, D.C: Aspen Institute, Communications and Society Program; 2010. p. 1–66.

Bose R. Competitive intelligence process and tools for intelligence analysis. Ind Manag Data Syst. 2008;108(4):510–28.

Carter P. Big data analytics: future architectures, skills and roadmaps for the CIO: in white paper, IDC sponsored by SAS. 2011. p. 1–16.

Castro EM, Van Regenmortel T, Vanhaecht K, Sermeus W, Van Hecke A. Patient empowerment, patient participation and patient-centeredness in hospital care: a concept analysis based on a literature review. Patient Educ Couns. 2016;99(12):1923–39.

Chen H, Chiang RH, Storey VC. Business intelligence and analytics: from big data to big impact. MIS Q. 2012;36(4):1165–88.

Chen CP, Zhang CY. Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci. 2014;275:314–47.

Chomiak-Orsa I, Mrozek B. Główne perspektywy wykorzystania big data w mediach społecznościowych. Informatyka Ekonomiczna. 2017;3(45):44–54.

Corsi A, de Souza FF, Pagani RN, et al. Big data analytics as a tool for fighting pandemics: a systematic review of literature. J Ambient Intell Hum Comput. 2021;12:9163–80. https://doi.org/10.1007/s12652-020-02617-4 .

Davenport TH, Harris JG. Competing on analytics, the new science of winning. Boston: Harvard Business School Publishing Corporation; 2007.

Davenport TH. Big data at work: dispelling the myths, uncovering the opportunities. Boston: Harvard Business School Publishing; 2014.

De Cnudde S, Martens D. Loyal to your city? A data mining analysis of a public service loyalty program. Decis Support Syst. 2015;73:74–84.

Erickson S, Rothberg H. Data, information, and intelligence. In: Rodriguez E, editor. The analytics process. Boca Raton: Auerbach Publications; 2017. p. 111–26.

Fang H, Zhang Z, Wang CJ, Daneshmand M, Wang C, Wang H. A survey of big data research. IEEE Netw. 2015;29(5):6–9.

Fredriksson C. Organizational knowledge creation with big data. A case study of the concept and practical use of big data in a local government context. 2016. https://www.abo.fi/fakultet/media/22103/fredriksson.pdf .

Gandomi A, Haider M. Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag. 2015;35(2):137–44.

Groves P, Kayyali B, Knott D, Van Kuiken S. The ‘big data’ revolution in healthcare. Accelerating value and innovation. 2015. http://www.pharmatalents.es/assets/files/Big_Data_Revolution.pdf (Reading: 10.04.2019).

Gupta V, Rathmore N. Deriving business intelligence from unstructured data. Int J Inf Comput Technol. 2013;3(9):971–6.

Gupta V, Singh VK, Ghose U, Mukhija P. A quantitative and text-based characterization of big data research. J Intell Fuzzy Syst. 2019;36:4659–75.

Hampel HOBS, O’Bryant SE, Castrillo JI, Ritchie C, Rojkova K, Broich K, Escott-Price V. PRECISION MEDICINE-the golden gate for detection, treatment and prevention of Alzheimer’s disease. J Prev Alzheimer’s Dis. 2016;3(4):243.

Harerimana GB, Jang J, Kim W, Park HK. Health big data analytics: a technology survey. IEEE Access. 2018;6:65661–78. https://doi.org/10.1109/ACCESS.2018.2878254 .

Hu H, Wen Y, Chua TS, Li X. Toward scalable systems for big data analytics: a technology tutorial. IEEE Access. 2014;2:652–87.

Hussain S, Hussain M, Afzal M, Hussain J, Bang J, Seung H, Lee S. Semantic preservation of standardized healthcare documents in big data. Int J Med Inform. 2019;129:133–45. https://doi.org/10.1016/j.ijmedinf.2019.05.024 .

Islam MS, Hasan MM, Wang X, Germack H. A systematic review on healthcare analytics: application and theoretical perspective of data mining. In: Healthcare. Basel: Multidisciplinary Digital Publishing Institute; 2018. p. 54.

Ismail A, Shehab A, El-Henawy IM. Healthcare analysis in smart big data analytics: reviews, challenges and recommendations. In: Security in smart cities: models, applications, and challenges. Cham: Springer; 2019. p. 27–45.

Jain N, Gupta V, Shubham S, et al. Understanding cartoon emotion using integrated deep neural network on large dataset. Neural Comput Appl. 2021. https://doi.org/10.1007/s00521-021-06003-9 .

Janssen M, van der Voort H, Wahyudi A. Factors influencing big data decision-making quality. J Bus Res. 2017;70:338–45.

Jordan SR. Beneficence and the expert bureaucracy. Public Integr. 2014;16(4):375–94. https://doi.org/10.2753/PIN1099-9922160404 .

Knapp MM. Big data. J Electron Resourc Med Libr. 2013;10(4):215–22.

Koti MS, Alamma BH. Predictive analytics techniques using big data for healthcare databases. In: Smart intelligent computing and applications. New York: Springer; 2019. p. 679–86.

Krumholz HM. Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. Health Aff. 2014;33(7):1163–70.

Kruse CS, Goswamy R, Raval YJ, Marawi S. Challenges and opportunities of big data in healthcare: a systematic review. JMIR Med Inform. 2016;4(4):e38.

Kyoungyoung J, Gang HK. Potentiality of big data in the medical sector: focus on how to reshape the healthcare system. Healthc Inform Res. 2013;19(2):79–85.

Laney D. Application delivery strategies 2011. http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf .

Lee IK, Wang CC, Lin MC, Kung CT, Lan KC, Lee CT. Effective strategies to prevent coronavirus disease-2019 (COVID-19) outbreak in hospital. J Hosp Infect. 2020;105(1):102.

Lerner I, Veil R, Nguyen DP, Luu VP, Jantzen R. Revolution in health care: how will data science impact doctor-patient relationships? Front Public Health. 2018;6:99.

Lytras MD, Papadopoulou P, editors. Applying big data analytics in bioinformatics and medicine. IGI Global: Hershey; 2017.

Ma K, et al. Big data in multiple sclerosis: development of a web-based longitudinal study viewer in an imaging informatics-based eFolder system for complex data analysis and management. In: Proceedings volume 9418, medical imaging 2015: PACS and imaging informatics: next generation and innovations. 2015. p. 941809. https://doi.org/10.1117/12.2082650 .

Mach-Król M. Analiza i strategia big data w organizacjach. In: Studia i Materiały Polskiego Stowarzyszenia Zarządzania Wiedzą. 2015;74:43–55.

Madsen LB. Data-driven healthcare: how analytics and BI are transforming the industry. Hoboken: Wiley; 2014.

Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Hung BA. Big data: the next frontier for innovation, competition, and productivity. Washington: McKinsey Global Institute; 2011.

Marconi K, Dobra M, Thompson C. The use of big data in healthcare. In: Liebowitz J, editor. Big data and business analytics. Boca Raton: CRC Press; 2012. p. 229–48.

Mehta N, Pandit A. Concurrence of big data analytics and healthcare: a systematic review. Int J Med Inform. 2018;114:57–65.

Michel M, Lupton D. Toward a manifesto for the ‘public understanding of big data.’ Public Underst Sci. 2016;25(1):104–16. https://doi.org/10.1177/0963662515609005 .

Mikalef P, Krogstie J. Big data analytics as an enabler of process innovation capabilities: a configurational approach. In: International conference on business process management. Cham: Springer; 2018. p. 426–41.

Mohammadi M, Al-Fuqaha A, Sorour S, Guizani M. Deep learning for IoT big data and streaming analytics: a survey. IEEE Commun Surv Tutor. 2018;20(4):2923–60.

Nambiar R, Bhardwaj R, Sethi A, Vargheese R. A look at challenges and opportunities of big data analytics in healthcare. In: 2013 IEEE international conference on big data; 2013. p. 17–22.

Ohlhorst F. Big data analytics: turning big data into big money, vol. 65. Hoboken: Wiley; 2012.

Olszak C, Mach-Król M. A conceptual framework for assessing an organization’s readiness to adopt big data. Sustainability. 2018;10(10):3734.

Olszak CM. Toward better understanding and use of business intelligence in organizations. Inf Syst Manag. 2016;33(2):105–23.

Palanisamy V, Thirunavukarasu R. Implications of big data analytics in developing healthcare frameworks—a review. J King Saud Univ Comput Inf Sci. 2017;31(4):415–25.

Provost F, Fawcett T. Data science and its relationship to big data and data-driven decisionmaking. Big Data. 2013;1(1):51–9.

Raghupathi W, Raghupathi V. An overview of health analytics. J Health Med Inform. 2013;4:132. https://doi.org/10.4172/2157-7420.1000132 .

Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst. 2014;2(1):3.

Ratia M, Myllärniemi J. Beyond IC 4.0: the future potential of BI-tool utilization in the private healthcare, conference: proceedings IFKAD, 2018 at: Delft, The Netherlands.

Ristevski B, Chen M. Big data analytics in medicine and healthcare. J Integr Bioinform. 2018. https://doi.org/10.1515/jib-2017-0030 .

Rumsfeld JS, Joynt KE, Maddox TM. Big data analytics to improve cardiovascular care: promise and challenges. Nat Rev Cardiol. 2016;13(6):350–9. https://doi.org/10.1038/nrcardio.2016.42 .

Schmarzo B. Big data: understanding how data powers big business. Indianapolis: Wiley; 2013.

Senthilkumar SA, Rai BK, Meshram AA, Gunasekaran A, Chandrakumarmangalam S. Big data in healthcare management: a review of literature. Am J Theor Appl Bus. 2018;4:57–69.

Shubham S, Jain N, Gupta V, et al. Identify glomeruli in human kidney tissue images using a deep learning approach. Soft Comput. 2021. https://doi.org/10.1007/s00500-021-06143-z .

Thuemmler C. The case for health 4.0. In: Thuemmler C, Bai C, editors. Health 4.0: how virtualization and big data are revolutionizing healthcare. New York: Springer; 2017.

Tsai CW, Lai CF, Chao HC, et al. Big data analytics: a survey. J Big Data. 2015;2:21. https://doi.org/10.1186/s40537-015-0030-3 .

Wamba SF, Gunasekaran A, Akter S, Ji-fan RS, Dubey R, Childe SJ. Big data analytics and firm performance: effects of dynamic capabilities. J Bus Res. 2017;70:356–65.

Wang Y, Byrd TA. Business analytics-enabled decision-making effectiveness through knowledge absorptive capacity in health care. J Knowl Manag. 2017;21(3):517–39.

Wang Y, Kung L, Wang W, Yu C, Cegielski CG. An integrated big data analytics-enabled transformation model: application to healthcare. Inf Manag. 2018;55(1):64–79.

Wicks P, et al. Scaling PatientsLikeMe via a “generalized platform” for members with chronic illness: web-based survey study of benefits arising. J Med Internet Res. 2018;20(5):e175.

Willems SM, et al. The potential use of big data in oncology. Oral Oncol. 2019;98:8–12. https://doi.org/10.1016/j.oraloncology.2019.09.003 .

Williams N, Ferdinand NP, Croft R. Project management maturity in the age of big data. Int J Manag Proj Bus. 2014;7(2):311–7.

Winters-Miner LA. Seven ways predictive analytics can improve healthcare. Medical predictive analytics have the potential to revolutionize healthcare around the world. 2014. https://www.elsevier.com/connect/seven-ways-predictive-analytics-can-improve-healthcare (Reading: 15.04.2019).

Wu J, et al. Application of big data technology for COVID-19 prevention and control in China: lessons and recommendations. J Med Internet Res. 2020;22(10): e21980.

Yan L, Peng J, Tan Y. Network dynamics: how can we find patients like us? Inf Syst Res. 2015;26(3):496–512.

Yang JJ, Li J, Mulder J, Wang Y, Chen S, Wu H, Pan H. Emerging information technologies for enhanced healthcare. Comput Ind. 2015;69:3–11.

Zhang Q, Yang LT, Chen Z, Li P. A survey on deep learning for big data. Inf Fusion. 2018;42:146–57.

Download references


We would like to thank those who have touched our science paths.

This research was fully funded as statutory activity—subsidy of Ministry of Science and Higher Education granted for Technical University of Czestochowa on maintaining research potential in 2018. Research Number: BS/PB–622/3020/2014/P. Publication fee for the paper was financed by the University of Economics in Katowice.

Author information

Authors and affiliations.

Department of Business Informatics, University of Economics in Katowice, Katowice, Poland

Kornelia Batko

Department of Biomedical Processes and Systems, Institute of Health and Nutrition Sciences, Częstochowa University of Technology, Częstochowa, Poland

Andrzej Ślęzak

You can also search for this author in PubMed   Google Scholar


KB proposed the concept of research and its design. The manuscript was prepared by KB with the consultation of AŚ. AŚ reviewed the manuscript for getting its fine shape. KB prepared the manuscript in the contexts such as definition of intellectual content, literature search, data acquisition, data analysis, and so on. AŚ obtained research funding. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Kornelia Batko .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The author declares no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and Permissions

About this article

Cite this article.

Batko, K., Ślęzak, A. The use of Big Data Analytics in healthcare. J Big Data 9 , 3 (2022). https://doi.org/10.1186/s40537-021-00553-4

Download citation

Received : 28 August 2021

Accepted : 19 December 2021

Published : 06 January 2022

DOI : https://doi.org/10.1186/s40537-021-00553-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

research paper on health sector

Healthcare Research Paper Topics

The healthcare industry is a crucial part of every economy, and researching it can be valuable for several stakeholders. Topics related to healthcare can be crucial to understanding and improving management practices worldwide. Here is a list of options that can help you get a head start.

Healthcare Management Research Paper Topics

Without prior strategic planning, the healthcare system cannot survive for long, regardless of the size or economic standing of the country. Here, you will be expected to cover human capital and resource management topics. Management will comprise of development and maintenance of healthy relationships among the stakeholders in the industry. Choosing topics to do a healthcare research paper on can be just as daunting as the writing process. Take a look at some ideal topics that you can use or get inspiration from:

Healthcare Finance Research Paper Topics

Although these topics might require a lot more research than other areas of the industry, they are not any less interesting. Depending on your interests, learning about healthcare finance is vital to ensure the sector’s success. If this is your first research paper related to healthcare, you might find it a little intimating, but there is nothing a little practice cannot solve.

Healthcare finance addresses the allocation of resources for different uses and their justification. Understandably, sustaining the sector will become next to impossible without proper budgeting and planning.

Let’s dive into some valuable finance health research paper topics you can use:

Healthcare Economics Research Paper Topics

Like other aspects of the health care industry, economics is one crucial factor influencing financial success. It is also a transactional-based business and needs profit generation to sustain itself.

Health economics can be considered a part of economics concerning value creation, efficiency, and effectiveness in the production and consumption of healthcare in society. It deals with the demand and supply side along with the budgeting decisions.

Having firm knowledge about this can enable economists and healthcare specialists to improve operational efficiencies, health outcomes, and lifestyle patterns. Regardless, not many researchers choose to opt for healthcare economics-related topics for their research papers due to their technicality and complexity.

The list below is compiled to make this process less challenging and a lot more exciting. Let’s look at each of them before settling on a research topic.

Good Health Research Paper Topics

When you are writing a research paper on the healthcare sector, the possibilities are endless. It can be about anything from lifestyle to social issues to different syndromes. With constant improvement in the information sector as well as the technological one, the industry is witnessing a lot more innovative ways. It means a lot more topics to write about and discover about.

When choosing a topic, it is noteworthy that it is tested and proved properly. A good topic comprises a subject that utilizes authentic and credible statistics sources.

Before diving deep into it, make sure there is enough research material you can use for further development of your thesis question. It would be a shame to put in the time, energy, and words to realize that you have to restart due to a lack of proper planning.

Below are some topics that will allow you to write a striking research paper that is sure to impress and leave a lasting impression:

Healthcare Administration Research Paper Topics

If you are looking to write your paper relevant to the administration, begin by brainstorming various aspects of health care administration that might interest you or put yourself in the shoe of a working professional for this. This area will cover topics related to involved stakeholders’ changing needs and pain points. It will include the staff, management, patients, and other intermediaries in the value chain. Without proper research on the industry players, it is impossible to ensure that the system runs smoothly and efficiently.

Sociology Research Topics on Mental Health

Mental health refers to one’s state of mental normality or well-being. For a nation to prosper, it is imperative that physical health and mental health are being taken care of. In older times, little attention was paid to mental health issues. They were not acknowledged as a serious disease. However, gradually people have started to gain a sense of its significance. Mental health constitutes a major part of the healthcare sector today.

Mental health awareness has become a center of attention as peoples’ perceptions have changed, and they are more open to its acceptability. This branch of healthcare deals with the history, application, and development of various theoretical perspectives relating to illness, health, and disorders. It is linked to numerous other areas in the field, including deviance, childhood studies, demography, gender discrimination, etc.

This is an interesting area to cover in the research paper and usually includes theories and frameworks encompassing the causes, development, and treatment of mental disorders. Below are a few suggestions that can make great research topics in this field:

Interesting Research Paper Topics About Healthcare

Healthcare as a field is very daunting. Determining healthcare-related research paper topics and

Writing a detailed research paper on it can be a nightmare for many. Not only are these complex in nature, but they can get tedious as they are research-intensive. But they do not have to be! The field is subjected to discoveries with every blink of an eye, which means a lot more interesting topics to discover and write a paper on.

Sometimes, interesting topics can get controversial, but that is no reason to shy away from them. They can become some of the best topics to work with. The audience also tends to enjoy such topics, and the writer can get to learn considerably when the topic is of interest. Nonetheless, make sure that you choose a topic that is not brand new to research and has substantially related content available online for your assistance. After all, you need to back your paper with authentic journals and scholarly research.

Still not sure where to start from? Here is a list of some good healthcare topics for research papers you can use:

With such a wide variety to choose from, the healthcare topics mentioned in this blog are a confirmed way to help you out. You need to find where your interest lies, then a little bit of research and more practice to perfect your research paper. Start today. Good luck!

American Marketing Association Logo

We noticed that you are using Internet Explorer 11 or older that is not support any longer. Please consider using an alternative such as Microsoft Edge, Chrome, or Firefox.

New: Learn at your own pace and on your own time with AMA On-Demand Training. Explore the courses here .

Exploring New Research on Marketing in the Healthcare Sector

Exploring New Research on Marketing in the Healthcare Sector

Hayoung Cheon and Nah Lee

Health Care Services Marketing

As the world faces a pandemic of a magnitude not witnessed for over 100 years, we are reminded of healthcare’s fundamental role in our interconnected world. Marketing as a discipline has not lived up to its potential contributions to this important aspect of our lives. The Journal of Marketing Special Issue on “ Marketing in the Healthcare Sector ” is dedicated to promoting research on healthcare marketing. Thirteen scholars from across the marketing discipline shared their views on unanswered questions facing marketing in the healthcare sector during a special session at the 2020 AMA Summer Academic Conference. A summary and video clip of their individual presentations follows.

research paper on health sector

Leonard Berry | University Distinguished Professor of Marketing, M.B. Zale Chair in Retailing and Marketing Leadership, Mays Business School, Texas A&M University

Underuse of Palliative Care and Hospice Services

One of healthcare’s most important jobs is to help people with advanced illnesses live as comfortably as possible until they die. Yet, many patients do not die how they wish, which is to be as pain-free as possible and at home surrounded by family. Two services are available in the U.S. for patients with advanced illness— palliative care and hospice . Both services provide comfort care (such as pain control) and emotional support for patients and their families.

However, palliative care and hospice services are grossly underutilized in the U.S. About 60% of patients who could benefit from palliative care do not receive it and 25% of hospice patients die within three days of enrollment even though insurance covers it for six months. How can marketing help improve the utilization of these valuable services that can help people live better at the end of life?

Poverty and Health

Another important topic is the impact of social determinants on health. Factors such as quality of housing and education, income levels, physical activity, and social support are far more influential in overall health and length of life than medical care. For example, life expectancy in a low-income Chicago area drops 16 years compared to an affluent neighborhood. Poverty’s links to health may seem an impossibly big and complex topic for marketing academics to tackle, but research teams can break this big puzzle into manageable pieces and make extraordinary contributions. Consider, for example, the opportunity for marketing to reimagine housing for low-income people such as being done in designing “purpose-built communities” such as Villages of East Lake in Atlanta, GA. Think of “purpose-built communities” as a complex new product to serve the needs of its customers and other stakeholders. We in marketing have the expertise to make these “products” much better.

Punam Keller | Senior Associate Dean of Innovation and Growth, Charles Henry Jones Third Century Professor of Management, Tuck School of Business, Dartmouth University

Ecosystem goal: The Role of Business and Marketing in the BIG Picture

Multiple factors determine health outcomes. As the current pandemic shows, health outcomes are the result of interactions across global and social elements, technology, governments, and organizations. Thus, to tackle health problems, marketing should work more with the parties that it has not done so very often in the past. For example, collaborative work with global organizations such as WHO, WTO, and COP can be advantageous.

Individual Behavioral Goal: Message-Behavior Tailoring Using Technology and AI

Switching the focus from the ecosystem level to the individual level, marketing should note that technology can be readily adapted to encourage behavioral changes that promote better health outcomes. For example, smartphones can be powerful if combined with tailored messages alerting patients when to take their medications. We can study the efficacy of the types of text messages across segments of patients to understand which types of message are most successful at promoting positive behavioral changes.

research paper on health sector

Irina Kozlenkova | Assistant Professor of Marketing, University of Virginia

Mitigating the Effects of Physician Turnover through Relationships

Relationships have an important role in healthcare marketing. Among many players in the healthcare ecosystem (which includes payers, purchasers, suppliers/distributors, and regulators), the physician-patient relationship is central to healthcare and is also related to other entities in the ecosystem.

One problem that has not been understood well is mitigating the effects of physician turnover. In 2017, healthcare jobs experienced 21% turnover, which is second only to the hospitality sector’s turnover rate. It is costly to replace health professionals ($100,000 to replace a registered nurse, $1,000,000 to replace a physician) and doing so negatively affects patients and organizations. It has been shown that typical retention initiatives that work in other industries do not work well in healthcare.

Relational mitigation strategies may be key to mitigating the negative impact of turnovers. We conducted qualitative interviews with employees from all levels of a big healthcare organization (from high level executives, physicians, nurses, to receptionists) and a patient survey, which we later matched with turnover data and patient health data. The data revealed a big variance between various departments in terms of staff structure – some had consistent structures, while others were more ad-hoc. We learned that it is important to pay attention not only to physician turnover, but also to other parties (RNs, MAs, PAs). Continuity of care with the other parties improves patient outcomes, such as retention by 45–75%. While often the most attention is paid to the central relationship between a physician and a patient, we found that to many patients, their relationships with other members of the healthcare team (e.g., nurses, medical assistants) were as or more important as the relationship with their physician. Proactive communication with recommendations for a replacement of a leaving party has also been shown to improve outcomes (41–91%).

Off-Labeling Prescribing

Another important problem to address is off-label prescribing. It is legal in many countries to prescribe drugs for conditions for which they have not been approved. This is a very common practice (over 20% of prescriptions are off-label), yet patients are often unaware of it because doctors are not required to tell them. Since drugs are used for conditions for which they have not been tested and approved, it can be risky, and sometimes deadly. Some populations (e.g., children, pregnant women) may disproportionally receive off-label prescriptions. Research shows that over 70% of off-label uses have little to no scientific support.

Two important research questions surrounding this issue are how to regulate off-label prescribing without stifling innovation and understanding how physicians make off-label prescribing decisions. Our preliminary research findings from a field conjoint study, matched with the actual prescription data, show that physicians are more likely to prescribe an off-label drug when they are similar to the patient (in gender or experiencing the same “issue”) and when they have more experience in the specialty. Also, higher prices of the approved drug tend to diminish the use of the cheaper off-label drug.

research paper on health sector

Cait Lamberton | Alberto I. Duran Distinguished Presidential Professor in Marketing, Wharton School of Business, University of Pennsylvania

Micro: Biases Specific to Care Choices?

While we have done quite a lot of work to show that well-established biases exist in healthcare (as they do in any context), we also have a lot to learn about specific biases that may arise in healthcare choice making. One example is anti-community bias. Health outcomes are superior closer to home, given that closer-to-home facilities offer better accessibility and a closer relationship with doctors. With no other information, patients seem to prefer to stay close to home. However, when given a choice, patients tend to reject community hospitals in favor of more distant university-based hospitals, which do not necessarily lead to better outcomes for many standard procedures. Moreover, in rural areas (where 20% of the U.S. population resides) such biases may have long-lasting negative effects, as we see the increasing closures of community hospitals in rural areas. Given this tension between rural and community hospitals versus urban and university-based hospitals, understanding how patients make choices weighing different factors across these two types of hospitals and contemplating how and when marketing should tip the scales become crucial.

Macro: Satisfaction (with Healthcare)?

At the macro-level, marketing can focus on hospital satisfaction measurement. HCAHPS (Hospital Consumer Assessment of Healthcare Providers and Systems) measures patient experience including communication, pain management, and the quietness of the hospital environment. It is a widely used measure, freely available and, more importantly offers a huge opportunity to conduct interesting research. For example, an interesting area of research is the difference in the mode of delivery where more positive responses are attained through mobile devices than through computers. Researchers can also investigate the role of pain and the way it may be framed to help consumers deal with it in the most healthy manner, what types of advertising work well for healthcare facilities and providers, and how we can more accurately capture patient satisfaction as fully-conceptualized, and likely to be rooted in different, healthcare-specific experiences like empathy and respect for dignity, than might drive satisfaction with other goods and services.

research paper on health sector

John Lynch | University of Colorado Distinguished Professor, University of Colorado-Boulder

Health Care System Infomediaries

Healthcare expenses have experienced a six-fold increase in inflation-adjusted dollars since 1970. One major factor contributing to this increase is the absence of consumer price sensitivity. Insurers, the payers of this expense, cap the maximum out-of-pocket costs for the consumer. Even when patients are paying, they are often willing to pay all they can for a few more months of expected life. Furthermore, prices are opaque, even to doctors. This means that doctors do not know how much patients will be charged for a given procedure. They view it as impossible to know because it is dependent on insurance and not their job to know. How can marketing help incorporate price sensitivity in healthcare? Can we design pricing infomediary models to help doctors be better price shoppers for their patients?

Health Privacy & Quality of Care

Another interesting topic is health privacy and quality of care. HIPPA regulations govern the uses and disclosures of personal health information. Patients have rights over their health information and can authorize certain health records to be disclosed. How many consumers know who has what records and how does this affect the transmission of health history information that could benefit care?

Utilizing health data is analogous to the literature on customer identification in advertising, pricing, and personalized recommendations. Sharing information has benefits, but there are also risks of exploitation. Can we develop models for patient ownership and sharing of personal health information that promote better health outcomes?

research paper on health sector

Detelina Marinova | Sam Walton Distinguished Professor of Marketing, University of Missouri

Physician-Patient Digital Communications for Improved Health Outcomes

Provider-patient interactions are crucial in healthcare and we see a shift of the mode of communication from in-person to digital platforms, especially during the pandemic. However, research has just started to address digital communication in healthcare. Digital communication can be beneficial because it reduces office visits, which can improve efficiency. However, it can also increase physician workload in other ways and digital communication bears a risk of miscommunication. Thus, it is important to understand why and under what conditions digital communication between patients and providers contributes to patient compliance, engagement, and improved health outcomes.

Managing Frontline Interactions for Patient Well-Being and Hospital Revenue

Hospital spending constitutes 30% of national health expenditures, yet it is challenging to deliver high quality and cost-efficient health outcomes. With this tension, there are trade-off s between hospital revenue and patient well-being. One crucial aspect affecting both hospital spending and health outcomes is frontline interactions, which includes proactive actions by physicians and nurses and reactive actions by staffs. These often shape patients’ behavioral approach to medical conditions and treatments, thereby influencing the patients’ well-being. Moreover, it can be either a revenue source or a high-cost factor for hospitals. Therefore, one potential research question is how proactive and reactive actions of frontline agents contribute to or alleviate the trade-offs from the dual-emphasis on hospital revenues and patient well-being.

research paper on health sector

Vikas Mittal | J. Hugh Liedtke Professor of Marketing, Jones Graduate School of Business, Rice University

Health Care & Marketing

Conducting successful research in healthcare has a few issues that are uncommon in other sectors. First, problem-solving and practical relevance is critical in healthcare. Collaborators in health systems may not be interested in laborious “theory.” Hence, it is important to focus on relevant problems with basic rigor rather than thin-slicing or engaging in complicated quantitative analyses.

Second, research modesty is important for successful collaboration. A marketing perspective can contribute to solving healthcare problems, which is a much better approach than trying to solve a marketing problem with healthcare only as a “context.” For example, problem-oriented research questions may be: 1. How can a pharmacy chain manage its segmentation in different locations? and 2. How can nursing homes improve employee retention to improve healthcare outcomes?

Third, it is important to learn the differences in process as well as in incentive structures. In healthcare, grants are more critical than publications, so learning how to contribute to the grant-writing process is vital. Regarding publications, in medical journals, authorship and authorship order follow a pre-defined structure. Lastly, data privacy and data integrity issues are paramount and often university-level permissions are needed, which can be time-consuming.

Despite the unique characteristics of the field, there are many marketing research opportunities to gain a deeper understanding of medical and healthcare problems and teaching opportunities for training health professionals for rewarding careers.

research paper on health sector

Maura Scott | Madeline Duncan Rolland Professor of Marketing, Florida State University

Stigma and Vulnerability in Healthcare: Solutions through Technology?!

Stigmatized consumers experience a distinct healthcare journey relative to other consumers. Stigmatization can aversely influence the quality of care that patients receive from healthcare providers. Stigmatization in healthcare can limit patients’ willingness to engage in their treatment, thereby potentially further harming their health outcomes. Sources of stigma include certain patients’ characteristics such as race, ethnicity, and body type. Some diseases may be stigmatized based on the perceptions of visibility, controllability, permanence, or contagion associated with the disease. Vulnerable populations (e.g., underrepresented minority groups) may face these two sources of stigmatization at the same time, further affecting their well-being. Identifying interventions that help encourage stigmatized patients overcome the reluctance to engage in their healthcare (e.g., via online healthcare communities) is crucial. More research should identify policies that create an inclusive, equitable, and accessible healthcare system.

Technology in Healthcare: Tensions and Solutions

One potential way to tackle low engagement from stigmatized patients is to leverage relevant technology in healthcare. There are concerns and tensions to consider when developing such solutions. First, technology can reduce stigmatization because it can reduce human interaction; however, technology programmed with inherent bias could increase stigmatization. Second, technology could lower costs and increase accessibility for vulnerable patients. Yet, income level can make a difference in healthcare service quality, for example by separating ‘premium’ in-person service for the wealthy, which might lead back to the current status quo. Third, technology can influence patients’ anxiety levels, which suggests the need for healthcare interventions to help reduce anxiety triggered by technology. More research is needed to identify how to leverage technology in healthcare to increase accessibility and inclusivity of high quality, low-cost healthcare for all patients. 

research paper on health sector

Steven Shugan | McKethan-Matherly Eminent Scholar Chair and Professor, Warrington College of Business, University of Florida

Changes in Healthcare Markets

Marketing can address several interesting issues in changing healthcare markets. Service mix has been addressed in recent work, highlighting the fact that services offered by non-profit hospitals differ from those offered by for-profit hospitals. More research on service mix is needed. Websites hosted by hospitals and other healthcare providers can serve multiple roles—information provision (education) and selling (referrals). Research on multiple role healthcare websites would be valuable. New product launches are also an interesting problem in healthcare, with many new devices facing complications when being brought to market because of licensing issues and multiple players (including regulators, competitors with patents and courts).

Block-chain is a new encryption technology that may enable the storage of sensitive healthcare data. Marketing research can address the interaction of these databases with multiple parties also with privacy concerns. The interaction of these databases with consumers is a typical marketing communications issue. Artificial intelligence also has made its way into healthcare integration, from reading x-rays to making diagnoses, yet the AI-consumer interface is a marketing issue with many unanswered questions.

Other changes in healthcare markets that merit further research include the effect of changes in government regulation of the healthcare industry, the impact of for-profit entry in the existing market, and the implications of declining patient co-pays. Marketing communication in a heavily regulated environment with both business-to-business and business-to-consumer issues provides many research topics.

Healthcare Data Sources

There are many publicly available data sources in healthcare. Links for these data sources appear in the attached slide. Many of these datasets can be integrated based on geography (e.g., zip codes, FIPS, states, counties, etc.). My slides indicate many sources of free healthcare data. I and coauthors have also purchased data from American Hospital Directory and combined that data with data from free sources.

research paper on health sector

Jagdip Singh | AT&T Professor, Case Western Reserve University

Frontlines in Hyper-Markets

The pandemic has underscored the importance of getting ahead of the healthcare curve in uncertain and fast-changing healthcare markets. Research opportunities lie in the study of “outside-in” and “inside-out” frontline capabilities in healthcare organizations for demand anticipation and response agility that yield effective outcomes.  These capabilities require an integration of ground-level experience with data-based analytics at speed.  Several research contributions in Marketing can be useful to facilitate understanding of these capabilities including adaptive foresight, strategic flexibility, velocity and marketing excellence.  Some potential ways to seed research is to leverage public data such as ‘Red Dawn’ emails or data from wearable-sensor technology. 

Temporary Organizing for Public Health

The uncertain nature of healthcare markets can sometimes stem from public health and humanitarian crises such as climate change, war, disease, migration, and other conflicts. Many different organizations, such as the Red Cross, NGOs, and Doctors Without Borders, come together to address these crises. The challenges involved collaboration, coalition, and conflict in temporary meta-organizations to yield effective outcomes. Several research contributions in Marketing can be useful to facilitate understanding of these challenges including cause-driven marketing, mega-marketing and temporary marketing organizations.  Potential for funding projects and data comes from Gates Foundation grants, Business Roundtable priorities, and community data.

research paper on health sector

Hari Sridhar | Joe B. Foster’56 Chair in Business Leadership Professor of Marketing, Mays Business School, Texas A&M University

Marketing in the Healthcare Sector: Improving Cancer Outreach Effectiveness

Marketing research in the healthcare sector can complement and embellish medical research. It is important to recognize that not all patients are created equal. We can leverage more than 60 years of marketing research on customer needs and the latest developments in machine learning. Using predictive models, we can also demonstrate the social and financial impact of healthcare interventions. Doing so can help the field of marketing become a value-added support arm to healthcare.

In our study 1 of cancer outreach effectiveness, we use patient data and predictive models to improve returns on cancer outreach efforts. Only 4-8% of the general population undergoes regular cancer screening, despite massive spending on preventive outreach campaigns. In an National Institute of Health (NIH) supported study in partnership with UT-Southwestern, we conduct a large scale randomized field experiment to study how cancer screening visits are impacted by different types of cancer outreach efforts. Using a smorgasbord of variables concatenated from medical histories, geographical information, and the outreach program CRM data, we apply causal forests to estimate the causal effect of outreach efforts for every individual patient. We find that patient response to cancer screening varies dramatically across the population, enabling the dream of personalized outreach programs. By targeting the right people with the right intervention, we show that cancer outreach programs can save money and improve yield (over 74% in returns) in preventive cancer screening. Can marketing save lives and money? Our answer is a resounding yes.

It is also critical to understand the innards of the healthcare value chain and move beyond just the study of patient-physician and patient-facility interfaces. Other marketing scholars are now addressing issues surrounding multiple players in designing care facilities and improving quality of care, the complexities of hospital purchasing contracts, and the impact of regulatory interventions on payment disclosures. The field is ripe with other relevant questions and we are merely scratching the surface.

Featured in JM Webinar: https://www.ama.org/events/webinar/jm-webinar-series-insights-for-managers/

research paper on health sector

S Sriram | Professor of Marketing, University of Michigan, Ann Arbor

Technology has the potential to have a significant impact on the healthcare ecosystem. More importantly, the impact is likely to be felt by all stakeholders in the ecosystem. I consider two examples here.

The Internet of Health Things

In recent years, there has been a considerable increase in the use of wearable devices and apps by consumers, who use these devices for monitoring various markers of physiological and psychological well being. Broadly, these hardware devices and software applications come under the realm of Internet of Things (IoT). Do these devices, which are supposed to monitor health actually lead to better health outcomes and well being? Extant literature has documented mixed results because of several reasons. First, purchasing a device or downloading an app does not necessarily translate into repeat usage. Researchers have documented that consumers routinely lose interest after a few months. Second, even in instances where interest does not wane over time, routinely monitoring markers of health can lead to excessive obsession, which can be detrimental to overall well being. Third, even if we can establish a positive effect of these devices on health outcomes and overall well-being using observational data, one needs to be careful to control for patient self-selection – purchasers of these devices are likely to be different from those who chose not to purchase them.

The effect of these devices and apps can extend beyond patients. In this regard, how an individual’s health monitoring efforts can benefit other stakeholders in the whole ecosystem can be studied. For example, providers might see the reduced hospital readmission rate as shown in some literature and can potentially ensure adherence to medication taken outside hospitals. Drug manufacturers can increase the speed of drug development faster with regularly monitored data, as opposed to relying on self-reported measures. Of course, the downside is that such regular monitoring can be intrusive and raise concerns about loss of privacy. A careful quantification of the benefits of monitoring patient health information can help in assessing whether the benefits of sharing consumer data outweigh the risks associated with the violation of privacy.


Although the idea of telemedicine has been around for a few years, COVID-19 has made it a reality for many consumers of healthcare. The promise of telemedicine lies in its potential to relax wealth, accessibility, time, and skills constraints. This, in turn, can democratize healthcare. However, there are several important questions that need to be answered in order to assess whether and how this promise is realized. First, is the actual and perceived quality of a telemedicine service as good as in-person visits? Are there any particular risks of misdiagnosis from telemedicine? Second, the benefits delivered by telemedicine might not be evenly distributed across different stakeholders. For example, what benefits do patients and other stakeholders such as providers, payers, and telemedicine platforms derive from the new mode of healthcare delivery? How are these benefits distributed among the various stakeholders? How does the relaxation of the aforementioned constraints benefit patients? Does the benefit vary across patients’ socioeconomic status? Lastly, one can study the challenges that telemedicine might face in building a stable platform.

research paper on health sector

Richard Staelin | Gregory Mario and Jeremy Mario Professor of Business Administration, Fuqua School of Business, Duke University

Patient Experience Questions

Patient experience data has been collected for decades. However, until recently, most of these data came from standard surveys given to patients after they received treatment. Over the last few years free-form texts, such as reviews, have become increasing available. This new source of input from the patients may provide additional information to more traditional “rating-only” surveys. Do patient reviews of doctors differ substantially from customer reviews in other sectors? Do these reviews provide new information over the standard surveys?

There may be distinct segments of patients that vary in terms of their ability to judge the quality of service received. What is the size of the sophisticated market segment and can it influence the behavior of medical professionals? It would also be interesting to understand whether patients’ view of the quality of care differs across venues of service (e.g., emergency room, hospital, clinic). How is the perceived quality different from the objective quality measures currently used by medical practitioners?

Organizational Reaction to Patient Experience Data

Patient experience data are relevant to hospital management and insurance companies. Do they pay more attention to some databases over others depending on the source? How much should they weigh patient experience data compared to objective or clinical measures of quality? What are the profit implications for the hospital/company? The reaction of the medical staff is also a critical factor in understanding the impact of patient feedback data. Are providers receptive to such feedback by the patients and, if so, do their ability to adapt to feedback depend on the type of information? For example, patient feedback may be regarding bedside manners, receiving faulty advice, or being overcharged. Medical professionals may try to improve bedside manners and avoid billing mistakes, but it may be very difficult (or costly) to alter diagnostic practices.

Learn more about the Journal of Marketing Special Issue on “ Marketing in the Healthcare Sector ” and note that for those interested, submissions  must be made between July 1, 2021 and November 1, 2021.

' src=

Doctoral student at University of Michigan

' src=

Doctoral Student at Duke University

By continuing to use this site, you accept the use of cookies, pixels and other technology that allows us to understand our users better and offer you tailored content. You can learn more about our privacy policy here


  1. Research paper ideas medical field

    research paper on health sector

  2. Top Health Economics Research Papers Pdf ~ Museumlegs

    research paper on health sector

  3. Unique Medical Research Paper Topics for Students

    research paper on health sector

  4. What is public health systems research papers

    research paper on health sector

  5. Research Paper

    research paper on health sector

  6. 😍 Healthcare research papers. Health Care Research Paper Example: Marketing in the Healthcare

    research paper on health sector


  1. Chapter 1.1: Introduction to the WHO Guidance on Research Methods for Health EDRM

  2. health tips

  3. DHS||Assam health department exam||12 february,2023|| solve paper||competative exam solve paper

  4. Public Health Lec 3 Demography

  5. Healthcare delivery system handwritten notes

  6. Multiple choice question || paper- Health and Physical Education || B.Ed || semester


  1. Assessing the impact of healthcare research: A systematic ...

    Methods and findings. Two independent investigators systematically searched the Medical Literature Analysis and Retrieval System Online (MEDLINE), the Excerpta Medica Database (EMBASE), the Cumulative Index to Nursing and Allied Health Literature (CINAHL+), the Health Management Information Consortium, and the Journal of Research Evaluation from inception until May 2017 for publications that ...

  2. The role of artificial intelligence in healthcare: a ...

    A topic dendrogram study that identifies five research clusters: health services management, predictive medicine, patient data, diagnostics, and finally, clinical decision-making. (f) An in-depth discussion that develops theoretical and practical implications for future studies. The paper is organised as follows.

  3. Participating in Health Research Studies - Harvard Library

    The term "health research," sometimes also called "medical research" or "clinical research," refers to research that is done to learn more about human health. Health research also aims to find better ways to prevent and treat disease. Health research is an important way to help improve the care and treatment of people worldwide.

  4. The use of Big Data Analytics in healthcare | Journal of Big ...

    The introduction of Big Data Analytics (BDA) in healthcare will allow to use new technologies both in treatment of patients and health management. The paper aims at analyzing the possibilities of using Big Data Analytics in healthcare. The research is based on a critical analysis of the literature, as well as the presentation of selected results of direct research on the use of Big Data ...

  5. (PDF) Challenges in Healthcare Sector - ResearchGate

    Abstract A well-functioning health-care system can make a considerable contribution to a country's economic growth, development, and industrialization. Health care is traditionally thought to...

  6. (PDF) A Review of Indian Healthcare Sector - ResearchGate

    The overall health sector in India is valued at over US$ 200 billion, growing at a compound annual growth rate of 22.9%. 1 Healthcare delivery consisting of hospitals, diagnostic laboratories...

  7. 67 Healthcare Research Paper Topics to Choose From - EssayZoo

    Healthcare Management Research Paper Topics Without prior strategic planning, the healthcare system cannot survive for long, regardless of the size or economic standing of the country. Here, you will be expected to cover human capital and resource management topics.

  8. Exploring New Research on Marketing in the Healthcare Sector

    In 2017, healthcare jobs experienced 21% turnover, which is second only to the hospitality sector’s turnover rate. It is costly to replace health professionals ($100,000 to replace a registered nurse, $1,000,000 to replace a physician) and doing so negatively affects patients and organizations.