4.1. Publishing Patterns
As of October 4, 2022, using our methods as described in
Section 3, we identified 3,772 research publications for African academic papers related to machine learning (ML) and healthcare. Throughout this section we refer to these publications as African machine earning for health publications, making use of acronym
AML4H to describe this set of publication. We observe that AML4H publications increased exponentially from 2011 as shown in
Figure 1 (Blue).
Figure 1 is depicted in a log scale, highlighting this exponential growth. Globally, research interest in ML techniques began to rise after 2013 as a result of their demonstrable efficacy on popular standard benchmark tasks [
33,
34]. This rise in interest is likely to have contributed to our observed exponential rise in ML and healthcare research in Africa. We suspect that regional factors that include the development of many consortia, funding initiatives, and organisations to enhance African scholarly contributions to Biomedical Informatics has had an impactful contribution to this exponential rise of AML4H publications. Examples of these contributions can be seen from
H3Africa [
35],
BETTEReHEALTH [
36] and
HELINA [
37] initiatives. Using biomedical informatics techniques, Luna
et al. [
38], describe six broad challenges that are considered to impact the general development of physical and digital infrastructure for information technology in Africa. As information technology infrastructure is a fundamental requirement to enable ML research, these challenges can be considered to impact ML for health research as well. Luna
et al. [
38] note that for successful implementation of health informatics, knowledge of the challenges to be faced is an important factor.
The majority of AML4H publications (77.7%) occurred after the beginning of the COVID-19 pandemic on the continent in 2020. These exhibit a shift toward topics related to the pandemic, such as ML for virology and ML for epidemiology. The increase of African research productivity for healthcare since 2020 is in line with the global efforts to solve urgent matters related to the outbreak, particularly those related to timely data curation and management [
39,
40]. This also reaffirms previous observations that disease outbreaks can lead to the growth of research in underdeveloped fields [
41] for affected nations [
42]. However, despite this increased productivity in the field of ML and healthcare, the impact of the COVID-19 pandemic on global research production more generally is mixed and will likely be an area of active research for some time [
43,
44].
We observe that contributions from North Africa (Tunisia, Algeria, Morocco, Libya, and Egypt) make up 64.5% of the total AML4H publications between 1993 and 2022 (2,434 publications). This is substantially larger than any other African region. However, recently the publication ratio for North Africa declined from over 70% between 1993 and 2015 to below 70% in 2016, dropping to 61.5% by 2022. An increase in AML4H publications from Sub-Saharan Africa has shifted this ratio rather than a decline of AML4H publication from North Africa.
In Sub-Saharan Africa recent establishment of government-led initiatives to introduce Telemedicine and Digital Health, mainly in public hospitals [
45] and ongoing international development funding for research [
46] are possible factors contributing to this increase in AML4H publications. There is still a persistent lack of digital health infrastructure in sub-Saharan Africa. Efforts to introduce various forms of electronic health record systems have started to alleviate this. However the adoption and scaling of these systems may likely still be slow relative to higher resource contexts [
47]. The limited availability of digital health infrastructure consequently limits the development of machine learning research [
47].
The funding information from Dodoo
et al. [
47] shows a correlation between AML4H publications (particularly Sub-Saharan Africa) and foreign funding, as shown in
Figure 2. National Institutes of Health (United States of America) has funded 145 AML4H publications. This funding stems mainly from the
Harnessing Data Science for Health Discovery and Innovation in Africa (DS-I Africa) Program. This program aims to enhance data science in Africa for healthcare, public health and biomedical research [
48]. Wellcome Trust (United Kingdom) has supported 48 AML4H publications. This is made possible due to the
Developing Excellence in Leadership, Training and Science (DELTAS) Africa Initiative co-organized with the African Academy of Sciences [
49]. The Medical Research Council (MRC) of the United Kingdom (UK) is also featured as a major funding institution for African research funding 36 AML4H publications. In particular the
UK MRC funded scholarship program is likely to have been an important contributing factor for these publications. The MRC maintains long term support for graduate students of Sub-Saharan Africa. [
49]. The European Commission is a funder for 34 AML4H publications. This is related to the prioritization of Africa-focused research thanks to the
Horizon Europe Framework Programme providing 350 million euros to fund research projects including Europe-Africa collaborations [
50]. Furthermore, the Bill and Melinda Gates Foundation (United States) provided funding for 55 AML4H publications. As a non-governmental charity organization, this foundation is interested in encouraging research projects that translate health-related knowledge into life-saving interventions, particularly in developing countries where access to clinical information and consistent health infrastructure is very limited [
51]. Moreover, the National Institute of Allergy and Infectious Diseases from the United States of America financially contributed to the development of 43 AML4H publications. This is done within the framework of the contribution to studies about infectious and respiratory diseases in Africa [
52].
Beyond these funding bodies that have a broad and global reach, several national research institutions with a generally narrower scope and more localised focus have been identified among the main funders of AML4H publications as shown in
Figure 2.
King Saud University and
Princess Nourah Bint Abdulrahman University from Saudi Arabia have respectively funded 49 and 35 AML4H publications. A likely contributing factor for this observation is the mass funding provided by Saudi Government to local research institutions to independently organize scholarly projects and publish high-quality papers toward achieving better standings in world university rankings [
53]. The National Natural Science Foundation of China and National Research Foundation of Korea (South Korea) have respectively supported 66 and 35 research AML4H publications. By contrast to those to the Saudi institutions, these are centralized and government-led, supervising domestic national research funding [
54]. These two institutions are not providing programs exclusively for foreign scientists unlike the National Institutes of Health (United States) [
54].Rather, these two institutions fund African research papers when Chinese or Korean scientists are significantly involved [
54]. Their presence as primary funders is likely motivated toward growing the presence of BRICS Countries in the research landscape in Africa, particularly in Health Informatics [
45].
In the similar context, we find that National Research Foundation (NRF) of South Africa is the only centralized, government-led institution based in Africa that is significantly funding AML4H with 45 publications. However AML4H publications supported by this local funding appear to draw less attention than those supported by international funders when considering patterns of citations. The number of citations garnered by AML4H publications supported by the South African NRF are less numerous than citations for AML4H publications supported by international funders [
54]. When we consider South African research in Health Informatics it seems that international funding and collaboration has a higher impact on citation count as compared to local governmental support [
45,
55]. This observation seems to be applicable to all of the continent and not just South Africa. This disparity in the effectiveness of local and international funding, at least when considering patterns of citation, may suggest that a general reform for government-led research funding across the continent should be investigated [
45,
55].
Patterns of funding shape the international collaboration networks that African institutions develop to conduct research for ML and healthcare. We summarise the funding contributions for AML4H publication from non-African countries in
Figure 3. Effectively, we clearly see that the United States of America and Saudi Arabia dominates international collaborations in this context respectively with 550 and 532 publications. Although the flexibility of these two countries can explain in part their relative domination on African Biomedical ML research, this fact can be due to other factors. The research policy of Saudi Arabia emphasizes international collaborations by contrast to other major funding countries [
56]. Saudi Arabia has also established for decades a tradition of research collaboration with North Africa through the mediation of Egypt thanks to geographic proximity and their joint affiliation to the Arab region [
57]. As for the United States of America, it is the most prolific country in the world for research on biomedical informatics [
58] as well as on ML [
33]. It is also behind the establishment of multiple international biomedical research consortia that encourage the move to digital health [
59]. We also notice that many European countries significantly contribute to African research on the matter: United Kingdom (359 publications), France (257 publications), Germany (158 publications), Spain (126 publications), Netherlands (82 publications), and Italy (73 publications). Financial support from the European Commission and from local charity organization like
Wellcome Trust can explain this finding, mainly for United Kingdom. Nevertheless, it is probable that this fact is also due to the existence of these countries among the most productive ones in ML [
33] and health informatics [
58] research: United Kingdom (3
rd in deep learning), France (6
th in health informatics), Germany (5
th in deep learning, 4
th in health informatics), Spain (8
th in health informatics), Netherlands (13
th in health informatics), and Italy (10
th in deep learning, 3
rd in health informatics). Similarly, we can find that the presence of China (198 publications), Canada (150 publications), Australia (110 publications), and South Korea (97 publications) among the main collaborating countries with African in Biomedical ML research can be explained by the status of these countries as highly productive ones in ML [
33] and health informatics [
58] research and by the existence of nationwide funding institutions in these countries [
54]. However, the identification of India (390), Pakistan (102), and United Arab Emirates (88) among the main collaborators of Africa in this research area is quite surprising as these countries have not been featured as sponsors for African research papers. For United Arab Emirates, the situation is quite similar to the one of Saudi Arabia as geographic proximity to North Africa enables the country to easily establish research collaborations with North Africa [
53]. United Arab Emirates has a large flexibility in establishing research collaborations, particularly higher than the one of Saudi Arabia [
60]. However, its limited efficiency to contribute to African Biomedical ML research outputs is mainly linked to the considerable smaller scholarly productivity of United Arab Emirates and the trend of the country to establish collaborations with Asian neighbors rather than with North African countries by contrast to Saudi Arabia that maintains a robust research collaboration with Egypt and consequently with North Africa [
60]. As for Pakistan, it is among the best published Islamic countries in computer science and it has long-term research collaboration traditions with Saudi Arabia [
60]. Its contribution to African research outputs is probably an effect of the involvement of Saudi Arabia in biomedical ML research in Africa. Concerning India, it is among the best ten most published countries in ML [
33] and health informatics [
58]. The limited history of research collaborations between India and Africa [
33,
58] except for several joint projects between South-Eastern Africa and India [
61] proves that this tendency is new and is probably a consequence of COVID-19 where Indian scientists were invited to join large-scale research projects online for their proficiency in this research field. This is confirmed for global COVID-19 research where India is identified as the third research collaborator of the Arab countries with a strong scholarly bond with Egypt [
62].
From what we have already discussed, it seems that collaborating countries tend to be selective towards North Africa or Sub-Saharan Africa. There are limited countries that develop scholarly collaboration programs for all the continent. This is confirmed through the computation of the rate of the papers coauthored with North African institutions among the paper coauthored by a non-African country with African ones (Grey in
Figure 3). In fact, Saudi Arabia, France, Spain, South Korea, and United Arab Emirates are biased towards establishing collaboration with North Africa (>55% coauthored with North Africa). Yet, United States of America, India, United Kingdom, and Netherlands are favoring collaboration with Sub-Saharan Africa (<45% coauthored with North Africa). While the bias of Saudi Arabia and United Arab Emirates can be explained by the close relations with other Arab nations including North African ones [
53,
60], the exclusive collaboration of France towards North Africa is rather due to the long-term effect of the colonization of Tunisia, Algeria, and Morocco by this country [
57]. The similarity of the higher education and research systems between France and these three African countries and the use of French as the main language of scholarly research in these nations facilitate the establishment of joint research programs between France and North Africa [
57]. The lack of collaboration between France and Sub-Saharan African countries that have been formerly colonized by it like Senegal, Benin, Cameroon, and Ivory Coast is explained by the current lack of development of research in health informatics [
58] and ML [
33] in these countries. As for Spain and South Korea, the bias is rather linked to the establishment of government-led bilateral research cooperation programs between these two countries and North Africa, particularly Tunisia and Morocco
2 [
63]. Spain is also a country that is located very close to North Africa and has consequently the ability to easily establish research collaborations with this region through Morocco [
57,
61]. Concerning United States of America, United Kingdom and probably Netherlands, their higher interest to Sub-Saharan Africa is mainly motivated by the funding programs that are exclusively done by these three countries that disregard North Africa due to the assumption that North African countries are richer than South African ones although all Africa is currently underdeveloped [
64]. However, the bias of India towards Sub-Saharan Africa is not explained by funding because research collaborations with North African institutions can be more easier through Saudi support [
60]. It is rather explained by an historical scholarly association between South-Eastern Africa and India [
61] and by the invitation of Indian individuals by institutions in Sub-Saharan Africa to join projects for their efficiency in computer science research [
33,
58].
Beyond regional and political motivations behind the bias in collaboration with non-African countries and the choice of funding sources between North Africa and Sub-Saharan Africa, it can be explained by the lack of coordination of research between the two regions. Only twelve Sub-Saharan African nations contributed to North African research about ML for healthcare:
South Africa (12 publications),
Nigeria (11 publications),
Sudan (6 publications),
Gabon (4 publications),
Ethiopia (3 publications),
Ghana (3 publications),
Kenya (3 publications),
Senegal (2 publications),
Cameroon (1 publication),
Congo (1 publication),
Malawi (1 publication), and
Tanzania (1 publication). Such behavior is unfortunately historical and common to all research fields in Africa [
61]. More collaboration between North Africa and Sub-Saharan Africa is required so that all countries can benefit from all the scholarly resources .
When seeing the research productivity of African countries (54 nations), we found that only eight countries published more than 100 publications and twelve countries published more than 40 publications as shown in
Figure 4. These countries are led by
Egypt (1255 publications),
South Africa (505 publications),
Morocco (446 publications),
Tunisia (407 publications),
Algeria (327 publications), and
Nigeria (270 publications). This proves a relative domination of African ML research for healthcare by North Africa over Sub-Saharan Africa. This is mainly due to the leading position of
Egypt and
South Africa in scholarly research [
61], particularly the one related to biomedical informatics [
58] and ML [
33]. The relative higher standings of
Tunisia,
Algeria,
Morocco, and
Nigeria is confirmed by previous findings on research productivity in Africa [
61]. As well, only four French-speaking countries are represented among the fifteen most published African nations:
Morocco,
Tunisia,
Algeria, and
Rwanda. This can partly be explained by the fact that quite all the research production about ML and healthcare in Africa is written in English (3,788 out of 3,789). This constitutes a language barrier for French-speaking countries where higher education is mostly delivered in French [
61]. This is also explained by the lack of collaboration between French-speaking North Africa and the rest of French-speaking Africa that has an underdeveloped research infrastructure [
61]. It is also important to know that only four countries from the fifteen most productive ones have a population that is inferior to 30 million citizens: Zambia (19.4 millions), Rwanda (12.9 millions), Tunisia (11.7 millions), and Libya (6.9 millions) [
65]. This confirms the effect of the population size on the research productivity of a country [
66]. When adjusting the research productivity by the population size for the considered nations, we found that only four countries achieve a rate of research publications per 1 million citizens superior to 8:
Tunisia (34.64),
Egypt (12.06),
Morocco (12.04), and
South Africa (8.30). This means that the registered advantage of these four continents over the remaining parts of the continent is not due to population size. This is rather due to other factors such as the higher quality of research capacities in these countries, the existence of a robust research infrastructure, and the development of efficient research policies and funding programs [
67]. This can be also related to the higher density of medical workers in these four countries (>20 per 10,000 citizens) [
68]. The relatively limited productivity of several countries with high rate of medical specialists like
Libya and
Mauritius is the lack of a computer science research community in these nations [
33,
58].
When seeing the effect of the Gross Domestic Product (GDP) in USD on the research productivity of the fifteen African nations, we found that only
Rwanda has both a nominal GDP (12.098 billion USD) and a GDP per capita (912 USD) that are not ranked among the best twenty in the continent [
65]. This country has succeeded to emerge thanks to the government-led research policy that tries to grow its local Artificial Intelligence community from the perspective of research and development [
69]. This involves the hosting of international branch campuses in the country such as Carnegie Mellon University Africa, the creation of a local artificial intelligence ecosystem involving startups and corporation branches, and the development of capacity building programs in ML [
69]. However, we found that eight of the fifteen considered countries have a GDP per capita that is not ranked as one of the best twenty in Africa:
Kenya (2255 USD),
Nigeria (2326 USD),
Zambia (1348 USD),
Tanzania (1245 USD),
Uganda (1105 USD),
Ethiopia (1097 USD),
Sudan (916 USD), and
Rwanda (912 USD). This proves that the funding programs provided for African countries, the capacity building events like
Deep Learning Indaba, and the free online and offline courses and mentorships in ML and biomedical informatics have succeeded to bridge the gap between African countries caused by financial burdens [
69]. When seeing the number of publications per one billion USD of GDP, we found that only four countries achieved a rate of 2 or more:
Tunisia (8.79),
Morocco (3.12),
Egypt (2.67), and
Rwanda (2.64). This proves that the higher productivity of these countries are not motivated by their better financial situation. As explained before for Rwanda, this advantage is mainly due to better research policies and capacities in these countries, particularly related to the field of artificial intelligence and digital health [
70]. Although several African countries have a better contribution to research about ML and healthcare, their productivity and impact are below the international average [
33,
58] and they also require foreign funding and capacity building programs from developed countries to evolve.
When identifying the twenty-one most productive institutions in this field, we found that only four African countries are featured as shown in
Figure 5 :
Egypt (8 universities),
South Africa (4 universities),
Tunisia (2 universities), and
Morocco (2 universities). This goes in line with our findings regarding the superiority of these nations in African ML research for healthcare. These universities are led by
Cairo University (Egypt, 218 publications),
Mansoura University (Egypt, 160 publications),
University of Cape Town (South Africa, 120 publications),
Menoufia University (Egypt, 111 publications),
University of Sfax (Tunisia, 111 publications),
Benha University (Egypt, 104 publications),
Ain Shams University (Egypt, 101 publications), and
Mohammed V University in Rabat (Morocco, 92 publications). This confirms previous findings proving the status of these universities as the most productive ones in Africa [
61]. This is achieved thanks to a strong government-led collaboration network between Tunisia, Algeria, and Morocco and another one between Egypt and South Africa [
61]. This is also due to robust collaborations between these African universities and non-African ones, particularly research collaborations of Tunisian, Algerian and Moroccan universities with French and German ones, the collaborations between Egyptian and Saudi universities, and the research partnership between South African universities and the ones in the United Kingdom, Sweden, Netherlands, and the United States of America [
61]. The analysis also revealed the significant contribution of five Saudi universities to ML research for healthcare in Africa. This means that the effect of the Saudi-Egyptian scholarly collaborations on the research productivity of Egyptian universities is higher than the one for the alliance between Tunisian, Algerian, Moroccan and South African universities and their collaborators in Europe and North America [
61]. Further efforts should be done to enhance the quality of research collaborations between African universities and world-class ones like the University of Oxford through initiatives like the
Africa Oxford Initiative3. In another context, it is important to notice that there is no AI corporation or startup has been featured among the most published African institutions about ML and healthcare. This can be explained by the lack of development of the AI industry in Africa that is still in its very beginning [
71]. This situation is different from the one of the AI industry in the developed world, mainly the United States of America, where AI corporations such as Google and Microsoft significantly contribute to the development of this research field where industries contribute up to 5% of the biomedical informatics research production [
72].
The analysis of the 22 prolific authors of African biomedical ML scholarly outputs revealed that 12 of the scientists were Egyptian ones, 2 were from Morocco, and 2 were from South Africa as shown in
Table 1. These scientists are led by
Aboul Ella Hassanien (52 publications - Cairo University, Egypt),
Romany F. Mansour (27 publications - New Valley University, Egypt),
Shaker El Sappagh (26 publications - Benha University, Egypt), and
Fahmi Khalifa (23 publications - Mansoura University, Egypt). This proves that the better standings of Egypt in biomedical ML research is mainly due to the existence of a large community of highly productive scientists in the field. These scientists contribute to the development of research customs and collaborations and the shaping of effective research directions inside their institutions leading to a sharp increase in their productivity [
73]. In fact, two scientists in our list have even been featured among the most productive scientists in the world by coauthoring over 40 papers in one year several times between 2000 and 2016:
Aboul Ella Hassanien (Cairo University, Egypt) and
Dan J. Stein (University of Cape Town, South Africa) [
74]. As well, these individuals will be the engine of highly-cited publications through providing guidance for the rest of their colleagues in the same institution and by identifying trendy topics based on their large experience [
75]. The lack of these scientists in other countries, particularly Morocco, South Africa, and Tunisia, proves that the development of research outputs in these countries about ML for healthcare is based on a collaborative effort rather than on individual ones. This situation does not seem to be similar to the one of African research about biomedical informatics where several highly-productive scientists are leading the field in South Africa and probably in Tunisia and Morocco through multiple international collaborations and large-scale research projects [
45]. This is mainly explained by the fact that the field of ML for healthcare in these countries is not as mature as in Egypt and that several highly-productive scientists can appear in the next few years in Tunisia, Morocco and South Africa where the field becomes more developed.
Table 1 also revealed that several scientists working outside Africa are featured among the ones mostly publishing African research outputs on ML for healthcare:
Ayman El-Baz (34 publications - University of Louisville, United States of America),
Mohammed Ghazal (26 publications - Abu Dhabi University, United Arab Emirates),
Islem Rekik (20 publications - Istanbul Technical University, Turkey),
Mohamed Elhoseny (18 publications - University of Sharjah, United Arab Emirates),
Sanjay Misra (18 publications - Østfold University College, Norway), and
Ahmed Soliman (17 publications - University of Louisville, United States of America). Most of these researchers are Egyptian ones serving as liaisons between their home country and their host institution building and maintaining and growing research collaborations between their host country and their country of origin. This confirms a general trend of scientists working abroad to collaborate with their country of origin [
76]. Such a collaboration is enhanced through the establishment of a joint supervision of Ph.D. students between the host country and the home nation of the productive scientist [
77]. This type of collaboration has been enhanced during the COVID-19 pandemic thanks to the organization of online conferences allowing the easier development of collaborations to fight the disease outbreak and because of the fact that several scientists have been blocked in their home country due to travel restrictions [
78]. The two non-Egyptian scientists
Islem Rekik (Tunisia) and
Sanjay Misra (India) are involved thanks to their past research experience in Africa respectively with
African Network for Artificial Intelligence in Radiology and Imaging (Morocco) and
Covenant University (Nigeria).
The analysis of the types of the African research outputs revealed that these publications are mostly
articles (2,223 publications),
conference papers (1,158 publications), and
reviews (221 publications) as shown in
Figure 6. Although reviews are among the best three publication types, their number is small when compared to the ones of
articles and
conference papers. Developing more review papers is important to provide an overview about the state-of-the-art of ML for healthcare to the African research audience [
79]. Limited interest is shown to short communications like notes, editorials, short surveys and letters to the editor as these kinds of publications have a less significant weight than
articles,
reviews, and
conference papers in evidence-based research [
79]. However, short communications and letters to the Editor can be very useful to brainstorm about a given topic and to develop the discussion, the accuracy, and dissemination of valuable research findings [
80,
81]. African scientists should use this route of publication to interact with other scientists across the continent and enhance the quality of their research outputs. Besides, African scientists do not significantly publish data papers to describe their datasets for ML for healthcare. Data papers are very important to provide detailed information about Africa-related datasets and ensure their availability for other scientists working on biomedical applications in the African context [
82]. The lack of documented and local datasets limits the development of customized solutions for digital health in the continent. More emphasis should be provided to the development of data papers in Africa. As for the number of
book chapters (130 publications), it is higher than the one of short communications but less than the one of journal articles and conference papers. Books can involve literature reviews and articles and that is why book chapters are not numerous than short communications [
83]. However, computer scientists tend to cite and interact with book chapters less than social scientists [
83]. This explains the tendency of African scientists to publish journal articles and conference papers rather than book chapters. One positive aspect of the African research production about ML for healthcare is that only two retractions occurred during 29 years. This proves in part the integrity of African research [
84]. Nevertheless, this should be carefully considered as most of the research publications have been issued since 2020 and have not been examined by proficient scientists for a sufficient time to identify research flaws and scientific misconduct [
84].
When seeing the contribution of North Africa to every publication type, we found that most of the kinds of research publications are dominated by North African scientists (Grey in
Figure 6). Particularly,
journal articles and
conference papers are mostly co-authored by North African institutions respectively at a rate of 61% and 75%. The significantly higher rate of conference papers by North African countries is mainly explained by the large cost for registration and travel to attend scholarly conferences that cannot be afforded by research scientists in Africa and the lack of organization of top-level conferences in the continent [
85]. The situation could be worse if the COVID-19 pandemic did not occur allowing online participation to scholarly conferences [
86].
Diversity, Equity and Inclusion (DEI) Programs are currently established to solve the lack of participation of the Global South, particularly Africa, in highly-referred conferences through fee waivers and mentorships [
87]. Several top-tier ML conferences will even be organized in Africa such as
ICLR 20234 (Rwanda) and
MICCAI 20245 (Morocco). It is interesting to see that
reviews,
editorials,
notes, and
short surveys are mostly published by Sub-Saharan African institutions. The higher rate of editorials by Sub-Saharan Africans is mainly due to the establishment of Africa-related special issues edited by Sub-Saharan African scientists in scholarly journals as a part of the DEI Program [
87]. Other reasons can be the involvement of African scientists in scientific society-driven special issues [
88] or the organization of special issues in Sub-Saharan African journals about ML for healthcare [
89]. The higher interest of Sub-Saharan Africa to publish
reviews and
short surveys is explained by the fact that the Sub-Saharan African scientists are trying to explore the field of biomedical ML prior to contributing to it by identifying and understanding the state-of-the-art, recent advances, and limitations for this field [
90]. The significant publication of
Notes by Sub-Saharan Africa is quite surprising as this region is not very interested in publishing short communications. This finding can be explained by the fact that notes can be short journal articles or unstructured reviews as in
BMC Research Notes and that the development of notes can be easier for Sub-Saharan African scientists, particularly aspiring ones, than articles, conference papers and reviews [
91].
When finding the publishers of the African outputs on ML for healthcare, we found that 2,783 out of 3,772 publications (76%) have been issued by ten world-class publishing houses as shown in
Figure 7. These main publishers are led by
Springer (Germany, 722 publications),
IEEE (United States of America, 693 publications),
Elsevier (Netherlands, 486 publications), and
MDPI (Switzerland, 273 publications). This confirms the oligopoly of scholarly publishers where less than ten corporations particularly
Reed-Elsevier,
Wiley-Blackwell, and
Springer, dominate the market [
92]. An interesting finding was that
Taylor & Francis, one of the top five research publishers, was not featured among the best publishing corporations issuing African papers on ML and healthcare. This is explicated by the lack of creation of scholarly journals related to Biomedical Informatics by
Taylor & Francis 6. Among the ten best publishers, only
Hindawi is featured as an African publisher with 233 publications as it was founded in Cairo, Egypt before being moved to London, United Kingdom. Africa should increase its number of confirmed scholarly publishers to enhance the inclusion of its research outputs, particularly the ones related to biomedical ML, in English and in local languages in large-scale bibliographic databases [
94]. The rise of
MDPI and
BioMed Central as confirmed publishers of African biomedical ML research outputs is mainly caused by their development of biomedical mega-journals having a publishing model that reduces publication delays through easier and more flexible peer review [
95].
When seeing the main publication types by publisher, we found that there are three different situations that depends on the policy of each corporation (Orange and blue in
Figure 7). The two main publishers
Springer and
IEEE publish both conference papers and journal articles. This variety of publication types explains the domination of these two publishers in the field of ML for healthcare.
BioMed Central,
MDPI,
Frontiers Media S.A.,
Hindawi Limited,
Wiley and
Blackwell Publishing exclusively publishes journal articles while
Elsevier mostly issue journal articles with a bit of conference papers. This can be explained by the policy of
BioMed Central,
MDPI,
Frontiers Media S.A., and
Hindawi Limited that mainly interests in the curation of open-access mega-journals [
95]. As for
Wiley,
Blackwell Publishing, and
Elsevier, they are mostly known for maintaining subscription-based and open-access journals although they have several venues for conference papers such as
Procedia Computer Science.
ACM is the unique publisher that only issues African conference papers about ML for healthcare. This is quite surprising as there are many computer science journals published by the
ACM and that can include African research about ML for healthcare such as
Communications of the ACM and
ACM Transactions on Database Systems [
93]. The choice of the publication type by the African scientists for every publisher is mainly related to what every corporation provides as topics and publishing models for research journals and conferences. As shown in
Table 2,
Springer provides a specific venues for the proceedings of health informatics conferences (
Lecture Notes in Bioinformatics) as well as the general venue for computer science conferences (
Lecture Notes in Computer Science). These series include 113 out of the 344 conference papers issued by
Springer. The other part of the conference papers is published in other specific venues about sub-fields of computer science such as
Advances in Intelligent Systems and Computing,
Lecture Notes in Networks and Systems, and
Communications In Computer And Information Science. Similarly,
ACM hosts a venue for the proceedings for computer science conferences (
ACM International Conference Proceeding Series) including 56 out of the 63 conference papers issued by this scholarly publisher. Even for
Elsevier, most of the conference papers (35 out of 46 publications) are issued as a part of its own venue for computer science proceedings (
Procedia Computer Science). By contrast,
IEEE does not provide a series for conference proceedings and publishes the outputs of every conference as an independent book. The importance of proceeding series in indexing conference papers has been confirmed for the computer science field [
96], probably because these venues allow an easier indexing of scholarly conferences by bibliographic databases. The advantage of the main conference paper publishers over
Elsevier is mainly due to the involvement of
ACM and
IEEE in the regular organization of scholarly conferences and the broad scope of
Springer conference series that accepts to include conferences with a narrow regional representation and research topic [
97]. African community should study how every publisher considers accepting conference proceedings and consequently work to enhance the indexing of continent-level conferences in biomedical informatics, particularly in Sub-Saharan Africa.
As for the most published journals, we found that most of them are open-access mega-journals publishing research papers after flexible peer review and short editorial delay, such as
IEEE Access (91 publications, IEEE),
Applied Sciences (40 publications, MDPI),
BioMed Research International (28 publications, BioMed Central), and
Scientific Reports (28 publications, Nature Publishing Group). Several open-access journals with narrower scope but having the same editorial policy of quick manuscript processing and sometimes a higher acceptance rate and efficient editorial services like proofreading and typesetting are also identified among the main target journals for African research about biomedical ML, particularly
Computational Intelligence and Neuroscience (68 publications), Hindawi),
International Journal of Advanced Computer Science and Applications (46 publications, Science and Information Organization),
Journal of Healthcare Engineering (40 publications, Hindawi),
Sensors (34 publications, MDPI),
Electronics (27 publications, MDPI), and
Informatics in Medicine Unlocked (26 publications, Elsevier). The leading position of
Elsevier and
Springer in publishing journal articles about African biomedical ML is mainly related to the creation of scholarly journals specific to a particular topic of biomedical informatics like
Computers in Biology and Medicine (31 publications, Elsevier),
Biomedical Signal Processing and Control (30 publications, Elsevier), and
Neural Computing and Applications (30 publications, Springer). When seeing the representation of Sub-Saharan African institutions in top journals, it was clear that only five journals were not dominated by North Africa:
Computational Intelligence and Neuroscience (Hindawi, 32%),
BioMed Research International (BioMed Central
7, 14%),
Scientific Reports (Nature Publishing Group, 35%),
Journal of Healthcare Engineering (Hindawi, 32%), and
Informatics in Medicine Unlocked (Elsevier, 46%). This advantage for
Hindawi can be explained by their lack of manuscript formatting requirements as well as their free-of-charge language proofreading report at point of submission
8. As well,
Hindawi also provides a fee waiver for publication in its scholarly journals at a rotating basis [
98]. Such editorial services are friendly to early-career Sub-Saharan African scientists who lack research support [
99]. Concerning
Scientific Reports and
Informatics in Medicine Unlocked, the only reason for their choice by Sub-Saharan Africa as target scholarly journals is due to their fast editorial delay and that is why they are less considered than the journals maintained by
Hindawi. The list of the main journals publishing African outputs about biomedical ML also reveals that 6 out of the 13 top journals (46%) are absolutely not related to computer science and that 2 journals (15%) are multidisciplinary mega-journals publishing health-related research among other outputs about topics ranging from exact sciences to social sciences. These lack of consideration of health-related research venues is confirmed when seeing the number of considered publications indexed by PubMed, a bibliographic database for biomedical scholarly publications. In fact, we found that only 948 out of 3,772 publications (25.1%) are indexed by PubMed. This proves a lack of involvement of health specialists, particularly physicians, pharmacists, and dentists, in biomedical informatics and digital health research in Africa and that the field is largely dominated by computer scientists. This is linked to the lack of availability of biomedical informatics courses in medical schools [
100]. Further efforts should be done to include medical specialists in biomedical ML research. This situation has been better during the first wave of the COVID-19 pandemic in 2020 where 147 out of 679 publications (21.6%) were indexed by Scopus. This is mainly due to the timely awareness of the clinical community that health information systems can be important to monitor the evolution of the disease due to the emergency of the situation [
101].
The analysis of the number of open-access publications by publisher (Grey in
Figure 7) finds that the publishers of open-access journals providing rapid and flexible peer review and low-cost editorial services (
MDPI and
Hindawi) lead the open-access publishing industry in issuing African research about biomedical ML. Several publishers such as
Tech Science Press,
Frontiers Media S.A.,
BioMed Central also fully publish their outputs as open-access publications. However, they are not aimed by the African community as
MDPI and
Hindawi due to their higher publication fees. Surprisingly,
Springer,
IEEE and
Elsevier that are not specialized in open-access publishing publish a significant rate of the African research as open-access outputs. This is explained for
Springer and
Elsevier by the availability of full open-access fee waivers for low-income countries, including Sub-Saharan African ones, through funding programs like
Research4Life and
cOAlition S [
102]. The open-access publications of
IEEE are mainly related to its open-access mega-journal,
IEEE Access, providing flexible peer review and editorial policies [
103]. When seeing the rate and types of open-access licenses assigned to publications, we found that 1,954 out of 3,772 publications (51.8%) are open access. This is a direct result of the tendency of Sub-Saharan African institutions to publish their research outputs as open-access scholarly publications [
102] and of the free sharing of COVID-19-related publications by publishers during the first wave of the pandemic in 2020 to increase the efficiency of the scholarly response to the disease outbreak [
104]. These outputs are mostly gold (1,336 out of 1,954 publications) or green open access (1,297 out of 1,954 publications) as shown in
Figure 8. Most of these publications are under gold and green open access licenses at once. This tendency to favor gold and green open access is higher for Sub-Saharan Africa (Grey in
Figure 8) and is mainly explained by the freedom given to authors and institutions to upload their research papers before and after final publications to repositories and freely shared them with the scientific community by contrast to bronze open access that prohibits the reuse of the publications by the authors [
105].
Figure 8 also shows a limited tendency of the authors of papers in pay-walled journals to pay article processing fees to let their research output open-access as only 159 open-access publications are issued in hybrid journals. This is due to the high open-access fees that authors have to pay for being granted open access (≊3,000 USD) [
106] and the possibility of doing self-archiving for free [
107]. Further efforts should be done to enhance open-access publishing in Africa by providing sustainable funding resources and spreading open science practices.
The screening of the subject areas of the African research outputs about ML for healthcare as inferred from their source titles has revealed that most of the work are published in computer science-related, engineering-related, medicine-related, and mathematics-related research venues are clearly found at
Figure 9. Another evident finding is the lack of publication of biomedical ML outputs in venues dealing with unrelated fields such as
Business, Management and Accounting,
Earth and Planetary Sciences,
Economics, Econometrics and Finance, and
Arts and Humanities. However, what should be considered is the limited publishing about biomedical ML applications in
Pharmacology, Toxicology and Pharmaceutics,
Nursing,
Psychology,
Veterinary Medicine, and
Dentistry. Although artificial intelligence has been used for many years to support clinical medicine practices [
108], the application of computer science to other health professions is quite a new field, particularly in Veterinary Medicine [
109], Nursing [
110], Pharmacology [
111], and Dentistry [
112]. Psychology is developing computational methods for remote diagnosis and treatment based on the principles of human-computer interaction [
113] and social network analysis [
114]. An interesting fact is that these emerging fields are mostly dominated by Sub-Saharan Africa. This seem to be an excellent alternative for Sub-Saharan Africa to grow its biomedical ML research in a field where competition is limited and the need for it is growing worldwide. Besides, Sub-Saharan Africa has proven its efficiency in conducting biomedical ML research related to
Agricultural and Biological Sciences and
Immunology and Microbiology. These outputs respectively reflect research about
food chemistry and safety as well as
infectious diseases, immunology, microbiology. The advantage of research on ML applications for infectious disease monitoring is confirmed by previous publications on the matter [
115] and is mainly related to the development of research consortia and programs in Sub-Saharan Africa for the control of epidemic infectious such as the
West Africa International Centers of Excellence in Malaria Research [
46]. ML applications in Agriculture is a limited research area in Africa mainly driven by the individual efforts of several scientists in South Africa [
116], explaining the relative domination of Sub-Saharan Africa on health safety research linked to agriculture. Furthermore, there are several research topics of ML for healthcare where North Africa and Sub-Saharan Africa contribute in a comparable way, mainly
Public, Environmental and Occupational Health (Social Sciences, Health Professions, Environmental Science),
Bioinformatics (Biochemistry, Genetics and Molecular Biology), and
Neurology and Neurosurgery (Neuroscience). The similar distribution of ML research related to public, environmental and occupational health between North Africa and Sub-Saharan Africa does not seem to correlated to the contributions of African countries in public health research where Sub-Saharan Africa currently dominates the research field [
117]. The equal distribution in public health-related ML research can be explained by the existence of research topics in this scholarly area that are only too relevant for one of the two regions such as
traffic accidents and
pollution-related diseases in North Africa [
34]. Bioinformatics research is developed and equally distributed across Africa thanks to the establishment and ongoing efforts of
H3ABioNet as a continent-level consortium for computational biology [
118]. Neuroscience-related ML research is fairly split between North Africa and Sub-Saharan Africa thanks to the existence of highly productive research communities in both sides of the African continent, particularly in Tunisia, Morocco, Egypt, Nigeria and South Africa [
119]. By contrast, there are several fields that are dominated by North Africa. These fields explain the advantage of North Africa in terms of scholarly productivity related to biomedical ML research:
Decision Sciences stand for ML research about clinical decision support and recommendation system engineering.
Chemical Engineering, Chemistry, and Materials Science mainly reveal research outputs linked to biochemistry, nanomedicine, device engineering, and drug discovery.
Physics and Astronomy and Energy identify research related to Biophysics, Nuclear Medicine, Oncology, and Radiology.
Several fields covered by these three clusters are still incubating and do not explain the huge gaps between North Africa and Sub-Saharan Africa such as Nanomedicine [
120], clinical decision support [
121,
122] and Drug Discovery [
123]. The publication bias towards North Africa is mainly explained by the higher research productivity of this region in fields like Oncology [
124], Radiology [
125], and Biochemistry [
126].
The assessment of the ten most cited scholarly publications about ML for healthcare has revealed that eight of these outputs have been issued by
Elsevier as shown in
Table 3: Three papers are published in
Expert Systems with Applications and two in
Computer Methods and Programs in Biomedicine. This proves that
Elsevier publishes more impactful research outputs than the other corporations due to its highly selective editorial policy characterized by a limited acceptance rate [
127,
128]. Other publishers should revise their peer review policies to achieve better research quality and consequently a better citation impact. A surprising finding is that only five out of the ten most cited publications are open access. This proves the limitation of open access as a factor for a single publication to achieve highly-cited paper status. In other words, there is a level of citation impact where the open-access advantage no longer works and where only publication quality matters. When seeing the venues of the ten highly-cited publications, we found out that only one of them was a conference paper. This finding also applies to the list of highly-cited papers about COVID-19 pandemic [
129] among other lists of highly-cited papers [
130]. This is mainly explained by the relative inability of conferences to generate groundbreaking publications due to page limits and short time for reviewing manuscripts. One solution to that is the
ACL Rolling Review launched to enable longer peer review period for conferences [
131]. The list of the authors of the best-cited research publications revealed that
Mohamed Loey (Benha University, Egypt) is featured as the first author of two highly-cited research publications. This author is not included in the list of most productive scientists at
Table 1, proving that early-career scientists can write highly-cited publications if they meet high standards of research. When seeing the titles of the ten most cited papers, we find that the two first ranked scholarly publications are reviews, confirming the citation advantage of review papers over other publication types [
132]. We also found one data paper published at
Symmetry among our list, proving that biomedical dataset creation is an important work that can be worth citing. Finally, we found that six works deal with biomedical image classification, proving the emphasis of this research topic by the African community.