1. Introduction and Literature Oversight
Data literacy is the ability to read, understand, create, and communicate data as information. It is a subset of both visual, information and other literacies, and is an important skill for knowledge workers, consumers, and in modern and traditional cultures (Wang and Strong 1996,
Levitan and Verhulst 2016,
Mandinach and Gummer 2016,
O’Connor 2021). Data Literacy comprises several interconnected components and dimensions, integrating a blend of skills, knowledge, and attitudes essential for comprehensive proficiency. Data Literacy involves a range of technical skills, including proficiency in data analysis, statistical reasoning, data visualisation, and proficiency in using relevant software or programming languages for data manipulation and interpretation. It encompasses an understanding of data concepts, such as different data types (quantitative, qualitative, mixed methods), data sources, data collection methods, and an awareness of ethical considerations related to data handling and usage (
Verhulst 2016,
Nwagwu 2024).
Data Literacy encompasses attitudes conducive to a data-driven mindset, including curiosity, critical thinking, skepticism, and an appreciation for the role of data in decision-making. It involves the willingness to explore and question data, acknowledging both its potential and limitations. These components collectively form the foundation of Data Literacy, equipping individuals with the ability to not only comprehend and interpret data but also to critically evaluate its relevance, make informed judgments, and effectively communicate insights derived from data to diverse audiences. As the digital landscape continues to evolve, Data Literacy remains an evolving concept, demanding continual adaptation and acquisition of new skills to navigate and harness the potential of data in an ever-changing world (
D’Ignazio 2017,
Stanton et al. 2017).
The importance of Data Literacy in modern society cannot be overstated, as it serves as a cornerstone for informed decision-making, transformative problem-solving, innovation, and ethical considerations across diverse domains. Its significance lies in its capacity to empower individuals and organisations to harness the potential of data in a rapidly evolving digital landscape. Data Literacy is indispensable in an era where data has proliferated across every aspect of society. The ability to navigate, interpret, and derive insights from this abundance of information is critical for individuals and organisations alike. In a world inundated with data, being data-literate isn’t just advantageous; it is an imperative skill necessary for success in various spheres. Data Literacy forms the bedrock of informed decision-making. It enables individuals to base their choices on evidence and insights derived from data rather than intuition or guesswork. Decisions driven by data are often more accurate, effective, and strategic (
Osasona, et al. 2024).
Gould (
2017) touches upon the importance of Data Literacy in understanding probabilities and making decisions based on statistical reasoning. He discusses how a grasp of Data Literacy aids in distinguishing chance occurrences from significant trends. Furthermore, he argues that Data Literacy is foundational in preparing individuals for the demands of a data-driven job market and the importance of informed decision-making therein. Saltu-Rivas et al. (2022) highlights the crucial role of Data Literacy in fostering informed decision-making processes. She emphasises how enhancing Data Literacy skills empowers individuals to utilise data effectively for better decision outcomes.
Morrow (
2021) discusses the importance of Data Literacy in organisational contexts. He illustrates how improved Data Literacy aids in better communication and decision-making within businesses and institutions. Also,
Heiser et al. (
2023) explores the significance of Data Literacy in managerial roles. She emphasises how Data Literacy equips managers with the competence to make informed decisions based on data-driven insights.
In the age of big data, ethical considerations surrounding data usage are paramount. Data Literacy includes an understanding of ethical principles related to data privacy, security, bias, and responsible data handling. It empowers individuals to navigate these ethical complexities, ensuring that data is used ethically and responsibly. Data Literacy empowers individuals and organisations across diverse fields. In healthcare, it aids in diagnosing illnesses, predicting outbreaks, and personalising treatments. In business, it drives marketing strategies, operational efficiencies, and customer insights. In education, it improves student performance analysis and personalised learning (
Ghodoosi 2023). Policymakers leverage Data Literacy to craft evidence-based policies, while scientists use it to advance research and innovation. Ultimately, Data Literacy empowers individuals and organisations to extract meaningful insights from data, enabling them to make better decisions, solve complex problems, foster innovation, and navigate ethical considerations effectively. As data continues to permeate every facet of society, the importance of Data Literacy will only continue to grow, shaping a more informed, innovative, and ethically conscious world. These opinions are supported by several authors (
Aldboush and Ferdous 2023).
Debruyne et al. (
2022) explores the ethical implications of algorithms and data-driven decision-making, emphasizing the need for ethical considerations within Data Literacy in various societal contexts.
Loukides, Mason and Patil (
2018) emphasises the ethical responsibilities surrounding data usage. He advocates for a broader understanding of Data Literacy that includes ethical considerations in handling data, emphasizing the ethical dimensions of Data Literacy.
Nissenbaum (
2009) delves into the ethical dimensions of Data Literacy, stressing the importance of understanding the ethical implications of data use and its societal impact, especially concerning privacy and social implications.
The impact of Data Literacy spans across various domains, revolutionizing approaches, and driving transformative changes. Let us delve into some compelling instances and success stories that highlight the profound effects of enhanced Data Literacy in diverse fields like healthcare, business, education, and policymaking. Data Literacy has been a game-changer in healthcare, significantly influencing patient care and public health initiatives. Imagine the power of predictive analytics, where healthcare professionals can foresee potential health risks for individuals based on data patterns, enabling early interventions and personalised treatments. During outbreaks or pandemics, data-driven insights aid in predicting disease trends, allocating resources effectively, and formulating targeted public health strategies. For instance, analysing epidemiological data assists in identifying areas susceptible to outbreaks, guiding authorities in implementing preventive measures.
In the corporate world, Data Literacy empowers organisations to make informed decisions, optimise operations, and drive innovation. Companies leveraging data analytics gain deep insights into consumer behaviour, enabling them to tailor marketing strategies, optimise supply chains, and develop products aligned with market demands. Consider the impact of data-driven decision-making in retail, where inventory management based on analytics minimises excess stock, maximises sales opportunities, and enhances overall efficiency, thereby reducing costs and boosting profitability. Data Literacy in education is reshaping learning paradigms. Educators armed with Data Literacy skills analyses student performance data to personalise learning experiences. This enables tailored teaching methodologies, identifies learning gaps, and provides targeted interventions to ensure every student is needs are met. Institutions utilising data-driven insights at a systemic level enhance curriculum design, resource allocation, and policy formulation to foster a conducive learning environment.
In governance, Data Literacy fuels evidence-based policymaking. Governments leverage data analytics to devise more effective policies in various sectors. Whether it is designing healthcare policies for improved service delivery, urban planning based on demographic trends, or optimising resource allocation in public sectors, data-driven insights guide policymakers in making informed decisions. Monitoring policy effectiveness through data analysis allows for adaptive policymaking, enhancing governance and citizen welfare. Success stories abound across these domains, showcasing tangible impacts of improved Data Literacy. From reducing hospital readmission rates through predictive analytics in healthcare to optimising inventory management for cost-efficiency in business, and from personalised learning experiences in education to evidence-based policymaking in governance, Data Literacy is transformative effects are evident. In each success story, Data Literacy acts as the catalyst, empowering individuals and organisations to harness the potential of data, make informed choices, drive innovation, and ultimately, bring about positive, measurable changes. As Data Literacy continues to evolve and permeate various sectors, its transformative impact will remain a driving force in shaping a more data-driven, efficient, and innovative world (Ongena 2023, Jiang 2023,
Olszewski and Abukhdier 2023,
Bartholo, Koslinski and de Castro 2022).
The concept of Data Literacy has emerged as a fundamental skill that transcends boundaries and holds immense significance across diverse domains. It represents more than just an understanding of numbers and statistics; rather, it embodies a multifaceted competency involving skills, knowledge, and attitudes necessary to navigate the ever-expanding sea of data. The evolution of Data Literacy has been remarkable. Initially rooted in statistical analysis and data interpretation, it has evolved to encompass a broader spectrum of proficiencies. It now demands technical skills in data manipulation, visualisation, and an understanding of ethical considerations surrounding data usage. This evolution reflects the dynamic nature of data itself and the need for individuals to adapt to emerging technologies and methodologies (
Vance, Glimp, Pieplow, Garrity and Melbourne 2022).
Across healthcare, business, education, policymaking, and numerous other fields, Data Literacy plays a pivotal role. In healthcare, it enables predictive analysis for personalised treatments and proactive healthcare interventions, thereby improving patient outcomes. In businesses, Data Literacy empowers organisations to make informed decisions, optimise operations, and foster innovation by leveraging consumer insights and market trends. Within education, it revolutionises teaching methodologies, allowing educators to personalise learning experiences and enhance student outcomes. In policymaking, it drives evidence-based decisions, leading to more effective policies and governance (
Pins et al. 2022).
Assessing Data Literacy remains a challenge due to its multifaceted nature. Evaluating skills, knowledge, and attitudes related to data involves subjective elements, making standardised assessments difficult (
Santos, & Pedro & Mattar,
2021). Furthermore, the rapid evolution of technology requires continuous refinement of assessment tools to ensure relevance and accuracy. Success stories in these domains underscore the transformative impact of enhanced Data Literacy. From predicting disease outbreaks to optimising supply chains and tailoring educational approaches, Data Literacy enables innovation, efficiency, and informed decision-making. Looking ahead, there is a need for concerted efforts to promote Data Literacy. This entails integrating it into educational curricula, refining assessment methodologies, fostering ethical data practices, and encouraging cross-disciplinary collaboration. The ongoing cultivation of a data-literate society is crucial to navigating the complexities of our data-centric world, driving innovation, and ensuring responsible and informed decision-making across all facets of society. As the data landscape continues to evolve, the significance of Data Literacy remains steadfast, shaping a future where individuals and organisations harness data is potential for societal advancement and positive change (
Wilkerson, Lanouette and Shareff 2021).
Research in data literacy is widely acknowledged as crucial for the advancement of educational programs and pedagogies. However, there exists uncertainty regarding the scope of data literacy and its optimal integration into educational curricula. While consensus exists on the necessity of data literacy among the general population, educators and policymakers often lack clarity on its specific components, resulting in sporadic efforts to incorporate it into educational standards. To address this, a clearer conceptualization of data literacy is needed, encompassing the knowledge and cognitive skills required for interpreting and evaluating data across diverse contexts, including personal decision-making, civic engagement, and scientific inquiry (
Gehrke, Kistler, Lübke, Markgraf, Krol and Sauer 2021).
This clarification will facilitate the development of assessments, teaching materials, and educational standards tailored to fostering data literacy skills among students of all ages. Furthermore, research in data literacy aims to provide instructional support across disciplines, enabling students to effectively understand and utilise data in their academic pursuits and daily lives. As access to high-quality data becomes increasingly prevalent in various fields, proficiency in data manipulation and interpretation is essential for informed decision-making in education, business, and public policy (
Phadkule 2022).
Nwagwu (
2024) has x-rayed the literature on data literacy, observing, among others, that there is a growing interest on research in the area. How do we examine the connections between different pieces of work, such as articles or books already carried out on data literacy? Imagine that one is looking at a big web of knowledge on literacy, and each citation is considered to be link between two points. When someone writes a research paper or a book, they often reference other works they used for their research. Citation analysis offers a way to look at these references to see which works are being referenced the most, which ones are influencing others the most, and how ideas flow between different authors and topics. Researchers use citation analysis to understand trends in research, identify key authors and publications in a field, and track the impact of their own work or others.
Citation analysis involves the evaluation of scholarly works based on their citations, examining patterns and trends to understand influence, impact, and scholarly communication dynamics (
Garfield,
1972). It is commonly employed in bibliometrics and scientometrics to assess research productivity and impact (
Moed 2005). Citation analysis, a methodical exploration of citations within scholarly literature, offers a quantitative lens into the multifaceted realm of research communication and impact. By scrutinizing citation patterns, including frequency and context, researchers glean valuable insights into the influence and significance of individual works, authors, journals, and research fields (
Abramo, D’Angelo & Murgia,
2019). Researchers harness citation analysis for myriad purposes. Firstly, they assess impact by scrutinizing the citations garnered by specific papers, authors, or journals, thereby gauging their resonance within the scholarly community (
Bordons, Fernández & Gómez, 2002). Evidently, citation analysis uncovers trends in research topics, interdisciplinary collaborations, and burgeoning fields, aiding in the identification of evolving scholarly landscapes (
Bornmann & Leydesdorff 2014).
Furthermore, citation analysis serves as a pivotal tool for evaluating authors and institutions, empowering funding agencies and academic entities to gauge productivity and impact with precision (
Larivière et al., 2016). By mapping intricate knowledge networks through citation networks, researchers delineate the propagation of ideas and identify pivotal contributors and influential nodes, enriching our understanding of research dynamics (
Wang et al. 2019). Additionally, citation analysis facilitates the ranking of academic journals based on impact and prestige, thereby informing strategic decisions regarding research dissemination and publication avenues (
Waltman & van Eck, 2012). Overall, citation analysis offers indispensable insights into the scholarly landscape, equipping stakeholders with the requisite intelligence for informed decisions pertaining to research priorities, collaborations, and resource allocation (
Alonso, Cabrerizo, Herrera-Viedma & Herrera 2009).
Related to citation analysis is co-citation analysis. Co-citation refers to the frequency with which two works or more are cited together by other authors in their own publications. It is a measure of the relatedness or similarity of the content of two documents based on the citations they receive. For example, if Author A and Author B are frequently cited together in other works, it suggests a strong connection between their research and ideas (
Small July 1973,
Hjorland and Nicolaisen, 2005). This could indicate that they are working on similar topics or that their work complements each other in some way. Studying co-citation patterns can reveal intellectual connections between authors, identify influential works, and uncover emerging trends or research areas within a field. It is often used in bibliometric analysis to map the intellectual structure of a discipline or to identify key players in a particular research area. Co-citation of cited authors refers to the phenomenon where two or more authors are cited together in the same paper by another author (
Gipp and Beel, 2009,
Small and Klavans, 2013).
Co-citation is essentially a measure of how often two documents are cited together by other documents. When at least one other document references the same two documents, they are considered co-cited. The frequency of these co-citations indicates the strength of their relationship and suggests they are likely related in meaning. Similar to bibliographic coupling, co-citation is a way to gauge semantic similarity between documents through citation analysis. Imagine there is a diagram illustrating this concept.
Research in data literacy is widely acknowledged as crucial for the advancement of educational programs and pedagogies. However, there exists uncertainty regarding the scope of data literacy and its optimal integration into educational curricula. While consensus exists on the necessity of data literacy among the general population, educators and policymakers often lack clarity on its specific components, resulting in sporadic efforts to incorporate it into educational standards. To address this, a clearer conceptualization of data literacy is needed, encompassing the knowledge and cognitive skills required for interpreting and evaluating data across diverse contexts, including personal decision-making, civic engagement, and scientific inquiry. Specifically, understanding the connections between different pieces of work, such as articles or books, through citation analysis and co-citation analysis, is crucial for advancing our understanding of data literacy.
Citation analysis offers a quantitative lens into the multifaceted realm of research communication and impact, enabling researchers to assess the influence and significance of individual works, authors, journals, and research fields. Co-citation analysis, on the other hand, provides insights into the relatedness or similarity of the content of two documents based on the citations they receive, helping to identify intellectual connections between authors, influential works, and emerging trends within a field. Despite the importance of citation analysis and co-citation analysis in advancing our understanding of data literacy, there remain challenges in evaluating data literacy skills and integrating findings from citation analysis and co-citation analysis into educational practices effectively.
3. Results
3.1. Citation by Documents (2005-2023)
A total 997 documents were written on data literacy, a mean of 52 documents per year. We placed a minimum number of 10 citations per document and this resulted to 205 documents out of which only 81 were connected to other documents and were the basis for this analysis. The 81 documents produced 2768 citations or a mean of 34 citations per document. Prado is 2013 “Incorporating Data
Table 1.
Citation by Top Thirty Documents (2005-2023).
Table 1.
Citation by Top Thirty Documents (2005-2023).
|
Label |
Cluster |
Links |
Citations |
Norm. citations |
Pub. Year |
1 |
Prado (2013) |
3 |
8 |
161 |
3.0377 |
2013 |
2 |
Gray (2018) |
1 |
2 |
131 |
10.1307 |
2018 |
3 |
Pangrazio (2019) |
8 |
10 |
123 |
9.8743 |
2019 |
4 |
Schildkamp (2015) |
2 |
2 |
114 |
3.1947 |
2015 |
5 |
Schildkamp (2019) |
2 |
2 |
89 |
7.1449 |
2019 |
6 |
Hoogland (2016) |
2 |
9 |
83 |
3.5485 |
2016 |
7 |
Koltay (2017b) |
5 |
9 |
79 |
3.5333 |
2017 |
8 |
Koltay (2015b) |
1 |
17 |
76 |
2.1298 |
2015 |
9 |
d’ignazio (2017) |
4 |
4 |
76 |
3.3992 |
2017 |
10 |
Gould (2017) |
7 |
5 |
74 |
3.3097 |
2017 |
11 |
Carmi (2020) |
8 |
1 |
65 |
7.18 |
2020 |
12 |
Kippers (2018) |
2 |
5 |
64 |
4.9493 |
2018 |
13 |
Koltay (2016b) |
1 |
6 |
60 |
2.5652 |
2016 |
14 |
Mandinach (2021a) |
2 |
3 |
56 |
8.1654 |
2021 |
15 |
Pangrazio (2020) |
6 |
3 |
55 |
6.0753 |
2020 |
16 |
Reeves (2015) |
9 |
1 |
52 |
1.4572 |
2015 |
17 |
Macmillan (2014) |
1 |
2 |
49 |
4.6838 |
2014 |
18 |
Cowie (2017) |
6 |
2 |
48 |
2.1468 |
2017 |
19 |
Raffaghelli (2020a) |
7 |
5 |
44 |
4.8603 |
2020 |
20 |
Stephenson (2007) |
5 |
8 |
41 |
1.9524 |
2007 |
21 |
Ebbeler (2017) |
2 |
3 |
40 |
1.789 |
2017 |
22 |
Stornaiuolo (2020) |
10 |
3 |
37 |
4.0871 |
2020 |
23 |
Wolff (2019) |
6 |
5 |
36 |
2.8901 |
2019 |
24 |
Koltay (2019) |
1 |
5 |
36 |
2.8901 |
2019 |
25 |
Athanases (2012) |
9 |
1 |
34 |
0.2163 |
2012 |
26 |
Bowler (2017) |
6 |
6 |
33 |
1.4759 |
2017 |
27 |
Maybe (2015) |
3 |
7 |
32 |
0.8968 |
2015 |
28 |
Federer (2016) |
1 |
1 |
32 |
1.3681 |
2016 |
29 |
Pothier (2020) |
5 |
2 |
31 |
3.4243 |
2020 |
30 |
Lee (2021) |
10 |
1 |
31 |
4.5201 |
2021 |
Literacy into Information Literacy Programmes: Core competences and content” in Libri has been cited (161) more than all papers on the subject matter. Prado is paper was based on the observation that rise of the importance of data in society necessitates libraries’ integration of data literacy into their information programs. The paper proposed a framework of core competencies to address this need, facilitating the development of resources and guiding further research.
Figure 1.
Citation by Top Thirty Documents (2005-2023).
Figure 1.
Citation by Top Thirty Documents (2005-2023).
Flowing is Gray, Gerlitz and Bounegru is (2018) “Data infrastructure literacy” in Big Data and Society In this paper, the author reminisces on a report from the UN that makes the case for “global data literacy” in order to realise the opportunities afforded by the “data revolution”. The document has been cited 131 times. Prado is paper has both higher links (8) and cluster (3) than Gray is 2 and 1 respectively.
3.2. Citation by Sources (2005-2023)
The total number of sources indexed is 546; for minimum number of documents per source placed at 2 resulted to156 documents out of which only 105 were linked, and were used in the analysis. We sorted the data according to documents first, and then by citations. Citations are usually indexed with reference to the documents being cited; hence sorting by documents yields some great insights about citations by sources. By this token, it can be seem that ACM International Conference Proceeding Series published the highest number of documents on the subject matter (47) and this documents were cited 260 times during the period. Communications in Computer and Information Science published 23 papers that were cited 79 times.
Figure 1.
Citation by Sources.
Figure 1.
Citation by Sources.
The Teachers College Record, (TCR) “… a journal of research, analysis, and commentary in the field of education that has been published continuously since 1900 by Teachers College, Columbia University, ranks the first. The journal published only six articles, once in 2012, and five times in 2015 – the journal, with only one cluster has been cited 429 times altogether. A single cluster and numerous citations, typically indicates focused article publication that draws considerable attention from researchers in a specific field, cluster here denoting a specialised area or narrow subject domain within the journal is broader coverage. Teaching and Teacher Education is the next journal after TCR. It has featured eight documents on the subject, and has a single cluster and 389 citations. But it has higher links (16) and Total Link Strength (30) than TCR (13) and 23 respectively. Educational Researcher, obviously has a wider focus and has the third highest number of citations (270). By number of documents published in the sources, ACM International Conference Proceeding Series and Communications in Computer and Information Science, 47 and 23 respectively. ACM International Conference Proceeding Series also accounted for the fourth highest number of citations.
Table 2.
Citation by Top Thirty Sources.
Table 2.
Citation by Top Thirty Sources.
|
Label |
Cluster |
Links |
Total link strength |
Documents |
Citations |
Norm. citations |
Avg. pub. year |
Avg. citations |
Avg. norm. citations |
1 |
ACM International Conference Proceeding Series |
13 |
8 |
11 |
47 |
260 |
27.1754 |
2020.064 |
5.5319 |
0.5782 |
2 |
communications in computer and information science |
16 |
6 |
10 |
23 |
79 |
5.0262 |
2018.696 |
3.4348 |
0.2185 |
3 |
Journal of Physics: Conference Series |
2 |
5 |
6 |
18 |
45 |
4.951 |
2020 |
2.5 |
0.2751 |
4 |
Proceedings of International Conference of the Learning Sciences, ICLS |
10 |
8 |
12 |
15 |
16 |
4.8611 |
2021.067 |
1.0667 |
0.3241 |
5 |
Conference on Human Factors in Computing Systems – Proceedings |
8 |
4 |
4 |
13 |
239 |
25.534 |
2019.385 |
18.3846 |
1.9642 |
6 |
CEUR Workshop Proceedings |
15 |
5 |
6 |
12 |
41 |
3.5447 |
2019.5 |
3.4167 |
0.2954 |
7 |
Lecture Notes in Computer Science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) |
9 |
2 |
2 |
11 |
21 |
1.8881 |
2019.455 |
1.9091 |
0.1716 |
8 |
Proceedings of the Association for Information Science and Technology |
10 |
25 |
29 |
10 |
88 |
5.9689 |
2018.8 |
8.8 |
0.5969 |
9 |
Computer-Supported Collaborative Learning Conference, CSCL |
17 |
7 |
8 |
9 |
20 |
2.0777 |
2019.222 |
2.2222 |
0.2309 |
10 |
Teaching and Teacher Education |
1 |
16 |
30 |
8 |
389 |
20.7427 |
2018 |
48.625 |
2.5928 |
11 |
British Journal of Educational Technology |
8 |
14 |
20 |
8 |
74 |
22.5432 |
2021.5 |
9.25 |
2.8179 |
12 |
Journal of Business and Finance Librarianship |
4 |
10 |
17 |
8 |
54 |
17.1122 |
2020.625 |
6.75 |
2.139 |
13 |
Education and Information Technologies |
8 |
8 |
9 |
8 |
43 |
40.4072 |
2022.75 |
5.375 |
5.0509 |
14 |
Journal of Media Literacy Education |
15 |
15 |
21 |
8 |
40 |
4.4184 |
2020 |
5 |
0.5523 |
15 |
Higher Education Dynamics |
11 |
8 |
10 |
8 |
1 |
1.8713 |
2023 |
0.125 |
0.2339 |
16 |
Studies in Educational Evaluation |
1 |
8 |
20 |
7 |
179 |
20.0918 |
2019.857 |
25.5714 |
2.8703 |
17 |
Information and Learning Science |
6 |
18 |
20 |
7 |
21 |
4.5491 |
2022 |
3 |
0.6499 |
18 |
Journal of Map and Geography Libraries |
10 |
1 |
1 |
7 |
18 |
1.8239 |
2019.714 |
2.5714 |
0.2606 |
19 |
Lecture Notes in Networks and Systems |
14 |
2 |
2 |
7 |
0 |
0 |
2022.571 |
0 |
0 |
20 |
Teachers College Record |
1 |
13 |
23 |
6 |
429 |
11.2856 |
2014.5 |
71.5 |
1.8809 |
21 |
Journal of Library and Information Science in Agriculture |
9 |
1 |
1 |
6 |
2 |
0.2916 |
2021 |
0.3333 |
0.0486 |
22 |
Big Data and Society |
2 |
3 |
3 |
5 |
186 |
17.0758 |
2019.2 |
37.2 |
3.4152 |
23 |
Journal of Documentation |
2 |
30 |
42 |
5 |
128 |
7.1709 |
2018.6 |
25.6 |
1.4342 |
24 |
Journal of Academic Librarianship |
4 |
8 |
15 |
5 |
107 |
13.753 |
2019.6 |
21.4 |
2.7506 |
25 |
International Journal of Educational Technology in Higher Education |
7 |
3 |
3 |
5 |
88 |
16.7639 |
2021.2 |
17.6 |
3.3528 |
26 |
information communication and society |
3 |
8 |
8 |
5 |
62 |
8.1744 |
2020.2 |
12.4 |
1.6349 |
27 |
Action in Teacher Education |
1 |
8 |
10 |
5 |
36 |
2.0608 |
2018 |
7.2 |
0.4122 |
28 |
Teaching Statistics |
5 |
1 |
1 |
5 |
29 |
9.4049 |
2021.8 |
5.8 |
1.881 |
29 |
Education Sciences |
5 |
9 |
10 |
5 |
25 |
2.8322 |
2021 |
5 |
0.5664 |
30 |
Library Philosophy and Practice |
2 |
2 |
2 |
5 |
4 |
0.3815 |
2020 |
0.8 |
0.0763 |
We physically searched for the website of the sources/journals to identify their disciplinary affiliations. We found that Education 24, Library and Information Science (7), Psychology (1), Computer Science/Information Technology (14), Social Sciences (7), Mathematics/Statistics (4) Multidisciplinary (3), General Science (1), Humanities (2) and Various Disciplines/ Interdisciplinary (10).
Citation by Authors (2005-2023)
For a minimum threshold of document per author placed at 3 and number of citations per author placed at 1, we obtained 2351 authors suitable for the analysis. Please see table xx and Figure xx. The table provides an overview of a citation study on data literacy by authors. It displays the crucial metrics such as cluster affiliation, linkages, total link strength, document count, citations received, and normalised citations. These metrics collectively offer valuable insights into the network dynamics and scholarly impact within the domain of data literacy research. Each author is cluster affiliation signifies their thematic association, while linkages and total link strength quantify the interconnectedness and prominence within their respective clusters. The document count reflects the scholarly output of each author, whereas the citations received and normalised citations shed light on their influence and recognition within the scholarly community. This comprehensive analysis aids in understanding the landscape of data literacy research, identifying key contributors, and discerning patterns of scholarly dissemination and impact.
Figure 3.
Citation by Authors (2005-2023).
Figure 3.
Citation by Authors (2005-2023).
In the realm of data literacy research, several authors stand out for their significant contributions, as evidenced by their citations, link strength, and publication output. Among these standout performers is Ellen B. Mandinach, whose work has garnered a remarkable total link strength of 10 and an impressive 776 citations across 31 documents, reflecting her profound influence and prolific output in the field. Kim Schildkamp also emerges as a prominent figure, with a substantial link strength of 27 and 510 citations distributed over 9 documents. Her research has evidently made a substantial impact within the community. Tabor Koltay distinguishes himself with 20 links and 343 citations spread across 15 documents, demonstrating a consistent and significant presence in data literacy scholarship. Luci Pangrazio, despite a lower number of documents, commands attention with an outstanding norm.citations score of 29.355, indicating high influence relative to
Table 3.
Citation by Top Thirty Authors (2005-2023).
Table 3.
Citation by Top Thirty Authors (2005-2023).
|
Label |
Cluster |
Links |
Total link strength |
Documents |
Citations |
Norm. citations |
1 |
Mandinach, Ellen B. |
4 |
1 |
2 |
10 |
776 |
31.397 |
2 |
Schildkamp, Kim |
4 |
17 |
27 |
9 |
510 |
36.300 |
3 |
Koltay, Tabor |
1 |
20 |
25 |
15 |
343 |
16.873 |
4 |
Pangrazio, luci |
3 |
33 |
58 |
7 |
304 |
29.355 |
5 |
Selwyn, Neil |
3 |
20 |
29 |
5 |
245 |
21.001 |
6 |
Jimerson, Jo Beth |
7 |
1 |
1 |
3 |
111 |
4.274 |
7 |
d’Ignazio, Catherine |
11 |
21 |
31 |
4 |
97 |
9.573 |
8 |
Reeves, Todd D. |
4 |
2 |
4 |
5 |
91 |
4.393 |
9 |
Wolff, Annika |
2 |
11 |
13 |
10 |
88 |
5.626 |
10 |
Carmi, Elinor |
3 |
2 |
4 |
4 |
84 |
9.710 |
11 |
Yates, Simeon J. |
3 |
2 |
4 |
3 |
83 |
9.134 |
12 |
Markham, Annette N |
11 |
1 |
1 |
3 |
74 |
6.923 |
13 |
Stewart, Bonnie |
9 |
4 |
5 |
4 |
70 |
7.704 |
14 |
Cowie, Bronwen |
7 |
10 |
14 |
4 |
54 |
2.914 |
15 |
Schultheis, Elizabeth H. |
2 |
3 |
3 |
4 |
54 |
7.424 |
16 |
Acker, Amelia |
5 |
11 |
14 |
5 |
53 |
2.860 |
17 |
Bowler, Leanne |
5 |
11 |
14 |
5 |
53 |
2.860 |
18 |
Lee, victor r. |
5 |
9 |
12 |
4 |
52 |
12.763 |
19 |
Schneider, rené |
1 |
1 |
1 |
4 |
52 |
1.233 |
20 |
Wilkerson, michelle hoda |
5 |
7 |
8 |
4 |
52 |
7.3697 |
21 |
Raffaghelli, juliana elisa |
9 |
4 |
4 |
4 |
51 |
10.9321 |
22 |
Dasgupta, sayamindu |
2 |
11 |
11 |
4 |
46 |
2.5879 |
23 |
Raffaghelli, juliana e. |
9 |
7 |
12 |
6 |
45 |
6.7135 |
24 |
Chi, yu |
5 |
11 |
15 |
3 |
42 |
2.1719 |
25 |
Jeng, wei |
5 |
11 |
14 |
3 |
42 |
2.1719 |
26 |
Condon, patricia b. |
1 |
6 |
10 |
3 |
37 |
8.1592 |
27 |
Knight, simon |
5 |
6 |
9 |
4 |
34 |
4.7121 |
28 |
Mccosker, anthony |
3 |
2 |
5 |
3 |
32 |
5.145 |
29 |
Manca, stefania |
9 |
5 |
6 |
3 |
29 |
8.4752 |
30 |
Nguyen, dennis |
4 |
4 |
5 |
4 |
27 |
8.2391 |
her publication output. Neil Selwyn and Jo Beth Jimerson also deserve recognition for their contributions. While Selwyn exhibits a strong influence with 245 citations across 5 documents, Jimerson is work, though fewer in number, is notable for its focused impact, as evidenced by 111 citations across only 3 documents. Catherine d’Ignazio, Todd D. Reeves, and Annika Wolff are among the notable performers, each with distinctive strengths in linkages and citations, reflecting their significant roles in advancing the discourse on data literacy.
3.3. Co-Citation of Cited Authors
Co-citation of cited authors refers to the phenomenon where two or more authors are cited together in the same paper by another author. There were 47361 cited authors on the subject matter. A total 118 met a threshold of 10 citations per author.
Table 4 and
Figure 4 represents a co-citation analysis of cited authors. Several authors stand out for their significant co-citation relationships, indicating the extent to which their work is cited together by other authors. Ellen B. Mandinach leads the pack with an impressive 156 links, reflecting a total link strength of 316.4462. Her work has been cited in conjunction with others 358 times, showcasing her pivotal role in shaping the discourse on data literacy. Kim Schildkamp closely follows with 137 links and a total link strength of 281.3285, reflecting 334 co-citations. Her research is evidently influential and frequently referenced alongside other prominent figures in the field.
E.S. Gummer also commands attention with 156 links and a total link strength of 233.3139, indicating substantial co-citation relationships resulting in 253 citations. Catherine d’Ignazio and R. Bhargava emerge as notable contributors, each with strong co-citation linkages and significant citation counts, demonstrating their impact and integration within the broader scholarly community. Tabor Koltay and J. Carlson are among the key players, with their work being frequently co-cited alongside others, reflecting their significant contributions to the field. Neil Selwyn and Luci Pangrazio also deserve recognition for their substantial co-citation relationships, indicative of their influence and interconnectedness within the research landscape. These authors
represent the core of the data literacy research community, characterised by their extensive co-citation relationships and the collective impact of their contributions on advancing the field is knowledge and understanding.
3.4. Co-Citation by Cited Reference
There were 35410 cited references; at a threshold of 10 citations per cited reference yielded 64 items.
Table 5 and
Figure 5 show that Mandinach and Gummer is research, published in 2013, delves into implementing data literacy in educator preparation. This work, falling into Cluster 2, stands out with 38 co-citations, indicating its widespread reference by other scholars. The total link strength of 30 suggests strong connections with other works in the cluster. With 37 citations, it is evidently a seminal piece, underlining its pivotal role in discussions surrounding educator training in data literacy. D’Ignazio and Klein’s “Data Feminism,” a recent publication from 2020 within Cluster 1, has attracted attention with 28 co-citations and 24 total link strength. This work highlights the intersection of gender studies and data literacy, a topic gaining prominence. Its 29 citations affirm its relevance and influence in shaping conversations about social perspectives in data literacy.
Koltay is article from 2015 explores the concept of data literacy identity, contributing to Cluster 3. With 24 co-citations and 20 total link strength, it reflects a robust connection within the cluster. Having garnered 27 citations, this work is evidently influential in discussions about defining and understanding data literacy. Gould is exploration of data literacy as statistical literacy, published in 2017 and affiliated with Cluster 1, draws significant attention with 31 co-citations and approximately 23.33 total link strength. With 26 citations, it underscores the critical relationship between data literacy and statistical understanding, highlighting its importance in educational discourse.
O’Neil is book “Weapons of Math Destruction” from 2016, situated in Cluster 1, resonates strongly with 30 co-citations and a total link strength of 22. With 26 citations, it sheds light on the societal implications of big data, emphasizing the ethical considerations inherent in data usage. This work contributes significantly to discussions about the broader societal impacts of data literacy. Carlson et al. is study on determining data information literacy needs, published in 2011 and associated with Cluster 3, garners attention with 23 co-citations and 19 total link strength. Its 22 citations underline its relevance in understanding the data literacy requirements of students and research faculty, particularly in academic settings. Calzada Prado and Marzal is work on incorporating data literacy into information literacy programs, appearing in Cluster 3 and published in 2013, is cocited with 18 other works, indicating its integration within the cluster. With 17 total link strength and 21 citations, this study contributes significantly to discussions about integrating data literacy into broader educational frameworks.
Mandinach is research on using data-driven decision making to inform practice, published in 2012 and associated with Cluster 2, draws attention with 22 co-citations and 19 total link strength. With 21 citations, it underscores the importance of leveraging data in educational decision-making processes, highlighting its relevance for educators and policymakers. Prado and Marzal is work on incorporating data literacy into information literacy programs, published in 2013 and associated with Cluster 3, stands out with 27 co-citations, 18 total link strength, and 21 citations. This indicates its pivotal role in shaping discussions about integrating data literacy into educational curricula and programs.
3.5. Co-Citation by Cited Sources
A total of 17706 cited sources were identified, and 173 sources met the threshold of 20 citations per source.
Table 6 af
Figure 6 speak to the detail. Teaching and Teacher Education: Belongs to cluster 5, with 130 links, a total link strength of 245.704, and 286 citations, indicating its significance in educational research and pedagogy. Teachers College Record: Also part of cluster 5, with 145 links, a total link strength of 239.297, and 279 citations, highlighting its influence in educational policy and practice. Big Data & Society: Falls under cluster 2, with 144 links, a total link strength of 157.076, and 197 citations, suggesting its importance in the intersection of big data and societal impacts. Educational Researcher: Belongs to cluster 5, with 161 links, a total link strength of 181.1752, and 188 citations, indicating its role as a leading publication in educational research. Journal of the Learning Sciences: Part of cluster 3, with 122 links, a total link strength of 144.878, and 177 citations, emphasizing its significance in the study of learning processes and educational technologies.
Studies in Educational Evaluation: Also associated with cluster 5, with 137 links, a total link strength of 144.168, and 174 citations, indicating its role in evaluating educational practices and policies. Computers & Education: Falls under cluster 4, with 147 links, a total link strength of 141.4396, and 164 citations, highlighting its importance in the intersection of technology and education. The Journal of Community Informatics: Belongs to cluster 2, with 154 links, a total link strength of 143.834 and 162 citations, suggesting its role in community-based research and informatics. New Media & Society: Also part of cluster 2, with 153 links, a total link strength of 131.164, and 145 citations, indicating its significance in the study of new media and digital cultures.
PLOS ONE is in cluster 1, with 131 links, a total link strength of 93.272 and 123 citations, highlighting its role as an open-access multidisciplinary journal. British Journal of Educational Technology belongs to cluster 4, with 148 links, a total link strength of 109.655, and 121 citations, emphasizing its importance in educational technology research and practice. Computers in Human Behaviour is in cluster 4, with 147 links, a total link strength of 109.522, and 120 citations, indicating its significance in the study of human-computer interaction in educational contexts. Journal of Documentation: Associated with cluster 1, with 144 links, a total link strength of 111.925, and 120 citations, highlighting its role in the study of information science and documentation. Journal of eScience Librarianship: Also part of cluster 1, with 81 links, a total link strength of 90.430 and 117 citations, indicating its significance in the field of library and information science, particularly in eScience. Government Information Quarterly falls under cluster 2, with 101 links, a total link strength of 64.279, and 106 citations, highlighting its role in the study of government information policies and practices.
School Effectiveness and School Improvement belongs to cluster 5, with 80 links, a total link strength of 85.520, and 101 citations, indicating its significance in research on school effectiveness and improvement strategies. Statistics Education Research Journal is part of cluster 3, with 135 links, a total link strength of 81.0099, and 95 citations, emphasizing its role in the advancement of statistical education research. American Educational Research Journal also associated with cluster 5, with 118 links, a total link strength of 88.299, and 92 citations, highlighting its significance in educational research across various domains. Libri falls under cluster 1, with 147 links, a total link strength of 83.357, and 90 citations, indicating its importance in the study of libraries and information science. IEEE Transactions on Visualization and Computer Graphics belongs to cluster 3, with 104 links, a total link strength of 60.803, and 89 citations, suggesting its role in advancing research in visualization and computer graphics in education.
Information, Communication & Society: Part of cluster 2, with 141 links, a total link strength of 81.508, and 89 citations, highlighting its importance in the study of information and communication technologies in society. Cognition and Instruction: Falls under cluster 3, with 111 links, a total link strength of 77.238, and 83 citations, emphasizing its role in the advancement of cognitive science and instructional design. International Journal of Digital Curation belongs to cluster 1, with 75 links, a total link strength of 69.755, and 79 citations, indicating its significance in the study of digital curation practices and standards. American Journal of Education: Also associated with cluster 5, with 90 links, a total link strength of 68.1408, and 75 citations,
highlighting its importance in educational policy and practice. Journal of Research in Science Teaching falls under cluster 3, with 107 links, a total link strength of 63.496, and 74 citations, emphasizing its role in the advancement of science education research. Science: Part of cluster 1, with 134 links, a total link strength of 65.860, and 74 citations, indicating its significance as a leading scientific source
4. Discussion of Findings
This study undertook a citation analysis of research on data literacy by documents, sources and authors; and a co-citation of cited authors, cited sources and cited references based on data collected from Scopus and analysed with Vosviewer. The analysis of citation by documents reveals a picture of the scholarly landscape surrounding data literacy. Initially, the study identifies a substantial volume of literature produced on the subject, totaling 997 documents over the observed period, with an average of 52 documents per year. However, to ensure a focus on significant contributions, a criterion of a minimum of 10 citations per document is applied. This filters the dataset down to 205 documents that meet the threshold, indicating a subset of scholarly works that have garnered substantial attention within the academic community. Further refinement of the dataset reveals that out of the 205 documents meeting the citation threshold, only 81 are interlinked with other documents. These interconnected documents form the basis for the subsequent analysis. This observation suggests that while numerous documents exist on data literacy, only a fraction of them are deeply engaged with and cited within the scholarly discourse, indicating their centrality in shaping the conversation on this topic (
Small 1973,
Small and Kalavan 2013)
Within this subset of interconnected documents, a remarkable average of 34 citations per document is observed, totaling 2768 citations across the 81 papers. This high citation rate underscores the significance and impact of these select works within the field of data literacy. Notably, Prado’s (2013) paper, “Incorporating Data Literacy into Information Literacy Programmes: Core Competences and Content,” emerges as a standout, amassing 161 citations, surpassing all other papers in terms of citation count. Prado’s work is noteworthy for its emphasis on the growing importance of data in contemporary society and its advocacy for the integration of data literacy into information programs. By proposing a comprehensive framework of core competencies, Prado provides not only a roadmap for educators and policymakers but also a foundation for further research and development in the field.
The analysis of citation by sources provides insight into the key journals, conference proceedings and other sources shaping the discourse on data literacy. Among the top sources, the ACM International Conference Proceeding Series emerges as the most prolific, publishing 47 documents on the subject and accumulating 260 citations. This indicates the significant contribution of academic conferences in disseminating research and fostering scholarly dialogue within the field. Additionally, the study highlights the Teachers College Record as a standout source, despite its relatively low publication output of only six articles. TCR’s remarkable citation count of 429 underscores its substantial impact within the field of education and its role as a leading platform for scholarly discourse on data literacy. The disciplinary affiliations of these sources vary, encompassing diverse fields such as Education, Library and Information Science, Psychology, Computer Science/Information Technology, Social Sciences, Mathematics/Statistics, Humanities, and Multidisciplinary subjects. This interdisciplinary nature of data literacy sources underscores the broad relevance and significance of the area across various domains of knowledge.
The study also examined authors’ citations during 2005 to 2023. For the analysis, a minimum threshold of three documents per author and one citation per author was set, resulting in 2351 authors suitable for examination. Each author’s cluster affiliation indicates their thematic association within the field, while linkages and total link strength quantify their interconnectedness and prominence within their respective clusters. The document count reflects the scholarly output of each author, whereas citations received and normalized citations shed light on their influence and recognition within the scholarly community. The analysis identifies standout authors based on their citation counts, link strength, and publication output. Notable figures include Ellen B. Mandinach, Kim Schildkamp, Tabor Koltay, and Luci Pangrazio, among others. These authors have demonstrated significant contributions to the field, as evidenced by their citations and linkages. Additionally, the analysis highlights the network dynamics within the domain of data literacy research, providing insights into the patterns of scholarly dissemination and impact. It aids in understanding the landscape of data literacy research, identifying key contributors, and discerning thematic trends and patterns of scholarly influence.
The analysis of Co-citation of Cited Authors reveals several prominent authors whose works are extensively cited together, indicating the interconnectedness of their ideas and the influence of their contributions. Ellen B. Mandinach, with 156 links and a total link strength of 316.4462, emerges as a central figure in shaping discussions on data literacy. Her research, particularly on implementing data literacy in educator preparation, has garnered significant attention, as reflected in the high number of co-citations. Similarly, Kim Schildkamp’s and E.S. Gummer’s works are frequently cited alongside Mandinach’s, highlighting their collective impact on advancing the field. Furthermore, authors like Catherine d’Ignazio and R. Bhargava, Tabor Koltay, and J. Carlson also feature prominently in co-citation networks, underscoring their significant contributions to the discourse on data literacy. These authors represent a diverse range of perspectives, from exploring the intersection of gender studies with data literacy to investigating the practical implications of data-driven decision-making in educational settings.
The analysis of co-cited references sheds light on seminal works that have shaped the trajectory of data literacy research. Mandinach and Gummer’s research on implementing data literacy in educator preparation stands out, with 38 co-citations and a total link strength of 30, indicating its foundational role in the field. Similarly, d’Ignazio and Klein’s exploration of “Data Feminism” and Koltay’s examination of data literacy identity contribute fresh insights, reflecting the evolving nature of data literacy discourse. The analysis highlights the interdisciplinary nature of data literacy research, with works spanning education, sociology, and information science. Gould’s exploration of data literacy as statistical literacy and O’Neil’s examination of the societal implications of big data underscore the multifaceted dimensions of the field. These co-cited references not only contribute to theoretical frameworks but also offer practical insights for educators, policymakers, and researchers grappling with the challenges of data literacy.
The co-citation analysis of cited sources reveals key publications that serve as pillars in data literacy research.
Teaching and Teacher Education and
Teachers College Record emerge as central hubs, showcasing their pivotal role in disseminating scholarly discourse on data literacy. These journals provide platforms for researchers to exchange ideas, disseminate findings, and engage in critical debates shaping the field. Additionally, interdisciplinary journals such as
Big Data & Society and
New Media & Society highlight the intersection of data literacy with broader societal trends, including the impact of technology on information dissemination and digital cultures. By analyzing the co-citation patterns of these sources, researchers can gain insights into the evolving landscape of data literacy research and identify emerging trends and themes that warrant further investigation (
Trujillo and Long 2018).