Learning that emphasizes data literacy and encourages analysis within multimedia visualization platforms is a growing trend in higher education pedagogy. Because data as a form of evidence holds a privileged position in our cultural discourse, interdisciplinary undergraduate degree programs in the social sciences, humanities, and related disciplines increasingly incorporate data visualization, thus elevating data literacy alongside other established curricular outcomes. When well-conceived, critical data literacy instruction engenders a productive blend of theory and practice and positions students to examine how race-based bigotry, gender bias, colonial dominance, and related forms of oppression are implicated in the rhetoric of data analysis and visualization. Students can then create visualizations of their own that establish counternarratives or otherwise confront the locus of power in society to present alternative perspectives.
As scholarship in media, communications, and cultural studies pedagogy has established, data visualizations “reflect and articulate their own particular modes of rationality, epistemology, politics, culture, and experience,” so as to embody and perpetuate “ways of knowing and ways of organizing collective life in our digital age” (Gray et al. 2016, 229). Catherine D’Ignazio and Lauren F. Klein (2020, 10) explain this dialectic more pointedly in Data Feminism, arguing, “we must acknowledge that a key way power and privilege operate in the world today has to do with the word data itself,” especially the assumptions and uses of it in daily life. Critical instruction positions undergraduates to question how data, in its composition, analysis, and visualization, can often perpetuate an unjust socio-cultural status quo. Undergraduates who are introduced to frames for interpreting culture also need to be exposed to tools—literal and conceptual—that help them critique data visualizations. The goal is to enable a holistic critical literacy, through which students can find data, structure it with a research question in mind, and produce accurate, inclusive visualizations.
However, data instruction is challenging, and planning data learning within the context of an existing course requires an array of skills. Effective data visualization pedagogy demands that instructors locate example datasets, clean data to minimize roadblocks, and create sample visualizations to initiate student engagement with first-order cultural-critical concepts. These steps, a substantial time investment, are necessary for teaching that enables data novices to contend with the mechanics of data manipulation while remaining focused on social and political questions that surround data. When charged with developing data visualization assignments and instructional assistance, faculty often seek the support and expertise of librarians and educational technologists, who are located at the nexus of data learning within the university (Oliver et al. 2019, 243).
Even in cases where librarians and instructional support staff are well-positioned to assist, the demand for teaching data visualization can be overwhelming. It can become burdensome to deliver in-person instruction to cohort courses with a large student enrollment, across many sections and in successive semesters. In order to initiate and maintain an effective, multidisciplinary data literacy program, teaching faculty, librarians, and educational technologists must establish strong teaching partnerships that can be replicated and reimagined in multiple contexts.
This essay addresses some challenges of teaching critical data literacy and describes a shared instruction model that encourages undergraduates at a large research university to develop critical data literacy and visualization skills. Although anyone engaged in teaching critical data literacy can draw from this essay, we intend our discussion for librarians who are teaching in an academic setting, and particularly in contexts involving large-scale or programmatic approaches to teaching. In addition, we believe our essay is particularly pertinent to those designing program curricula within discipline-specific settings, as our ideas engage questions of determining scale, scope, and learning outcomes for effective undergraduate instruction.
The teaching model we propose originated as a collaboration between the New York University Libraries and NYU’s Media, Culture, and Communications (MCC) department, and our specific intervention is the development of an assignment involving data visualization for a Methods in Media Studies (MIMS) course. The distributed teaching model we describe has the dual purpose of supporting the major and serving as an organizational template, a structure for building resources and approaches to instruction that supports librarians as they develop replicable pedagogical strategies, including those informed by a cultural critical lens. In this regard, we believe that collaborative instruction empowers librarians and faculty from many disciplines to develop their own data literacy competency while growing as teachers. And, it enables the library to affect undergraduate learning throughout the university.
There is already an extensive body of research about the role of critical data literacy instruction, including critical approaches to the technical elements of data visualization (Drucker 2014; Sosulski 2019; Engebresten and Kennedy 2020). While we draw from that scholarly discussion, we focus instead on the upshot of programmatic, extensible teaching partnerships between libraries and discipline-specific undergraduate programs. Along the way, we engage two crucial questions: What is the value of creating replicable lesson plans and materials, to be taught by an array of library staff repeatedly? How can the librarians who design these materials strike a balance between creating a step-by-step lesson plan that library instructors follow and structuring a guided lesson that is flexible and capacious enough for instructors to experience meaningful teaching encounters of their own?
Data Literacy in Undergraduate Education
Several curricular initiatives and assessment rubrics in higher education pedagogy recognize the need for students to develop fluidity with digital media and quantitative reasoning, a precursor to effective data visualization. In 2005, Association of American Colleges and Universities (AAC&U) began a decade-long initiative called Liberal Education and America’s Promise (LEAP), which resulted in an inventory of 21st century learning outcomes for undergraduate education. Quantitative literacy is on the list of outcomes (Association of American Colleges and Universities 2020). A corresponding AAC&U rubric statement asserts that “[v]irtually all of today’s students … will need basic quantitative literacy skills such as the ability to draw information from charts, graphs, and geometric figures, and the ability to accurately complete straightforward estimations and calculations.” The rubric urges faculty to develop assignments that give students “contextualized experience” analyzing, evaluating, representing, and communicating quantitative information (Association of American Colleges and Universities 2020). The substance of the LEAP initiative informed the development of our collaborative teaching model, for it allowed us to ground our curricular interventions within larger university curricular trends that had already emerged.
Although quantitative literacy is important, there are other structures for teaching that see data fluidity and visualization as being tied to larger information seeking practices. For this reason, we also turned to the Framework for Information Literacy for Higher Education, developed by the Association of College and Research Libraries (ACRL). The Framework embraces the concept of metaliteracy, which promotes metacognition and a critical examination of information in all its forms and iterations, including data visualization. One of the six frames posed by the document, “Information Creation as a Process,” closely aligns with data competency, including data visualization. This frame emphasizes that the information creation process can “result in a range of information formats and modes of delivery” and that the “unique capabilities and constraints of each creation process as well as the specific information need determine how the product is used.” Within the Framework, learning is measured according to a series of “dispositions,” or knowledge practices that are descriptive behaviors of those who have learned a concept. Here, the Framework is apropos, as students who see information creation as a process “value the process of matching an information need with an appropriate product” and “accept ambiguity surrounding the potential value of information creation expressed in emerging formats or modes” (ACRL 2016). The Framework recognizes that evolved undergraduate curricula must incorporate active, multimodal forms of analysis and production that synthesize information seeking, evaluation, and knowledge creation.
Other organizations and disciplines also advocate for quantitative literacy in the undergraduate curriculum. For instance, Locke (2017) discusses the relevance of data in the humanities classroom and points to ways undergraduate digital humanities projects can incorporate data analysis and visualization to extend inquiry and interpretation. And Beret and Phillips (2016, 13) recommend that every journalism degree program provide a foundational data journalism course, because interdisciplinary data instruction cultivates professionals “who understand and use data as a matter of course—and as a result, produce journalism that may have more authority [or] yield stories that may not have been told before.” In sum, LEAP, the ACRL Framework, and movements for data literacy in the disciplines influenced the Libraries’ collaboration with the Media and Cultural Communications department, and this informed the effort to create and support a meaningful learning experience for students in this major.
Learning-by-Teaching: Structured, Programmatic Instruction and Libraries
Our collaborative model evolved with the conviction that structured, programmatic teaching can foster professional growth for librarians and library technologists. In addition to creating impactful learning for students, programmatic teaching provides a structure that allows for educators to expand the contexts in which they can teach. In many cases, librarians who specialize in information literacy are less adroit regarding the concepts and mechanics of working with data. Teaching data as a form of information, then, necessarily requires a baseline technical expertise.
Several studies published within the past decade indicate that learning with the intent to teach can lead to better understanding, regardless of the content in question. One such study finds that learners who were expecting to teach the material to which they were being introduced show better acquisition than learners who were expecting only to take a test, theorizing that learning-by-teaching pushes the learner beyond essential processing to generative processing, which involves organizing content into a personally meaningful representation and integrating it with prior knowledge (Fiorella and Mayer 2013, 287). Another study finds that learners who were expecting to teach show better organizational output and recall of main points than those who were not expecting to teach, which suggests that learners who anticipate teaching tend to put themselves “into the mindset of a teacher,” leading them to use preparation techniques—such as concept organizing, prioritizing, and structuring—that double as enhancements to a learner’s own encoding processes (Nestojko, et al. 2014, 1046). This evidence boosts our belief that learning-by-teaching is a good strategy for librarians to build foundational data literacy skills, and it informed the development of our program.
Development and Implementation of the Collaborative Teaching Model
Situated in NYU’s Steinhardt School of Culture, Education, and Human Development, the MCC program covers global and transcultural communication, media institutions and politics, and technology and society, among other related fields. MCC program administrators, who were looking to incorporate practical skills into what had previously been a theory-heavy degree, approached the library to co-develop instructional content that would expose students to applied data literacy and multimedia visualization platforms. The impetus for the program administrators to reach out to the library was their participation in a course enhancement grant program, which testifies to the lasting effects that school or university-based curriculum initiatives can have on undergraduate learning. In this case, what emerged was a sustained teaching partnership. Though the support was refined over time, its core remained constant: individual sections of a media studies methods class would attend a librarian-led class session that prepares students to evaluate data and construct a visualization exploring some element of media and political economy, grounded in an assigned reading that asserts ownership of or access to media and communications infrastructure is intrinsically related to the well-being and development of countries around the world.
The class is a first-year requirement in Media, Culture, and Communication, one of NYU’s largest majors. The course tends to be taught by beginning doctoral students, and is by design a highly fluid teaching environment. In early iterations of library support, we designed a module that attempted to have students perform a range of analysis and visualization tasks. Students were introduced to basic socio-demographic datasets and were invited to create a visualization that investigates a research question of their choosing, provided that the question adhered generally to the themes of media and political economy. The assignment as initially constituted expected the student to frame a question, find a dataset and clean it, choose a visualization platform, and generate one or more visualizations that imply a causal relationship between variables that they had identified.
The learning outcomes and assignment developed in this initial sequence turned out to be too ambitious. The assignment had fairly loose parameters, which proved problematic, and the 75-minute class session could not provide sufficient preparation. Students struggled with developing viable research questions, finding data sets, and cleaning the data (the multivalent process of normalizing, reshaping, redacting, or otherwise configuring data to be ingested and visualized in online platforms without errors). Also, we had pointed them to an overwhelming array of data analysis software tools, including ESRI’s ArcMap, Carto, Plot.ly, Raw, and Tableau. We found they had great difficulty with both selecting a tool and learning how to use it, in addition to the connected process of finding a dataset to visualize within it. The Libraries tried to accommodate, but ultimately realized that the module needed significant adjustment going forward, especially since the MCC department decided to expand the project to include up to 10 sections of the course each semester.
Besides struggling with research questions, datasets, and tools, it was also apparent that students had trouble connecting this work to the broader ideas of media and political economy intrinsic to the assignment. Informed by these first-round outcomes, we came together again to revise the instructional content and assignment. Taking our advice into account, the MCC teaching faculty and program administrators refined the learning outcomes as such:
- Become familiar with the principles, concepts, and language related to data visualization
- Investigate the context and creation of a given dataset, and think critically about the process of creating data
- Emphasize how online visualization platforms allow users to make aesthetic choices, which are part and parcel of the rhetoric of visualization
The librarians also created a student-facing online guide as a home base for the module and decided to distribute the teaching load by inviting Data Services specialists from the Libraries’ Data Services department to help teach the library sessions (MCC-UE 2019). And to provide a better lead-in to the library session, a preparatory lesson plan was developed for the MCC instructors to present in the class prior to the library visit.
After further feedback from program administrators and consideration, we inserted a scaffolding component into the library session lesson plan to better prepare students for their assignment. The component involved comparing four sample visualizations created from the very same data, and it included questions for eliciting a discussion about the origins and constructions of data. Scenario-based exercises for creating visualizations in Google sheets and Carto were also incorporated into the lesson, giving students practice before tackling the actual assignment. The assignment was also redesigned with built-in support. Students would no longer be expected to find their own dataset and attempt to clean extracted data, tasks that had caused them frustration and anxiety. Instead, they would choose from a handful of prescribed and pre-cleaned datasets. Data Services staff worked to remediate a set of interesting datasets to anticipate the kind of visualization students would attempt. Also, rather than having to choose from a confusing array of data visualization tools, they would be directed to use Google sheets or Carto only. Assuming the task of identifying, cleaning, and preparing datasets meant extra front-loaded work on the Libraries’ part, but it also freed students to focus on the higher order activity of investigating the relationship between visualizing information and examining social or political culture.
Instructional Support from a Wide Community of Teachers: Growing a Base
Another issue at hand was the strain the project was having on the members of the Data Services team and Communications Librarian, who taught all ten library sessions that were offered each semester. To achieve sustainability going forward, a broader group of librarians would be needed to help teach the library sessions. Moving forward, the Data and Communications librarians decided to recruit other NYU librarians to participate as instructors. Most of the recruits were data novices, but they viewed the invitation as an opportunity to learn data basics, expand their instruction repertoire, and strengthen their teaching practice. Calling on colleagues to teach outside their comfort zone is a big ask, one that requires strong support and administrative buy-in. So recruits were provided with a thorough lesson plan, a comprehensive hands-on training session, and the opportunity to shadow more experienced instructors before teaching the module solo (MCC-UE 2019).
By including a more robust roster of instructors, the structure also gave us the ability to further tie our lesson to what was planned in the MIMS curriculum. A new reading was chosen by the media studies faculty, “Erasing Blackness: the media construction of ‘race’ in Mi Familia, the first Puerto Rican situation comedy with a black family,” by Yeidy Rivero. The article grounds the students’ exploration of the relationship between media and political economy within the MIMS class, and it also provides a good entry point to explore critical data literacy concepts. According to Rivero, the show Mi Familia, deliberately represents a “flattened,” racially homogeneous “imagined community” of lower-middle class black family life that erases Puerto Rico’s hybrid racial identity. This flattening, Rivero argues, is part and parcel of multidimensional efforts to “Americanize” Puerto Rico and align its culture with the interests of the U.S. Furthermore, since the Puerto Rican media is regulated by the U.S. Federal Communication Commission (FCC) and owned by U.S. corporations, Puerto Ricans themselves had little recourse to question the portrayal of constructed racial identities in the mainstream culture (Rivero 2002).
Students were instructed to complete the reading prior to the library session. During the session, the library instructor referred to the reading and introduced a dataset with particular relevance to it. The instructor engaged students in a discussion about the importance of reviewing the dataset description and variables in order to form a question that can be reasonably asked of the data. With students following along, the instructor then modeled how to use Google sheets to manipulate the data and create a visualization that speaks to the question.
The selected dataset resulted from a study of the experiences and expressions of racial identity by young adults who lived in first and second-generation immigrant households in the New York City area during the late 1990s (Mollenkopf, Kasinitz, and Waters 2011). The timeframe of this article and the dataset line up well. The sitcom mentioned in the article first aired in 1994, but had been picked up in Telemundo’s NYC area affiliates by the late 1990s, so it is highly possible that this sitcom would have been on the air in the homes of study participants. The dataset, which is aggregated at the person level, includes variables about participants’ family and home context, patterns of socialization, exposure to media, and sense of self. In order to foreground the analytic process of looking at data, ascertaining its possibilities, and gesturing at potential visualizations, we created a simplified version of the raw data, which omits some columns and imputes other variables for easier use. To accompany this dataset, we also created some simple data visualizations in Google Sheets, ArcGIS Online, and Tableau, which are intentionally “impoverished,” thus designed to elicit discussion from students about the claims made by the visualizations.
Undoubtedly, these adjustments to the module led to students performing better on the assignment. Improvements to the lead-in session provided by the MCC instructors ensured that the students were prepared with context for the library workshop and an understanding of why the library was supporting the assignment. Basing the assignment on a specific article made it possible for librarians to model a way of bridging the theoretical concepts of the class to a question that could be asked of data. There was also more time for two pair-and-share discussions and group work in Google Sheets and Carto, which addressed a fundamental and recurring frustration in the students’ understanding of the assignment: the ability to ask an original question of a dataset, and to ask a question that would address a larger theme of media and political economy.
From the standpoint of instructors in NYU Libraries, we also found that the model provided a strengthened group of teachers. Several people who worked with sections of MIMS contributed ideas to the instructor manual and created ancillary slides and examples that are tailored to their own interest in the claims about racial and national identity that the Rivero article makes. For us, this flexibility is an important element of the collaborative teaching model; it offers both the structure for those who are new to data analysis and visualization to teach effectively, yet it also contains enough pathways for discussion to be meaningful and personal, should individual instructors want to branch out in their own teaching.
Conclusion
Despite being familiar with technology, many students arrive at college without a holistic ability to interpret, analyze, and visualize data. Educators now recognize the need to provide foundational data literacy to undergraduates, and many teaching faculty look to the library for support in instructional design and implementation. In this article, we recognize that creating integrated, meaningful data learning lessons is a complex task, yet we believe that the collaborative teaching model can be applied in various disciplinary contexts. Sustainability of this model depends on equipping a wide range of librarians with necessary data literacy skills, which can be achieved with a learning-by-teaching approach. After developing a teaching model that calls upon the expertise of teachers across the library, we gained some important insights on maintaining the communication and support to make it sustainable, building the workshop itself, and balancing the labor that all of this requires.
Good communication and organization between the MCC department and librarians was also key in maintaining the scalability of this instruction program. Given the heavy rotation of new teachers on both the library and MCC side, we needed to provide module content that was streamlined and assignment requirements that were clear cut in order to quickly on-board teachers to the goals, process, and output of the module. When recruiting library instructors, we emphasized that volunteers will not only build their data literacy skill set, but will also expand their pedagogical knowledge and teaching range. Finally, to ensure that volunteer instructors have a successful experience, we also provide support mechanisms such as a step-by-step lesson plan, thorough train-the-trainer sessions, opportunities to observe and team-teach before going solo, and a point person to contact with questions and concerns.
There is much hidden labor in all of this work. Robust student support for the course was also crucial, and really took off when the MCC department created a dedicated student support team from graduate assistants in the program. On the library side, communicating regularly with the MCC department, assessing and revising the learning objects, organizing and hosting train the trainer sessions, and scheduling all of the library visits takes many hours of time and planning. This work should not be overlooked when considering a program of this scale.
A collaboration at this level can provide rich data literacy at scale to undergraduates, while also offering the chance for instructors in the library and in disciplinary programs to develop their own skills in numeracy and data visualization as they learn by teaching. Through time, effort, and dedicated maintenance, a program like this becomes a successful partnership that has a broad and demonstrated impact on student learning, strengthens ties between the library and the departments we serve, and allows librarians and data services specialists the opportunity to learn and grow from each other.
Related to the learning objects themselves, we had the most success when we matched the scope of the assignment closely with the time and support the students would have to complete it, and preparing a small selection of data sets for the students in advance was very helpful in this regard. We also built in a full class session of preparation before the library visit, in which MCC teachers introduced the assignment, some principles of data visualization (via a slide deck prepared by the library’s Data Services department), and how this method can connect to broader concepts of media analysis. This led to more effective learning for students. These changes to the student assignment, learning outcomes, and library lesson plan were developed through regular and structured assessments of the workshop: a survey to the instructors teaching the course, classroom visits to see the students’ final projects, and in-depth conversations with instructors on which aspects of the lesson plan were successful and which fell flat. Following each assessment the MCC administrators and the librarians would get together to discuss and iterate on the learning objects. This process of gathering feedback on the workshop, reflecting on that information and then revising the assignment enabled us to improve the teaching and learning experience over the years.