Background
Music programming is the practice of writing code in a textual or visual environment to analyze audio input and/or produce sonic output. The possibilities for creative music coding projects are virtually endless including generative music-makers, audio-visual instruments, sonifications, interactive soundtracks, music information retrieval algorithms, and live-coding performances. Music programming has become an increasingly popular pursuit as calls to broaden and diversify the computer science field have led to attempts at integrating programming and computational thinking into other topics and curricula.[1] Today, music programming languages are widespread and predominantly free. Yet, as evidenced by their history and purpose, most cater to expert computer musicians rather than novice programmers or musicians. For example, Max/MSP, one of the most popular music programming environments for professionals and academics, was first developed at IRCAM in Paris to support the needs of avant garde composers (Puckette 2002). Usually, prior knowledge of music theory, programming, and signal processing is needed just to get started with a music programming language. Writing about the computer music programming environment SuperCollider, Greher and Heines (2014) note that “the large learning curve involved in understanding the SuperCollider syntax makes it inappropriate for an entry-level interdisciplinary course in computing+code” (Greher and Heines 2014, 104–105). However, recent platforms such as SonicPi (Aaron et al. 2016), EarSketch (Freeman et al. 2014), and the approaches piloted by Gena Greher, Jesse Heines, and Alex Ruthmann using Scratch in the Sound Thinking course at the University of Massachusetts Lowell (Heines et al. 2011) attempt to find ways to teach both music and coding to novices at the same time.
Music and Coding Education
The idea of using computational automata to generate music can be traced back to Ada Lovelace, possibly the world’s first programmer, in the mid-19th century who imagined an Analytic Engine capable of completing all sorts of tasks including composing complex musical works (Wang 2007). One hundred years later, the first programming languages that synthesised sound were developed beginning with Music I in 1957 by Max Matthews at Bell Labs. Educational environments that treated music and sound as an engaging context for learning coding appeared relatively soon after when Jeanne Bamberger collaborated with Seymour Papert at the MIT AI Lab to adapt the Turtle Graphics programming language Logo to support music (Schembri 2018). Bamberger and the Lab’s creation MusicLOGO opened up exciting opportunities for children making music since it enabled them to bypass the tedium and complexity of learning traditional notation and theory and dive right in to composition and playing back their work to reflect and iterate (Bamberger 1979). Following MusicLOGO, Bamberger developed Impromptu, a companion software to her text Developing Musical Intuitions (Bamberger and Hernandez 2000) that enables young learners to explore, arrange, and create music using “tune blocks” in creative music composition, listening, and theory concept projects.
Within the last ten years, the landscape of free, educational programming languages designed explicitly around music has begun to bloom. While languages vary across style of programming (functional, object-oriented) and music (classical, hip-hop), all that follow enable novice programmers to sonically express themselves. SonicPi (Aaron, Blackwell, and Burnard 2016) is a live-coding language based on Ruby that enables improvisation and performance. It is bundled in the Raspbian OS resulting in widespread deployment in Raspberry Pi computers around the world, and it has been studied in UK classrooms. EarSketch (Freeman et al. 2014) is another platform that combines a digital audio workstation (DAW) with a Python or Javascript IDE (integrated development environment), and was originally designed to teach programming to urban high school students in Atlanta, Georgia using hip-hop grooves and samples. Among its innovations is the inclusion of a library of professional-quality samples enabling learners to make music through combining and remixing existing musical structures rather than adding individual notes to a blank canvas. JythonMusic (Manaris and Brown 2014) is another open source Python-based environment that has been used in college courses to teach music and programming. Beyond music-centric languages, numerous media computing environments such as AgentSheets and Alice support sound synthesis and/or audio playback in the context of programming games, animations, and/or storytelling projects (Repenning 1993; Cooper, Dann, and Pausch 2000; Guzdial 2003).
Music Coding in the Browser
At the time of their initial release, most of the programming environments described above were not web-based. The additional steps of downloading and installing the software and media packages to a compatible operating system presented a barrier to easy installation and widespread use by children and teachers. In 2011 Google introduced the Web Audio API, an ambitious specification and browser implementation for creating complex audio functions with multi-channel output all in pure Javascript. Soon after, Chrome began to support its many features, and Firefox and Safari quickly followed suit. As the developers of Gibberish.js, an early web audio digital signal processing (DSP) library, point out, the Web Audio API is optimized for certain tasks like convolution and FFT analysis at the cost of others like sample accurate timing which is necessary for any kind of reliable rhythm (Roberts, Wakefield, and Wright 2013). Over the past few years, the Web Audio API has added functionality, a more accessible syntax, and new audio effects. The aforementioned Gibberish.js and its companion Interface.js were among the first libraries to add higher-level musical and audio structures on the base Web Audio API, which has enabled the rapid prototyping and implementation of complex web-based music and audio interfaces. These libraries are increasingly being used in educational settings, such as middle schools in which girls are taught coding (Roberts, Allison, Holmes, Taylor, Wright and Kuchera-Morin 2016).
Newer javascript audio and music libraries include Flocking (Clark and Tindale 2014), p5.js (McCarthy 2015), and tone.js (Mann 2015). Flocking uses a declarative syntax similar to SuperCollider meant to promote algorithmic composition, interface design, and collaboration. The p5 sound library adds an audio component to the popular web animation library p5.js, and most recently tone.js provides a framework with syntax inspired by DAW’s and a sample-accurate timeline for scheduling musical events (Mann 2015).
Scratch
Scratch is a programming environment designed and developed in the Lifelong Kindergarten Group at the MIT Media Lab. First released in 2007, Scratch 1.0 was distributed as a local runtime application for Mac and Windows, and Scratch 2.0 was coded in Adobe’s Flash environment to run in the web-browser beginning in 2013. With the impending deprecation of Flash as a supported web environment, a new version of Scratch (3.0) has been completely reprogrammed from the ground up using Javascript and the Web Audio API. This version went live to all users in January 2019.
The design of Scratch is based on constructionist design metaphors in a visual environment where users drag, drop, and snap together programmable LEGO-style blocks on the screen (Resnick et al. 2009). An important part of the Scratch website is its large community of users that comment on and support each other’s work and project gallery which includes a “remix” button for each project that enables users to look inside projects, build upon and edit them to make custom versions, and copy blocks of code known as scripts into their “backpacks” for later use. While Scratch is used to create all kinds of media ranging from simulations to animations to games, it comes with a set of sound blocks and advertises its capability for creating music. In fact, the name “Scratch” derives from a DJ metaphor in that a user may tinker with snippets of code created by herself and others similar to how a DJ scratches together musical samples (Resnick 2012). Scratch is also widely used to make music in classrooms and homes around the world often through hardware such as the Makey Makey. Middle school music educator Josh Emanuel, for example, has posted comprehensive documentation on building and performing Scratch instruments with his students in Nanuet, New York.
Scratch is accessible even to novice coders in classes ranging from elementary to college level introductory coding classes, and to curious adults. Recently, in an undergraduate course entitled Creative Learning Design, the authors assigned a project for a diverse set of undergraduate students to design and build music-making environments in Scratch followed by user testing sessions with local first and seventh graders.[2] The music component of Scratch has been used to teach computational thinking in an interdisciplinary general education Sound Thinking class, which was co-designed and taught Alex Ruthmann, at the University of Massachusetts Lowell (Greher and Heines 2014). The instructors of the course covered topics such as iteration, boolean operators, concurrency, and event handling through building music controllers, generating algorithmic music, and defining structures for short compositions (2014, 104–131). They chose Scratch as the music programming environment for their course as it “makes the threshold of entry into the world of programming very low” and “includes some great music functionality” such as the ability to “play a number of built-in sounds, or sounds stored in mp3 files” and more importantly “its ability to generate music using MIDI” (2014, 104–105).
Motivation
Scratch has a massive user base with over 33 million registered users and over 35 million projects shared across its platform to date. In supporting this wide audience, the Scratch team intends to put forward an environment with “low floors, high ceilings, and wide walls,” that is to say a low barrier of entry in which one has ample room to grow and pursue unique interests (Resnick et al. 2009). However, when it comes to music projects, we believe Scratch has limitations. In terms of easily coding and playing pre-recorded and user-created audio files, one finds low floors. However, for users wishing to create music through sequencing musical notes or with engaging sounds and instruments, they often can face high floors, low ceilings and narrow walls due to the complex numeric music mappings and mathematical representations, as well as data structures that limit musical expression.
In this work, we present and critique three major challenges with the implementation of music and sound blocks in Scratch 2.0. First, the functionality of music blocks is immediately accessible, but designed to play sounds at the level of “musical smalls” (i.e., the “atoms” or “phonemes” of music, such as an individual note, pitch, or duration) vs. “musical simples” (i.e., the “molecules” or “morphemes” of music, such as motives and phrases) (Bamberger and Hernandez 2000). Second, arising from Scratch’s bottom-up programming style (i.e., building music code block by block, and note by note from a blank screen), the act of realizing musical sequences is tedious requiring a deep mathematical understanding of music theory to (en)code music. Third, and perhaps most challenging to the end user for music, is a timing mechanism designed to privilege animation frame rates over audio sample-level accuracy. As illustrated by numerous Scratch projects and our own task breakdown, a user must take extra, often unintuitive steps to achieve adequate musical timing to implement basic musical tasks such as drums grooving together in time, or melodies in synchronization with harmony. For the balance of this article, we analyze the design constraints of the music and sound blocks, the user experience and implications of using them in the Scratch environment, and finally the quality of the audio they produce. We conclude with a preview of new music block design ideas that aim to better address music making and coding in the Scratch environment for musical and coding novices.
Music Functionality in Scratch 2.0
The music capabilities of Scratch 2.0 center around the Sound blocks (Figure 1)—audio primitives that enable users to trigger individual sounds, notes and rests, and to manipulate musical parameters such as tempo, volume and instrument timbre.

Scratch makes the first steps in crafting sounds and music at the computer easy and accessible. Unlike many audio programming languages, it provides a learner immediate audio feedback with a single block of code that can be clicked and activated at any point. Other music coding environments often require more complex structures to play a sound, including instantiating a new instrument, locating and loading a sample, and turning audio output on or running the program before producing a result (Figure 2).

Scratch supports three types of audio playback—sounds (audio files), MIDI drums, and MIDI pitched instruments. A library of free audio files is included in the Scratch environment, and sounds are linked intuitively to relevant sprites (e.g. when the Scratch cat is chosen, its default sound option is “meow”). Users can easily upload and record new sounds by clicking on the “sounds” tab. General parameters such as instrument, tempo, and volume, are set locally for each sprite.
Unfortunately, the three types of audio playback in Scratch lack consistency. In the design of novice programming languages, consistency means, “the language should be self-consistent, and its rules should be uniform” (Pane and Meyers 1996, 55). Setting a tempo affects pitched instruments and drums, but not audio files (e.g., sounds). This makes sense for games and animations where strict rhythmic timing is unnecessary, e.g. characters communicating with each other. But if a user wanted to load in their own musical sounds to create a sampler, they would need to use a combination of “play sound” blocks and “rest” blocks to emulate tempo. The syntax “play sound until done” is also confusing to new learners and unique to sound playback. Really this line means “do not move on until the sound has finished” and results in timing derived from the current sound’s length.[3]
While pitched instruments and drums use a consistent timing mechanism, they differ in that “play drum” requires a percussion instrument as an argument, and “play note” requires, not an instrument, but a pitch. Following a real world metaphor of musicians playing notated music, this distinction is appropriate (Pane and Meyers 1996) as separate percussion instruments are often notated on different lines of the same staff just like pitches. A single percussionist is required to play multiple instruments like an instrumentalist is required to play multiple notes. However, the “set instrument” function constrains sprites to a single instrument. It’s often a challenge to teach that a single sprite can play any number of percussion sounds in parallel with a pitched instrument but is limited to one pitched instrument.[4] A more consistent design may treat a drum like any instrument requiring it to be set for the entire sprite while providing smart selection/labelling to indicate the resulting drum sound.[5]
The design of the “play note” and “play drum” blocks is challenging in other ways. For one thing, the vertical, numeric representation of pitches and instruments conceals melodic content more easily visualized with other representations like traditional score notation or pitch contour curves. A learner has to memorize for example that pitch 64 refers to “E,” that drum 1 is the snare drum, and instrument 9 is the trombone. Rhythm is also expressed numerically as fractions or multiples of the beat rather than in traditional formal music terms like “sixteenth notes” and “quarter notes.” While this notation affords a more fundamental conception of timing and can lead to crazy behaviors (1.65 beats against 1.1 beats), it constrains the type of rhythms possible, e.g. triplet eighth notes that would need to be rounded down to 0.33. Depending on what one is teaching about music, a notation-based rhythm representation may be more accessible than decimal fractions of beats.
The quality of the audio output produced may limit users’ expressiveness and may be a major ceiling on creativity for young users. Pitches and drums are played via a flash soundbank. While Scratch provides a large variety of drums and pitched instruments, they vary in quality and are output as mono sounds. High quality audio, such as instrument sample libraries, take up large amounts of storage space and internet bandwidth in downloading. It is a trade-off in sound quality with load speed that limits many web-based music and sound environments. Because Scratch 2.0 does not support synthesizing new sounds or uploading new MIDI instrument libraries, users are less able to pursue their unique musical instruments. In contrast, EarSketch uses a library of loops created by professional DJs and producers that its developers write contributes to an experience that is personally meaningful, and deeply authentic (Freeman et al. 2014). In Scratch, the lack of high-quality audio, in combination with other challenges discussed, results in many users bypassing the instrument blocks entirely in favor of using audio samples and uploading long audio files that play on repeat. For example in one Scratch forum post entitled “Add more instruments,” one response to the poster’s request states, “Why can’t you download music? It’s better.”
Learning Music Coding/Coding Music in Scratch
Coding in Scratch is a bottom-up process. While the developers argue this style of constructionist learning supports exploration, tinkering, and iterative design (Resnick et al. 2009), and Scratch’s remix feature lessens the load allowing many users to begin from an existing project rather than a blank canvas, critics have pointed out that it may lead to bad programming habits such as incorrect use of control structures and extremely fine-grained programming where programs consist of many small scripts that lack coherency (Meerbaum-Salant, Armoni, and Ben-Ari, 2011). For our purposes, we must interrogate whether Scratch’s bottom-up programming style is a useful method for composing music. For one thing, bottom-up programming takes time. Each instance a new block is added to the screen, a user must click and drag, select a new pitch/instrument and duration, and often rearrange existing blocks. Music scripts are long, and as a result of a finite amount of screen space, they quickly become unwieldy as even simple tunes like “Twinkle Twinkle Little Star” have tens if not hundreds of notes. One might argue that hard coding all of the pitches, like one might in a DAW, defeats the purpose of using a programming environment to represent music. It is true that a powerful exercise in Scratch is to represent a tune with as little code as possible using loops as repeats, messages to trigger sections, and data structures to store groups of notes. In formal settings the tedium required to code a tune block by block opens up the opportunity to learn more efficient methods of coding (Greher and Heines 2014). However, abstraction is a difficult concept for novice programmers to internalize, and problem-solving usually occurs first at the concrete level. Collecting the results of multiple studies Pea, Soloway, and Spohrer (2007) note that “there is a well-known tendency of child Logo programmers to write simple programs using Logo primitives rather than hierarchically organized superprocedures that call other procedures even after examples of superprocedures and discussions of their merits for saving work have been offered” (24). Similarly, in Scratch, beginning music coders must slog through dragging blocks and blocks of code before realizing even a simple melody, let alone a multi-part composition (Figure 3).

A more fundamental argument against bottom-up music coding can be drawn from Jeanne Bamberger’s work Developing Musical Intuitions (2000). In it, she points out that conventional music instruction, like Scratch, begins with the smallest levels of detail—notes, durations, intervals, etc. She compellingly makes the case that this approach ignores the intuitions we already possess as the starting point for musical instruction, e.g. our love of “sticky” melodies and pleasure in finding and moving to danceable grooves. Conventional instruction asks us to begin our educational work with the smallest musical components, often taken out of context, rather than those that our prior experience and developed intuitions with music may have provided.
In Bamberger’s curriculum, conversely, she begins with the middle-level concept of melodic chunks. The accompanying learning tool Impromptu presents musical building blocks called “tuneblocks” that are hierarchically grouped into entire tunes, phrases, and motives. The user begins by listening to the various “chunks” of melodies, and arranges and rearranges them to create or re-create a composition. The user is not working initially at the level of individual pitches and rhythms, but instead at a higher, and more intuitive level of abstraction. Students begin by learning that the base units of a composition are melodies and motives, rather than pitches or frequencies. It is not until Part 3 entitled Pitch Relations when Bamberger zooms in and introduces intervals and assigns students to uncover and reverse engineer scales from melodies. She effectively reverses the order of traditional music theory, teaching children how to theorize with music, rather than accept it as a set of bottom-up, well-structured rules. One notices this approach is the reverse of the way many people approach teaching coding in Scratch (and also in many other languages), but is congruent with the original design intention behind Scratch where users would first browse and play games created by other users in the Scratch community, then clicking the “see inside” button to remix others’ code as the starting point for learning.
Examined from a music technology angle, Freeman and Magerko (2016) classify educational music programming languages by the layer of musical interaction they afford and thus the form of musical thinking they prescribe: Subsymbolic programming languages prioritize synthesis and control of time at an extremely low level. Symbolic languages offer editing at the individual note level usually via MIDI. Hierarchical languages deal with musical form and allow loops or motives to be layered and remixed.[6] The most powerful of the bunch are clearly subsymbolic languages offering near limitless flexibility and control for expert and professional coders. Yet, the manipulation of low-level processes requires patience, determination, and a lot of time and experience to build up the knowledge to implement in a practical setting.[7] We have observed that novice Scratch users sometimes get hung up and frustrated after the first few short programs they write, lacking ever a moment of true self-expression and failing to develop the knowledge and motivation to progress further.
Hierarchical languages do not always support lower-level sound editing meaning that advanced users will inevitably reach a ceiling on what they can express with the tool. (EarSketch, for example, allows the programming of beats, but disallows modifying single pitches within a melody.) However, this constraint is not necessarily a bad thing. Educational programming languages should not serve as the end-all-be-all, but rather as a meaningful stage in a learner’s development. Maloney et al (2010) write that their intent with Scratch is to “keep [their] primary focus on lowering the floor and widening the walls, not raising the ceiling,” noting it is important for some users to eventually move on to other languages (66). Of course, this growth step should not be taken out of desperation, but rather in search of new mediums that afford more creative potential after one has undergone multiple positive experiences.
Ideally, users of a music programming environment should have the opportunity to compose entire pieces and/or perform for others using custom instruments/interfaces all while encountering code challenges that push their critical thinking skills along the way. Through this process, they can begin to theorize about and through code using musical materials and ideas they care about, testing and developing their musical intuitions and hunches as part of their creative process. Environments like EarSketch and Hyperscore (Farbood, Pasztor and Jennings, 2004) offer innovative, hierarchical methods of interaction that demonstrably achieve these goals, but we believe there is still unrealized potential for environments and educators to support parallel artistic and computational growth and then scaffold the transition process into more advanced/powerful environments. (Hyperscore is not a programming language, but a unique computer-assisted composition system that enables users to control high level structures (e.g. composition-length harmonic progressions) and lower level structures (motivic gestures) all through drawing and editing curves.)
Inconsistent Timing in Scratch Music Playback
The Scratch music implementation contains a tempo object that is used to specify the duration of notes, drums, and rests. Even though tempo is key to almost all music-making, many other audio programming languages require a user to time playback in second or millisecond long durations. One obvious application in which tempo is useful is in creating rhythms or drum beats. Consider a popular Scratch drum machine developed by user thisisntme. Both the code of thisisntme and many thoughtful comments indicate exciting possibilities for making music in Scratch.[8] Other comments indicate that the app is imperfect, especially in regards to rhythm. For example, user Engin3 writes, “only problem to me (that i doubt can be fine tuned [sic], is the timing of the repeat. It feels a bit off to me.” The developer thisisntme writes in the instructions, “TO MAKE THE BEATMAKER MORE ACCURATE, ENABLE TURBO MODE AND REDUCE TEMPO WITH DOWN ARROW.” These comments are indicative of a much more serious problem in Scratch: the lack of accurate timing. In the rest of this section, we describe the buggy path to building a simple drum loop consisting of hi-hat, kick drum, and snare drum. Consider a challenge to implement the drumbeat in Michael Jackson’s “Billie Jean” (Figure 4).

An initial solution to this problem requires listening to determine the relative rates of each drum and envisioning three musical lines running in parallel. Learners would need to understand loops, as well implement a solution to trigger the three instruments to start at the same time. A Scratch representation of all the notes above looks as follows:

A follow-up challenge adapted from the Sound Thinking curriculum (Greher and Heines 2014, 109–110) may involve asking students to implement a simplified version of the groove using as few blocks code as possible. This approach teaches students to look for patterns and opportunities to more efficiently code the music, exploring computational concepts of repeats, loops, and variables.

When played back, the above solutions drift away from each other, increasingly out-of-time. Individually, each of the drum notes sound okay, but when executed together in parallel they sound cacophonous as if three musicians were asked to play together wearing blindfolds and earplugs. In an informal experiment done with undergraduates in the NYU Music Experience Design Lab, after encountering the problem, each unsuccessfully attempted a different fix. One student tried explicitly setting tempo in each group as if each loop requires a constant reminder or reset for how fast it is supposed to play. Another tried adjusting the beat values to approximations of the beat like 0.99 and 1.01 attempting to account for perceived differences in playback speed across the drums, but actually augmenting the differences in tempo.

In reality, the problem is beyond the user’s control stemming from the fact Scratch is optimized for animation frame-rate timing rather than the higher precision timing needed for audio and music. Greher and Heines (2014) write that Scratch 1.4 intentionally slows down processing so that animations do not run too fast on modern computers, but “what’s too fast for animations is sometimes too slow for music, where precise timing is necessary” (112). In Scratch 2.0, the problem may be compounded by its Flash implementation in which background processes within and outside the web browser can interrupt audio events, making music and sound out of sync. We should note timing in the web is a longstanding challenge: Before the Web Audio API, programmers relied on imprecise functions like setTimeout() and Date.now(). Even though the Web Audio API exposes the computer’s hardware clock, its implementation is inflexible and often misunderstood (Wilson 2013). Ideally in a novice programming environment like Scratch, these timing challenges should be addressed so that users can think and implement code with intuitive musical results.
There are a few “hacks” to work around these timing issues in Scratch. Toggling the “Turbo Mode” button helps, but it merely executes code blocks as fast as possible, which does not prevent timing issues. Discussions from users in the Scratch online forums reveal that they are also perplexed about what the mode actually accomplishes. An algorithmic solution to improve musical timing involves conceptualizing one of the musical lines as the leader or conductor. Rather than each of the lines looping and gradually getting out of time relative to each other, one line loops and signals the others to follow suit:

Of course, this solution is less readable, and is not immediately intuitive to a novice coder. It is also more viscous (Pane and Meyers 1996) since the user must always ensure the “conductor” script is equal or longer than all other scripts. Changes to the “conductor” script may affect playback of other scripts, so the user must check multiple scripts after making a change. More importantly, it is an incomplete solution. As each line moves forward in its own loop, it will get out of time relative to the other loops before snapping back in time once the “start” message is broadcasted causing a feeling of jerkiness. As loops lengthen the effect becomes more dramatic. A slightly better solution requires creating a single forever loop that triggers all current drums at correct moments in time. Unfortunately, this solution results in ignoring the durations included in each “play drum” call since rests are necessary to separate the asynchronous “broadcast” calls. While it prevents the issue of these few music loops getting out of sync, it does not solve the fundamental problem of inaccurate timing: the entire rhythm still has the potential for musical drift.

One may argue that because at least one solution exists, the programming environment is not at fault, and further, that all programming languages have affordances and constraints unique to their implementations. We hope that a solution may be found in a future version of Scratch, or via a new extension mechanism, to solve many of these timing issues so that novice musicians and coders may use the Scratch paradigm to create musically synchronized code for their creative projects. This is in part because new programmers are often less attuned to what precisely to look for when they encounter results that are unexpected. Pea, Soloway, and Spohrer (2007) write that “since the boundaries of required explicitness vary across programming languages, the learner must realize the necessity of identifying in exactly what ways the language he or she is learning ‘invisibly’ specifies the meaning of code written” (25). In order to solve this problem, users must learn to mistrust Scratch’s implementation of tempo and respond in a way that gets around the issue of timing through using their ears. Our solutions can be thought of more as “hacks” than as useful exercises in building musical computational thinking skills.
Consider the same challenge executed in another visual programming environment Max/MSP. Max/MSP, as noted above, is not nearly as immediately accessible and does not abstract away a lot of music functionality. As a result, more code blocks are necessary to accomplish the same tasks. First, there is not a library of built-in drum sounds so the user must figure out his/her own method to synthesize or play the drums back. In the solution below (Figure 10), local audio files for each drum type are loaded into Max/MSP memory called buffers. Unlike Scratch’s sound playback which requires one object, Max/MSP’s sound playback requires a minimum of four objects: a buffer to store the audio file, a play~ object to play it, a start button to trigger playback, and an audio output to ensure sound reaches the user’s speakers. Second, rather than specify a global tempo in beats per minute (BPM), Max/MSP includes a metro object which produces output at regular intervals specified in milliseconds. To convert BPM to the correct note duration, it is up to the user to perform the calculations. All that said, the Max/MSP program utilizes an identical algorithm as the original Scratch solution but works as expected in that the drum beats sound in time together with each other.

New Opportunities for Music Blocks in Scratch
As we have shown, music making in Scratch may be limited by the level of representation coded into the blocks, differing motivations by users in using and reading the blocks, and inaccurate timing that affects music playback. However, there are ways forward that may enhance music making through redesigning the music and sound blocks in Scratch. Scratch’s accessibility as a browser-based environment, its wide and diverse user community, and its insistence on making all code and assets shared on its platform copyable and remixable creates a democratized space for children to explore music making and learning. Imagine if young people could not only use the web as a tool to listen to music and look up song lyrics, but actually step inside and remix the creations of their peers. With a new extension mechanism due to be implemented in Scratch 3.0, educators and more advanced coders will be able to design their own blocks to extend or modify the functionality of Scratch, leading towards a more open and customized coding environment.
As described in the wiki entry for Scratch 3.0, there are many notable changes. Scratch 3.0 only supports sound blocks in the default blocks interface, while music blocks, such as “play note” and “set tempo,” must now be loaded via an extension. This separation of blocks addresses our concerns about consistency formally separating how sound and MIDI music blocks are used.[9] However, the issues of steady musical timing, and representations of music as “musical smalls” remain. As other developers step in to add music functionality, it is important they explore ways to avoid the pitfalls of the current music implementation.
One exciting development is the ScratchX + Spotify extension for the ScratchX experimental platform.[10] This extension interfaces with the Spotify API in only a few, powerful Scratch blocks, while addressing many of the issues we discussed above.[11] Rather than using low-level notes and beats as musical building blocks, it loads thirty-second long clips of songs streamed instantly from Spotify. It abstracts away many of the complexities of using the API while including exciting features, such as chunking a song based on where Spotify’s algorithm believes the beat to be, and triggering animations or motions based on cues timed to structures within the music. Crucially, the ScratchX + Spotify website provides starting points for using the extension such as one app making sprites “dance” and another letting a user remix a song. From a user’s point of view, they can access a song or piece of music that is personally meaningful to them, and begin to code, play, and experiment directly. They can directly work with musically simple chunks of music they care about, in turn developing the curiosity that lays the foundation for the meaningful exploration of lower-level musical details and code at a later time. This is a powerful idea and positive step towards making music coding projects more accessible and motivating to young coders.
New Design Ideas for Music Blocks in Scratch
In the NYU Music Experience Design Lab (MusEDLab), we have been experimenting with new block designs for music using Scratch 2.0’s “Make a Block” functions, as well as ScratchX’s extension platform. In beginning work on new Scratch 3.0 music extensions, we are intimately aware of the challenges in designing for novice programmers and in representing music in code. To move from music blocks that are based on the musical smalls of “play note” and “play drum,” we propose creating new blocks that are modeled at a higher level of musical representation, closer to the “musical simples” that Bamberger and Hernandez (2000) advocate. This work builds on the creation of new individual blocks that perform chords or entire sequences of music, experimenting with possible syntaxes for creative interaction, exploration and play.[12]
As a first step, we are prototyping new functionality to launch various MusEDLab web-based apps via blocks within Scratch, such as the circular drum sequencer Groove Pizza, then returning the audio or MIDI output for use in Scratch projects. This functionality enables users to use our apps to play with and create rhythmic groove “simples” in an interface designed for musical play. Then, when they are finished creating their groove, the groove is already embedded into a new “play groove” Scratch block for use and further transformation through code.
In this prototype block (Figure 12), the user clicks on the yellow Groove Pizza icon on the “play groove” block, which pops up a miniature 8 slice Groove Pizza in the user’s browser. The user clicks to create a visual groove in the pop-up pizza, and the “play groove” block is virtually filled with that 8 beat groove. The user can then use this block in their Scratch project to perform the rhythmic groove. If the user wants to speed up or slow down the groove, they can modify it with an existing “set tempo” or “change tempo” block. This approach brings the musical creation and playful coding experiences closer together within the Scratch 3.0 interface.

Curious users of the MusEDLab apps who are interested in learning more about the computational thinking and coding structures behind them may already launch, step inside, and remix simplified versions created in Scratch, such as the aQWERTYon and Variation Playground apps. Where novice coders may not be able to initially understand the javascript code behind our apps when clicking “View Source” in their browser, they can get a middle-level introduction to our approaches by exploring and remixing the simplified versions of these apps in Scratch.
Another approach we’ve taken is to utilize Scratch’s “Make a Block” feature to create custom music blocks such as “play chord” and “play sequence.” In this approach, the user can copy our blocks and use them without having to know how the inner coding works. Those users that are curious can always take a deeper look at how the blocks, built using musical and computational simples, were coded, because in this implementation the structure behind the custom blocks are left intact.

The leftmost block of code in Figure 13 presents a common method from Scratch 2.0 for performing a musical triad (a C major chord) when the space key is pressed on the computer keyboard. This code requires three instances of the “when space key pressed” block, as well as the user knowing MIDI pitch numbers to build a C major chord where C equals MIDI note 60, E equals MIDI note 64, and G equals MIDI note 67. This implementation requires 6 Scratch blocks to create and requires basic music theory and MIDI understanding.[13] To lower the floor and widen the walls for chord playback, we implemented a new “play chord” block where users can specify the chord root and chord quality directly. These blocks can then be sequenced to form a chord progression. This approach encapsulates music at a higher level than individual notes and rhythms, the level of the “musical simple” of a chord. While there is a time and place for teaching that chords are made up of individual notes, implementing a “play chord” block enables novices to more easily work with chords, progressions, and harmony.
Another approach to a simpler music block is illustrated in Figure 14. To perform a melody or sequence of musical notes or rests, a Scratch 2.0 user needs to use a “play note” block for each note in the melody. The “play seq” block enables the user to input a sequence of numbers (1-9, plus 0 for a 10th note) or a hyphen for a musical rest to represent and perform a melodic sequence. The variable “major” indicates that the sequence is to be mapped onto a major scale starting on a root note of 60, or middle C. The first number 1 maps onto the root pitch. With this block, the melodic sequence can be transposed, and rhythmic durations set by fractions of whole notes, or mapped to different musical scales all within the same block. When used in a live-coding approach, the sequence within a block running in a repeat or forever loop can be adjusted to update the melody on the fly.

The prototype blocks described above provide just a few ideas for how to make it easier for novices to code and explore musical concepts in Scratch. Our approach focuses on designing Scratch blocks for music that are modeled after Bamberger’s notion of musical simples vs. musical smalls, potentially enabling a new pathway for children to intuitively and expressively explore creating and learning music through coding. Given the huge user base among school-age children, there is a great opportunity for the Scratch environment to not only be a platform for teaching coding and computational thinking, but also for musical creativity and expression. The affordances of a well-designed, wide-walled coding environment, provide opportunities for children to “tinker” with the musical intuitions and curiosities they already have, and to easily share those with their peers, teachers, and families. While there are many technical and design challenges ahead, and no environment can be perfect for everyone, we look forward to seeing how developers will extend the Scratch environment to promote lower floors, higher ceilings, and wider walls for creative music and sound coding projects for all.