Tagged experience design

Side by side comparison of traditional and experimental Scratch blocks. Detailed description in article text.

Music Making in Scratch: High Floors, Low Ceilings, and Narrow Walls?


Music programming is an increasingly popular activity for learning and creating at the intersection of computer science and music. Perhaps the most widely used educational tool that enables music programming is Scratch, the constructionist visual programming environment developed by the Lifelong Kindergarten Group at the MIT Media Lab. While a plethora of work has studied Scratch in the context of children creating games and coding interactive environments in general, very little has honed in on its creative sound or music-specific functionality. Music and sound are such an important part of children’s lives, yet their ability to easily engage in creating music in coding environments is limited by the deep knowledge needed in music theory and computing to easily realize musical ideas. In this paper, we discuss the affordances and constraints of Scratch 2.0 as a tool for making, creating and coding music. Through an analysis of limitations in music and sound code block design, a discussion of bottom-up music programming, and a task breakdown of building a simple drum loop, we argue and illustrate that the music and sound blocks as currently implemented in Scratch may limit and frustrate meaningful music making for children, the core user base for Scratch. We briefly touch on the history of educational music coding languages, reference existing Scratch projects and forums, compare Scratch with other music programming tools, and introduce new block design ideas to promote lower floors, higher ceilings and wider walls for music creation in Scratch.


Music programming is the practice of writing code in a textual or visual environment to analyze audio input and/or produce sonic output. The possibilities for creative music coding projects are virtually endless including generative music-makers, audio-visual instruments, sonifications, interactive soundtracks, music information retrieval algorithms, and live-coding performances. Music programming has become an increasingly popular pursuit as calls to broaden and diversify the computer science field have led to attempts at integrating programming and computational thinking into other topics and curricula.[1] Today, music programming languages are widespread and predominantly free. Yet, as evidenced by their history and purpose, most cater to expert computer musicians rather than novice programmers or musicians. For example, Max/MSP, one of the most popular music programming environments for professionals and academics, was first developed at IRCAM in Paris to support the needs of avant garde composers (Puckette 2002). Usually, prior knowledge of music theory, programming, and signal processing is needed just to get started with a music programming language. Writing about the computer music programming environment SuperCollider, Greher and Heines (2014) note that “the large learning curve involved in understanding the SuperCollider syntax makes it inappropriate for an entry-level interdisciplinary course in computing+code” (Greher and Heines 2014, 104–105). However, recent platforms such as SonicPi (Aaron et al. 2016), EarSketch (Freeman et al. 2014), and the approaches piloted by Gena Greher, Jesse Heines, and Alex Ruthmann using Scratch in the Sound Thinking course at the University of Massachusetts Lowell (Heines et al. 2011) attempt to find ways to teach both music and coding to novices at the same time.

Music and Coding Education

The idea of using computational automata to generate music can be traced back to Ada Lovelace, possibly the world’s first programmer, in the mid-19th century who imagined an Analytic Engine capable of completing all sorts of tasks including composing complex musical works (Wang 2007). One hundred years later, the first programming languages that synthesised sound were developed beginning with Music I in 1957 by Max Matthews at Bell Labs. Educational environments that treated music and sound as an engaging context for learning coding appeared relatively soon after when Jeanne Bamberger collaborated with Seymour Papert at the MIT AI Lab to adapt the Turtle Graphics programming language Logo to support music (Schembri 2018). Bamberger and the Lab’s creation MusicLOGO opened up exciting opportunities for children making music since it enabled them to bypass the tedium and complexity of learning traditional notation and theory and dive right in to composition and playing back their work to reflect and iterate (Bamberger 1979). Following MusicLOGO, Bamberger developed Impromptu, a companion software to her text Developing Musical Intuitions (Bamberger and Hernandez 2000) that enables young learners to explore, arrange, and create music using “tune blocks” in creative music composition, listening, and theory concept projects.

Within the last ten years, the landscape of free, educational programming languages designed explicitly around music has begun to bloom. While languages vary across style of programming (functional, object-oriented) and music (classical, hip-hop), all that follow enable novice programmers to sonically express themselves. SonicPi (Aaron, Blackwell, and Burnard 2016) is a live-coding language based on Ruby that enables improvisation and performance. It is bundled in the Raspbian OS resulting in widespread deployment in Raspberry Pi computers around the world, and it has been studied in UK classrooms. EarSketch (Freeman et al. 2014) is another platform that combines a digital audio workstation (DAW) with a Python or Javascript IDE (integrated development environment), and was originally designed to teach programming to urban high school students in Atlanta, Georgia using hip-hop grooves and samples. Among its innovations is the inclusion of a library of professional-quality samples enabling learners to make music through combining and remixing existing musical structures rather than adding individual notes to a blank canvas. JythonMusic (Manaris and Brown 2014) is another open source Python-based environment that has been used in college courses to teach music and programming. Beyond music-centric languages, numerous media computing environments such as AgentSheets and Alice support sound synthesis and/or audio playback in the context of programming games, animations, and/or storytelling projects (Repenning 1993; Cooper, Dann, and Pausch 2000; Guzdial 2003).

Music Coding in the Browser

At the time of their initial release, most of the programming environments described above were not web-based. The additional steps of downloading and installing the software and media packages to a compatible operating system presented a barrier to easy installation and widespread use by children and teachers. In 2011 Google introduced the Web Audio API, an ambitious specification and browser implementation for creating complex audio functions with multi-channel output all in pure Javascript. Soon after, Chrome began to support its many features, and Firefox and Safari quickly followed suit. As the developers of Gibberish.js, an early web audio digital signal processing (DSP) library, point out, the Web Audio API is optimized for certain tasks like convolution and FFT analysis at the cost of others like sample accurate timing which is necessary for any kind of reliable rhythm (Roberts, Wakefield, and Wright 2013). Over the past few years, the Web Audio API has added functionality, a more accessible syntax, and new audio effects. The aforementioned Gibberish.js and its companion Interface.js were among the first libraries to add higher-level musical and audio structures on the base Web Audio API, which has enabled the rapid prototyping and implementation of complex web-based music and audio interfaces. These libraries are increasingly being used in educational settings, such as middle schools in which girls are taught coding (Roberts, Allison, Holmes, Taylor, Wright and Kuchera-Morin 2016).

Newer javascript audio and music libraries include Flocking (Clark and Tindale 2014), p5.js (McCarthy 2015), and tone.js (Mann 2015). Flocking uses a declarative syntax similar to SuperCollider meant to promote algorithmic composition, interface design, and collaboration. The p5 sound library adds an audio component to the popular web animation library p5.js, and most recently tone.js provides a framework with syntax inspired by DAW’s and a sample-accurate timeline for scheduling musical events (Mann 2015).


Scratch is a programming environment designed and developed in the Lifelong Kindergarten Group at the MIT Media Lab. First released in 2007, Scratch 1.0 was distributed as a local runtime application for Mac and Windows, and Scratch 2.0 was coded in Adobe’s Flash environment to run in the web-browser beginning in 2013. With the impending deprecation of Flash as a supported web environment, a new version of Scratch (3.0) has been completely reprogrammed from the ground up using Javascript and the Web Audio API. This version went live to all users in January 2019.

The design of Scratch is based on constructionist design metaphors in a visual environment where users drag, drop, and snap together programmable LEGO-style blocks on the screen (Resnick et al. 2009). An important part of the Scratch website is its large community of users that comment on and support each other’s work and project gallery which includes a “remix” button for each project that enables users to look inside projects, build upon and edit them to make custom versions, and copy blocks of code known as scripts into their “backpacks” for later use. While Scratch is used to create all kinds of media ranging from simulations to animations to games, it comes with a set of sound blocks and advertises its capability for creating music. In fact, the name “Scratch” derives from a DJ metaphor in that a user may tinker with snippets of code created by herself and others similar to how a DJ scratches together musical samples (Resnick 2012). Scratch is also widely used to make music in classrooms and homes around the world often through hardware such as the Makey Makey. Middle school music educator Josh Emanuel, for example, has posted comprehensive documentation on building and performing Scratch instruments with his students in Nanuet, New York.

Scratch is accessible even to novice coders in classes ranging from elementary to college level introductory coding classes, and to curious adults. Recently, in an undergraduate course entitled Creative Learning Design, the authors assigned a project for a diverse set of undergraduate students to design and build music-making environments in Scratch followed by user testing sessions with local first and seventh graders.[2] The music component of Scratch has been used to teach computational thinking in an interdisciplinary general education Sound Thinking class, which was co-designed and taught Alex Ruthmann, at the University of Massachusetts Lowell (Greher and Heines 2014). The instructors of the course covered topics such as iteration, boolean operators, concurrency, and event handling through building music controllers, generating algorithmic music, and defining structures for short compositions (2014, 104–131). They chose Scratch as the music programming environment for their course as it “makes the threshold of entry into the world of programming very low” and “includes some great music functionality” such as the ability to “play a number of built-in sounds, or sounds stored in mp3 files” and more importantly “its ability to generate music using MIDI” (2014, 104–105).


Scratch has a massive user base with over 33 million registered users and over 35 million projects shared across its platform to date. In supporting this wide audience, the Scratch team intends to put forward an environment with “low floors, high ceilings, and wide walls,” that is to say a low barrier of entry in which one has ample room to grow and pursue unique interests (Resnick et al. 2009). However, when it comes to music projects, we believe Scratch has limitations. In terms of easily coding and playing pre-recorded and user-created audio files, one finds low floors. However, for users wishing to create music through sequencing musical notes or with engaging sounds and instruments, they often can face high floors, low ceilings and narrow walls due to the complex numeric music mappings and mathematical representations, as well as data structures that limit musical expression.

In this work, we present and critique three major challenges with the implementation of music and sound blocks in Scratch 2.0. First, the functionality of music blocks is immediately accessible, but designed to play sounds at the level of “musical smalls” (i.e., the “atoms” or “phonemes” of music, such as an individual note, pitch, or duration) vs. “musical simples” (i.e., the “molecules” or “morphemes” of music, such as motives and phrases) (Bamberger and Hernandez 2000). Second, arising from Scratch’s bottom-up programming style (i.e., building music code block by block, and note by note from a blank screen), the act of realizing musical sequences is tedious requiring a deep mathematical understanding of music theory to (en)code music. Third, and perhaps most challenging to the end user for music, is a timing mechanism designed to privilege animation frame rates over audio sample-level accuracy. As illustrated by numerous Scratch projects and our own task breakdown, a user must take extra, often unintuitive steps to achieve adequate musical timing to implement basic musical tasks such as drums grooving together in time, or melodies in synchronization with harmony. For the balance of this article, we analyze the design constraints of the music and sound blocks, the user experience and implications of using them in the Scratch environment, and finally the quality of the audio they produce. We conclude with a preview of new music block design ideas that aim to better address music making and coding in the Scratch environment for musical and coding novices.

Music Functionality in Scratch 2.0

The music capabilities of Scratch 2.0 center around the Sound blocks (Figure 1)—audio primitives that enable users to trigger individual sounds, notes and rests, and to manipulate musical parameters such as tempo, volume and instrument timbre.

13 purple puzzle piece-shaped Scratch blocks arranged vertically. Blocks 1-3 correspond to audio files. Blocks 4-7 correspond to drums and pitches. Blocks 8-13 correspond to volume and tempo.
Figure 1. Blocks included in the Sound category in Scratch.

Scratch makes the first steps in crafting sounds and music at the computer easy and accessible. Unlike many audio programming languages, it provides a learner immediate audio feedback with a single block of code that can be clicked and activated at any point. Other music coding environments often require more complex structures to play a sound, including instantiating a new instrument, locating and loading a sample, and turning audio output on or running the program before producing a result (Figure 2).

Three screenshots of code side by side. Left: a single Scratch block with the text 'play sound meow.' Center: four grey Max MSP blocks. Right: two lines of Javascript code.
Figure 2. Playing a sound “meow” in Scratch (left) uses only one block. It is more complex in Max/MSP (center) and Tone.js (right).

Scratch supports three types of audio playback—sounds (audio files), MIDI drums, and MIDI pitched instruments. A library of free audio files is included in the Scratch environment, and sounds are linked intuitively to relevant sprites (e.g. when the Scratch cat is chosen, its default sound option is “meow”). Users can easily upload and record new sounds by clicking on the “sounds” tab. General parameters such as instrument, tempo, and volume, are set locally for each sprite.

Unfortunately, the three types of audio playback in Scratch lack consistency. In the design of novice programming languages, consistency means, “the language should be self-consistent, and its rules should be uniform” (Pane and Meyers 1996, 55). Setting a tempo affects pitched instruments and drums, but not audio files (e.g., sounds). This makes sense for games and animations where strict rhythmic timing is unnecessary, e.g. characters communicating with each other. But if a user wanted to load in their own musical sounds to create a sampler, they would need to use a combination of “play sound” blocks and “rest” blocks to emulate tempo. The syntax “play sound until done” is also confusing to new learners and unique to sound playback. Really this line means “do not move on until the sound has finished” and results in timing derived from the current sound’s length.[3]

While pitched instruments and drums use a consistent timing mechanism, they differ in that “play drum” requires a percussion instrument as an argument, and “play note” requires, not an instrument, but a pitch. Following a real world metaphor of musicians playing notated music, this distinction is appropriate (Pane and Meyers 1996) as separate percussion instruments are often notated on different lines of the same staff just like pitches. A single percussionist is required to play multiple instruments like an instrumentalist is required to play multiple notes. However, the “set instrument” function constrains sprites to a single instrument. It’s often a challenge to teach that a single sprite can play any number of percussion sounds in parallel with a pitched instrument but is limited to one pitched instrument.[4] A more consistent design may treat a drum like any instrument requiring it to be set for the entire sprite while providing smart selection/labelling to indicate the resulting drum sound.[5]

The design of the “play note” and “play drum” blocks is challenging in other ways. For one thing, the vertical, numeric representation of pitches and instruments conceals melodic content more easily visualized with other representations like traditional score notation or pitch contour curves. A learner has to memorize for example that pitch 64 refers to “E,” that drum 1 is the snare drum, and instrument 9 is the trombone. Rhythm is also expressed numerically as fractions or multiples of the beat rather than in traditional formal music terms like “sixteenth notes” and “quarter notes.” While this notation affords a more fundamental conception of timing and can lead to crazy behaviors (1.65 beats against 1.1 beats), it constrains the type of rhythms possible, e.g. triplet eighth notes that would need to be rounded down to 0.33. Depending on what one is teaching about music, a notation-based rhythm representation may be more accessible than decimal fractions of beats.

The quality of the audio output produced may limit users’ expressiveness and may be a major ceiling on creativity for young users. Pitches and drums are played via a flash soundbank. While Scratch provides a large variety of drums and pitched instruments, they vary in quality and are output as mono sounds. High quality audio, such as instrument sample libraries, take up large amounts of storage space and internet bandwidth in downloading. It is a trade-off in sound quality with load speed that limits many web-based music and sound environments. Because Scratch 2.0 does not support synthesizing new sounds or uploading new MIDI instrument libraries, users are less able to pursue their unique musical instruments. In contrast, EarSketch uses a library of loops created by professional DJs and producers that its developers write contributes to an experience that is personally meaningful, and deeply authentic (Freeman et al. 2014). In Scratch, the lack of high-quality audio, in combination with other challenges discussed, results in many users bypassing the instrument blocks entirely in favor of using audio samples and uploading long audio files that play on repeat. For example in one Scratch forum post entitled “Add more instruments,” one response to the poster’s request states, “Why can’t you download music? It’s better.”

Learning Music Coding/Coding Music in Scratch

Coding in Scratch is a bottom-up process. While the developers argue this style of constructionist learning supports exploration, tinkering, and iterative design (Resnick et al. 2009), and Scratch’s remix feature lessens the load allowing many users to begin from an existing project rather than a blank canvas, critics have pointed out that it may lead to bad programming habits such as incorrect use of control structures and extremely fine-grained programming where programs consist of many small scripts that lack coherency (Meerbaum-Salant, Armoni, and Ben-Ari, 2011). For our purposes, we must interrogate whether Scratch’s bottom-up programming style is a useful method for composing music. For one thing, bottom-up programming takes time. Each instance a new block is added to the screen, a user must click and drag, select a new pitch/instrument and duration, and often rearrange existing blocks. Music scripts are long, and as a result of a finite amount of screen space, they quickly become unwieldy as even simple tunes like “Twinkle Twinkle Little Star” have tens if not hundreds of notes. One might argue that hard coding all of the pitches, like one might in a DAW, defeats the purpose of using a programming environment to represent music. It is true that a powerful exercise in Scratch is to represent a tune with as little code as possible using loops as repeats, messages to trigger sections, and data structures to store groups of notes. In formal settings the tedium required to code a tune block by block opens up the opportunity to learn more efficient methods of coding (Greher and Heines 2014). However, abstraction is a difficult concept for novice programmers to internalize, and problem-solving usually occurs first at the concrete level. Collecting the results of multiple studies Pea, Soloway, and Spohrer (2007) note that “there is a well-known tendency of child Logo programmers to write simple programs using Logo primitives rather than hierarchically organized superprocedures that call other procedures even after examples of superprocedures and discussions of their merits for saving work have been offered” (24). Similarly, in Scratch, beginning music coders must slog through dragging blocks and blocks of code before realizing even a simple melody, let alone a multi-part composition (Figure 3).

Large, complicated structure of more than 40 Scratch sound blocks connected vertically. The top block starts the music. Each following block indicates the pitch and duration of a single note.
Figure 3. “Twinkle, Twinkle Little Star” requires a large number of “play note” blocks in Scratch.

A more fundamental argument against bottom-up music coding can be drawn from Jeanne Bamberger’s work Developing Musical Intuitions (2000). In it, she points out that conventional music instruction, like Scratch, begins with the smallest levels of detail—notes, durations, intervals, etc. She compellingly makes the case that this approach ignores the intuitions we already possess as the starting point for musical instruction, e.g. our love of “sticky” melodies and pleasure in finding and moving to danceable grooves. Conventional instruction asks us to begin our educational work with the smallest musical components, often taken out of context, rather than those that our prior experience and developed intuitions with music may have provided.

In Bamberger’s curriculum, conversely, she begins with the middle-level concept of melodic chunks. The accompanying learning tool Impromptu presents musical building blocks called “tuneblocks” that are hierarchically grouped into entire tunes, phrases, and motives. The user begins by listening to the various “chunks” of melodies, and arranges and rearranges them to create or re-create a composition. The user is not working initially at the level of individual pitches and rhythms, but instead at a higher, and more intuitive level of abstraction. Students begin by learning that the base units of a composition are melodies and motives, rather than pitches or frequencies. It is not until Part 3 entitled Pitch Relations when Bamberger zooms in and introduces intervals and assigns students to uncover and reverse engineer scales from melodies. She effectively reverses the order of traditional music theory, teaching children how to theorize with music, rather than accept it as a set of bottom-up, well-structured rules. One notices this approach is the reverse of the way many people approach teaching coding in Scratch (and also in many other languages), but is congruent with the original design intention behind Scratch where users would first browse and play games created by other users in the Scratch community, then clicking the “see inside” button to remix others’ code as the starting point for learning.

Examined from a music technology angle, Freeman and Magerko (2016) classify educational music programming languages by the layer of musical interaction they afford and thus the form of musical thinking they prescribe: Subsymbolic programming languages prioritize synthesis and control of time at an extremely low level. Symbolic languages offer editing at the individual note level usually via MIDI. Hierarchical languages deal with musical form and allow loops or motives to be layered and remixed.[6] The most powerful of the bunch are clearly subsymbolic languages offering near limitless flexibility and control for expert and professional coders. Yet, the manipulation of low-level processes requires patience, determination, and a lot of time and experience to build up the knowledge to implement in a practical setting.[7] We have observed that novice Scratch users sometimes get hung up and frustrated after the first few short programs they write, lacking ever a moment of true self-expression and failing to develop the knowledge and motivation to progress further.

Hierarchical languages do not always support lower-level sound editing meaning that advanced users will inevitably reach a ceiling on what they can express with the tool. (EarSketch, for example, allows the programming of beats, but disallows modifying single pitches within a melody.) However, this constraint is not necessarily a bad thing. Educational programming languages should not serve as the end-all-be-all, but rather as a meaningful stage in a learner’s development. Maloney et al (2010) write that their intent with Scratch is to “keep [their] primary focus on lowering the floor and widening the walls, not raising the ceiling,” noting it is important for some users to eventually move on to other languages (66). Of course, this growth step should not be taken out of desperation, but rather in search of new mediums that afford more creative potential after one has undergone multiple positive experiences.

Ideally, users of a music programming environment should have the opportunity to compose entire pieces and/or perform for others using custom instruments/interfaces all while encountering code challenges that push their critical thinking skills along the way. Through this process, they can begin to theorize about and through code using musical materials and ideas they care about, testing and developing their musical intuitions and hunches as part of their creative process. Environments like EarSketch and Hyperscore (Farbood, Pasztor and Jennings, 2004) offer innovative, hierarchical methods of interaction that demonstrably achieve these goals, but we believe there is still unrealized potential for environments and educators to support parallel artistic and computational growth and then scaffold the transition process into more advanced/powerful environments. (Hyperscore is not a programming language, but a unique computer-assisted composition system that enables users to control high level structures (e.g. composition-length harmonic progressions) and lower level structures (motivic gestures) all through drawing and editing curves.)

Inconsistent Timing in Scratch Music Playback

The Scratch music implementation contains a tempo object that is used to specify the duration of notes, drums, and rests. Even though tempo is key to almost all music-making, many other audio programming languages require a user to time playback in second or millisecond long durations. One obvious application in which tempo is useful is in creating rhythms or drum beats. Consider a popular Scratch drum machine developed by user thisisntme. Both the code of thisisntme and many thoughtful comments indicate exciting possibilities for making music in Scratch.[8] Other comments indicate that the app is imperfect, especially in regards to rhythm. For example, user Engin3 writes, “only problem to me (that i doubt can be fine tuned [sic], is the timing of the repeat. It feels a bit off to me.” The developer thisisntme writes in the instructions, “TO MAKE THE BEATMAKER MORE ACCURATE, ENABLE TURBO MODE AND REDUCE TEMPO WITH DOWN ARROW.” These comments are indicative of a much more serious problem in Scratch: the lack of accurate timing. In the rest of this section, we describe the buggy path to building a simple drum loop consisting of hi-hat, kick drum, and snare drum. Consider a challenge to implement the drumbeat in Michael Jackson’s “Billie Jean” (Figure 4).

One measure of music represented with two different approaches to notation. Top: Traditional notation showing eight hi-hat eighth notes and alternating kick and snare quarter notes. Bottom: Piano roll notation where notes are represented as filled in squares on a grid.
Figure 4. “Billie Jean” drum pattern in traditional and piano roll notation.

An initial solution to this problem requires listening to determine the relative rates of each drum and envisioning three musical lines running in parallel. Learners would need to understand loops, as well implement a solution to trigger the three instruments to start at the same time. A Scratch representation of all the notes above looks as follows:

Four groups of Scratch blocks that play the rhythm in Figure 4. The first group sets the tempo to 120 bpm. Groups 2-4 trigger hi-hat with eight blocks, kick drum with two blocks, and snare drum with four blocks respectively. Sounds are set to loop forever.
Figure 5. “Billie Jean” Solution 1 representing all notes in measure.

A follow-up challenge adapted from the Sound Thinking curriculum (Greher and Heines 2014, 109–110) may involve asking students to implement a simplified version of the groove using as few blocks code as possible. This approach teaches students to look for patterns and opportunities to more efficiently code the music, exploring computational concepts of repeats, loops, and variables.

Identical layout of Scratch code as in Figure 5 with reduced code. Only one hi-hat block, one kick-drum block, and two snare drum blocks are used.
Figure 6: “Billie Jean” Solution 2 with reduced code and correct recreation of code in Figure 5.

When played back, the above solutions drift away from each other, increasingly out-of-time. Individually, each of the drum notes sound okay, but when executed together in parallel they sound cacophonous as if three musicians were asked to play together wearing blindfolds and earplugs. In an informal experiment done with undergraduates in the NYU Music Experience Design Lab, after encountering the problem, each unsuccessfully attempted a different fix. One student tried explicitly setting tempo in each group as if each loop requires a constant reminder or reset for how fast it is supposed to play. Another tried adjusting the beat values to approximations of the beat like 0.99 and 1.01 attempting to account for perceived differences in playback speed across the drums, but actually augmenting the differences in tempo.

Similar layout of Scratch code as Figure 6, but with a 'set tempo' block inserted into each instrument.
Figure 7. “Billie Jean” attempted solution by setting tempo at the start of every loop.

In reality, the problem is beyond the user’s control stemming from the fact Scratch is optimized for animation frame-rate timing rather than the higher precision timing needed for audio and music. Greher and Heines (2014) write that Scratch 1.4 intentionally slows down processing so that animations do not run too fast on modern computers, but “what’s too fast for animations is sometimes too slow for music, where precise timing is necessary” (112). In Scratch 2.0, the problem may be compounded by its Flash implementation in which background processes within and outside the web browser can interrupt audio events, making music and sound out of sync. We should note timing in the web is a longstanding challenge: Before the Web Audio API, programmers relied on imprecise functions like setTimeout() and Date.now(). Even though the Web Audio API exposes the computer’s hardware clock, its implementation is inflexible and often misunderstood (Wilson 2013). Ideally in a novice programming environment like Scratch, these timing challenges should be addressed so that users can think and implement code with intuitive musical results.

There are a few “hacks” to work around these timing issues in Scratch. Toggling the “Turbo Mode” button helps, but it merely executes code blocks as fast as possible, which does not prevent timing issues. Discussions from users in the Scratch online forums reveal that they are also perplexed about what the mode actually accomplishes. An algorithmic solution to improve musical timing involves conceptualizing one of the musical lines as the leader or conductor. Rather than each of the lines looping and gradually getting out of time relative to each other, one line loops and signals the others to follow suit:

Similar layout of Scratch code as Figure 6. The hi-hat grouping now includes a block that states 'broadcast message start.' The snare and kick groupings include corresponding blocks within the text 'When I receive start.'
Figure 8. “Billie Jean” solution 3 with more accurate timing.

Of course, this solution is less readable, and is not immediately intuitive to a novice coder. It is also more viscous (Pane and Meyers 1996) since the user must always ensure the “conductor” script is equal or longer than all other scripts. Changes to the “conductor” script may affect playback of other scripts, so the user must check multiple scripts after making a change. More importantly, it is an incomplete solution. As each line moves forward in its own loop, it will get out of time relative to the other loops before snapping back in time once the “start” message is broadcasted causing a feeling of jerkiness. As loops lengthen the effect becomes more dramatic. A slightly better solution requires creating a single forever loop that triggers all current drums at correct moments in time. Unfortunately, this solution results in ignoring the durations included in each “play drum” call since rests are necessary to separate the asynchronous “broadcast” calls. While it prevents the issue of these few music loops getting out of sync, it does not solve the fundamental problem of inaccurate timing: the entire rhythm still has the potential for musical drift.

Drastically changed version of Scratch code. A longer grouping of Scratch blocks on the left represents the full rhythm with a combination of broadcast and rest blocks. Three groupings on the right receive broadcast messages and trigger corresponding drums.
Figure 9. “Billie Jean” solution 4 with slightly more accurate timing.

One may argue that because at least one solution exists, the programming environment is not at fault, and further, that all programming languages have affordances and constraints unique to their implementations. We hope that a solution may be found in a future version of Scratch, or via a new extension mechanism, to solve many of these timing issues so that novice musicians and coders may use the Scratch paradigm to create musically synchronized code for their creative projects. This is in part because new programmers are often less attuned to what precisely to look for when they encounter results that are unexpected. Pea, Soloway, and Spohrer (2007) write that “since the boundaries of required explicitness vary across programming languages, the learner must realize the necessity of identifying in exactly what ways the language he or she is learning ‘invisibly’ specifies the meaning of code written” (25). In order to solve this problem, users must learn to mistrust Scratch’s implementation of tempo and respond in a way that gets around the issue of timing through using their ears. Our solutions can be thought of more as “hacks” than as useful exercises in building musical computational thinking skills.

Consider the same challenge executed in another visual programming environment Max/MSP. Max/MSP, as noted above, is not nearly as immediately accessible and does not abstract away a lot of music functionality. As a result, more code blocks are necessary to accomplish the same tasks. First, there is not a library of built-in drum sounds so the user must figure out his/her own method to synthesize or play the drums back. In the solution below (Figure 10), local audio files for each drum type are loaded into Max/MSP memory called buffers. Unlike Scratch’s sound playback which requires one object, Max/MSP’s sound playback requires a minimum of four objects: a buffer to store the audio file, a play~ object to play it, a start button to trigger playback, and an audio output to ensure sound reaches the user’s speakers. Second, rather than specify a global tempo in beats per minute (BPM), Max/MSP includes a metro object which produces output at regular intervals specified in milliseconds. To convert BPM to the correct note duration, it is up to the user to perform the calculations. All that said, the Max/MSP program utilizes an identical algorithm as the original Scratch solution but works as expected in that the drum beats sound in time together with each other.

Recreation of Scratch code in Figure 6 with the visual programming language Max MSP. Each grouping is much more complicated with additional required calculations.
Figure 10. Max/MSP Implementation of “Billie Jean” drumbeat.
[archiveorg ScratchBeats width=640 height=480 frameborder=0 webkitallowfullscreen=true mozallowfullscreen=true]
Figure 11. Four audio examples of the Billie Jean beat as discussed: First, Solution 1 in Scratch where the instruments drift. Second, Solution 1 implemented in Max MSP with accurate timing. Third, Solution 3 in Scratch where one instrument broadcasts “start” to other instruments. Fourth, Solution 4 in Scratch where the beat is notated as a single sequence using “broadcast” and “rest” functions.

New Opportunities for Music Blocks in Scratch

As we have shown, music making in Scratch may be limited by the level of representation coded into the blocks, differing motivations by users in using and reading the blocks, and inaccurate timing that affects music playback. However, there are ways forward that may enhance music making through redesigning the music and sound blocks in Scratch. Scratch’s accessibility as a browser-based environment, its wide and diverse user community, and its insistence on making all code and assets shared on its platform copyable and remixable creates a democratized space for children to explore music making and learning. Imagine if young people could not only use the web as a tool to listen to music and look up song lyrics, but actually step inside and remix the creations of their peers. With a new extension mechanism due to be implemented in Scratch 3.0, educators and more advanced coders will be able to design their own blocks to extend or modify the functionality of Scratch, leading towards a more open and customized coding environment.

As described in the wiki entry for Scratch 3.0, there are many notable changes. Scratch 3.0 only supports sound blocks in the default blocks interface, while music blocks, such as “play note” and “set tempo,” must now be loaded via an extension. This separation of blocks addresses our concerns about consistency formally separating how sound and MIDI music blocks are used.[9] However, the issues of steady musical timing, and representations of music as “musical smalls” remain. As other developers step in to add music functionality, it is important they explore ways to avoid the pitfalls of the current music implementation.

One exciting development is the ScratchX + Spotify extension for the ScratchX experimental platform.[10] This extension interfaces with the Spotify API in only a few, powerful Scratch blocks, while addressing many of the issues we discussed above.[11] Rather than using low-level notes and beats as musical building blocks, it loads thirty-second long clips of songs streamed instantly from Spotify. It abstracts away many of the complexities of using the API while including exciting features, such as chunking a song based on where Spotify’s algorithm believes the beat to be, and triggering animations or motions based on cues timed to structures within the music. Crucially, the ScratchX + Spotify website provides starting points for using the extension such as one app making sprites “dance” and another letting a user remix a song. From a user’s point of view, they can access a song or piece of music that is personally meaningful to them, and begin to code, play, and experiment directly. They can directly work with musically simple chunks of music they care about, in turn developing the curiosity that lays the foundation for the meaningful exploration of lower-level musical details and code at a later time. This is a powerful idea and positive step towards making music coding projects more accessible and motivating to young coders.

New Design Ideas for Music Blocks in Scratch

In the NYU Music Experience Design Lab (MusEDLab), we have been experimenting with new block designs for music using Scratch 2.0’s “Make a Block” functions, as well as ScratchX’s extension platform. In beginning work on new Scratch 3.0 music extensions, we are intimately aware of the challenges in designing for novice programmers and in representing music in code. To move from music blocks that are based on the musical smalls of “play note” and “play drum,” we propose creating new blocks that are modeled at a higher level of musical representation, closer to the “musical simples” that Bamberger and Hernandez (2000) advocate. This work builds on the creation of new individual blocks that perform chords or entire sequences of music, experimenting with possible syntaxes for creative interaction, exploration and play.[12]

As a first step, we are prototyping new functionality to launch various MusEDLab web-based apps via blocks within Scratch, such as the circular drum sequencer Groove Pizza, then returning the audio or MIDI output for use in Scratch projects. This functionality enables users to use our apps to play with and create rhythmic groove “simples” in an interface designed for musical play. Then, when they are finished creating their groove, the groove is already embedded into a new “play groove” Scratch block for use and further transformation through code.

In this prototype block (Figure 12), the user clicks on the yellow Groove Pizza icon on the “play groove” block, which pops up a miniature 8 slice Groove Pizza in the user’s browser. The user clicks to create a visual groove in the pop-up pizza, and the “play groove” block is virtually filled with that 8 beat groove. The user can then use this block in their Scratch project to perform the rhythmic groove. If the user wants to speed up or slow down the groove, they can modify it with an existing “set tempo” or “change tempo” block. This approach brings the musical creation and playful coding experiences closer together within the Scratch 3.0 interface.

Single Scratch block with the text 'play groove' and an image that can be clicked on to open up the Groove Pizza app.
Figure 12. MusEDLab prototype block creating and playing a Groove Pizza groove in Scratch 3.0.

Curious users of the MusEDLab apps who are interested in learning more about the computational thinking and coding structures behind them may already launch, step inside, and remix simplified versions created in Scratch, such as the aQWERTYon and Variation Playground apps. Where novice coders may not be able to initially understand the javascript code behind our apps when clicking “View Source” in their browser, they can get a middle-level introduction to our approaches by exploring and remixing the simplified versions of these apps in Scratch.

Another approach we’ve taken is to utilize Scratch’s “Make a Block” feature to create custom music blocks such as “play chord” and “play sequence.” In this approach, the user can copy our blocks and use them without having to know how the inner coding works. Those users that are curious can always take a deeper look at how the blocks, built using musical and computational simples, were coded, because in this implementation the structure behind the custom blocks are left intact.

Side by side comparison of traditional and experimental Scratch blocks. Detailed description in article text below.
Figure 13. Left: Scratch 2.0 blocks necessary to play a single chord; Right: MusEDLab experimental music blocks for playing a chord progression in ScratchX.

The leftmost block of code in Figure 13 presents a common method from Scratch 2.0 for performing a musical triad (a C major chord) when the space key is pressed on the computer keyboard. This code requires three instances of the “when space key pressed” block, as well as the user knowing MIDI pitch numbers to build a C major chord where C equals MIDI note 60, E equals MIDI note 64, and G equals MIDI note 67. This implementation requires 6 Scratch blocks to create and requires basic music theory and MIDI understanding.[13] To lower the floor and widen the walls for chord playback, we implemented a new “play chord” block where users can specify the chord root and chord quality directly. These blocks can then be sequenced to form a chord progression. This approach encapsulates music at a higher level than individual notes and rhythms, the level of the “musical simple” of a chord. While there is a time and place for teaching that chords are made up of individual notes, implementing a “play chord” block enables novices to more easily work with chords, progressions, and harmony.

Another approach to a simpler music block is illustrated in Figure 14. To perform a melody or sequence of musical notes or rests, a Scratch 2.0 user needs to use a “play note” block for each note in the melody. The “play seq” block enables the user to input a sequence of numbers (1-9, plus 0 for a 10th note) or a hyphen for a musical rest to represent and perform a melodic sequence. The variable “major” indicates that the sequence is to be mapped onto a major scale starting on a root note of 60, or middle C. The first number 1 maps onto the root pitch. With this block, the melodic sequence can be transposed, and rhythmic durations set by fractions of whole notes, or mapped to different musical scales all within the same block. When used in a live-coding approach, the sequence within a block running in a repeat or forever loop can be adjusted to update the melody on the fly.

Comparison of traditional and experimental Scratch blocks. Detailed description in article text above.
Figure 14. Top: sequence of notes coded in Scratch 2.0; Bottom: experimental music block for playing an identical sequence.

The prototype blocks described above provide just a few ideas for how to make it easier for novices to code and explore musical concepts in Scratch. Our approach focuses on designing Scratch blocks for music that are modeled after Bamberger’s notion of musical simples vs. musical smalls, potentially enabling a new pathway for children to intuitively and expressively explore creating and learning music through coding. Given the huge user base among school-age children, there is a great opportunity for the Scratch environment to not only be a platform for teaching coding and computational thinking, but also for musical creativity and expression. The affordances of a well-designed, wide-walled coding environment, provide opportunities for children to “tinker” with the musical intuitions and curiosities they already have, and to easily share those with their peers, teachers, and families. While there are many technical and design challenges ahead, and no environment can be perfect for everyone, we look forward to seeing how developers will extend the Scratch environment to promote lower floors, higher ceilings, and wider walls for creative music and sound coding projects for all.  


[1] The recent book, New Directions for Computing Education: Embedding Computing Across Disciplines (2017) provides a good starting point for reading about how computer science may be taught in broader contexts. One article by Daniel A. Walzer is particularly relevant as it looks at CS in the context of a Music Technology program (143).

[2] The website for the course may be found at http://creativelearningchina.org/. Some of the Scratch projects created in the course may be found in the Creative Learning Design Scratch Studio. Broadly in their reflections, our students found that the experiences they came up with were fun for first graders, but overly simplistic for seventh graders who were quite critical and were prepared to implement their own remixes!

[3] The new version of Scratch (3.0) changes the syntax for audio sounds in that the old “play sound” block is now “start sound” more accurately indicating that this block simply starts the sound without waiting until the sound is finished to move on to other blocks in a particular stack in Scratch.

[4] We often use a pedagogical metaphor of a sprite as a musician. A clarinetist can only play the clarinet (but can play any number of pitches). A percussionist or drummer has two hands and two feet and can then play up to 4 different drum sounds at once in layers. This is if we are trying to enforce human metaphors. That said, one of the cool things of computer environments is that create music that humans cannot. Many of Alex Ruthmann’s early Scratch experiments are “hacks” of Scratch taking advantage of overdriving the Flash sound engine, etc.

[5] Tone.js uses a more consistent design (Mann 2015). Even though sample playback objects called “Players” and synthesis objects require different arguments upon instantiation, both essentially produce sound. As a result, they share many methods like “start,” “stop,” “connect,” “sync,” etc.

[6] While programming environments often blur the lines, a good litmus test is to ask what a typical “Hello World” program looks like. Is it generating a sine wave, perhaps mapping mouse position to frequency? Subsymbolic. Is it scheduling a few notes and rests? Symbolic. Is it triggering a loop? Hierarchical.

[7] The first author recalls a conversation with Miller S. Puckette about the degree of functionality that Pure Data, the open source music programming language he actively develops, should have built in, and what should be up to users to implement with its basic constructs. For example, Puckette teaches students that the method to play an audio file stored within a buffer is to determine its size and then iterate through each sample of the buffer at the rate it was recorded at. Only once this is understood does he mention the existence of a simple playback command (sfread~). This is a worthwhile task for the budding music technologists in his course, but not likely for novice musicians who merely want to make some sounds before they have learned about loops and sampling rates.

[8] One of the authors of this work, Alex Ruthmann, has posted a Scratch Live coding session inspired by the composer Arvo Pärt. Another neat example of live coding and pushing Scratch to its timing limits is demonstrated by Eric Rosenbaum.

[9] For example, One user writes, “Mine Is Kick: 1000 0000 1010 1001 Clap: 1010 0110 0010 1000 Hat: 1010 0110 0010 1010 Snare: 1000 0100 0000 0001,” showing an exploration of musical ideas and a willingness to share his/her creative idea with others. The comment also indicates an understanding of boolean logic where ones indicate onsets and zeros indicate rests. Many others share grooves with identical boolean notation, and most remixes are identical but preprogrammed with a custom groove. Another user posts, “Put all of the lines on for all the instruments at the same time and put it at full volume. You’re welcome,” demonstrating expressivity in pushing an instrument to its limit to hear the result.

[10] Another simple but welcome fix in Scratch 3.0 is that instrument names are included with their numbers and pitch letters are included next to MIDI values making all music code more readable.

[11] The extension is currently available for ScratchX, an experimental version of Scratch created to test and share extensions, while lacking many of Scratch’s community features.

[12] Some early prototypes of these musical blocks may be explored on the Performamatics@NYU Scratch studio page.

[13] An excellent resource for learning the basics of music theory and MIDI through making music in the browser can be found at https://learningmusic.ableton.com.


Aaron, Samuel, Alan F. Blackwell, and Pamela Burnard. 2016. “The development of Sonic Pi and its use in educational partnerships: Co-creating pedagogies for learning computer programming.” In Journal of Music, Technology & Education 9, no. 1 (May): 75-94.

Bamberger, Jeanne. 1979. “Logo Music projects: Experiments in musical perception and design.” Retrieved March 21, 2019 from https://www.researchgate.net/publication/37596591_Logo_Music_Projects_Experiments_in_Musical_Perception_and_Design.

Bamberger, Jeanne Shapiro, and Armando Hernandez. 2000. Developing musical intuitions: A project-based introduction to making and understanding music. Oxford University Press, USA.

Clark, Colin BD, and Adam Tindale. 2014. “Flocking: a framework for declarative music-making on the Web.” In SMC Conference and Summer School, pp. 1550-1557.

Cooper, Stephen, Wanda Dann, and Randy Pausch. 2000. “Alice: a 3-D tool for introductory programming concepts.” In Journal of Computing Sciences in Colleges, vol. 15, no. 5, pp. 107-116. Consortium for Computing Sciences in Colleges.

Farbood, Morwaread M., Egon Pasztor, and Kevin Jennings. 2004. “Hyperscore: a graphical sketchpad for novice composers.” In IEEE Computer Graphics and Applications 24, no. 1: 50-54.

Fee, Samuel B., Amanda M. Holland-Minkley, and Thomas E. Lombardi, eds. New Directions for Computing Education: Embedding Computing Across Disciplines. Springer, 2017.

Freeman, Jason, and Brian Magerko. 2016. “Iterative composition, coding and pedagogy: A case study in live coding with EarSketch.” In Journal of Music, Technology & Education 9, no. 1: 57-74.

Freeman, Jason, Brian Magerko, Tom McKlin, Mike Reilly, Justin Permar, Cameron Summers, and Eric Fruchter. 2014. “Engaging underrepresented groups in high school introductory computing through computational remixing with EarSketch.” In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pp. 85-90. ACM.

Green, Thomas R. G., and Marian Petre. 1996. “Usability Analysis of Visual Programming Environments: A ‘Cognitive Dimensions’ Framework.” In Journal of Visual Languages and Computing 7, no. 2 (Summer): 131-174.

Greher, Gena R., and Jesse Heines. 2014. Computational Thinking in Sound. New York: Oxford University Press.

Guzdial, Mark. 2003. “A media computation course for non-majors.” In ACM SIGCSE Bulletin, vol. 35, no. 3, pp. 104-108. ACM.

Heines, Jesse, Gena Greher, S. Alex Ruthmann, and Brendan Reilly. 2011. “Two Approaches to Interdisciplinary Computing + Music Courses.” Computer 44, no. 12: 25-32. IEEE.

Maloney, John, Mitchel Resnick, Natalie Rusk, Brian Silverman, and Evelyn Eastmond. 2010. “The Scratch Programming Language and Environment.” In ACM Transactions on Computing Education (TOCE) 10, no. 4 (Winter): 16.

Manaris, Bill, and Andrew R. Brown. 2014. Making Music with Computers: Creative Programming in Python. Boca Raton, FL: Chapman and Hall/CRC.

Mann, Yotam. “Interactive music with tone.js.” 2015. In Proceedings of the 1st annual Web Audio Conference. Retrieved March 20, 2019 from https://medias.ircam.fr/x9d4352.

McCarthy, Lauren. “p5. js.” https://p5js.org.

Meerbaum-Salant, Orni, Michal Armoni, and Mordechai Ben-Ari. 2011. “Habits of Programming in Scratch.” In Proceedings of the 16th Annual Joint Conference on Innovation and Technology in Computer Science Education, pp. 168-172. ACM.

Pane, John F., and Brad A. Myers. 1996. “Usability Issues in the Design of Novice Programming Systems.” In School of Computer Science Technical Reports, Carnegie Mellon University, no. CMU-HCII-96-101.

Puckette, Miller. 2002. “Max at seventeen.” In Computer Music Journal 26, no. 4 (Winter): 31-43.

Repenning, Alex. 1993. “Agentsheets: a Tool for Building Domain-Oriented Visual Programming Environments.” In Proceedings of the INTERACT’93 and CHI’93 Conference on Human Factors in Computing Systems, pp. 142-143. ACM.

Resnick, Mitchel. 2012. “Reviving Papert’s dream.” Educational Technology 52, no. 4 (Winter): 42-46.

Resnick, Mitchel, John Maloney, Andrés Monroy-Hernández, Natalie Rusk, Evelyn Eastmond, Karen Brennan, Amon Millner et al. 2009. “Scratch: Programming for All.” In Communications of the ACM 52, no. 11 (November): 60-67.

Roberts, Charlie, Jesse Allison, Daniel Holmes, Benjamin Taylor, Matthew Wright, and JoAnn Kuchera-Morin. 2016. “Educational Design of Live Coding Environments for the Browser.” In Journal of Music, Technology & Education 9, no. 1 (May): 95-116.

Roberts, Charles, Graham Wakefield, and Matthew Wright. 2013. “The Web Browser as Synthesizer and Interface.” In Proceedings of the International Conference on New Interfaces for Musical Expression, Daejeon, Republic of Korea, pp. 313-318. NIME.

Schembri, Frankie. 2018. “Ones and zeroes, notes and tunes.” In MIT News, February 21, 2018. https://www.technologyreview.com/s/610128/ones-and-zeroes-notes-and-tunes/.

Wang, Ge. 2007. “A history of programming and music.” In The Cambridge Companion to Electronic Music, 55-71.

Wilson, C. 2013. “A Tale of Two Clocks—Scheduling Web Audio with Precision.” January 9, 2013. https://www.html5rocks.com/en/tutorials/audio/scheduling/.

About the Authors

Willie Payne is a PhD candidate in Music Technology at NYU Steinhardt where he develops tools and pedagogies to enable people to express themselves regardless of skill level or ability. He is especially interested in using participatory design methods to craft creative experiences with and for underrepresented groups, e.g. those with disabilities. When not writing code, Willie can often be found wandering New York City in search of (his definition of) the perfect coffee shop.

S. Alex Ruthmann is Associate Professor of Music Education and Music Technology, and Affiliate Faculty in Educational Communication & Technology at NYU Steinhardt. He serves as Director of the NYU Music Experience Design Lab (MusEDLab.org), which researches and designs new technologies and experiences for music making, learning, and engagement together with young people, educators, industry and cultural partners around the world.

Skip to toolbar