Strategies for spatial music performance: the practicalities and aesthetics of responsive systems design


This article will explore practical and aesthetic questions concerning spatial music performance by interrogating new developments within an emerging hyperinstrumental practice. The performance system is based on an electric guitar with individuated audio outputs per string and multichannel loudspeaker array. A series of spatial music mapping strategies will explore in-kind relationships between a formal melodic syntax model and an ecological flocking simulator, exploiting broader notions of embodiment underpinning the metaphorical basis for the experience and understanding of musical structure. The extension and refinement of this system has been based on a combination of practice-led and theoretical developments. The resulting mapping strategies will forge new gestural narratives between physical and figurative gestural planes, culminating in a responsive, bodily based, and immersive spatial music performance practice. The operation of the performance system is discussed in relation to supporting audiovisual materials.

Introduction: framing spatial mappings in performance and design

A system for spatial ideas (theory and outline of design)

Spatial music, by definition, treats space as a central creative parameter of musical experience, as opposed to an ancillary context. However, in practice, the question of how space may be musically significant provokes two somewhat divergent tactics: one more formalistic in nature, the other more perceptually based. An intuitively attractive approach treats space as largely analogous to the most common formal parameters of historical composition, pitch and rhythm. For example, Stockhausen’s Gesang der Jünglinge (Stockhausen, 1956) sought to apply permutational pitch structures directly to spatial structure (Smalley, 2000, pp. 6–7) via specific speaker assignments, in tandem with degrees of reverberation (Bates, 2010, pp. 131–132; Moritz, 2002). However, in spite of such rigorous process-based formalism, perceptual grouping factors may tend to dominate, especially given the basic spatialisation technique of alternating outputs via a limited array of five speakers (Bates, 2010, pp. 131–132; Smalley, 2000, p. 6). For example, Stockhausen’s later spatial work, Gruppen(Stockhausen, 1957), for multiple orchestras, focused explicitly on space’s contribution to perceptual grouping – (cf Bregman, 1990, pp. 293–302) – rather than space as a vehicle for simple formalism (Stockhausen, quoted in Moritz, 2002, cited in Bates, 2010, p. 133). Stockhausen (1975) later reflected on the problems inherent in a rigid adherence to serial processes without taking perceptual and environmental factors into account, advocating for more relationally based and empirically grounded approaches; see Bates (2010, p. 136–137 for further commentary).

Approaches such as the latter treat space more explicitly as a framing device. As discussed in Bates (ibid., pp. 115–126), this type of approach became important in the twentieth century for enhancing the perception of complex musical materials in the work of composers such as Charles Ives and Henry Brant. More recently, this perspective has been theorised by Emmerson (1994; 2007, pp. 97–102) as a relational space frame typology for musical activity within different spheres of a sound environment (either real or virtual, created or evoked by electronic or digital processing). Smalley’s (2007) account of space in acousmatic music (and environmental experience) also approaches the issue from the perspective of framing. In a similar fashion, Sterne (2012, p. 9) describes a set of common defining tropes within sound studies, such as “hearing is spherical, vision is directional ... hearing immerses its subject ... hearing places you inside an event [and] hearing brings us into the living world.” Here, spatial concepts contribute to the definition of hearing via concepts of framing, immersion, and relationships within a wider sonic environment. Our own approach follows Emmerson’s (1994, 2007); the delineation of different spheres of performative activity and the sonic responses: local (singular or foregrounded events/streams, connected with performer activity) and field (the wider environment) frames; see figure 1.

Figure 1 Local and field space frames, after Emmerson (2007, p. 98)

We believe that Emmerson’s ideas provide an intuitively accessible means of organising spatial control in systems design and creative practice. Furthermore, due to their environmental logic, we believe that they can be easily extended via a conceptual framework of spatial relations derived from current theories of embodied cognition (Brower, 2000, 2008; Johnson, 1987; Johnson, 2007; Lakoff, 1987; Solomon, 2007). These theories of embodiment provide a foundation from which to explore the integration of performance gesture, control structures, and the performance space within a compatible unifying framework (a shared gestural typology for space). As such, each stage of our performance system, from performance gesture, via control mapping, to output, is treated in terms of an environmental spatial logic, enhancing its potential for iterative development and extensibility as new processes and controls are added.

Design rationale, control structures, and spatial framing

Our spatial music performance system (or hyperinstrument) is based on an electric guitar with an individuated audio pickup for multichannel audio (one channel per string) output (Graham, 2012, pp. 102–109). Aside from the enhanced pickup systems, physical affordances were essentially unmodified: there were no “bolt-ons” of additional control surfaces or sensors (Graham, 2012; Graham and Bridges, 2013, 2014) on the guitar itself; see also Levitin (2002), which informed this perspectiveThe central motivation behind our approach is the intention to create a system that makes accessible connections between familiar performance gestures from the guitar (specifically, detected notes, note groupings, and note articulations) and the system’s spatialised output; see figure 2.

Figure 2 Outline of spatial music system, with parsing and spatialisation of multichannel audio feed

As a result, the multichannel audio feed becomes our primary source of control data, specifically focusing on the extraction and organisation of pitched (monophonic, note-based materials) materials. Note event information is parsed from the audio feed, including pitch class, spectral content, and note inter-onset (time between each note attack) data for each voice. This data is then integrated to provide macrostructures, most importantly pitch contours, which provide an intuitive initial vehicle for spatial mappings.

Video example 1 Basic parametric control, 6’20”

The note data is extracted in Pure Data (Pd) and applied to the control of first-order ambisonic spatialisation azimuth angle (direction)and distance parameters. The system’s pitch tracking obtains twelve pitch class divisions (pc0–pc11) in relation to a user-defined tonal centre (pc0) using the time-domain, “Specially Normalized AutoCorrelation” or SNAC-based [helmholtz~] object for Pd (Vetter, 2011). The first version of the system controlled azimuth direction based on a simple cyclical mapping, with pc0–pc11 mapped to 0–360 degrees. The distance parameters were mapped via a cognitive tonal model from Lerdahl (2001; Lerdahl and Krumhansl, 2007), discussed further below, mapping tonally central materials to a central (local) position and tonally peripheral materials to peripheral (field) positions. In tandem with a frontal perspective from stage-based amplification, the system facilitates a relational dialogue between local and field via the input tonal materials. More conventional guitar voicings will reinforce events that are localised to the local/stage frames. Less conventional voicings will activate a more pronounced off-centre spatial response (and more obvious/extended system responses), activating the field frame via a spatial array surrounding the audience (see figure 3). This outlines a basic relational dynamic to our use of space. Emmerson’s local/field distinctions and relationships are thus articulated through: (1) the local of the more conventional guitar materials via on-stage monitoring and (2) the field response, when the system’s diffused/spatialised responses predominate.

Video example 2 Basic parametric control 0’06”

Figure 3 Local and field frames as defined by the system’s responses

Embodied spaces and animated responses


Embodied pitch space and spatial mappings

Our approach seeks to match input performance gestures to spatial gestures in a manner that facilitates accessibility for the performer and clarity of results for the audience. Thus, we adapted Emmerson’s (2007) space frame typology as a means to provide a clear and intuitive way to present localisable and non-localisable musicals materials within a physical performance space. We initially sought to collate note event data (pitch classes) into a continuous macro-level melodic contour stream (0 to 1), which was then used to control the ambisonic spatialisation of each audio channel, reifying tonal structures through the relative spatial placement of the guitar’s individual voices. This approach established a somewhat intuitive narrative between real-time instrumental voicings and the spatial locations and trajectories of each audio output per string. Our model for the tonal-spatial mapping was then developed to accommodate the tonal pitch space theories of Lerdahl (2001; Lerdahl and Krumhansl, 2007), themselves based upon the cognitive studies of Krumhansl (1990). These theories provided a set of spatial dynamics that embodied cyclical positions (mapped to ambisonic azimuth angle) along with a tonal distance factor relative to tonal centre (mapped to ambisonic distance). As such, they provided a compatible, isomorphic base for mapping tonal structures to spatial forms.

Figure 4 depicts Lerdahl’s basic space, which arranges tonal materials into functional groupings: octave/root, triadic, diatonic, and chromatic spaces. These functional groupings within the basic space model formed the central part of our centre–periphery (local/field) spatial music mapping strategy.

Video example 3 Early spatial music performance tests 2’12” 

Root/octave and triadic materials provide a grounding central dynamic, with diatonic and chromatic materials activating the periphery. Such a spatial-relational mapping might be considered as a nascent embodied cognition perspective on Lerdahl’s model, creating anembodied tonal space (depicted in the second part of figure 4). This applies the embodied image schema theories of Lakoff and Johnson (Johnson, 1987; Lakoff, 1987; Lakoff and Johnson, 1980) to various components of the basic space model.

Figure 4 From a cognitively based model of tonal pitch space, after Lerdahl (2001), to an embodied tonal pitch space (highlighting verticality, cyclical, and centre–periphery or point-to-diffusion/dispersion schemas).

Image schemas are theorised as common patterns of sensorimotor activity, “imported” into higher-level cognitive functioning as some of the basic components of thought. More complex abstract models can be conceptualised as combinations of these embodied image schemas. Theoretical work on describing musical structures using these models has been carried out by Brower (2000; 2008) and Johnson (2007, pp. 235–262). Solomon (2007, pp. 291–301) discusses these models specifically in the context of spatialisation and spatial gestures in music; see also (Erickson, 1975, pp. 141–145) for some early theorising about sound via spatial frameworks. Wilkie et al. (2010) provide evidence to support the claim that musicians conceptualise musical structure in such a fashion. Some of the most musically significant image schemas are depicted in figure 5.

Figure 5 Some common image schemas which have particular relevance for music, after Brower (2008, p. 10)

For our purposes, the advantage of an embodied perspective on Lerdahl’s tonal model is that it highlights the relational dynamics of this formal model’s spatial structure (eg stable/unstable, centre/periphery, point-to-diffused). This perspective on Lerdahl’s model therefore already embodies ready-made mapping potential, as image schemas are inherently spatial-relational in origin, as they are based on common movements in an environment. Furthermore, such an embodied model has potential for extensibility whilst maintaining coherence. Other control modalities and intermediary control mappings for spatial music may be easily incorporated into this system, with coherence contributed to as long as they are structured around a compatible embodied/ecological base. This approach therefore provides a framework for the integration of a variety of control approaches beyond our initial design.

In terms of the embodied components we identify within Lerdahl’s model, Brower’s theories are particularly significant for the present work, drawing attention to verticality (grounded/stable to air/unstable), cycle and centre–periphery schemas present within tonality (Brower, 2000, pp. 335–336; 2008, p. 15). The tonal hierarchy “cone” may be modelled as a combination of a verticality schema with multiple cycle schemas comprising the different functional levels. The embodied coherence of these components is further reinforced in our work by referencing centre-periphery schema (Brower, 2000, p. 318; Johnson, 1987; Lakoff, 1987) through spatialisation. One of the benefits of Emmerson’s (2007) space frame typology is that it can be viewed as incorporating models of spatial containers (Brower, 2000, pp. 328, 336; Johnson, 1987; Lakoff, 1987; Solomon, pp. 291–296) – image schema terminology which relates to framing and grouping. As applied in the present model (see figure 6), it provides us with tonal space frames that conform to the previously presented local fielddynamic.

Figure 6 Tonal space frames: movement of spatialised voices from local/localised to diffuse/field positions due to functional positions in Lerdahl’s tonal hierarchy (eg triadic, diatonic, chromatic)

Animated pitch space mappings

In addition to our embodied perspective on the (fixed) basic space model from Lerdahl, the next stage of the system’s development saw the investigation of mappings that incorporated Lerdahl’s dynamic models of tonal syntax behaviours (Lerdahl, 2001; Lerdahl and Krumhansl, 2007). We applied these dynamic models of tonal attraction and repulsion to the animation of our tone space mappings, incorporating the boids flocking algorithm (Reynolds, 1987; Singer, 1997) for the control of our spatialised voices. Although this algorithm has previously been applied to spatialisation control (Bates, 2010), our innovation (Graham, 2012; Graham and Bridges, 2013; 2014) lies in its integration with Lerdahl’s dynamic tonal models. Lerdahl provides a series of models influenced by “traditional” formal cognitive models of tonal perception (Krumhansl, 1990). While these dynamic models are formalistic, they arguably incorporate some clear embodied concepts. Lerdahl treats musical syntax dynamics as analogous to forces, including a model for tonal attraction, which is based on gravitational attraction via an inverse-square law (Lerdahl, 2001; Lerdahl and Krumhansl, 2007). We adapted this embodied perspective on spatial structures by mapping dynamic melodic syntax data (such as attraction and inertia values) from Lerdahl’s tonal models to in-kind parameters (namely, attraction and inertia) of the boids flocking algorithm (Graham, 2012; Graham and Bridges, 2013; 2014). This controlled dynamic spatialisation effects relative to a specified central point within the ambisonic spatialisation field. In this setup, each boid controls the movement of a single voice from the guitar’s multichannel audio output, with strength of tonal attraction via the Lerdahl dynamic model reflected in the overall flock’s degree of centricity and attraction; see figure 7 and video example 4.

Video example 4 Initial iteration of melodic model

Figure 7 Boids mappings resolve tonal centre materials to spatial centre (providing “spatial closure” to accompany tonal closure) in the presence of a tonic resolution

These embodied force structures can be seen as conforming to inertia-to-attraction force dynamics, diffusion/dispersion-to-point image schemas, and spatialisation/diffusion based on local/field dynamics. On a basic level, the flocking behaviours essentially treat melodic forces as physical or behavioural forces. The mapping of melodic syntax to the flocking behaviour of the boids is outlined in figure 8.

Figure 8 Detail of melodic syntax model and boids mappings

Melodic syntax and misc. computations (note-to-note bases)Mapped to x flocking flight pattern “steering behaviours”

Melodic attraction/tensionData was mapped to the centre parameter of each boid, reflecting the relationships between stability and tension tone positions in a pitch class profile (default: Ionian (major), conforming to Lerdahl’s basic space structure; other modes may still be used with this profile, differences in pitch are articulated via centre–periphery mappings).

Implicative denialData was mapped to control the inertia of each boid, reflecting the denied attractional potential as differences in behavioural flight patterns.

Ratios of asymmetrical attractionData was mapped directly to the attract parameter, reflecting asymmetrical tonal attractions relative to a sequential tone position in a pitch class profile. High values will cause the boids to flock in attraction to a (tonic) point within a physical performance space.

TendencyData was mapped to control the speed matching behaviours of neighbouring boids. High values produce matched speeds in flight patterns in an attempt to reflect expectancy schemas through constancy (eg direction of melodic motion and tonal attraction relationships).

Average note onset time (ms)

Data was mapped to control the speed of the flight behaviours, establishing a narrative between instrumental phrasing and the speed of the dynamic trajectory adopted by the boids.

Pitch class distance

Data was mapped to control the acceleration parameter of each boid, reifying the notion that a listener may perceive smaller pitch class distances as occurring over a shorter time period and larger pitch class distances over a longer time period; see (Snyder, 2000, p. 12) for discussions of music and memory/timescales.

Key cases are illustrated in figure 9; see also video example 5. Further details of our application of the originating Lerdahl model can be found in Graham (2012, p. 130).

Video example 5 Initial iteration of melodic model, 13’16”

Figure 9 Dynamic boids ecological tonal-space for basic space (Ionian-Major) mappings; see also figure 4, for Lerdahl’s basic space model

Our investigation of Lerdahl’s tonal models has centred on the mapping of the resulting melodic syntax values to in-kind embodied image schemas and force-based metaphors (boids). As a result, abstract tonal structures are reified through dynamic spatialisation processes. In summary, the adapted models have informed the design and integration of performance controls for an accessible and dynamic spatial music performance system, maintaining coherence through a unifying embodied-relational framework. Furthermore, our application of Lerdahl’s force-based models to ecological behaviours provides an extension of the centre–periphery schemas via dynamic structures, which are compatible with Johnson’s (2007, pp. 235–262) theory of embodied force metaphors in music cognition, specifically his moving music and music as moving force metaphors.

estural affordances and extended embodied mappings for reinforcing spatial structures

Ancillary/accompaniment gestures and extended mappings

The extension and refinement of this system has been based on a combination of practice-led and theoretical developments. Our initial work provided an embodied spatial model and shared gestural typologyan embodied spatial model and shared gestural typology for developing connections between tonal and physical/acoustic spaces. The result provided a model that could facilitate the integration of more direct gesture tracking into this system’s control structures. As such, connections between tonal structures and embodied spatial domains could be further explored through the extraction of larger-scale non-sounded performer bodily movements: ancillary/accompaniment gestures (Cadoz and Wanderley, 2000). These movements, such as a change in central performer position (torso) or a movement of the guitar’s body or neck, may be conscious or unconscious parts of the performer’s creative practice. Such cases may be thought of as providing embodied accompaniments to resulting musical structures.

For this iteration, physical gestural data was obtained using a combination of the Xbox Kinect sensor, parsed via the Synapse application (Challinor, 2011), which provides values for velocity and acceleration in addition to coordinate sets for skeletal points in a three-dimensional Cartesian space. Although these are additional input modalities, they do not entail a laborious process of learning new instrumental affordances and techniques (as might be the case with the provisional of additional “bolt-on” controls). Rather, these types of gestures are already broadly accessible and familiar as by-products of established performance practices. Furthermore, the control structures act either as moderators or reinforcements of the established tonal-spatial control mappings.

Video example 6 Developing mapping strategies for spatial music performance: mapping skeletal data using Xbox Kinect and Pure Data, 0’32”

Figure 10 shows this iteration of the system. In this version, the physical movement of the performer provides position and acceleration data for various points of the body and guitar, which can be integrated with the rest of the system’s treatment of musical (specifically tonal) motion via force-based and spatial analogues. For example, in addition to boids acceleration parameters being controlled by computations of pitch class distance, direct bodily motion (as acceleration) can be applied to moderate the tonal acceleration value.

Figure 10 Integration of motion tracking for body movement with the system’s other physical motion and force/musical motion and force mappings

Exploring embodiment: dynamic spatial mappings of skeletal data

A number of fortuitous by-products of the joint-tracking process aided this wider exploration of embodied controls. Firstly, the guitar’s neck was reliably treated as an extended limb (see figure 11). Secondly, the nature of the tracking process implied that the guitarist’s picking hand would only be tracked by the system when making larger accompaniment gestures rather than more typical note-articulation gestures (picking/plucking, etc). As a result of these affordances of the technology, we are able to access two distinct gestural-spatial ranges: (1) small-scale physical gestures for note articulation and (2) more expansive bodily accompaniment/ancillary gestures. This combination facilitates the treatment of these gestures both in terms of clear delineation of function (ie separate functional mappings) whilst maintaining the overall holistic coherence of the integrated force–motion metaphorical mappings.

Figure 11 Skeletal tracking data from Synapse showing the treatment of the guitar neck as an extended limb and superimposed with key control mappings

In video examples 7 and 8, the performer’s torso position sets the central attraction point for flocking behaviours, allowing the performer to explore the notion of embodying an abstract musical concept (tonal attraction/centricity).

Video example 7 Developing mapping strategies for spatial music performance: mapping skeletal data using Xbox Kinect and Pure Data

Video example 8 Mapping strategies for embodied metaphors – improvised music examples

The velocity of the left “hand” (headstock of guitar) controls the flocking speed, directly linking the motion of the guitar headstock to the rate of the flocking behaviours within the speaker array (see figure 11)Intuitively, the performer can easily calibrate axial positioning of the left hand (fretting hand) with melodic tonal syntax when improvising, thus providing structurally informed spatial accompaniments via left “hand” (neck and headstock) motion. The velocity of the right hand (picking hand) was mapped to the following granular parameters: feedbackbuffer positiongrain reversal, and time variation. The velocity of this hand also controlled the avoidance and acceleration flocking parameters, allowing for the direct correlation between detached bodily movements outside of common instrumental practice to be linked to more aggressive sonic transformations (see figure 11). As such, more expansive physical gestures can be seen as spatial-performative correlates of more obvious and dynamic signal processing.

Reinforcing space through embodied signal process mappings

In a similar fashion, some additional granular mappings are derived from our embodied version of the Lerdahl basic tonal pitch space model (see figure 4). These mappings are designed to accentuate the spatialisation effects previously noted. Conforming to the centre–periphery/point-source-to-diffuse articulations, the effects of these mappings can be summarised as follows:

  1. Stable positions within the tonal hierarchy/basic space produce centred granular images across the horizontal plane, an application of high-pass filtering, and a decrease in grain gap size.
  2. Less stable positions within the tonal hierarchy/basic space produce a wider granular image across the horizontal plane, an application of low-pass filtering, and an increase in grain gap size.

The presence of more high frequency content in case (1) facilitates clearer spatial perception (centre/point source/integration), while case (2) results in diffused/dispersed perspectives. See video example 9.

Video example 9 Early spatial music performance tests, 2’11” 

An additional example of an embodied granular mapping strategy with spatial implications can be found in an additional note inter-onset time (rate) mapping from the note input. This mapping draws on a metaphorical equivalence between this note rate and granular densityand shorter grain durations. Broadly speaking, an increase in note rate causes more obvious granular effects (via greater granular density and shorter grain durations). In terms of audible spatial implications, the associated shorter grain duration parametric mapping also implies an increased “presence” of the system’s higher rate/tension response, as the shorter grains lead to increased noise and, hence, additional higher frequency content (making it easier to localise materials). In metaphorical terms, increased inter-note rate-effort events are mapped to greater activity density of the granulation process, leading to a greater sense of presence and spatiotemporal coverage; see video example 10.

Video example 10 Early spatial music performance tests, 5’30”

This particular mapping strategy reflects the theorised perspective provided by Johnson (2007, pp. 21–24) on the qualitative dimensions of movement; specifically, tension (embodying effort/amount of activity, and connecting with rate for repeated short actions); see also (Graham and Bridges, 2013; 2014).

mergent aesthetics and future development

Moving music/moving time mappings

One fork of the performance system experimented with extended structural mappings influenced by Johnson’s (2007, p. 248) moving music metaphor, whereby the listener’s metaphorical progression through a piece of music is based on physical movements. Johnson’s connection of musical structures with related spatial structures (eg paths of motion, cessation of motion, location of observer) is seen as mapped to musical temporal structures and implications (respectively: gestural contours, points of rest/stability/cadence and a sense of musical immediacy, ie “presence”/“the present”). This proposal of cognitive connections between temporal structure and spatial structure suggests a provocative question for designers of music performance systems: can spatialisation of materials be used to foreground embodied/ecological metaphors underpinning resulting musical materials? Johnson’s equation of “present” musical materials with the vantage point of a stationary observer suggests that spatial movements relative to this observer may be used to highlight temporal structural relationships embodied by musical materials. To examine this question, we must examine a number of motion-based metaphors (figure 12):

Figure 12 Musical motion metaphors, adapted from Johnson (2007, p. 248)

This model sees the tracking of new note articulations in order to spatialise each object from the front of the array (future) via centre (present) to the rear of the speaker array (past), modelling a temporal progression as the materials move past the listener. As notes are held, they are dynamically spatialised towards the rear of the array and are diffused through the application of filtering and delay effects. When an object reaches the rear of array, it remains (contained) in a static position until amplitude levels fall below a predetermined threshold, allowing the event to decay. A centre–periphery tonal schema is maintained through the use of a modification of the Lerdahl basic space model to provide distance factors for the relative lateral (x-plane) spatialisation. The direction of spatialisation on the horizontal axis is based on alternation of basic spaces around the centre point, thus being based on a relational rather than absolute model. It was considered that the previous cyclical schema did not enjoy a particular degree of cognitive salience and that a simplification on this basis would facilitate a more dynamic front-to-rear relational mapping strategy. This version of the system is depicted in figure 13; see also video examples 11 and 12.

Video example 11 Sounds and schemas: mapping metaphors

Video example 12 Sounds and schemas: moving music metaphor – improvised music examples

One particular example of the interplay between horizontal and front-back perspectives in this mapping can be found at 0’45” of video example 12. This illustrates how different positions within the tonal hierarchy contribute to dynamic connections within and between individual instrumental legato articulations. In this context, the re-articulation of a previously held note may lend a sense of cyclical structural return. Ostinato structures become points of relative spatial stability.

Figure 13 Spatial application of the moving music metaphor (Johnson, 2007)

As per the previous mapping strategies, more stable tonal materials, such as those in an octave/root space, are placed at the centre of the field frame. Less stable tonal materials, such as the chromatic space, are placed closer to the perimeter. Interpolation between axial points produced a favourable aesthetic-structural response, particularly during glissando string events. Rate-effort mappings (note inter-onset time and right hand velocity) were implemented to control the prominence (or feedback) of granular signal processes, providing an extra degree of performative gestural control for effects with spatial implications. These sets of mapping strategies provide the performer with greater control over the framing of the system’s tonal-spatial materials.

Centre–periphery schemas and potentials

The current developments of the system imply certain emergent aesthetic preferences and priorities that are common to each of the iterations and variants of the performance system. Space is treated as a framing parameter, an aspect of parametric organisation that supports and facilitates the perception of other musical structuring principles. All versions of the performance system incorporate centre–periphery relational structures (with accompanying diffuse/enveloped-to-point-source articulations of sound materials). The use of a hierarchical tonal model as a control structure for distance parameters provides an embodied modelling of competing musical currents. Furthermore, as dissonant materials are diffused towards the periphery of the array, apparent sensory dissonance effects (ie beating, roughness effects) may be reduced due to enhanced spatially based auditory stream segregation (Bregman, 1990, pp. 293–302); see also (Graham, 2010), which discusses the application of this principle in electric guitar performance via the use of multichannel audio feeds. Hence, in the present system’s treatment, dissonant tonal materials may be viewed as spatially contained via the framing of our centre–periphery processes.

In addition, the performer’s ability to change the centre location based on larger-scale bodily movements (relative torso position) maintains this relational clarity whilst providing an opportunity for the “central” spatial responses of the system to be dynamically controlled. However, this aspect of the system’s control may require judicious engagement on the part of the performer: sudden larger-scale bodily movements (rather than more progressive changes in the form of gestural accompaniments) may provide more of a jarring of perspective than a useful expansion of spatial development. With that said, a performer who is particularly aware of their relative positions within a performance space (and the relevant application tracking ranges) might derive some benefit from the exploration of this modality.


We have outlined an approach to the creation of spatial music via systems design and hyperinstrumental performance practice. We consider this design approach to be broadly consistent with perceptually informed spatial music practices and with the theories of Emmerson (2007), whereby the final system iteration is used to contribute to the perceptual delineation of emergent musical materials. Furthermore, our performance system’s application of a parsed audio feed to the control of spatialisation via a tonal model from Lerdahl (2001) can be seen as grounding more abstract and formal tonal concepts within an embodied performance environment. We believe that this approach provides an intuitive means of connecting tonal structures to centre–periphery relational dynamics in a spatial music performance practice. Further investigations led to an exploration into embodied/ecological potentials within Lerdahl’s dynamic model of tonal syntaxes (Lerdahl, 2001; Lerdahl and Krumhansl, 2007), resulting in the treatment of Lerdahl’s musical forces and movement dynamics as control structures for animated spatial mappings using the boids flocking algorithm (Bates, 2010; Reynolds, 1987). These animated mappings are used to apply melodic behaviours and related motion dynamics to in-kind parameters in the boids algorithm, which provides a more sophisticated environmentally based metaphorical mapping for the control of individual spatialised voices in real-time music performance.

The advantage of using embodied models as a means of contributing to rich yet coherent performance system responses is highlighted further through the integration of a motion tracking component to the system. The specification of additional controls and mappings via a unifying embodied framework allows for additional parameters to be added whilst maintaining maximal consistency and coherence, and hence accessibility for the performer. We consider these ecological/embodied models to be ripe for wider application in spatial music systems design and creative practices due to these integrating potentials. A variety of extensions to the signal processing side also have mappings that can be seen as consistent with embodied/ecological relations. For example, various types of rate mappings are widely applied to the density and dynamism of granular processing effects. Furthermore, some of the system’s audio effects chains are designed to accentuate spatial centre–periphery dynamics, impacting the localisation abilities of the listener. Finally, a more speculative mapping based on Johnson’s moving music and moving time metaphors was discussed, approaching spatialisation from the perspective of highlighting temporal progression as a set of tonal materials that flow past the listener on a front–back axis, relative to the envelope profile of each instrumental note event. One of the greatest strengths of the resulting system lies in its ability to integrate a wide variety of controls and creative outputs within a coherent but extensible framework. Overall, these types of spatialisation processes place the performer at the centre of an embodied creative space, establishing unique narratives between the image schemas underpinning instrumental theory and physical technique. As such, we hope that the theoretical principles underpinning some of our designs will suggest creative mapping possibilities to other practitioners.


Bates, E. (2010) The Composition and Performance of Spatial Music. PhD,

University of Dublin, Trinity College, Dublin.

Bregman, A.S. (1990) Auditory Scene Analysis: The Perceptual Organization of Sound. Cambridge, Mass.: MIT Press.

Brower, C. (2000) A Cognitive Theory of Musical Meaning. Journal of Music Theory, 44(2), pp. 323–379. DOI: 10.1215/00222909-44-2-323

Brower, C. (2008) Paradoxes of Pitch Space. Music Analysis, 27, pp. 51–106. DOI: 10.1111/j.1468-2249.2008.00268.x

Cadoz, C. and Wanderley, M. (2000) Gesture-Music. In Wanderley, M. and Battier, M. (eds). Trends in Gestural Control of Music. Paris: IRCAM. pp. 72–94. Available at: usic/ [Accessed 30/8/2014]

Challinor, R. (2011) Synapse. Software application. Available at: [Accessed 6/7/2014]

Emmerson, S. (1994) Local/Field: towards a typology of live electroacoustic music. Proceedings of the International Computer Music Conference 1994. San Francisco: International Computer Music Association. pp. 31–34.

Emmerson, S. (2007) Living Electronic Music. Aldershot: Ashgate.

Erickson, R. (1975) Sound Structure in Music. Berkeley: University of California Press.

Graham, R. (2010) The Effects of Polyphonic Technology on Electric Guitar Performance. Postgraduate Conference of the Society for Musicology in Ireland, Dublin Institute of Technology, January 2010. Available online at: [Accessed 30/8/2014]

Graham, R. (2012) Expansion of Electronic Guitar Performance Practice through the

Application and Development of Interactive Digital Music Systems. PhD.

University of Ulster, Northern Ireland.

Graham, R. and Bridges, B. (2013) Mapping and Meaning: embodied metaphors and non-localised structures in performance system design. In: Re-New 2013 Conference Proceedings. Available online at: [Accessed 30/8/2014]

Graham, R. and Bridges B. (2014) Gesture and Embodied Metaphor in Spatial Music Performance Systems Design. In: Caramiaux, B., Tahiroğlu, K., Fiebrink, R. and Tanaka, A. (eds) Proceedings of the International Conference on New Interfaces for Musical Expression 2014, Goldsmiths, University of London, July 2014, pp. 581–584. Available online at: [Accessed 14/7/2014]

Johnson, M. (1987) The Body in the Mind: The Bodily Basis of Meaning, Imagination and Reason. Chicago: University of Chicago Press.

Johnson, M. (2007) The Meaning of the Body: Aesthetics of Human Understanding. Chicago: University of Chicago Press.

Krumhansl, C. (1990) Cognitive Foundations of Musical Pitch. Oxford: Oxford University Press.

Lakoff, G. (1987) Women, Fire, and Dangerous Things: What Categories Reveal

About the Mind. Chicago: University of Chicago Press.

Lakoff, G. and Johnson, M. (1980) Metaphors we Live By. Chicago: University of Chicago Press.

Lerdahl, F. (2001) Tonal Pitch-Space. Oxford: Oxford University Press.

Lerdahl, F., Krumhansl, C. (2007) Modeling Tonal Tension. Music Perception, 24(4), pp. 329–366. DOI: 10.1525/MP.2007.24.4.329

Levitin, D. J. (2002) Control parameters for musical instruments: a foundation for new mappings of gesture to sound. Organised Sound, 7(2), pp. 171–189. DOI: 10.1017/S135577180200208X

Moritz, A. (2002) Stockhausen: Essays on the Works. [Online] Available at: [Accessed 14/7/2014]

Reynolds, C. (1987) Flocks, Herds and Schools: A distributed behavioral model. SIGGRAPH Comput. Graph. 21(4), pp. 25–34.

Singer, E. (1997) Boids for Max. Software library. Available at: [Accessed 6/7/2014]

Smalley, J. (2000)  Gesang der Jünglinge: History and Analysis. In: Masterpieces of 20th-Century Electronic Music: A Multimedia Perspective [Online]. New York: Columbia University Computer Music Center.

Available online at: [Accessed 14/7/2014]

Smalley, D. (2007) Space Form and the Acoustic Image. Organised Sound, 12(1), pp. 35–58. DOI: 10.1017/S1355771807001665

Solomon, J. (2007) Spatialization in Music: The analysis and interpretation of Spatial Gestures. PhD. University of Georgia, Athens, Georgia.

Sterne, J. (2012) Sonic Imaginations. In: Sterne, J. (ed.) The Sound Studies Reader. Abingdon, Oxford: Routledge, pp. 1–18.

Stockhausen, K. (1956) Gesang der Jünglinge. Electroacoustic composition. Available on: Stockhausen Edition 3. [Audio CD]. Kürten, Germany: Stockhausen Foundation for Music.

Stockhausen, K. (1957) Gruppen. [Composition]. Available on Stockhausen Edition 5. [Audio CD]. Kürten, Germany: Stockhausen Foundation for Music.

Stockhausen, K. (1975) Music in Space. (Trans. R. Koenig). die Reihe, 5, pp. 67–82.

Wilkie, K., Holland, S., Mulholland, P. (2010) What Can the Language of Musicians Tell Us about Music Interaction Design? Computer Music Journal, 34(40), pp. 34–48. DOI: 10.1162/COMJ_a_00024

About the authors: 

Richard Graham is a guitarist and computer musician based in the United States. Graham has performed across the US, Asia, UK, and Europe, including festivals and conferences such as Celtronic and the International Symposium on Electronic Art. He has composed music for British and US television, recorded live sessions for BBC radio, and has authored music for the popular video game, Rock Band. Ricky was an artist-in-residence at STEIM in 2010, where he developed the first iteration of his performance system for multichannel guitar. He received his Ph.D. in Music Technology from the University of Ulster in 2012 and he is now an Assistant Professor of Music and Technology at Stevens Institute of Technology in New Jersey. His most recent paper on performance systems was presented at NIME 2014 and his most recent musical work, “Nascent”, was released on Fluttery Records in 2012.

rian Bridges is a composer and music technology researcher from Dublin, Ireland. He is currently based at the University of Ulster, Northern Ireland, where he has been Lecturer in Creative Arts/Creative Technology since 2008. His research interests lie in the connection between theories of auditory perception and cognition and creative practices and systems designs. His creative work spans the fields of sound installation and audiovisual practices and electroacoustic and instrumental music. He is a founder–member of the Dublin–based Spatial Music Collective and his compositions have been programmed at festivals in Europe, the Americas and Asia. Brian is a graduate of Trinity College Dublin (MPhil. in Music and Media Technologies) and the National University of Ireland, Maynooth (PhD on microtonal music) and he has also undertaken private studies in the US with Glenn Branca and Tony Conrad.