Monday, 8 June 2015

BA Hons Creative Sound Production - Dissertation


CAN GAME AUDIO BE SPECIFICALLY DESIGNED AND IMPLEMENTED FOR THE PURPOSE OF LEADING THE PLAYER INTO AN IMMERSIVE EXPERIENCE?
GARY MAIN























University of Abertay Dundee

School of Arts, Media and Computer Games

May 2015





University of Abertay – Dundee

Author:                          Gary Main


Title:                               Can game audio be specifically designed and implemented for the purpose of leading the player into an immersive experience
                   
Degree:                          BA Hons – Creative Sound Production


Year:                              2015



Date:                              1st of May 2015


Abstract
This purpose of this dissertation is to investigate the impact audio has on the immersive nature of computer games, whilst also determining whether its effectiveness can be pre-designed and measured. The author uses knowledge form the surrounding literature to define the characteristics required to implement audio into a working level which is designed to produce an immersive experience. This experience is measured using the established GEQ (Game Experience Questionnaire) to determine whether the key design features implemented during production prove effective in immersing the participants. Information extracted from the results is however limited due to their only being one level to analyse. Original plans for a second level for A/B testing were scrapped due to unexpected technical difficulties during the implementation stage.





Table of Contents



1. Introduction………………………………………………………………………………..1








LIST OF FIGURES

[g:1] Graph - GEQ Test result

[g:1.1] Graph – GEQ Questions relating to the Sensory and Imaginative Immersion dimension

[g1.2] Graph Showing average results from player immersion ratings on specific audio design elements

[1.0] IZEA Framework outline.                                   

[2.0] Game Experience Questionnaire – Core Module Results

[2.1] Game Experience Questionnaire – In-Game Module

[2.3] Supplementary Questions – Immersion Ratings on Design

[f:1] Illustration of IZEA Model

[f:2] Illustration of IZEA Model with general design properties









1. Introduction

The term immersion or immersive is often used by gamers when describing games and is often associated with the quality of a game, but what exactly does it mean to be immersed in a game and what defines an immersive experience? Most importantly, what role does audio play in immersing a player?
Throughout a review of literature surrounding the term ‘immersion, this dissertation investigates its meaning within the context of playing video games. Current definitions of the term ‘immersion’ will be discussed in detail through an analysis of research carried out by practitioners in the field of game study. This overview will then bring into focus the aural aspect of immersion and the research which is directed at this specific side of game design. Case studies of existing games will also be presented with the aim of highlighting key design elements which support the importance of audio for immersion. The literature review alongside the case studies form the rationale and basis on which the level design and implementation were conceptualised and implemented as will be discussed in detail. Testing methods as well as results and analysis will follow before coming to conclusions as to whether audio can indeed be designed for the purposed of creating an immersive experience.

 




2. Literature review


2.1 What is Immersion?

A strong foundation for the design and production of an aurally immersive experience is first of all to provide a more definitive explanation of term ‘immersion’. Any gamer or developer will know and often use the term but there is much debate on both its definition and measurability. Even in the academic arena there are many theories and opinions, as highlighted by Grimshaw, Charlton and Jagger (2011) as they state “Academics, equally, have wide interpretations of the term and there is, as yet, no widely established model providing a definition and describing the process” (2011, p.29). What exactly is immersion and what does it mean to be ‘immersed’ within the context of computer games? What is the process of becoming immersed in a computer game? Also, and critically important to this project, is what role does audio play in the immersive nature of games and how can it be measured? This literature review aims to answer these fundamental questions as well as to provide the rationale and framework for actions taken to complete the project.

Janet Murray describes immersion in a broad sense of the definition as;
“Immersion is a metaphorical term derived from the physical experience of being submerged in water. We seek the same feeling from a psychologically immersive experience that we do from a plunge in the ocean or swimming pool: the sensation of being surrounded by a completely other reality, as different as water is from air, that takes over all of our attention, our whole perceptual apparatus” (Murray, 1997, p. 98)

Dovey and Kennedy describe immersion in a similar fashion by adding that it is “the experience of losing a sense of embodiment in the present whilst connecting on a meditated environment” (Dovey and Kennedy, 2006, p. 146).
Brown and Cairns (2004) highlight that the term immersion, or immersive is used frequently to describe games throughout the gaming community but add that the term is also used without a coherent explanation (Brown and Cairns. 2004, p.1). Other terms have also been used to describe the immersive state such as ‘Incorporation’ (Calleja, 2007) and ‘Presence’ (McMahan, 2003). Jennett et al (2008) also mention ‘Presence’ as well as other concepts such as ‘Flow’ and ‘Cognitive absorption’. However they argue that “immersion is clearly distinct from these established concepts and a better understanding of immersion would be crucial in understanding the relationship between people and videogames” (Jennett et al, 2008, p. 642).

2.2 Stages of Immersion

A closer look at the theories behind game immersion provide a more unique analysis, whilst narrowing down the broad range of terms and descriptions that have been presented thus far. Laurie Taylor (2002) expands the definition in a gaming context by splitting it into two separate stages of engagement “diegetic immersion, where the player is immersed in the act of playing the video game, and as intra-diegetic or situated immersion, where the player is immersed in playing the game and in the experience of the game space as a spatial and narrated space” (Taylor. L. 2002, p. 12). The idea of different stages of engagement is shared by Brown and Cairns (2004). Their study tries to define immersion within the context of games using a ‘grounded theory’ approach (a method of qualitative research developed by Strauss and Corbin (1998)) by analysing interviews with seven gamers about their experiences when playing games. Much like Taylor (2002) their results define immersion as a process of stages of engagement with the game which, “moves along the path of time and is controlled by barriers” (Brown and Cairns. 2004, p. 2), only here three stages are suggested.

The three stages of immersion described by Brown and Cairns are ‘Engagement’, ‘Engrossment’ and ‘Total Immersion’ with the latter also being described as ‘Presence’. The barriers describe gateways to each of the stages and by opening these the player can move forward through these stages of immersion. For example the first stage is ‘Engagement’, this is where the player must first engage with the game, so a potential barrier might be that the player does not like the genre or style of the game, thus creating a barrier for further immersion to take place. As stated by Brown and Cairns “To lower the barriers to enter this level, the gamer needs to invest time, effort, and attention” (2004, p.2).
The next stage of immersion, ‘Engrossment’ is described by one interviewee as
“A Zen-like state where your hands just seem to know what to do, and your mind just carries on with the story.” (Brown and Cairns. 2004, p.3).
Brown and Cairns link this stage to game construction and the players’ emotional connection, as well as respect for the design work put into it. Again there is a barrier to this stage of investing time and effort to become emotionally involved with the game, but also design flaws such as poor visual/aural feedback or even unrealistic gameplay are other factors of the game construction which could present a barrier to further immersion. These factors could lead to player disengagement and bring up barriers against progressing to the final stage of immersion described by Brown and Cairns.
Brown and Cairns describe the last stage of immersion, ‘Total Immersion’ as ‘Presence’, this is where the gamer feels completely detached from the real world around them and are fully focused on the game. The barriers to ‘Presence’ that are suggested lay with empathy and atmosphere. In this sense empathy is defined not only as the players’ empathetic feelings towards game characters or situations but as the “growth of attachment” to the game itself (Brown and Cairns. 2004, p.3). Interestingly the majority of games described by the gamers’ as ‘Totally Immersive’ were of a FPS (First person shooter) model. Perhaps due to the direct visual and aural illusion of seeing and hearing from the characters perspective gives them a greater feeling of ‘Presence’ within the game world. Grimshaw, Charlton and Jagger (2011) also make a link to player perspective when discussing FPS game elements which contribute to immersion, “Aiding the illusion of being present within the gameworld displayed on the screen, the FPS game typically posits the player with a first-person perspective[…]” (2011 paragraph. 8).
Brown and Cairns also describe three elements of player attention which are linked to the immersion process: visual, auditory and mental. They go on to suggest that the more of these attention aspects are pulling on the players’ senses, the greater involved with the game the player will become. “If gamers need to attend to sound, as well as sight, more effort is needed to be placed into the game. The more attention and effort invested, the more immersed a gamer can feel.” (Brown and Cairns. 2004, p. 3).

Ermi and Mäyrä (2005) also pick up on the concept of stages of immersion by presenting the SCI-model which links aspects of game design to immersion and highlights their influence on the process. This model identifies three aspects of game design which have been drawn from an analysis of interviews with gamers regarding their playing experiences, it is then concluded how much these aspects impact the immersive quality of games. The design aspects are labelled ‘audio-visual quality and style’, ‘level of challenge’ and ‘imaginary world and fantasy’ which are linked to the core values of the SCI-model; Sensory immersion, Challenge based immersion and Imaginative immersion respectively (Ermi and Mäyrä 2005, pp. 7-9). The results from Ermi and Mäyrä (2005) as well as Brown and Cairns (2004) both acknowledge that audio play a role in the immersion processes they describe but neither study makes direct links with audio.

2.3 Immersive Audio

Collins (2008) picks up on the theory of Ermi and Mäyrä (2005) with a more direct stance on game audio by suggesting that the ‘imaginary world and fantasy’ element of the SCI-model “is strongly enhanced by audio” (Collins 2008, p.134). Collins also talks about the link between immersion and audio in more general terms by adding “The illusion of being immersed in a three-dimensional atmosphere is greatly enhanced by the audio” (Collins 2008, p. 132)

Sander Huiberts (2010) makes a more explicit connection between immersion and audio in a thesis which examines their relationship from a conceptual design standing. In his thesis, which was largely based around an analysis of user surveys (Creative Heroes 2007), he comments on the SCI-model of Ermi and Mäyrä (2005) by stating “audio is capable of enhancing the three dimensions of immersion by enhancing the sensory connection, the feeling of flow and the feeling of empathy of the player” (Huiberts. S 2010, p. 101).
The classification and typology of audio in games is expressed in a number of ways by practitioners in the industry, each structuring game audio content into meaningful partitions with regards to functionality. Huiberts describes some of the different frameworks that have been developed which can be used to help structure game audio content, but also points out some inconsistencies relating to each framework and makes it clear that the subject of game audio needs a more coherent model (2010, pp.15-20). The IZEA (Interface, Effect, Zone and Affect) framework (Huiberts. S 2010, pp. 20-35) is presented as tool to categorise and analyse the functioning of the different aspects of game audio (further details of the IZEA model can be found in appendix [1.0]). Huiberts uses the IZEA framework in conjunction with existing theories on game immersion such as those from Ermi and Mäyrä (2005) and Brown and Cairns (2004) to highlight the role of audio (from a design perspective) in the process of player immersion. He also highlights two main functions of game audio used to enhance user experience. These are the use of audio to both ‘optimise’ and ‘dynamise’ gameplay. Optimise in this sense is defined as sound which helps the player interact with the game to enhance usability. Audio which is used to dynamise gameplay, helps to make the game more exciting and enhance the experience according to Huiberts (2010, p.29). It is also noted that audio can be designed to both optimise and dynamise gameplay as in the example given “[…]an event which presents information, such as the sound of a weapon is designed for the communication of important information about the weapon itself and the Activity of the game as well as making the experience more exciting” (Huiberts. S 2010, p.30). Fundamentally the IZEA is based on the three elements of immersion in the SCI-model (Ermi and Mäyrä 2005) and is claimed to be conceptual design tool which will enhance immersion in those areas through the use of audio.

Grimshaw, Charlton and Jagger (2011) also recognise the importance of audio and its role in becoming immersed when describing FPS games, by highlighting that audio is not limited to the visual restrictions of the game monitor or television screen. Sounds can be spatialised within the game so that the player can essentially hear what is going on all around them. They state, “[…]the visual space is restricted to what can be seen with stereoscopic vision, the aural space is unrestricted with reference to the position of heard sound sources” (Grimshaw, Charlton and Jagger, 2011. paragraph. 10). They also add that the use of headphones can ‘monopolize’ the players’ aural senses by restricting sound external to the gameworld. They too acknowledge the ongoing debate regarding the term immersion but draw two frequent similarities found in explanations of the term. These are that the state of immersion is linked to a physical disconnection of external (outside the game) surroundings, as well as loosing track/sense of time (Grimshaw, Charlton and Jagger, 2011.) Both of these similarities are a result of the game holding the players’ full attention. With this in mind, the argument could be made that, games which are widely accepted as immersive should have a core function within the design to continually draw the players’ full attention.

Grimshaw, Charlton and Jagger (2011) make a link between player attention and immersion with reference to the work of Brown and Cairns (2004, p. 3), as mentioned previously but also suggest links with elements of game design. They indicate that one aspect of player attention is linked to learning and automating game functions such as controlling the character, level layouts, how to attack, defend or tactics for example. They argue that these functions of the game will be stored in the players’ mind and over time will become automated, requiring less player attention. If the player is subjected to a change of these learned game functions (such as encountering a new enemy or finding a new weapon) then the player is forced to invest more attention in order to adapt. Grimshaw, Charlton and Jagger also point out that “If certain actions were unable to become learned or automated in some way, game playing would be a laborious task and little progress would ever be made” (2011, paragraph. 20).
They go on to highlight specific design elements which contribute to immersion in FPS games such as ease of player control. Again this is tied to the idea of the player automating game functions. In this case it is argued that if the player can automate the control system quickly and easily it will allow their attention to be directed to becoming progressively immersed within the gameworld. In short, the less the player needs to fixate on an external controller or keyboard commands the more attention they can invest in the gameworld. Other design considerations suggested by Grimshaw, Charlton and Jagger (2011) for optimal immersion with FPS games is a need for balance in level of challenge. This usually starting off at low level whilst increasing in difficulty as the game progresses and the player begins to automate common game functions. Also highlighted as vitally important features for immersion is for the player to empathise with the character they are controlling, as well as the gameworld atmosphere and environment. An aspect also pointed out by Brown and Cairns in which to achieve ‘total immersion’ (2004, p.3).

The term immersion has still to be explicitly defined, as just shown throughout the literature surrounding it. However, a clearer understanding of the term has been presented with regards to the context of this project.



3. Case Studies – Immersive games

As part of the research and development of the game level created for this project it was important to dissect some existing game titles of a similar nature. The main objective of this line of research was to identify specific sound design and implementation techniques used by practitioners within the industry which contribute to an immersive experience. This as well as a general observation of game design as a whole in the role of immersive gameplay.

3.1 Amnesia - The Dark Descent

Outline
Amnesia (Frictional Games 2010) is set in the darkness of Brennenburg Castle where the player takes control of Daniel, who has just awoken with a severe case of amnesia. The player must navigate through the labyrinth of hallways and puzzles to slowly reveal the hidden truth about who he is. With no weapons the player must use shadows and hiding places to avoid detection from the monsters lurking in the castle. The drawback to hiding in the dark is that Daniel begins to go insane, with an active ‘insanity meter’ giving the player feedback.

Immersion and Audio
Even before the game begins the developers have tried to set the scene and put the player in a specific mind set. On starting a new game the player is instructed to adjust brightness levels, play in a dark room and to wear headphones to optimise playing experience. Finally it is added that "Amnesia should not be played to win. Instead, focus on immersing your self in the game world and story" (Frictional Games 2010).
The use of headphones as pointed out previously by Grimshaw, Charlton and Jagger (2011) can be instrumental to ‘monopolize’ the players’ senses. In the case of amnesia perhaps, arguably even more so due to the use of binaural sound within the game.

Typical gameplay is within a dark environment where the player is forced to hide or run rather than fight. This type of darkened environment setting plays into the hands of audio as the player must rely more on auditory information to navigate and detect threat. The player must sneak and hide whilst actively listen for aural feedback, some of which is binaural as already mentioned. It could be argued on the basis of findings from Grimshaw, Charlton and Jagger (2011) that this type of game setting and atmosphere exaggerates the players’ attention to aural senses whilst keeping the player on an aural alert. Thus resulting in ‘barriers’ (as suggested by Brown and Cairns (2004)) being removed to allow further immersion to take place through the use of audio. The music also plays a crucial role in setting the atmosphere and tone of Amnesia by delivering the player a familiar schema which conjures up preconceptions of the horror film genre. By making these connections the player then has an understanding of and familiarity with the kind of setting they should expect from the very first opening sequence of the game. Tension, fear and the unknown are all phrases that sit well with the music of Amnesia.

As highlighted by both Brown and Cairns (2004) and Grimshaw, Charlton and Jagger (2011) to empathise and make a connection with the game character and atmosphere is vital for immersion. A prominent feature of Amnesia which helps deliver a strong player attachment to the game character is the breathing mechanism. The character can be heard breathing which works dynamically by being linked to player actions and game events such as being under threat. Although at times repetitive, this method is very effective at conveying the mood of the character whilst forcing the player to tune into the characters breathing pattern and away from their own.

Unfortunately the most disturbing aspect of the audio design in Amnesia was not the creepy or horrific sound design, but the inconsistent spatialisation of certain sounds leading to extreme panning issues. Not all but some sound sources pan from one ear to the other (on headphones) very abruptly depending on player perspective and rotation (as extreme as 100% signal from one ear to the other). This unnatural spatialisation of sound is arguably the games' biggest flaw which presents a barrier for immersion. Notably the story and visual aspect of the game contribute to immersion so the question was, can audio lead this kind of experience or will it always just be one part of the immersion process? 

3.2 The Last of Us

Outline
The Last Of Us (Naughty Dog) is set in a post-apocalyptic world 20 years after being subjected to a global outbreak of a deadly fungal parasite, with the infected turning into zombie like creatures. The Hollywood style survival story revolves around two main characters. Joel, a middle aged man who witnessed the murder of his young daughter during the outbreak, and Ellie, a young girl who is immune to infection. The main plot revolves around Joel trying to take Ellie on a dangerous journey across the country to try and find a cure for the infection. Not only do they need to be concerned about the threat of infection and the creatures it breeds, but other survivors are just as much of a threat to their own survival.

Immersion and Audio
Much like Amnesia the player must sneak around undetected for much of the game if they wish to live. In order to be effective at moving around undetected the player can use a focus function (linked to a button press) where they are essentially listening for threats. When entered, the focus mode sends the bulk of in game audio through a low pass filter, with threat sounds such as enemy noises left largely unaffected. This works in combination with a visual effect of everything but moving objects being slightly blurred and out of focus. This feature allows the player to dynamically tune into threat sounds as and when they feel it necessary, much like in a real life scenario where one can subjectively listen for a specific sound. This is especially effective given the distinctive clicking sound made by zombie like creatures. The 'clickers' (as referred to in the game) have no vision and use this sound as an echo location mechanism, this also means that the player can be detected if any sudden movements are made near a ‘clicker’ This distinctive aural gameplay function is an essential driver of the survival mechanism throughout the game given that the main threat is usually heard before seen. The focus feature at its core is a device which draws on more of the players’ aural attention, as well as make the player empathise with the characters situation. Although not in a first person perspective there is still a strong player to character attachment which is bolstered by the focus feature as well as a convincing breathing system. Much like Amnesia, the use of audio and the fact that the player must actively listen (or pay the price) throughout much of the game is arguably a strong contributor to the games immersive nature.

The use of space is also a strong point in the game. Room spaces and tonal changes from one room to the next are highly audible. This is especially evident when in focus mode where the player can hear enemy sound from behind walls. The spatialised presence of enemies along with realistic room acoustics solidifies the players 'presence' within the game. This is arguably one of the key immersive gameplay factors in the game as the player is essentially forced to tune into the character hearing system.

Two key design features of The Last Of Us are, utilisation of the players’ senses and an effective balance of gameplay, including intense periods of survival action alongside periods of exploration in relative safety. In this case as in most AAA rated games the storyline is also a major hook to get the players attention. Perhaps the biggest attachment from player to game lays with this element, which in the case of The Last Of Us is exceptional and even received a host of awards including five BAFTAs’, notably for audio use and best story, to name but two. That being said, as impressive as the audiovisual and story of 'The Last Of Us' is, it would stand for nothing if the gameplay experience did not match the expectations set by these elements.





4. Methodology


4.1 Production

A combination recorded sound (specifically for the project) and pre-existing recordings from the authors sound library were edited and exported for the test level using the AVID Pro Tools DAW (Digital Audio Workstation).

The UDK (Unreal Development Kit) game engine was utilised for the creation of the test level. This choice of software was largely based on the approachability of UDK with regards to its visual based coding system within the editor (Kismet). Due to the authors’ lack of expertise in relation to computer coding, UDK provided a workable environment in which to create the testing level. Initially there was to be two versions of the level created for the purposes of comparing the different sound design elements with regard to their immersive potential. However due to an underestimation of the time required and also the technical challenges of designing and implementing game audio, only one version was created.

4.2 Testing Methods

The core questionnaire module from the GEQ (Game Experience Questionnaire) (Ijsselsteijn et al, 2008) was used to assess the test level as it was specifically developed to measure player experience. Use of the GEQ can also be found in similar studies in which player experience is measured such as those by Nacke and Lindley (2008) and Nacke, Grimshaw and Lindley (2010).    
The GEQ core questionnaire module is aimed at analyzing the players’ feelings during gameplay and are divided up into the following seven dimensions.
‘Immersion’, ‘tension’, ‘competence’, ‘flow’, ‘negative affect’, ‘positive affect’, and ‘challenge’, which address different aspects of player involvement with a game. The questionnaire is made up of a series of statements regarding the players feelings, from each statement the players indicated how much they agreed with each statement on a scale of 0 – 4 (0= Not at all, 4= Extremely). Included with the core GEQ module is an In-Game version which consists of the same questions but in a more concise format to assess players during play. The act of stopping and starting play to answer questions was not in the interests of inducing immersion, however the In-Game version was included at the end along with the main questionnaire. The point of this was to try gather a more accurate average by asking similar questions twice and compiling the results into one single data set.
Supplementary to the GEQ another section was added which was more directed towards the audio elements of the test level. The players were required to rate each listed aspects of the game sound design based on how immersive they were (from not at all immersive=0, up to, completely immersive=4). As part of this section a short explanation was provided on what the authors’ definition of immersion was. This supplementary section was aimed at providing some additional evidence which could be correlated with the responses from the GEQ.


4.3 Testing Conditions

Each player was instructed to wear headphones whilst playing the game to maximize audio impact. As picked up by Grimshaw, Charlton and Jagger (2011) the use of headphones can ‘monopolize’ the players’ aural senses. This was essential in order to minimise external distractions (sound from the testing area) which could introduce a ‘barrier’ (as discussed previously in the works of Brown and Cairns (2004)) to the player becoming immersed. The technical challenges of audio programming and game scripting (which will be discussed later) forced the project to run behind schedule. As a result, little time remained to carry out actual testing of the game which in turn led to less than optimal conditions. This was with regards to the PC used and the location of testing which were carried out on a laptop within the audio lab (Whitespace, Abertay).




 

5. Pre-Production


5.1 Level Overview

The initial intentions of the project were to specifically design and build the game level using the tools and assets that come with UDK. However this was not the primary objective, as well as being a task which would have taken valuable time away from audio development. A custom level was outsourced for the purposes of implementation and testing of audio assets. The level selected to work with (credit to level designer Chris Holden) is set in a dark dungeon which is mainly lit by burning torches mounted on walls and occasional openings in the roof. The level also had a basic gameplay systems already in place including objectives and tasks. The player (in first person perspective) must navigate and escape the dungeon by finding three hidden relics which then have to be placed into a stone alter in order to open the final door. Along the way the player can also pick up hidden keys which open treasure chests throughout the level. The basic level matched the required style which would enable a good starting point for implementing an immersive experience. This style requirement is linked to the findings presented within the literature review which highlight the immersive potential of FPS (first person shooter) type games. This as well as the fact that the dungeon is poorly lit which will force the player to rely on their other senses.
The level included a degree of challenge by the way of exploration, finding hidden objects and working out how to open doors. There was however no enemies or threat to the player of any kind and although visually impressive was not very immersive as it stood. The main objective was to improve the immersive quality of the game through the use of audio.

5.2 Level Design

Genre
The overall design of the level with regards to appearance, sets it firmly in the zone of a horror/suspense/adventure genre. The visual design was capitalised to present a sound design which is both familiar and convincing to the player. The sound is primarily aimed at reinforcing the horror and suspense theme, but with the gameplay also involving exploration, treasure and finding hidden items it incorporates an adventure element too.

6. Designing for Immersion


6.1 Game Character

To enable a strong link between the player and character a breathing system was developed which works dynamically as the player moves around, such as heavy breathing when running. To enhance this system further the breathing sounds were recorded binaurally with the use of specialist in ear microphones. The primary aim of this was to capture the recorded breathing sound as it is heard by an individual. Played back through headphones this results in sonic characteristics within the breathing sound which give the player a sense that the breath is coming from them, thus fabricating an empathetic bond between the player and character. The importance of which for immersion was highlighted through the work of Grimshaw, Charlton and Jagger (2011) and Brown and Cairns (2004) in the literature review.
Also important to this bond is character footsteps which are notoriously repetitive in many games. The repetitive nature of footsteps presented challenges by trying to balance variation with system memory usage as the inclusion of too many individual sounds within each footstep cue resulted in playback issues. The level floor is made up of a variety of materials so variation between those helps to break up any long spells of the same sound playing back. One of the key factors was keeping the footstep sounds consistent when the player transitions between each material, this combined with variation was aimed at maintaining a balanced sound which did not break the players’ attention (essentially breaking immersion).


6.2 Creating a Threat 

The presence of a threat was essential for setting tension and atmosphere as well as dynamising gameplay. Physically placing a visible foe of some description in the level was a consideration, however without the assets for a suitable enemy (which fits the aesthetics of the level) such as some kind of monster character, this was not an option. More suitably for the aim of the test level there was a sense of threat introduced through the use of audio. There are various elements contributing to the illusion of threat within the level which will now be discussed in more detail.
The underlying implication that there is a threat to the player is instigated from the very beginning of the level. The first instance sits within the atmospheric bed of sound that creates the setting which plays on the users’ already established schema of the horror film and game genre to inform them about the kind of situation that may follow (setting the scene). Within the atmospheric layers of sound contains a randomly generated leitmotif sound device which is aimed at priming the player to associate this sound with the threat. The Use of a leitmotif is summed up by Kalinak, “[…]leitmotifs heightened spectator response through sheer accumulation, each repetition of the leitmotif bringing with it the associations established in earlier occurrences” (Kalinak, K. 1992. p.104). The actual raw sound used in this situation was made by impacting and scraping a disused grain bin in a farm building. The sounds from this particular recording session are the base of all threat sounds within the level, from the looping atmospheric background to the intense music (which will be discussed later).  
The physical presence of a threat was created by placing sounds imitating a roaming beast within the level (this time supported by vocal elements) which functions dynamically, by moving around the level, and spatially by the sound having an identifiable source as it moves. As the player passes through hidden trigger points, the beast sound will be perceived spatially and with directionality by the player, indicating that something is closing in to their location. This enforces the idea that the player is not alone, thus pushing them into a state of awareness and importantly for immersion, drawing and keeping their attention on the gameworld.

Camera actors within UDK were used to create a cut scene in which the players’ visual perspective is switched as if they are suddenly viewing through the eyes of the threat. This camera sequence is triggered by the player picking up one of the relics and proceeds to show the threat moving quickly towards their location. This sequence is accompanied by music which first of all, during the cut scene contains a sound effect driven track of distorted noise and the signature base metallic leitmotif sound. These are synchronised to the movements of the threat which ends in a peak of the dynamic range before returning to the players’ view. This is where the main threat music cues in, signifying to the player that the threat is no longer just a suggestion.

6.3 Music

The main threat music was composed almost entirely with sounds recorded from the grain bin session mentioned earlier. The raw recordings were manipulated using pitch shift, delay and reverb to create a core rhythmic section which was supported by additional drums. The music was created in sections which reflect different intensity levels, these different levels then correspond to the current state of the player in game. When the cut scene ends and switches back to the player perspective, the music is at its highest intensity in a continuous loop. The loop remains in this state until the player escapes through a gate which they must open by operating a lever and reaching the gate before it slams closed again. After this trigger point the music changes state to a less intense version which gives the player a sense that they have progressed and are in less danger. As the player leaves this area another trigger point switches the music to a low intensity version, which eventually fades out returning to the original ambience.
A slightly different version of the high intensity music was implemented in another area within the level to create another scare point. In this situation the player walks past an iron gate which proceeds to open revealing a hidden key located within a small room. A sequence was created so that as soon as the player picks up the key the gate slams behind them whilst the high intensity music sharply cuts in. This as well as the roaming beast sound mentioned before is triggered and can be perceived as moving within the halls on the other side of the gate. This action is also linked to a post processing volume which is an area defined within the level editor of UDK to act as a type of camera effect. In this case the players’ vision becomes blurred, helping to convey a message of character emotion to the player. Again there is no actual threat to the player, only the suggestion. The gate reopens after a predefined time has passed but further development of this area by creating some kind of puzzle or mechanism for the player to trigger may work better from an interactive point.

6.4 Atmosphere & Environment

Given the visual style and player expectations of a dungeon setting an atmosphere was created for the level which consisted of both diegetic and non-diegetic elements. A thunder storm was created by looping a sample of rain and combining it with thunder rolls which trigger off at random time intervals. This compliments the water which continually drips into the dungeon through various openings, giving the player a sense that something exists outside the walls of the dungeon. Adding further to persuade the player of their presence within the gameworld are the sounds of objects in the game such as burning torches or running water. Each placed in the level and attenuated to mimic real world acoustics with regards to volume and low pass filtering over distance. The aim is to transport the player into this world by making the environment sound a realistic as possible. A non-diegetic layer of ambient sound is also present throughout the level which adds a sense of mood and tone between the action sequences. This also works in conjunction with the threat music by shifting to a more ‘on edge’ variation after the threat music has faded, giving the player a message that all has not returned to normality.

6.5 Player Feedback Elements

The main mission for the player as pointed out is to find and place all the hidden relics into a stone alter which opens the final door allowing them to escape. As a side mission the player can also find the hidden keys which open treasure chests around the level. The two item pick up sounds (key and relic) are aimed to be associated with gratification and reward whilst giving the player incentive to keep searching for the items. The key pick up which is required to unlock the treasure boxes is a bright sound which stands out from the dark atmospheric sound of the level, giving the player a sense of hope and achievement. The relic pick up sound was designed to be a little more mysterious to try and leave the player wondering, what exactly is this item?
The player is also presented with intensifying versions of sound (effect synchronised to the action of relic placement) with each relic securely placed. This again is aimed at player satisfaction whilst implying that there will be a climax with the final relic being placed.

Another important section of the gameplay which required player feedback was with two sets of timed gates. In this situation the player must activate two levers within a set timeframe in order to open specific gates. The basic system for this was already set up within the level but there was little player feedback on what was actually to be done apart from matching numbers on the wall. A ticking timer sound was created for when the player activated one of these levers to indicate the time limit, this sound was attached to the lever itself and could be heard spatially within the level. To increase the sense of urgency, a looping music cue was also created to support the fact that there was a time limit on getting to the other lever. This sound can be heard by the player for the entire duration of the set time as opposed to the ticking sound which will fade out as the player moves away from the lever. By having the ticking sound attached to both levers the player can then use this as a guide to locating the next lever as they get closer to it. As a last touch there was also a camera cut scene used to switch the players perspective view to the location of the next lever that needed to be used.

7. Results and Analysis

14 male participants between the ages 20-28 were asked to play the level and answer the GEQ and supplementary questions relating to the level. Each session varied in time depending on player ability with each play session last approximately 15-25mins.
The results from both the GEQ core and in-game questions were combined to account for participants answering differently to similar questions in each set (as mentioned, the in-game questions are a streamlined version of the core questions). Marginal differences were evident when the two questionnaires results are compared as can be seen in the graph [g:1].


 [g:1] Showing the average scores from both Core questions and In-Game version (average of these two are presented as 'Merged result')

The merged results imply a greater sense of immersion from the ‘sensory and imaginative’ dimension (M = 2.94) which could be linked to the aural part of the design. However this is not definitive as visual and story aspects also fall under the same dimension. When we look at the specific questions relating to the ‘sensory and imaginative’ dimension as can be seen graph [1.1], indeed the most visual based question “It was aesthetically pleasing” scored highest (M = 3.36). Positive affect (M = 2.5), Flow (M = 2.47) and Challenge (M = 2.38) were not far behind. With ‘Flow’ being linked directly (in the case to the GEQ) to the players’ disengagement with the world outside of the game, and their inhabitability to keep track of time. Then Competence (M = 2.24), followed by Negative affect (M = 0.73) and Tension/Annoyance (M = 0.72) with relatively small scores as expected. It could be safely argued that the player did not have a bad experience, but pin pointing exactly the command of points audio design contributed to the results is very difficult without a measure to test against.


[g:1.1] Questions relating to the Sensory and Imaginative Immersion dimension (average scores shown)

The supplementary questions were aimed a specific audio design elements and rated on a 0 – 4 scale much like the GEQ (seen in graph [g:1.3]). Participants were given the following statement before answering this section;
“If we define game immersion as being fully engaged with the activity of playing the game whilst feeling a presence in the game world. Eg, time passes unnoticed; you become unaware of events or people around you; your heart rate quickens in scary or exciting sections; you empathise with or become the character ect. Now thinking about the game audio specifically, please rate the following sections based on their impact.”
This was important so that all participants were thinking about the same concept before giving their ratings The different elements scored similar results with ‘Ambience’ and ‘Perception of threat’ both (M = 3.69) followed closely by ‘Spatialisation of environment’ (M = 3.62), ‘Game character’ and ‘Diegetic sound’ both (M = 3.38), and found least immersive was the music (M = 3.23) although this only in comparison to the other elements (as seen in figure [g:1.2]). All elements scored higher than those in the GEQ which perhaps suggests there is still work to be done to further enhance the GEQ to recognize the many more specifics of games which lead the player into an immersive state. The overall average (M = 3.47) of audio design elements appears to be a promising result to determine whether audio can be specifically designed for the purposes of immersion.

 [g1.2] Showing average results from player immersion ratings on specific audio design elements

8. Conclusions and Further Work

As a solo developer the authors’ lack of coding knowledge presented a technical challenge and steep learning curve which resulted in the project running over time. This forced the level into testing before it was really finished and as such prohibited many ideas from being fulfilled. Refinement of the GEQ and, or the supplementary questions would be the next logical step for future developments. Perhaps a more robust testing system altogether and better controlled testing conditions could yield more definitive results. The original idea of creating a second test level for an A/B style comparison however is the main element missing from this study. A smaller test level may have allowed more time to implement this, but could result in a shorter playing time, thus restricting the players’ ability to become fully immersed. The information gathered throughout this document however, is a starting point for further development into unpicking the connection between audio and immersion in games. It will also inform and educate any game developer with an interested in the design properties of immersion. Although the author has not fundamentally pin pointed a definitive answer, the foundation has been set for future studies. Can audio be designed with the goal of creating an immersive experience? There is no doubt in the authors mind that this can be achieved with suitable resources.


















APPENDIX

9. Appendix


[1.0] IZEA Framework
The IZEA model first of all divides audio into two categories, ‘Diegetic’ and ‘Non-diegetic’, then further divides these up into four domains, ‘Zone’, ‘Effect’, ‘Affect’ and ‘Interface’. ‘Diegetic’ sounds are those which originate from within the game world, sounds like character footsteps, gunfire, and weather. ‘Non-diegetic’ sounds are those which originate from outside the game world such as music or menu navigation sounds. Huiberts also points out that these domains are interdependent on each other which adds another dimension to the model, these are ‘Activity’ which connects ‘Interface’ and ‘Effect’, as well as ‘Setting’ which connects ‘Zone’ and ‘Effect’. He identifies these as “The Activity communicates events occurring in the game environment, while the Setting provides a background or context for the Activity.” (Huiberts. S, 2010, p. 24)









[f:1]

[f:1] Illustration of IZEA Model (Huiberts. S, 2010, p.25)

[f:2]

(f:2) Illustration of IZEA Model with general design properties (Huiberts. S, 2010, p.32)


IZEA Summary:
Effect (Part of the diegetic division)
  • Sounds that are perceived as originating from within the game world for example sounds that would be heard by the character.
  • Sounds responsive to the players’ activity within the diegetic world, either by triggering directly or indirectly.
Zone (Part of the diegetic division)
  • Includes sound which is part of the diegetic environment, for example background ambience such as bird calls and wind in a forest setting.
  • Sound which do not interact with the player but add a sense of realism to the game world by providing a backdrop of enveloping sound.
  • “Communicating an ambient, background layer, which forms an auditory setting for the game world” (Huiberts. S, 2010, p. 27).
Interface (Part of the non-diegetic division)
  • Informing sounds which are outside of the game world such as health bar or score feedback, menu interactions
Affect (Part of the non-diegetic division)
  • Sounds which influence mood such as music at key game points to build tension
  • Sounds which are not part of the diegesis but are used to affect the players’ behaviour during gameplay.

[2.0] Game Experience Questionnaire – Core Module Results



[2.1] Game Experience Questionnaire – In-Game Module


[2.3] Supplementary Questions – Immersion Ratings on Design







 


















10. References

Brown, E. and Cairns, P. (2004). A Grounded Investigation of Game Immersion. Extended Abstracts of the 2004 Conference on Human Factors and Computing Systems, Vienna April 24-29 2004. New York: ACM. pp. 1297-1300. [online]. Available from: http://www-users.cs.york.ac.uk/~pcairns/papers/Immersion.pdf [Accessed 1st November 2014]

Calleja, G. (2007). Revising Immersion: A Conceptual Model for the Analysis of Digital Game Involvement. Digital Games Research Association (DiGRA). Situated Play, Proceedings of DiGRA 2007 Conference. [online]. Available from: http://www.digra.org/wp-content/uploads/digital-library/07312.10496.pdf [Accessed 7th November 2014]

Collins, K. (2008). Game Sound An Introduction to the History, Theory, and Practice of Video Game Music and Sound Design. Cambridge, Massachusetts: Massachusetts Institute of Technology.

Dovey, J. And Kennedy, H.W. (2006). Game Cultures: Computer games as new Media. Berkshire: Open University Press.

Ermi, L. and Mäyrä, F. (2005) Fundamantal Components of the Gameplay Experience: Analysing Immersion [online]. In Proceedings of DiGRA 2005 Conference: Changing Views Worlds in Play. Available from: http://www.digra.org/wp-content/uploads/digital-library/06276.41516.pdf [Accessed October 26 2014]

Ijsselsteijn et al. 2008. Measuring the Experience of Digital Game Enjoyment. In Proceedings of Measuring Behavior, Maastricht, Netherlands, August 26-29, 2008. pp. 88-89. [online]. Available from: http://www.noldus.com/mb2008/program/Proceedings_Measuring_Behavior_2008_web.pdf [Accessed 21st January 2015]

Kalinak, K. 1992. Settling the Score: Music and the Classical Hollywood Film. Wisconsin: Univ of Wisconsin Press

Nacke, L. and Lindley, C. 2008. Boredom, Immersion, Flow - A Pilot Study Investigating Player Experience.  In IADIS Gaming 2008: Design for Engaging Experience and Social Interaction, Amsterdam, The Netherlands, July 25-27, 2008. [online]. Available from:  http://btu.se/fou/Forskinfo.nsf/all/2490c60a20fbcc6ec12574db005900fc/$file/IADIS-PilotStudy-Paper-Final-Short.pdf [Accessed March 5th 2015)

Nacke, Grimshaw and Lindley. 2010. More than a feeling: Measurement of sonic user experience and psychophysiology in a first-person shooter game. Interacting with Computers. 22(5): pp. 336-343.

Grimshaw, Charlton and Jagger. 2011. First-Person Shooters: Immersion and Attention. Eludamos. Journal for Computer Game Culture. 5(1): pp. 29-44 [online]. Available from: http://www.eludamos.org/index.php/eludamos/article/viewArticle/vol5no1-3/html3# [Accessed 1st February 2015]

Huiberts, S. 2010. Captivating Sound, The Role of Audio for Immersion in Computer Games. [online]. Available from: http://download.captivatingsound.com/Sander_Huiberts_CaptivatingSound.pdf [Accessed 20th October 2014].

Jennett et al. (2008). Measuring and defining the experience of immersion in games. Int. J. Human-Computer Studies. 66(May): pp. 641-661.

McMahan, A. (2003). The Video Game Theory Reader. Chapter 3 - Immersion, Engagement, and Presence. New York: Toylor & Francis Books, Inc.

Murray, J. (1997). Hamlet on the Holodeck: The Future of Narrative in Cyberspace. New York: Simon & Schuster.

Strauss, A. and Corbin, J. 1998. Basics of Qualitative Research. 2nd ed. Sage Publications, Inc. [online]. Available from: http://stiba-malang.ac.id/uploadbank/pustaka/RM/BASIC%20OF%20QUALITATIVE%20RESEARCH.pdf [Accessed 1st November 2014]

Taylor, L. 2002. Video Games: Perspective, Point-of-View, and Immersion. [online]. Available from: http://etd.fcla.edu/UF/UFE1000166/taylor_l.pdf [Accessed 25 October 2014]
















BIBLIOGRAPHY

11. Bibliography

Collins, K. (2008). Game Sound An Introduction to the History, Theory, and Practice of Video Game Music and Sound Design. Cambridge, Massachusetts: Massachusetts Institute of Technology.

Droumeva, M. 2005. Understanding immersive audio: A historical and socio-cultural exploration of auditory displays. Proceedings of ICAD 05-Eleventh Meeting of the International Conference on Auditory Display. Limerick, Ireland, July 6-9 2005. Georgia Institute of Technology. [online] Available from: https://smartech.gatech.edu/bitstream/handle/1853/50196/Droumeva2005.pdf?sequence=1  Accessed [11 November 2014]

Ekman, I. 2013. On the Desire to Not Kill Your Players: Rethinking Sound in Pervasive and Mixed Reality Games. Conference: Foundations of Digital Games 2013. [online] Available from: http://www.fdg2013.org/program/papers/paper19_ekman.pdf  Accessed on [20 November 2014]

Grimshaw, 2011. Game Sound Technology and Player Interaction: Concepts and Developments. Hershey PA. Information Science Reference (an imprint of IGI Global).

Horowitz and Looney. 2014 The essential guide to game audio: The theory and practice of sound for games. Burlington, MA: Focal Press.

Shilling, Zyda and Wardynski. 2002. Introducing Emotion into Military Simulation and Videogame Design: America’s Army: Operations and VIRTE. [online]. Available from: http://calhoun.nps.edu/bitstream/handle/10945/41580/ShillingGameon2002.pdf?sequence=1 [Accessed 10 November 2014]
Stevens and Raybould. 2011. The Game Audio Tutorial: A Practical Guide to Sound and Music for Interactive Games. Burlington, MA: Focal Press.