CAN GAME AUDIO BE
SPECIFICALLY DESIGNED AND IMPLEMENTED FOR THE PURPOSE OF LEADING THE PLAYER
INTO AN IMMERSIVE EXPERIENCE?
GARY MAIN
University of Abertay Dundee
School of Arts, Media and Computer Games
May 2015
University
of Abertay – Dundee
Author: Gary
Main
Title: Can game audio be
specifically designed and implemented for the purpose of leading the player
into an immersive experience
Degree: BA
Hons – Creative Sound Production
Year: 2015
Date: 1st of
May 2015
Abstract
This
purpose of this dissertation is to investigate the impact audio has on the
immersive nature of computer games, whilst also determining whether its effectiveness
can be pre-designed and measured. The author uses knowledge form the
surrounding literature to define the characteristics required to implement
audio into a working level which is designed to produce an immersive
experience. This experience is measured using the established GEQ (Game
Experience Questionnaire) to determine whether the key design features
implemented during production prove effective in immersing the participants.
Information extracted from the results is however limited due to their only
being one level to analyse. Original plans for a second level for A/B testing
were scrapped due to unexpected technical difficulties during the
implementation stage.
Table of Contents
1.
Introduction………………………………………………………………………………..1
LIST
OF FIGURES
[g:1]
Graph - GEQ Test result
[g:1.1]
Graph – GEQ Questions relating to the Sensory and
Imaginative Immersion dimension
[g1.2] Graph Showing average results from player immersion ratings
on specific audio design elements
[1.0] IZEA Framework outline.
[2.0]
Game Experience Questionnaire – Core Module Results
[2.1] Game Experience Questionnaire – In-Game Module
[2.3] Supplementary Questions – Immersion Ratings on Design
[f:1] Illustration of IZEA Model
[f:2] Illustration of IZEA Model with general design properties
1.
Introduction
The
term immersion or immersive is often used by gamers when describing games and
is often associated with the quality of a game, but what exactly does it mean
to be immersed in a game and what defines an immersive experience? Most
importantly, what role does audio play in immersing a player?
Throughout
a review of literature surrounding the term ‘immersion, this dissertation investigates
its meaning within the context of playing video games. Current
definitions of the term ‘immersion’ will be discussed in detail through an
analysis of research carried out by practitioners in the field of game study.
This overview will then bring into focus the aural aspect of immersion and the
research which is directed at this specific side of game design. Case studies
of existing games will also be presented with the aim of highlighting key
design elements which support the importance of audio for immersion. The
literature review alongside the case studies form the rationale and basis on
which the level design and implementation were conceptualised and implemented
as will be discussed in detail. Testing methods as well as results and analysis
will follow before coming to conclusions as to whether audio can indeed be
designed for the purposed of creating an immersive experience.
2. Literature
review
2.1 What is Immersion?
A
strong foundation for the design and production of an aurally immersive
experience is first of all to provide a more definitive explanation of term ‘immersion’.
Any gamer or developer will know and often use the term but there is much
debate on both its definition and measurability. Even in the academic arena
there are many theories and opinions, as highlighted by Grimshaw, Charlton and Jagger (2011) as they state “Academics,
equally, have wide interpretations of the term and there is, as yet, no widely
established model providing a definition and describing the process” (2011,
p.29). What
exactly is immersion and what does it mean to be ‘immersed’ within the context
of computer games? What is the process of becoming immersed in a computer game?
Also, and critically important to this project, is what role does audio play in
the immersive nature of games and how can it be measured? This literature
review aims to answer these fundamental questions as well as to provide the
rationale and framework for actions taken to complete the project.
Janet
Murray describes immersion in a broad sense of the definition as;
“Immersion
is a metaphorical term derived from the physical experience of being submerged
in water. We seek the same feeling from a psychologically immersive experience
that we do from a plunge in the ocean or swimming pool: the sensation of being
surrounded by a completely other reality, as different as water is from air,
that takes over all of our attention, our whole perceptual apparatus” (Murray,
1997, p. 98)
Dovey and Kennedy describe immersion in a similar
fashion by adding that it is “the experience of losing a sense of embodiment in
the present whilst connecting on a meditated environment” (Dovey and Kennedy,
2006, p. 146).
Brown and Cairns (2004) highlight that the term
immersion, or immersive is used frequently to describe games throughout the
gaming community but add that the term is also used without a coherent
explanation (Brown and Cairns. 2004, p.1). Other terms have also been used to
describe the immersive state such as ‘Incorporation’ (Calleja, 2007) and
‘Presence’ (McMahan, 2003). Jennett et al (2008) also mention ‘Presence’ as
well as other concepts such as ‘Flow’ and ‘Cognitive absorption’. However they
argue that “immersion is clearly distinct from these established concepts and a
better understanding of immersion would be crucial in understanding the
relationship between people and videogames” (Jennett et al, 2008, p. 642).
2.2 Stages of Immersion
A closer look at the theories behind game immersion
provide a more unique analysis, whilst narrowing down the broad range of terms
and descriptions that have been presented thus far. Laurie Taylor (2002)
expands the definition in a gaming context by splitting it into two separate
stages of engagement “diegetic immersion, where the player is immersed in the
act of playing the video game, and as intra-diegetic or situated immersion,
where the player is immersed in playing the game and in the experience of the
game space as a spatial and narrated space” (Taylor. L. 2002, p. 12). The idea
of different stages of engagement is shared by Brown and Cairns (2004). Their
study tries to define immersion within the context of games using a ‘grounded
theory’ approach (a method of qualitative research developed by Strauss and
Corbin (1998)) by analysing interviews with seven gamers about their
experiences when playing games. Much like Taylor (2002) their results define
immersion as a process of stages of engagement with the game which, “moves
along the path of time and is controlled by barriers” (Brown and Cairns. 2004,
p. 2), only here three stages are suggested.
The three stages of immersion described by Brown and
Cairns are ‘Engagement’, ‘Engrossment’ and ‘Total Immersion’ with the latter
also being described as ‘Presence’. The barriers describe gateways to each of
the stages and by opening these the player can move forward through these
stages of immersion. For example the first stage is ‘Engagement’, this is where
the player must first engage with the game, so a potential barrier might be
that the player does not like the genre or style of the game, thus creating a
barrier for further immersion to take place. As stated by Brown and Cairns “To
lower the barriers to enter this level, the gamer needs to invest time, effort,
and attention” (2004, p.2).
The next stage of immersion, ‘Engrossment’ is
described by one interviewee as
“A Zen-like state where your hands just seem to know
what to do, and your mind just carries on with the story.” (Brown and Cairns.
2004, p.3).
Brown and Cairns link this stage to game construction
and the players’ emotional connection, as well as respect for the design work
put into it. Again there is a barrier to this stage of investing time and
effort to become emotionally involved with the game, but also design flaws such
as poor visual/aural feedback or even unrealistic gameplay are other factors of
the game construction which could present a barrier to further immersion. These
factors could lead to player disengagement and bring up barriers against
progressing to the final stage of immersion described by Brown and Cairns.
Brown and Cairns describe the last stage of immersion,
‘Total Immersion’ as ‘Presence’, this is where the gamer feels completely
detached from the real world around them and are fully focused on the game. The
barriers to ‘Presence’ that are suggested lay with empathy and atmosphere. In
this sense empathy is defined not only as the players’ empathetic feelings
towards game characters or situations but as the “growth of attachment” to the
game itself (Brown and Cairns. 2004, p.3). Interestingly the majority of games
described by the gamers’ as ‘Totally Immersive’ were of a FPS (First person
shooter) model. Perhaps due to the direct visual and aural illusion of seeing
and hearing from the characters perspective gives them a greater feeling of
‘Presence’ within the game world. Grimshaw, Charlton and Jagger (2011) also
make a link to player perspective when discussing FPS game elements which
contribute to immersion, “Aiding the illusion of being present within the
gameworld displayed on the screen, the FPS game typically posits the player
with a first-person perspective[…]” (2011 paragraph. 8).
Brown and Cairns also describe three elements of
player attention which are linked to the immersion process: visual, auditory
and mental. They go on to suggest that the more of these attention aspects are
pulling on the players’ senses, the greater involved with the game the player
will become. “If gamers need to attend to sound, as well as sight, more effort
is needed to be placed into the game. The more attention and effort invested,
the more immersed a gamer can feel.” (Brown and Cairns. 2004, p. 3).
Ermi and Mäyrä (2005) also pick up on the concept of
stages of immersion by presenting the SCI-model which links aspects of game
design to immersion and highlights their influence on the process. This model
identifies three aspects of game design which have been drawn from an analysis
of interviews with gamers regarding their playing experiences, it is then
concluded how much these aspects impact the immersive quality of games. The
design aspects are labelled ‘audio-visual quality and style’, ‘level of
challenge’ and ‘imaginary world and fantasy’ which are linked to the core
values of the SCI-model; Sensory immersion, Challenge based immersion and
Imaginative immersion respectively (Ermi and Mäyrä 2005, pp. 7-9). The results
from Ermi and Mäyrä (2005) as well as Brown and Cairns (2004) both acknowledge
that audio play a role in the immersion processes they describe but neither
study makes direct links with audio.
2.3 Immersive Audio
Collins (2008) picks up on the theory of Ermi and
Mäyrä (2005) with a more direct stance on game audio by suggesting that the
‘imaginary world and fantasy’ element of the SCI-model “is strongly enhanced by
audio” (Collins 2008, p.134). Collins also talks about the link between
immersion and audio in more general terms by adding “The illusion of being
immersed in a three-dimensional atmosphere is greatly enhanced by the audio”
(Collins 2008, p. 132)
Sander Huiberts (2010) makes a more explicit
connection between immersion and audio in a thesis which examines their
relationship from a conceptual design standing. In his thesis, which was
largely based around an analysis of user surveys (Creative Heroes 2007), he
comments on the SCI-model of Ermi and Mäyrä (2005) by stating “audio is capable
of enhancing the three dimensions of immersion by enhancing the sensory
connection, the feeling of flow and the feeling of empathy of the player”
(Huiberts. S 2010, p. 101).
The classification and typology of audio in games is
expressed in a number of ways by practitioners in the industry, each
structuring game audio content into meaningful partitions with regards to
functionality. Huiberts describes some of the different frameworks that have
been developed which can be used to help structure game audio content, but also
points out some inconsistencies relating to each framework and makes it clear
that the subject of game audio needs a more coherent model (2010, pp.15-20). The
IZEA (Interface, Effect, Zone and Affect) framework (Huiberts. S 2010, pp.
20-35) is presented as tool to categorise and analyse the functioning of the
different aspects of game audio (further details of the IZEA model can be found
in appendix [1.0]). Huiberts uses the IZEA framework in conjunction with
existing theories on game immersion such as those from Ermi and Mäyrä (2005)
and Brown and Cairns (2004) to highlight the role of audio (from a design
perspective) in the process of player immersion. He also highlights two main
functions of game audio used to enhance user experience. These are the use of
audio to both ‘optimise’ and ‘dynamise’ gameplay. Optimise in this sense is
defined as sound which helps the player interact with the game to enhance
usability. Audio which is used to dynamise gameplay, helps to make the game
more exciting and enhance the experience according to Huiberts (2010, p.29). It
is also noted that audio can be designed to both optimise and dynamise gameplay
as in the example given “[…]an event which presents information, such as the
sound of a weapon is designed for the communication of important information about
the weapon itself and the Activity of the game as well as making the experience
more exciting” (Huiberts. S 2010, p.30). Fundamentally the IZEA is based on the
three elements of immersion in the SCI-model (Ermi and Mäyrä 2005) and is
claimed to be conceptual design tool which will enhance immersion in those
areas through the use of audio.
Grimshaw, Charlton and Jagger (2011) also recognise
the importance of audio and its role in becoming immersed when describing FPS
games, by highlighting that audio is not limited to the visual restrictions of
the game monitor or television screen. Sounds can be spatialised within the
game so that the player can essentially hear what is going on all around them.
They state, “[…]the visual space is restricted to what can be
seen with stereoscopic vision, the aural space is unrestricted with reference
to the position of heard sound sources” (Grimshaw, Charlton and Jagger, 2011. paragraph.
10). They also add that the use of headphones can ‘monopolize’ the players’
aural senses by restricting sound external to the gameworld. They too
acknowledge the ongoing debate regarding the term immersion but draw two
frequent similarities found in explanations of the term. These are that the
state of immersion is linked to a physical disconnection of external (outside
the game) surroundings, as well as loosing track/sense of time (Grimshaw,
Charlton and Jagger, 2011.) Both of these similarities are a result of the game
holding the players’ full attention. With this in mind, the argument could be
made that, games which are widely accepted as immersive should have a core
function within the design to continually draw the players’ full attention.
Grimshaw,
Charlton and Jagger (2011) make a link between player attention and immersion with
reference to the work of Brown and
Cairns (2004, p. 3), as mentioned previously but also suggest links with
elements of game design. They indicate that one aspect of player attention is linked
to learning and automating game functions such as controlling the character,
level layouts, how to attack, defend or tactics for example. They argue that
these functions of the game will be stored in the players’ mind and over time
will become automated, requiring less player attention. If the player is subjected to a change of these learned game
functions (such as encountering a new enemy or finding a new weapon) then the
player is forced to invest more attention in order to adapt. Grimshaw, Charlton
and Jagger also point out that “If certain actions were unable to become
learned or automated in some way, game playing would be a laborious task and
little progress would ever be made” (2011, paragraph. 20).
They
go on to highlight specific design elements which contribute to immersion in
FPS games such as ease of player control. Again this is tied to the idea of the
player automating game functions. In this case it is argued that if the player
can automate the control system quickly and easily it will allow their
attention to be directed to becoming progressively immersed within the
gameworld. In short, the less the player needs to fixate on an external
controller or keyboard commands the more attention they can invest in the
gameworld. Other design considerations suggested by Grimshaw, Charlton and
Jagger (2011) for optimal immersion with FPS games is a need for balance in
level of challenge. This usually starting off at low level whilst increasing in
difficulty as the game progresses and the player begins to automate common game
functions. Also highlighted as vitally important features for immersion is for
the player to empathise with the character they are controlling, as well as the
gameworld atmosphere and environment. An aspect also pointed out by Brown and Cairns in which to achieve ‘total immersion’
(2004, p.3).
The
term immersion has still to be explicitly defined, as just shown throughout the
literature surrounding it. However, a clearer understanding of the term has
been presented with regards to the context of this project.
3. Case
Studies – Immersive games
As
part of the research and development of the game level created for this project
it was important to dissect some existing game titles of a similar nature. The
main objective of this line of research was to identify specific sound design
and implementation techniques used by practitioners within the industry which
contribute to an immersive experience. This as well as a general observation of
game design as a whole in the role of immersive gameplay.
3.1 Amnesia
- The Dark Descent
Outline
Amnesia (Frictional Games 2010) is set in the darkness
of Brennenburg Castle where the player takes control of
Daniel, who has just awoken with a severe case of amnesia. The player must
navigate through the labyrinth of hallways and puzzles to slowly reveal the
hidden truth about who he is. With no weapons the player must use shadows and
hiding places to avoid detection from the monsters lurking in the castle. The
drawback to hiding in the dark is that Daniel begins to go insane, with an
active ‘insanity meter’ giving the player feedback.
Immersion
and Audio
Even
before the game begins the developers have tried to set the scene and put the
player in a specific mind set. On starting a new game the player is instructed
to adjust brightness levels, play in a dark room and to wear headphones to
optimise playing experience. Finally it is added that "Amnesia should not
be played to win. Instead, focus on immersing your self in the game world and
story" (Frictional Games 2010).
The
use of headphones as pointed out previously by Grimshaw, Charlton and Jagger (2011) can be instrumental to ‘monopolize’
the players’ senses. In the case of amnesia perhaps, arguably even more so due
to the use of binaural sound within the game.
Typical
gameplay is within a dark environment where the player is forced to hide or run
rather than fight. This type of darkened environment setting plays into the
hands of audio as the player must rely more on auditory information to navigate
and detect threat. The player must sneak and hide whilst actively listen for
aural feedback, some of which is binaural as already mentioned. It could be
argued on the basis of findings from Grimshaw,
Charlton and Jagger (2011) that this type of game setting
and atmosphere exaggerates the players’ attention to aural senses whilst
keeping the player on an aural alert. Thus resulting in ‘barriers’ (as
suggested by Brown and Cairns (2004))
being removed to allow further immersion to take place through the use of
audio. The music also plays a crucial role in setting the atmosphere and tone
of Amnesia by delivering the player a familiar schema
which conjures up preconceptions of the horror film genre. By making these
connections the player then has an understanding of and familiarity with the
kind of setting they should expect from the very first opening sequence of the
game. Tension, fear and the unknown are all phrases that sit well with the
music of Amnesia.
As
highlighted by both Brown and
Cairns (2004) and Grimshaw, Charlton and Jagger (2011)
to empathise and make a connection with the game character and atmosphere is
vital for immersion. A
prominent feature of Amnesia which helps deliver a strong player attachment to
the game character is the breathing mechanism. The character can be heard breathing
which works dynamically by being linked to player actions and game events such
as being under threat. Although at times repetitive, this method is very
effective at conveying the mood of the character whilst forcing the player to
tune into the characters breathing pattern and away from their own.
Unfortunately
the most disturbing aspect of the audio design in Amnesia was not the creepy or
horrific sound design, but the inconsistent spatialisation of certain sounds
leading to extreme panning issues. Not all but some sound sources pan from one
ear to the other (on headphones) very abruptly depending on player perspective
and rotation (as extreme as 100% signal from one ear to the other). This
unnatural spatialisation of sound is arguably the games' biggest flaw which
presents a barrier for immersion. Notably the story and visual aspect of the
game contribute to immersion so the question was, can audio lead this kind of
experience or will it always just be one part of the immersion process?
3.2 The Last of Us
Outline
The
Last Of Us (Naughty Dog) is set in a post-apocalyptic world 20 years after
being subjected to a global outbreak of a deadly fungal parasite, with the
infected turning into zombie like creatures. The Hollywood style survival story
revolves around two main characters. Joel, a middle aged man who witnessed the
murder of his young daughter during the outbreak, and Ellie, a young girl who
is immune to infection. The main plot revolves around Joel trying to take Ellie
on a dangerous journey across the country to try and find a cure for the
infection. Not only do they need to be concerned about the threat of infection
and the creatures it breeds, but other survivors are just as much of a threat
to their own survival.
Immersion
and Audio
Much
like Amnesia the player must sneak around undetected for much of the game if
they wish to live. In order to be effective at moving around undetected the
player can use a focus function (linked to a button press) where they are
essentially listening for threats. When entered, the focus mode sends the bulk
of in game audio through a low pass filter, with threat sounds such as enemy
noises left largely unaffected. This works in combination with a visual effect
of everything but moving objects being slightly blurred and out of focus. This
feature allows the player to dynamically tune into threat sounds as and when
they feel it necessary, much like in a real life scenario where one can
subjectively listen for a specific sound. This is especially effective given
the distinctive clicking sound made by zombie like creatures. The 'clickers'
(as referred to in the game) have no vision and use this sound as an echo
location mechanism, this also means that the player can be detected if any
sudden movements are made near a ‘clicker’ This distinctive aural gameplay
function is an essential driver of the survival mechanism throughout the game
given that the main threat is usually heard before seen. The focus feature at
its core is a device which draws on more of the players’ aural attention, as
well as make the player empathise with the characters situation. Although not
in a first person perspective there is still a strong player to character
attachment which is bolstered by the focus feature as well as a convincing
breathing system. Much like Amnesia, the use of audio and the fact that the
player must actively listen (or pay the price) throughout much of the game is
arguably a strong contributor to the games immersive nature.
The
use of space is also a strong point in the game. Room spaces and tonal changes
from one room to the next are highly audible. This is especially evident when
in focus mode where the player can hear enemy sound from behind walls. The
spatialised presence of enemies along with realistic room acoustics solidifies
the players 'presence' within the game. This is arguably one of the key
immersive gameplay factors in the game as the player is essentially forced to
tune into the character hearing system.
Two
key design features of The Last Of Us are, utilisation of the players’ senses and
an effective balance of gameplay, including intense periods of survival action alongside
periods of exploration in relative safety. In this case as in most AAA rated
games the storyline is also a major hook to get the players attention. Perhaps
the biggest attachment from player to game lays with this element, which in the
case of The Last Of Us is exceptional and even received a host of awards
including five BAFTAs’, notably for audio use and best story, to name but two.
That being said, as impressive as the audiovisual and story of 'The Last Of Us'
is, it would stand for nothing if the gameplay experience did not match the
expectations set by these elements.
4. Methodology
4.1 Production
A
combination recorded sound (specifically for the project) and pre-existing
recordings from the authors sound library were edited and exported for the test
level using the AVID Pro Tools DAW (Digital Audio Workstation).
The
UDK (Unreal Development Kit) game engine was utilised for the creation of the
test level. This choice of software was largely based on the approachability of
UDK with regards to its visual based coding system within the editor (Kismet).
Due to the authors’ lack of expertise in relation to computer coding, UDK
provided a workable environment in which to create the testing level. Initially there was to be two versions of the level
created for the purposes of comparing the different sound design elements with
regard to their immersive potential. However due to an underestimation of the
time required and also the technical challenges of designing and implementing
game audio, only one version was created.
4.2 Testing Methods
The core questionnaire module from the GEQ
(Game Experience Questionnaire) (Ijsselsteijn
et al, 2008) was used to assess the test level as it was specifically developed
to measure player experience. Use of the GEQ can also be found in similar
studies in which player experience is measured such as those by Nacke and
Lindley (2008) and Nacke, Grimshaw and Lindley (2010).
The GEQ core
questionnaire module is
aimed at analyzing the players’ feelings during gameplay and are divided up into the following seven dimensions.
‘Immersion’, ‘tension’, ‘competence’, ‘flow’, ‘negative
affect’, ‘positive affect’, and ‘challenge’, which address different aspects of
player involvement with a game. The questionnaire is made up of a series of statements regarding the players
feelings, from each statement the players indicated how much they agreed with
each statement on a scale of 0 – 4 (0= Not at all, 4= Extremely). Included with the core GEQ module is an In-Game version which
consists of the same questions but in a more concise format to assess players
during play. The act of stopping and starting play to answer questions was not
in the interests of inducing immersion, however the In-Game version was
included at the end along with the main questionnaire. The point of this was to
try gather a more accurate average by asking similar questions twice and
compiling the results into one single data set.
Supplementary to the GEQ another section was added which
was more directed towards the audio elements of the test level. The players
were required to rate each listed aspects of the game sound design based on how
immersive they were (from not at all immersive=0, up to, completely
immersive=4). As part of this section a short explanation was provided on what
the authors’ definition of immersion was. This supplementary section was aimed
at providing some additional evidence which could be correlated with the
responses from the GEQ.
4.3 Testing Conditions
Each
player was instructed to wear headphones whilst playing the game to maximize
audio impact. As picked up by Grimshaw, Charlton and Jagger (2011) the use of
headphones can ‘monopolize’ the players’ aural senses. This was essential in
order to minimise external distractions (sound from the testing area) which
could introduce a ‘barrier’ (as discussed previously in the works of Brown and Cairns (2004)) to the player
becoming immersed. The technical challenges of audio programming and game
scripting (which will be discussed later) forced the project to run behind
schedule. As a result, little time remained to carry out actual testing of the
game which in turn led to less than optimal conditions. This was with regards
to the PC used and the location of testing which were carried out on a laptop
within the audio lab (Whitespace, Abertay).
5. Pre-Production
5.1 Level Overview
The
initial intentions of the project were to specifically design and build the
game level using the tools and assets that come with UDK. However this was not
the primary objective, as well as being a task which would have taken valuable
time away from audio development. A custom level was outsourced for the
purposes of implementation and testing of audio assets. The level selected to
work with (credit to level designer Chris Holden) is set in a dark dungeon
which is mainly lit by burning torches mounted on walls and occasional openings
in the roof. The level also had a basic gameplay systems already in place
including objectives and tasks. The player (in first person perspective) must
navigate and escape the dungeon by finding three hidden relics which then have
to be placed into a stone alter in order to open the final door. Along the way
the player can also pick up hidden keys which open treasure chests throughout
the level. The basic level matched the required style which would enable a good
starting point for implementing an immersive experience. This style requirement
is linked to the findings presented within the literature review which highlight
the immersive potential of FPS (first person shooter) type games. This as well
as the fact that the dungeon is poorly lit which will force the player to rely
on their other senses.
The
level included a degree of challenge by the way of exploration, finding hidden
objects and working out how to open doors. There was however no enemies or
threat to the player of any kind and although visually impressive was not very
immersive as it stood. The main objective was to improve the immersive quality
of the game through the use of audio.
5.2 Level Design
Genre
The
overall design of the level with regards to appearance, sets it firmly in the
zone of a horror/suspense/adventure genre. The visual design was capitalised to
present a sound design which is both familiar and convincing to the player. The
sound is primarily aimed at reinforcing the horror and suspense theme, but with
the gameplay also involving exploration, treasure and finding hidden items it incorporates
an adventure element too.
6. Designing for Immersion
6.1 Game Character
To
enable a strong link between the player and character a breathing system was
developed which works dynamically as the player moves around, such as heavy
breathing when running. To enhance this system further the breathing sounds
were recorded binaurally with the use of specialist in ear microphones. The
primary aim of this was to capture the recorded breathing sound as it is heard by
an individual. Played back through headphones this results in sonic
characteristics within the breathing sound which give the player a sense that
the breath is coming from them, thus fabricating an empathetic bond between the
player and character. The importance of which for immersion was highlighted
through the work of Grimshaw, Charlton and Jagger (2011) and Brown and Cairns (2004) in the literature review.
Also
important to this bond is character footsteps which are notoriously repetitive
in many games. The repetitive nature of footsteps presented challenges by
trying to balance variation with system memory usage as the inclusion of too
many individual sounds within each footstep cue resulted in playback issues.
The level floor is made up of a variety of materials so variation between those
helps to break up any long spells of the same sound playing back. One of the
key factors was keeping the footstep sounds consistent when the player
transitions between each material, this combined with variation was aimed at
maintaining a balanced sound which did not break the players’ attention
(essentially breaking immersion).
6.2 Creating a Threat
The
presence of a threat was essential for setting tension and atmosphere as well
as dynamising gameplay. Physically
placing a visible foe of some description in the level was a
consideration, however without the assets for a suitable enemy (which fits
the aesthetics of the level) such as some kind of
monster character, this was not an option. More suitably for the aim of the
test level there was a sense of threat introduced through the use of audio.
There are various elements contributing to the illusion of threat within the
level which will now be discussed in more detail.
The
underlying implication that there is a threat to the player is instigated from
the very beginning of the level. The first instance sits within the atmospheric
bed of sound that creates the setting which plays on the users’ already
established schema of the horror film and game genre to inform them about the
kind of situation that may follow (setting the scene). Within the atmospheric
layers of sound contains a randomly generated leitmotif sound device which is
aimed at priming the player to associate this sound with the threat. The Use of
a leitmotif is summed up by Kalinak, “[…]leitmotifs heightened spectator
response through sheer accumulation, each repetition of the leitmotif bringing
with it the associations established in earlier occurrences” (Kalinak, K. 1992.
p.104). The actual raw sound used in this situation was made by impacting and
scraping a disused grain bin in a farm building. The sounds from this
particular recording session are the base of all threat sounds within the
level, from the looping atmospheric background to the intense music (which will
be discussed later).
The
physical presence of a threat was created by placing sounds imitating a roaming
beast within the level (this time supported by vocal elements) which functions
dynamically, by moving around the level, and spatially by the sound having an
identifiable source as it moves. As the player passes through hidden trigger points,
the beast sound will be perceived spatially and with directionality by the
player, indicating that something is closing in to their location. This
enforces the idea that the player is not alone, thus pushing them into a state
of awareness and importantly for immersion, drawing and keeping their attention
on the gameworld.
Camera
actors within UDK were used to create a cut scene in which the players’ visual perspective
is switched as if they are suddenly viewing through the eyes of the threat.
This camera sequence is triggered by the player picking up one of the relics
and proceeds to show the threat moving quickly towards their location. This
sequence is accompanied by music which first of all, during the cut scene
contains a sound effect driven track of distorted noise and the signature base
metallic leitmotif sound. These are synchronised to the movements of the threat
which ends in a peak of the dynamic range before returning to the players’
view. This is where the main threat music cues in, signifying to the player
that the threat is no longer just a suggestion.
6.3 Music
The
main threat music was composed almost entirely with sounds recorded from the
grain bin session mentioned earlier. The raw recordings were manipulated using
pitch shift, delay and reverb to create a core rhythmic section which was
supported by additional drums. The music was created in sections which reflect
different intensity levels, these different levels then correspond to the
current state of the player in game. When the cut scene ends and switches back
to the player perspective, the music is at its highest intensity in a
continuous loop. The loop remains in this state until the player escapes
through a gate which they must open by operating a lever and reaching the gate
before it slams closed again. After this trigger point the music changes state
to a less intense version which gives the player a sense that they have
progressed and are in less danger. As the player leaves this area another
trigger point switches the music to a low intensity version, which eventually
fades out returning to the original ambience.
A
slightly different version of the high intensity music was implemented in
another area within the level to create another scare point. In this situation
the player walks past an iron gate which proceeds to open revealing a hidden
key located within a small room. A sequence was created so that as soon as the
player picks up the key the gate slams behind them whilst the high intensity
music sharply cuts in. This as well as the roaming beast sound mentioned before
is triggered and can be perceived as moving within the halls on the other side
of the gate. This action is also linked to a post processing volume which is an
area defined within the level editor of UDK to act as a type of camera effect.
In this case the players’ vision becomes blurred, helping to convey a message
of character emotion to the player. Again there is no actual threat to the
player, only the suggestion. The gate reopens after a predefined time has
passed but further development of this area by creating some kind of puzzle or
mechanism for the player to trigger may work better from an interactive point.
6.4 Atmosphere & Environment
Given the visual style and
player expectations of a dungeon setting an atmosphere was created for the
level which consisted of both diegetic and non-diegetic elements. A thunder
storm was created by looping a sample of rain and combining it with thunder
rolls which trigger off at random time intervals. This compliments the water
which continually drips into the dungeon through various openings, giving the
player a sense that something exists outside the walls of the dungeon. Adding
further to persuade the player of their presence within the gameworld are the
sounds of objects in the game such as burning torches or running water. Each
placed in the level and attenuated to mimic real world acoustics with regards
to volume and low pass filtering over distance. The aim is to transport the
player into this world by making the environment sound a realistic as possible.
A non-diegetic layer of ambient sound is also present throughout the level
which adds a sense of mood and tone between the action sequences. This also
works in conjunction with the threat music by shifting to a more ‘on edge’
variation after the threat music has faded, giving the player a message that
all has not returned to normality.
6.5 Player Feedback Elements
The
main mission for the player as pointed out is to find and place all the hidden
relics into a stone alter which opens the final door allowing them to escape. As
a side mission the player can also find the hidden keys which open treasure
chests around the level. The two
item pick up sounds (key and relic) are aimed to be associated with
gratification and reward whilst giving the player incentive to keep searching
for the items. The key pick up which is required to unlock the treasure boxes is
a bright sound which stands out from the dark atmospheric sound of the level,
giving the player a sense of hope and achievement. The relic pick up sound was
designed to be a little more mysterious to try and leave the player wondering,
what exactly is this item?
The
player is also presented with intensifying versions of sound (effect
synchronised to the action of relic placement) with each relic securely placed.
This again is aimed at player satisfaction whilst implying that there will be a
climax with the final relic being placed.
Another
important section of the gameplay which required player feedback was with two
sets of timed gates. In this situation the player must activate two levers
within a set timeframe in order to open specific gates. The basic system for
this was already set up within the level but there was little player feedback
on what was actually to be done apart from matching numbers on the wall. A
ticking timer sound was created for when the player activated one of these
levers to indicate the time limit, this sound was attached to the lever itself
and could be heard spatially within the level. To increase the sense of urgency,
a looping music cue was also created to support the fact that there was a time
limit on getting to the other lever. This sound can be heard by the player for
the entire duration of the set time as opposed to the ticking sound which will
fade out as the player moves away from the lever. By having the ticking sound
attached to both levers the player can then use this as a guide to locating the
next lever as they get closer to it. As a last touch there was also a camera
cut scene used to switch the players perspective view to the location of the
next lever that needed to be used.
7. Results and Analysis
14
male participants between the ages 20-28 were asked to play the level and
answer the GEQ and supplementary questions relating to the level. Each session
varied in time depending on player ability with each play session last
approximately 15-25mins.
The
results from both the GEQ core and in-game questions were combined to account
for participants answering differently to similar questions in each set (as
mentioned, the in-game questions are a streamlined version of the core
questions). Marginal differences were evident when the two questionnaires
results are compared as can be seen in the graph [g:1].
[g:1] Showing the average scores from both
Core questions and In-Game version (average of these two are presented as
'Merged result')
The
merged results imply a greater sense of immersion from the ‘sensory and
imaginative’ dimension (M = 2.94) which could be linked to the aural part of
the design. However this is not definitive as visual and story aspects also
fall under the same dimension. When we look at the specific questions relating
to the ‘sensory and imaginative’ dimension as can be seen graph [1.1], indeed
the most visual based question “It was aesthetically pleasing” scored highest
(M = 3.36). Positive affect (M = 2.5), Flow (M = 2.47) and Challenge (M = 2.38)
were not far behind. With ‘Flow’ being linked directly (in the case to the GEQ)
to the players’ disengagement with the world outside of the game, and their
inhabitability to keep track of time. Then Competence (M = 2.24), followed by
Negative affect (M = 0.73) and Tension/Annoyance (M = 0.72) with relatively
small scores as expected. It could be safely argued that the player did not
have a bad experience, but pin pointing exactly the command of points audio
design contributed to the results is very difficult without a measure to test
against.
[g:1.1] Questions relating to the Sensory
and Imaginative Immersion dimension (average scores shown)
The
supplementary questions were aimed a specific audio design elements and rated
on a 0 – 4 scale much like the GEQ (seen in graph [g:1.3]). Participants were
given the following statement before answering this section;
“If
we define game immersion as being fully engaged with the activity of playing
the game whilst feeling a presence in the game world. Eg, time passes
unnoticed; you become unaware of events or people around you; your heart rate
quickens in scary or exciting sections; you empathise with or become the
character ect. Now thinking about the game audio specifically, please rate the
following sections based on their impact.”
This
was important so that all participants were thinking about the same concept
before giving their ratings The different elements scored similar results with
‘Ambience’ and ‘Perception of threat’ both (M = 3.69) followed closely by ‘Spatialisation
of environment’ (M = 3.62), ‘Game character’ and ‘Diegetic sound’ both (M =
3.38), and found least immersive was the music (M = 3.23) although this only in
comparison to the other elements (as seen in figure [g:1.2]). All elements
scored higher than those in the GEQ which perhaps suggests there is still work
to be done to further enhance the GEQ to recognize the many more specifics of
games which lead the player into an immersive state. The overall average (M =
3.47) of audio design elements appears to be a promising result to determine
whether audio can be specifically designed for the purposes of immersion.
[g1.2]
Showing average results from player immersion ratings on specific audio design
elements
8. Conclusions and Further Work
As
a solo developer the authors’ lack of coding knowledge presented a technical
challenge and steep learning curve which resulted in the project running over
time. This forced the level into testing before it was really finished and as
such prohibited many ideas from being fulfilled. Refinement of the GEQ and, or
the supplementary questions would be the next logical step for future
developments. Perhaps a more robust testing system altogether and better
controlled testing conditions could yield more definitive results. The original
idea of creating a second test level for an A/B style comparison however is the
main element missing from this study. A smaller test level may have allowed
more time to implement this, but could result in a shorter playing time, thus
restricting the players’ ability to become fully immersed. The information
gathered throughout this document however, is a starting point for further
development into unpicking the connection between audio and immersion in games.
It will also inform and educate any game developer with an interested in the
design properties of immersion. Although the author has not fundamentally pin
pointed a definitive answer, the foundation has been set for future studies. Can
audio be designed with the goal of creating an immersive experience? There is
no doubt in the authors mind that this can be achieved with suitable resources.
APPENDIX
9. Appendix
[1.0]
IZEA Framework
The IZEA model first of all divides audio into two
categories, ‘Diegetic’ and ‘Non-diegetic’, then further divides these up into
four domains, ‘Zone’, ‘Effect’, ‘Affect’ and ‘Interface’. ‘Diegetic’ sounds are
those which originate from within the game world, sounds like character
footsteps, gunfire, and weather. ‘Non-diegetic’ sounds are those which
originate from outside the game world such as music or menu navigation sounds.
Huiberts also points out that these domains are interdependent on each other
which adds another dimension to the model, these are ‘Activity’ which connects
‘Interface’ and ‘Effect’, as well as ‘Setting’ which connects ‘Zone’ and
‘Effect’. He identifies these as “The Activity communicates events occurring in
the game environment, while the Setting provides a background or context for
the Activity.” (Huiberts. S, 2010, p. 24)
[f:1] Illustration
of IZEA Model (Huiberts. S, 2010, p.25)
(f:2) Illustration of IZEA Model with general design properties
(Huiberts. S, 2010,
p.32)
IZEA Summary:
Effect (Part of the diegetic division)
- Sounds that are perceived as originating from
within the game world for example sounds that would be heard by the
character.
- Sounds responsive to the players’ activity within
the diegetic world, either by triggering directly or indirectly.
Zone (Part of the diegetic division)
- Includes sound which is part of the diegetic
environment, for example background ambience such as bird calls and wind
in a forest setting.
- Sound which do not interact with the player but
add a sense of realism to the game world by providing a backdrop of
enveloping sound.
- “Communicating an ambient, background layer,
which forms an auditory setting for the game world” (Huiberts. S, 2010, p.
27).
Interface (Part of the non-diegetic division)
- Informing sounds which are outside of the game
world such as health bar or score feedback, menu interactions
Affect (Part of the non-diegetic division)
- Sounds which influence mood such as music at key
game points to build tension
- Sounds which are not part of the diegesis but are
used to affect the players’ behaviour during gameplay.
[2.0]
Game Experience Questionnaire – Core Module Results
[2.1]
Game Experience Questionnaire – In-Game
Module
[2.3]
Supplementary Questions – Immersion
Ratings on Design
10. References
Brown, E. and Cairns, P. (2004). A Grounded
Investigation of Game Immersion. Extended Abstracts of the 2004 Conference on
Human Factors and Computing Systems, Vienna April 24-29 2004. New York:
ACM. pp. 1297-1300. [online]. Available from: http://www-users.cs.york.ac.uk/~pcairns/papers/Immersion.pdf [Accessed 1st November 2014]
Calleja, G. (2007). Revising Immersion: A Conceptual Model for the Analysis of Digital Game
Involvement. Digital Games Research Association (DiGRA). Situated Play,
Proceedings of DiGRA 2007 Conference. [online]. Available from: http://www.digra.org/wp-content/uploads/digital-library/07312.10496.pdf [Accessed 7th November 2014]
Collins, K. (2008). Game Sound An Introduction to the History, Theory, and Practice of
Video Game Music and Sound Design. Cambridge, Massachusetts: Massachusetts
Institute of Technology.
Dovey, J. And Kennedy, H.W. (2006). Game Cultures: Computer games as new Media.
Berkshire: Open University Press.
Ermi, L. and Mäyrä, F. (2005) Fundamantal Components of the Gameplay Experience: Analysing Immersion
[online]. In Proceedings of DiGRA 2005 Conference: Changing Views ‐ Worlds in Play. Available from: http://www.digra.org/wp-content/uploads/digital-library/06276.41516.pdf [Accessed October 26 2014]
Ijsselsteijn
et al. 2008. Measuring
the Experience of Digital Game Enjoyment. In Proceedings of Measuring Behavior,
Maastricht, Netherlands, August 26-29, 2008. pp. 88-89. [online]. Available
from: http://www.noldus.com/mb2008/program/Proceedings_Measuring_Behavior_2008_web.pdf
[Accessed 21st January 2015]
Kalinak, K.
1992. Settling the Score: Music and the
Classical Hollywood Film. Wisconsin: Univ of Wisconsin Press
Nacke,
L. and Lindley, C. 2008. Boredom, Immersion, Flow - A Pilot Study Investigating
Player Experience. In IADIS Gaming 2008: Design for Engaging Experience and Social
Interaction, Amsterdam, The Netherlands, July
25-27, 2008. [online]. Available from: http://btu.se/fou/Forskinfo.nsf/all/2490c60a20fbcc6ec12574db005900fc/$file/IADIS-PilotStudy-Paper-Final-Short.pdf
[Accessed March 5th 2015)
Nacke,
Grimshaw and Lindley. 2010. More than a feeling: Measurement of sonic user
experience and psychophysiology in a first-person shooter game. Interacting with Computers. 22(5): pp. 336-343.
Grimshaw,
Charlton and Jagger. 2011. First-Person Shooters: Immersion and Attention. Eludamos. Journal for Computer Game
Culture. 5(1): pp. 29-44 [online]. Available from: http://www.eludamos.org/index.php/eludamos/article/viewArticle/vol5no1-3/html3# [Accessed 1st February 2015]
Huiberts,
S. 2010. Captivating Sound, The Role of
Audio for Immersion in Computer Games. [online]. Available from: http://download.captivatingsound.com/Sander_Huiberts_CaptivatingSound.pdf [Accessed 20th October 2014].
Jennett
et al. (2008). Measuring and defining the
experience of immersion in games. Int. J. Human-Computer Studies. 66(May):
pp. 641-661.
McMahan,
A. (2003). The Video Game Theory Reader.
Chapter 3 - Immersion, Engagement, and Presence. New York: Toylor & Francis
Books, Inc.
Murray,
J. (1997). Hamlet on the Holodeck: The
Future of Narrative in Cyberspace. New York: Simon & Schuster.
Strauss,
A. and Corbin, J. 1998. Basics of
Qualitative Research. 2nd ed. Sage Publications, Inc. [online].
Available from: http://stiba-malang.ac.id/uploadbank/pustaka/RM/BASIC%20OF%20QUALITATIVE%20RESEARCH.pdf [Accessed 1st November 2014]
Taylor,
L. 2002. Video Games: Perspective,
Point-of-View, and Immersion. [online]. Available from: http://etd.fcla.edu/UF/UFE1000166/taylor_l.pdf [Accessed 25 October 2014]
BIBLIOGRAPHY
11. Bibliography
Collins, K. (2008). Game Sound An Introduction to the History, Theory, and Practice of
Video Game Music and Sound Design. Cambridge, Massachusetts: Massachusetts
Institute of Technology.
Droumeva, M. 2005. Understanding immersive audio: A
historical and socio-cultural exploration of auditory displays. Proceedings of ICAD 05-Eleventh Meeting of
the International Conference on Auditory Display. Limerick, Ireland, July
6-9 2005. Georgia Institute of Technology. [online] Available from: https://smartech.gatech.edu/bitstream/handle/1853/50196/Droumeva2005.pdf?sequence=1 Accessed [11 November 2014]
Ekman, I. 2013. On the Desire to Not Kill Your
Players: Rethinking Sound in Pervasive and Mixed Reality Games. Conference: Foundations of Digital Games
2013. [online] Available from: http://www.fdg2013.org/program/papers/paper19_ekman.pdf Accessed on [20 November 2014]
Grimshaw, 2011. Game
Sound Technology and Player Interaction: Concepts and Developments. Hershey
PA. Information Science Reference (an imprint of IGI Global).
Horowitz
and Looney. 2014 The essential guide to game audio: The theory and practice of
sound for games. Burlington, MA: Focal Press.
Shilling,
Zyda and Wardynski. 2002. Introducing
Emotion into Military Simulation and Videogame Design: America’s Army:
Operations and VIRTE. [online]. Available from: http://calhoun.nps.edu/bitstream/handle/10945/41580/ShillingGameon2002.pdf?sequence=1 [Accessed 10 November 2014]
Stevens
and Raybould. 2011. The Game Audio
Tutorial: A Practical Guide to Sound and Music for Interactive Games.
Burlington, MA: Focal Press.