What exactly is a syllable? Etymologically speaking, the term descended from the Anglo-French "sillabe", from the Latin "syllaba", from the Greek syllable meaning “that which is held together”, referring to individual sounds that are combined and grouped in a sequence together to make a single unit. The syllable is a basic unit of speech and is studied on both the phonetic and phonological levels of analysis. Yet, since it is an abstract phonological constituent without clear phonetic correlates, it can be quite difficult to reach a consensual, unified definition that truly represents, and accurately identifies what a syllable is and what it is not.
Definitions such as a part of a word pronounced with a single uninterrupted sounding’, for example, can be quite vague if not general as they do not pin down the nature of the syllable nor do they specify what exactly is ‘uninterrupted’. For our purposes, a syllable is a prosodic unit larger than the segment and smaller than the word having one vowel sound -or sometimes a syllabic consonant- (the nucleus of the syllabus), with or without surrounding consonantal margins (the onset & coda of the syllabus). This article is organized as follows:
1) The syllable as a domain of sequencing.
2) the syllable as a substantive universal: basic cross-linguistic syllable shapes.
3) General overview of optimality theory; iv) optimality theoretical analysis of syllable shapes and inventories.
4) representations of the syllable.
5) The role of sonority in the organization of the syllable.
The analysis of languages from different families reveals how only certain segment sequences are allowed whereas others are banned. In English, for example: /en/, /gl/, /str/, and others are allowed to occur word-initially while their reverse /gl/, /ng/,/rts/ are only allowed word-finally. The same thing is noticed with other languages, each with its phonotactics. The linguists trying to uncover the unconsciously internalized set of rules behind human language noticed how the occurring sequences present only a fraction of the much larger set that would result from a free concatenation of members of its segment inventory.
To pin down the restrictions on segment distribution, an obvious move is to first posit a constituent that serves as a domain of phonotactics (Zec, 2007). While more than one candidate has been proposed for this role, strong evidence points to the syllable (Kahn, 1976) particularly since many phonological patterns can be better understood by explicitly identifying it as such. Before, in Chomsky and Halle’s SPE, the domain of sequencing generalizations initially assumed was the morpheme, with the phonological representations reduced to feature matrices and morphological boundaries.
Generalizations about segment sequencing could refer to morphological constituents alone (Chomsky & Halle, 1968) with no other types of entities larger than the segment admitted into the grammar (Zec, 2007). The dominant view surrounding syllables in the literature draws upon Kahn’s(1976) analysis of syllable-based generalizations in English, which has established the important role of the syllable as a core unit in generative theory. Many studies on the role of the syllable in phonological theory have taken native intuitions about the existence of the syllable (native speakers of a language can easily identify how many syllables are in words as a plausible piece of evidence to support the status of the syllable as a phonological unit, especially in a theory that seeks to account for the phonological knowledge of native speakers (Al Motairi, 2015).
Bringing our attention to structure, the principal subparts of the syllable are the nucleus (ν) and the two margins, the onset (ω) and the coda (κ). The margins henceforth will be referred to with the symbol C denoting ‘consonant’ (later on when clusters are taken under consideration CC will be used). Based on this, and on the fact the syllable can sometimes be a single nucleus without surrounding segments, four basic shapes can be distinguished:
a) CVC—corresponds to a syllable with all three principal subparts.
b) CV—contains only the onset and the nucleus.
c) VC—contains the nucleus and the coda.
d) V which contains only the nucleus.
Note that segments typically occurring in the nucleus are represented as V indicating a vowel – be it a long vowel, a monophthong, a diphthong, or a triphthong. However, in some languages, the nucleus can also be copied by a syllabic (also called vocalic) consonant, like the rhotic [r], the lateral [l], and the nasals [m] and[n] in the English words rhythm’, tton’, and ‘bottle’ . In this case, the language-specifically permissible syllable shapes the consonant is transcribed with an understroke acritic. Typologists noticed that based on this, we can group language systems into two major types considering the syllable shapes that are language-specifically permissible: systems without codas (Onsets are either required-CV, r optional –CV, V), and systems with codas (onsets are either required-CV, CVC-, or optional-CV, V, CVC, VC)
One thing to notice here which will be referred to later through optimality theory constraints is the asymmetry between the onset and coda. The desirability of onsets is shown by the fact that every language allows syllables with onsets. No language known thus far has only onsetless syllables. In contrast, codas are avoided in many languages, and they are never required in all environments. Put simply, onsets are highly desirable, and codas are dispreferred (Zec, 2007). The number of types (of language systems) is further proliferated by the number of segments allowed in either of the margins. Namely, systems that allow complex onsets-CC stands for cluster- (codas are either banned-CV, CCV-or optional-CV, CCV, CVC, CCVC), and systems that do not. With this in mind, Zec (2007) presents a linguistic typology that classifies all languages into twelve basic inventories based on their allowed syllables’ shapes depending on whether onsets are either optional or required , and codas, onset clusters, and coda clusters are either optional or banned. These are summarized as follows: (C)CV(CC), (C)CV(C), CV(CC), CV(C), (C)CV, CV, (CC)V(CC) like English,(CC)V(C), (C)V(CC), (C)V(C), (CC)V, and (C)V (parentheses indicate ‘optional’).
Before analyzing this range of possible syllable shapes and inventories in light of optimality theory and showing how it is characterized by a set of output constraints, let’s first talk about the theory itself. Chomsky argues that our ability to learn languages (internalize the grammar and build the lexicon) roughly at the same time and generate an infinite number of grammatically well-formed utterances (outputs) despite the bad quantity and quality of the input (poverty and imperfection) is because we have a ‘mental organ or structure’ he called ‘language faculty’. Just as the optic apparatus helps us see things, language faculty enable us to acquire language.
Of course, other factors that also contribute to our knowledge of language, but this is outside the scope of this article. Based on this argument and the premise that we must be equipped with the same ‘genetically determined’ biological language acquisition device, he concludes that all natural languages must be just superficially, or seemingly different from each other and that they somehow share the same characteristics. After all, if we had a key that opens all locks, then it is plausible to assume that all locks are inherently similar despite their different shapes and colors. After this assumption, several linguistic theories have tried to delve into human languages aiming to find the ‘thing’ in which they are all similar. Optimality theory is one of these theories.
The idea behind optimality theory (henceforth OT) is very simple but quite interesting. It states that the observed forms of language are only the result of constraints interaction . According to the theory, there are universal constraints (limitations and restrictions) that are shared by all natural languages. The only difference is that each language has its own language-specific ranking of these constraints which determines the ones that are “OK” to be violated and the ones that are not. However, it is very important to bear in mind that we are solely focusing on the grammar of languages to understand how they are similar. Languages can still be different in terms of their lexicon since most linguistic signs – apart from those that are morphologically derived from others – are arguably just arbitrary combinations of sounds with no intrinsic meaning (the signifier) that are conventionally assigned to certain concepts (the signified). For this reason, two languages can have the same rules (or ranking of constraints) but still refer to the same object with different linguistic signs.
The previously mentioned syllable inventories are considered among the substantive linguistic universals of human language, and analyzing them using the framework of OT reveals the essential constraints that capture all possible syllable shapes. These are: the markedness constraints on the syllable form (the structure of the output): *NUC (syllables must have a nucleus), *ONS (syllables must have an onset), *-COD (syllables must not have a coda),*COMPLEXION (syllables must not have more than one onset segment), and *COMPLEXCOD (syllables must not have more than one coda segment) (* indicates constraint). These constraints are combined with the faithfulness constraints which require the output (the syllable structure in this context) to match the underlying or lexical form (the input), MAX which prohibits segment deletion from the input, and DEP which prohibits epenthesis (addition of segments) to give their specific ranking in each language depending on the syllable inventory of the language. Going back to the four basic syllable shapes, OT gives the constraints ranking for each shape:
a) CV— *ONS » *-COD » (MAX DEP);
b) CV, V—*COD » (MAX DEP) » *ONS;
c) CV,CVC—*ONS » (MAX DEP) » *-COD;
d) CV, CVC, V, VC—(MAX DEP) » *ONS » *COD (*NUC is omitted from this ranking because the syllable cannot exist without it, so instead of repeating it every time we can just imagine that exist as the first constrain that must not be violated; and (MAX and DEP) indicate that at least one of them must be in the ranking position indicated).
Of course, a person might ask ‘Why to bother with all this’, and the answer is that this ranking will help us understand why languages like Japanese, for instance, pronounce tables ‘teeburu’, or why Hindi speakers utter the words starting with the cluster [sp] with an epenthesis of the schwa (prosthesis in this case since the addition of the vowel is at the beginning of the word). In other words, it helps us understand how and why the output in a language sometimes differs from the input. For example, a language system in which onsets are optional while codas are banned (permits CV, V) can transform words like ‘pan’ into ‘panu’ or ‘pani’ or ‘pane’, etc. (CVC becomes CV.CV). How optimality theory will explain this phenomenon is better illustrated using this table:
The table indicates that the input /CVC/ has three possible outputs (there is actually an infinite number of possible outputs but these are the ones that violate fewer constraints), and the ranking of the constraints shows that the prohibition of codas overrides faithfulness, but the demand for onsets does not. Therefore, unlike the first and second output candidates that violate the constraints 'of the first order’(serious crimes), the third candidate is the one violating the least important constraints (a misdemeanor, not a felony) when compared with others. Thus, it is the ‘optimal candidate’ and the language system will favor it over the others. The same table can be applied to all other systems to predict certain phonological processes such as deletion and insertion of segments.
Moving to how syllables are represented, Zec (2007) insists that in the structural characterization of the syllable, both its principal subparts – the nucleus, the onset, and the coda – and the syllable weight need to be properly delimited. The syllable weight(symbolized by the Greek letter μ) is important because it is relevant to many other areas of phonology, most notably stress. In this regard, syllables are bifurcated into two types: a light syllable – 1 more – includes one peak, and a heavy syllable includes two peaks (for some reason, the author did not mention ‘super heavy syllables’ probably because she only deals with basic syllable shapes without counting complex codas); the segment that gravitates to each syllabic position will be later discussed in the last part of this article (sonority of segments). So words like /kaet/, /neko/, and /bili:/ will be represented as follows:
In these moraic representations, the syllables with one nucleus without a coda are light syllables, whereas the others with a coda or branched nucleus -diphthong or long vowels are considered heavy syllables.
Another way of representing the syllable includes structural positions for each relevant subpart of the syllable: the syllable node branches into the onset and the rime, and the latter further branches into the nucleus and the coda. This constituency captures syllable weight in structural terms by designating the rime as the weight domain. A heavy syllable has a branching rime, as distinct from a light syllable whose rime does not branch. However, because the rime is the domain of weight, it needs to be stipulated that onsets are weightless. That is, branching under the syllable node is not relevant for the computation of weight (Zec, 2007).
Coming back to our very first question: why certain segments and sequences of segments are permissible in certain positions within the syllable and others are not? Good evidence points out that the sonority of segments (the loudness, amplitude, or resonance of speech sounds)is thus far the best explanation. The sonority of segments is commonly represented using a scale like ‘the sonority hierarchy’ which corresponds to an ordering of segments ranging from those highest in sonority to those lowest in sonority though the values of segments can differ to some extent from language to language.Typically, the values are V(owels) > G(lides) > L(iquids) > N(asals) > O(obstruents). A more detailed hierarchy is : low vowels > mid vowels > high vowels > glides > rhotics > laterals >nasals > voiced fricatives > voiced affricates > voiced stops > voiceless fricatives > voiceless affricates > voiceless stops > (glides and affricates were not mentioned in the chapter). With these in mind, the sonority sequencing principle (SSP) states that the syllable nucleus constitutes a sonority peak that is preceded and/or followed by a segment or sequence of segments with progressively decreasing sonority values (i.e., the sonority has to fall toward both edges of the syllable). In other words, the sonority profile of the syllable must rise until it peaks, and then falls (Roca and Johnson, 1999).
Linguists have noticed that certain languages allow only V to occupy the nucleic position of the syllable (e.g., Bulgarian), others permit both V and L (e.g., Slovak), and other languages such as English allow V, L, and N whereas in Imdlawn Tashlhiyt, for example, all segments are allowed. /tskrkst/ meaning ‘you lied’, for example, has no vowel. Each language has its own syllabicity threshold (i.e., the point or boundary of sonority that can be exceeded. In English , for instance, the threshold is N, meaning that all segments equal to or higher in sonority than nasal sounds are admissible in the nucleus as long as the SSP is abided. The sonority hierarchy of syllable peaks is incorporated into the grammar as a set of markedness constraints with a universally fixed ranking.
The constraints on syllabicity are “*μh/O »*μh/N » *μh/L » *μh/V “(μh stands for the head mora – the mora of the nucleus). This set of constraints, while banning all segments from the nuclear position, places the strongest ban on the least sonorous segments, that is, obstruents (Zec, 2007). With this in mind, the syllabicity thresholds of the previous languages can be seen as merely a different ranking of such constraints. In Slovak, for instance, the threshold C is represented as “*μh/O » *μh/N » C »*μh/L » *μh/V”, whereas in Imdlawn Tashlhiyt, it is represented as “C » *μh/O » *μh/N » *μh/L »*μh/V”. These constraints, when correctly combined with the faithfulness constraints (MAX and DEP) will be of use in determining how certain words will be pronounced in these languages based on the output candidate they will consider as ‘optimal’ according to their syllabic constraints (more examples about this are to be found in the chapter).
Still, this principle alone cannot explain why [pn] and [ml] are not possible onset sequences in Spanish while clusters like [pl] and [pr] are, even though the second member of the cluster is more sonorous than the first. Cases like this are justified in terms of Minimal Sonority Distance (MSD) imposed on a pair of onset segments (Zec, 2007). Counting the space in between as intervals O, N, L, V (2 intervals between OL, 3 between OL, 1 between, etc.), [p] is separated from [l] by two intervals, and only one interval separates [p] from[n], and [m] from [l]. Because the minimal sonority distance in Spanish is at least two intervals, [pl] and [pr] are possible onset clusters, while [pn] and [ml] are not. (Zec, 2007).
The two principles, however, still do not explain why the following clusters [pw], [bw], [fw],[tl], [dl], [Dl] are unacceptable in English even though their sonority rises. These are explained by the Obligatory Contour Principle (OCP) which states that two adjacent segments must not be similar . The pairs of syllables that emerge as optimal under constraints on syllable shapes are further subject to the requirements of Syllable Contact (SC). Certain phonological alternations some words are subject to when borrowed from one language to another may be driven by syllable contact which favours a sonority fall across boundaries.
Al Motairi, Sarah Soror. (2015). "Anoptimality-theoretic analysis of syllable structure in QassimiArabic". Master's Theses and Doctoral Dissertations.Retrieved from :https://commons.emich.edu/theses/612.
Chomsky, N. & M.Halle, 1968. The Sound Pattern ofEnglish. New York: Harper and Row.
Kahn,D. (1976). Syllable-based generalizations in English phonology (Doctoral dissertation). MassachusettsInstitute of Technology, Cambridge, MA.
Zec, Graga.(2007). The Syllable. Cambridge University Press. The Cambridge Handbook of Phonology.Ed. Paul de Lacy.