The visual-verbal text interrelation: Lessons from the ideational meanings of a phonics material in a primary level EFL textbook

The interrelation between verbal and image represented in a textbook is one of the prominent issues that should be taken into account. If the interrelation between these modes is presented in a textbook properly and aptly, students’ outcomes of a particular material, such as phonics, may be entrenched and fostered. Thereby, this study aimed to scrutinize the interrelation between the verbal and image represented in the phonics material, which was to the best of the writer’s knowledge concerning studies scrutinizing the phonics materials vis-à-vis multimodal text was still limited. Furthermore, a qualitative research method using content analysis was employed to investigate the analysis unit, namely the phonics material taken from one primary level EFL textbook. Besides, Royce’s (1998, 2002, 2007) Intersemiotic Complementarity and Kress and van Leeuwen’s (2006) Grammar of Visual Design involving one of the metafunctions deriving from Halliday’s Systemic Functional Linguistics, ideational metafunction was employed. The findings revealed that a multimodal text encompassing the phonic material was, in fact, had fruitful meanings manifested in various modes, and there was a synergy found between the visual and verbal meanings realized by the ideational intersemiotic complementarity.


Introduction
The way images complement texts and vice versa constitutes one of the prominent areas studied at the present time. As Kress (2010) argues that various modes having cultural and social bound associated with semiotic resources leading to making meaning are considered to gain some proliferations in any discipline. Furthermore, it is on account of the fact that such various modes are inevitably | 2 | used in communication in the recent time (Kress 2000;Kress, 2003;Nathaniel & Sannie, 2018). It is due to the fact that the meaning of a certain message can be communicated through images . Moreover, without exception, such notion works with the classroom setting in which, according to Stein (2000), is considered to be 'semiotic spaces' through which any multimodal text encompassing various forms of modes such as 'visual, written, spoken, performative, sound, and gestural modes' are possible to be generated. Such modes provide signs of which meanings can influence the way individuals see, think, and perceive issues or things in their daily lives (Kress, 1993).
In addition, some evidence has shown that the use of various modes may bring benefits to students' English learning and teachers' instructional practices. For instance, a study conducted by Cahyaningati and Lestari (2018) reveals the efficacy of multimodal text concerning reading proficiency. Moreover, a multimodal text was found to have the potential to foster students' critical thinking (Al-Qahtani, 2019). Furthermore, Teo and Zhu (2018) argue that images are not only considered as supplementary or accessory, but it has some significant role in the meaning-making of the printed words. Similarly, Misianto (2017) in his study found that pictures were proven to be beneficial for students to enhance their English proficiency skill, which in this regard in his study particularly the speaking skill.
Meanwhile, an EFL textbook, which may cover fruitful multimodal texts, is considered to have crucial roles and places in the EFL learning and teaching process. In this regard, it is considered to have an essential role, namely as the source and means for introducing knowledge and cultures learned (Rinekso & Indonesia, 2021;Sugianto & Wirza, 2021). Also, Marefat and Marzban (2014) argue that a textbook constitutes an important tool used by a lot of people. Moreover, in relation to the role of multimodal texts in an EFL textbook, text and images presented in a textbook are deemed to have a significant effect on the students' understanding of a lesson. It is due to the fact that the author of the textbook does not also work alone when structuring a textbook, but he/she works together with visual artists or graphic designers (Bezemer & Kress, 2010).
Furthermore, despite having benefits, a textbook may not be taken for granted because there may be some deficiencies in it that may not fulfill and facilitate the students learning needs; thereby, evaluating a textbook is one of the crucial issues in the EFL context. It is because textbook, as Richards and Richards (2015) argue, there is no flawless textbook that can suit a language program thoroughly. This notion is corroborated by a study, for instance, which was conducted by Sobkowiak (2016) even though there was credence that a textbook could enhance the student's critical thinking ability, it was yet found to be limited in his study. Similarly, Mizbani and Chalak (2017) revealed that an EFL textbook investigated in their study was found to not accommodative in terms of higherorder thinking skills. Thereby, it should be taken into account that the textbook selection process is not a taken-for-granted process, there should have some considerations which could be hinged on the evaluation conducted.
Moreover, to evaluate the textbook using the multimodal text analysis, understanding the frameworks are required. In this regard, there are some influential figures with respect to this area. In this case, two of the most prominent ones and thus their notions were used in this study having to do with the notion of the grammar of visual design advocated by Kress and van Leeuwen (2006) and intersemiotic complementarity proposed by Royce (1998Royce ( , 2002Royce ( , 2007. These two frameworks are | 3 | developed under the notions of Halliday's systemic functional linguistics covering three types of metafunctions ideational, interpersonal, and textual metafunctions (Halliday, 1994;Halliday & Matthienssen, 2004, 2014. In this regard, the ideational metafunction concerns with the realization of 'reality, events, and experiences', the interpersonal metafunction concerns with the realization of entrenching and keeping social rapports, and the textual metafunction concerns with the realization of the organization of discourse through which the 'flow of the information' is managed and kept (Eggins, 2004;Emilia, 2014;Gunawan, 2020). Moreover, the visual meanings and their relations to the verbal meanings are elaborated under the notions of intersemiotic complementarity, which in this regard due to the scope of the present study the elaboration are focused on the ideational metafunction only. Table 1 gives the brief explanation of the ideational intersemiotic complementarity.

Visual meanings
Intersemiotic complementarity Verbal meanings Identification having to do with the represented participants and the interaction; activity having to do with the action, events, or types of behaviour; circumstances having to do with 'setting, mean, and accompanyment'; attributes having to do with the qualities or characteristics attached to the participants Repitition having to do with the 'identical meanings'; synonymy having to do with the 'same or similar meanings'; antonymy having to do with the 'opposite meanings'; meronymy having to do with the 'part-whole relation'; hyponymy having to do with 'general-sub class relations'; collocation having to do with the 'the probability that an entity or subject can co occur in a certain subject area' Identification having to do with the participants; activity having to do with actions or processes; circumstances having to do with 'setting, mean, and accompanyment; attributes having to do with the qualities or characteristics attached to the participants.  (Royce, 1998(Royce, , 2002(Royce, , 2007 Furthermore, despite its importance in the students' English learning, to the writer's knowledge, studies concerning the phonics material associated with a multimodal text were still limited. Thus, based on the rationales above, the present study aimed to investigate and answer the following questions: 1) how are the ideational visual meanings realized in the multimodal text of a phonics material? 2) how are the ideational verbal meanings are realized in the multimodal text of a phonics material? and 3) how are the interrelations between the visual and verbal meanings realized in the multimodal text of a phonics material?

Method
A qualitative research was employed in this study. There are a number of characteristics owned by this type of research. In this regard, the qualitative research, as Wertz et al. (2011) suggest, is a type of research intended to answer the open-ended questions leading to obtaining "qualitative knowledge" involving contextuality, output, and the benefits of the issues being investigated. Moreover, Nunan (1992) asserts that "process-oriented" that has to do with three aspects, such as validity aspect which is gained through garnering deep, rich, real, and objective data, ungeneralisable aspect on ac-| 4 | count of the use of "single case studies" as well as the assumption of dynamic aspect which is associated with the reality investigated. Based on the notions mentioned above, this type of research is considered to be appropriate if implemented in the present study for this study is projected to answer the open-ended questions having to do with understanding the visual images, verbal text, and the interelations between them; also, the features of the qualitative research above can be considered to be in line with the way the data were collected, namely gaining in-depth data concerning the multimodal texts and it is deemed to be 'ungeneralisable' for it was conducted to investigate the multimodal text. Moreover, a content analysis by making use of a multimodal analysis utilizing the visual grammar advocated by Kress and van Leeuwen's (2006) and Royce's (1998Royce's ( , 2002Royce's ( , 2007 intersemiotic complementarity frameworks discussing the interrelation between image and text were employed. Besides, the analysis unit derived from a primary level EFL textbook entitled Super Minds (Puchta, Gerngross, & Lewis-Jones, 2017). In this regard, the material had to do with one of the phonics materials from the textbook, namely -sure and -true. The material was taken based on the credence that it was deemed to be the challenging phonics to be learned by the students. Moreover, the material was considered to be a multimodal text which was not only presented by using a verbal text, but it also was accompanied by some colorful pictures. However, due to the copyright issue, the pictures were changed into black and white colors.
In addition, in regard to the matter of trustworthiness issues, some techniques were employed. In this case, in terms of the credibility, 'establishing referential adequacy' having to do with using various supporting documents and resources to interpret the data was utilized (Guba, 1981). Moreover, 'developing thick description' was used to deal with the transferability issue, that is, whether or not the data fit to the other relevant or related context (p. 86). Furthermore, in terms of dependability and confirmability issues, 'audit trail' and 'practicing reflexity' were employed through making use of notes or journals in regard to the collected data (p. 87).

Findings and discussion
In this part, the results gained from the content analysis conducted to the analysis unit, that is, a multimodal text with respect to the phonics material, namely -sure and -ture will be discussed and elaborated. In this regard, the discussion and elaborations will be associated with visual meanings, verbal meanings, and the interelation between the two modes, namely visual and verbal meanings and modes represented in the analysis unit (see Figure 1). (Puchta et al., 2017) The visual-verbal text interrelation: Lessons from the ideational meanings of a phonics material ...

| 5 |
To reach and find out the ideational intersemiotic complementarity, the part of ideational meanings was divided into three parts. They were visual message elements depicting the visual meanings of the analysis unit, followed by the description of the verbal aspect, and ideational intersemiotic complementarity. These three parts are discussed and elaborated below.

The verbal meanings of the multimodal text
To begin with, in regard to the visual meanings, there are four aspects discussed. They are identification, activity, circumstances, and attributes. First, concerning the identification that has to do with the explication of the represented participants, the analysis unit reveals there are two actors encompassing a man and a parrot. Following the two actors, there are some represented participants' features comprising treasure chest, gold coins, gold necklace, red headgear, white-blue strip t-shirt, black vest, peg-leg, and cutlass or sword.
In addition, in terms of the activity, having to do with the actions occur, it can be revealed that from the images shown in Figure 1, there are some various processes. In this regard, based on Kress and van Leeuwen's (2006) framework, encompassing the following: 1) a symbolic attributive process where the features such as a treasure chest clothes, and weapons so forth aforementioned having an association with the attributes of pirates (life) and in this case the carrier constitutes the man; 2) a transactional process where a man is opening the chest, and a parrot which seems to talk to the man; in this regard, it seems that the major actor is the man with the goal is the chest treasure. Also, the parrot can be deemed as the goal to whom he speaks. Besides, the parrot can also be the major actor for he seems to talk to the man as the goal; in other words, it is considered to have sequential bidirectionality; 3) a reactional process, in which based on Figure 1, it can be considered that there are two reacters, namely the man and the parrot. In this regard, if the parrot becomes the actor with respect to the transactional action process where the chest treasure is the goal, then the man can be considered to have a role as the reacter with the phenomenon is opening the treasure chest. Meanwhile, the parrot is viewed as the reacter if it is considered to react to the man's action, that is, opening the treasure chest and hence it is considered to be the phenomenon; 4) speech process, which in this case, even though there was no speech/thought/dialogue balloon, the dialogue which becomes the verbal text appear in the center of the two pictures, represents their speeches; in this regard, it seems the speaker or sayer is considered to be the more appropriate terms to be used instead of senser since the content is close to dialogue albeit there is no dialogue balloon provided; hence, the speakers or sayers may encompass the man or the parrot and the utterance constitutes the content they are talking about.
Furthermore, in regard to circumstances associated with the "settings (i.e. place or background), means (i.e. the supporting 'tool' to conduct the activities), and accompaniment (i.e. the other participant with whom the major participant conducts the activities)" (Royce, 1998) based on Figure 1 and Table 1, there are information obtained. Concerning the setting, it was conducted on a ship with the other background settings covering the light blue sky with white clouds which are likely to occur on the sunny day (either in the morning or afternoon). Moreover, in terms of the means, it can be seen from Figure 1 that the main participant, i.e., the man, is holding and opening an object namely EnJourMe (English Journal of Merdeka): Culture, Language, and Teaching of English Vol. 6, No. 1, Juli 2021, pp. 1-10 | 6 | treasure chest, through which an actional vector is created, which also becomes the attributes of the man. Furthermore, in terms of the accompaniment, the two participants comprising the man and the parrot and the things may be considered to be the accompaniment through which a vector can be created.
In addition, concerning the attributes or the 'qualities and characteristics of the participants, it can be indicated by the symbolic attributive process aforementioned. The attributes are realized by a man Carrier who is holding and opening the treasure chest with some other attributive features such as the headgear, peg-leg, cutlass or sword and so forth which have some connotative meanings represented and summarized in Table 2 below.  (Cirlot, 2001;Ferber, 2007;Morton, 2021)  White-blue strip t-shirt A symbol of 'joy, solace, and gladness and heavenly things'; it can also be associated with sky and sea.
Black vest A symbol of bad or evil; it can be associated with the characteristics of a pirate. Pegleg An attributive symbol of a pirate.
Cutlass/sword A symbol of dignitaries; a pirate's weapon. Green purple, red, yellow Parrot A symbol of a messenger, soul and if it is associated with the green colour it symbolises luck and nature, purple signs mystery, yellow symbolises happiness, and red symbolises adventure. Ship A symbol of 'joy and happiness' or adventure.

The verbal meanings of the multimodal text
To construe the verbal meanings from the dialogue, the Halliday's systemic functional linguistic, transitivity metafunction, was utilized. Table 3 below provides the analysis of the transitivity analysis of the dialogue. The analysis provides the information concerning the participants. In this regard, some processes were found to be associated with the represented participants. For instance, it was found that the represented participants were found to have mental processes with respect to affection such as love finding with its phenomenon treasure and (love) being in nature. Moreover, the represented participant is also shown by the Gold which has a role as the actor conducting process material to the recipient which is the represented participants, the man and the parrot, with the pleasure is the goal. Moreover, it was also found that relational process appeared in the dialogue which can be indicated the represented characteristics that see life is adventure.

The Visual and verbal interrelation in the multimodal text
The ideational intersemiotic complementarity were represented by various relations between the visual and verbal modes. These comprise repetition, synonymy, antonymy, meronymy, hyponymy, and collocation. However, before turning to the intersemiotic complementarity, the verbal aspect of the text was investigated first by breaking down the dialogue into sentence-level represented in Table  4 below. As the sentence-level providing the information concerning the verbal aspect was given these 'intersemiotic sense of relations can be obtained. In this regard, the intersemiotic complementarity was presented in Table 5 below.  Table 5 above, it can be seen that the interelation between the visual messages and verbal text was found. In this regard, gold was found to collocational support superiority. Similarly, valuable thing was found to be complemented by the verbal text indicated by the meronymies found encompassing the words treasure, gold, and life. Besides, finding treasure was found to be the meronymy of the superordinate the adventure which is also supported by the synonym found in verbal text with the same word, adventure. Furthermore, the relation of hyponymy was found to support the visual element connotating the sky, sea. In addition, pirate which represents one of the visual message elements were complemented collocational by the phrase or words found in the clauses such as finding treasure gold, and adventure. Also, the visual message element generating nature was found to have the relation of repetition of nature found in the dialogue. Furthermore, happiness which constitutes the visual message element were complemented collocational by the word love and was found to have a synonym with the word pleasure.