A Semantic Broadcasting Method for the Exchange of Encoded Texts between Different XML Schemas

Marios Poulos, Sozon Papavlasopoulos, and George Bokos
Department of Archives and Libraries Science, University of Ionian,
Palaia Anaktora, 49100, Corfu, Greece
Email: {
mpoulos, sozon, gbokos}@ionio.gr


In this paper, we introduce a novel method, based on specific frequency segmentation, for the exchange of encoded texts between different XML schemas. In this process, we encode the text part of each element of the Dublin Core using a signal processing technique so that every text element broadcasts an audio signal in a unique frequency. In this way, every schema based on the XML language can exchange textual data. This method may be used in order to interchange different philosophies of the XML Schema and may also be used as a solution to the general problem of interoperability. More specifically, this method may be applied to TV metadata and the broadcasting services of automated libraries.

Keywords: Multimedia, XML, Encoding, Signal Processing, Metadata Management, Harmonization Problem.




Background: A recent increase in the sophistication of digital TV and interactive TV technologies has meant that a richer choice of content and related data is available to the user, thus requiring ever more effective methods for the selection and management of either distributed or locally stored broadcasted contents. Various papers have considered the application of Dublin Core (DC) and the Resource Description Framework (RDF) for the related data (Bhat, 1998; Gonno, Nishio, Haraoka, & Yamagishi, 1998; Hunter & Iannella, 1998; Nishio, Gonno, Haraoka, & Yamagishi, 1998). The central feature of the Dublin Core is the building of an interdisciplinary, international consensus around a core element set (currently 15 elements). Although it is possible to describe non-bibliographic attributes of non-textual resources through qualification of the Dublin Core elements, MPEG-7, the Multimedia Content Description Interface standard (Manjunath, Salembier, & Sikora, 2002) developed by the ISO/IEC Moving Pictures Experts Group (MPEG), has been designed to provide detailed formatting information and fine-grained descriptions of the structural and low-level audio, visual, and audiovisual features of multimedia content. The aim of MPEG-7 is to provide a rich set of standardized tools to enable both humans and machines to generate and understand audiovisual descriptions that can be used to enable fast, efficient retrieval from digital archives (pull applications) as well as the filtering of streamed audiovisual broadcasts on the Internet (push applications). Although it is possible to determine mappings between the Dublin Core elements and MPEG-7 descriptors, the mapping is complex. There is no simple one-to-one mapping that corresponds to each DC element, and many of the MPEG-7 descriptors are embedded at a low level within the MPEG-7 Description Schemes (DSs). Moreover, in light of the problem of interoperability, MPEG-7 descriptors and TV-Anytime have adopted a common representation format for the exchange of metadata. Interoperability means that a metadata provider using this representation format will be ensured that this information is appropriately interpreted, processed, and rendered on different platform implementations. Interoperability also means that different original representation formats are capable of bidirectional transformation into the common format. MPEG-7 descriptors and TV-Anytime have the common description definition language of the XML Schema, a common framework based on W3C's XML family (Composite Capability/Preference Profiles, 2001).


The lack of standardization in the area of television-programmed information or “metadata” poses increasing problems for all concerned, from programme-makers, broadcasters, advertisers, and network operators to viewers. Three industry-based initiatives have so far addressed this problem, the earliest being DVB-SI, which plays an integral role in the digitalization of TV in Europe and other parts of the world. TV-anytime and MPEG-7 are also promising television metadata standards that are in the process of being finalized. The former deals with high volume, low cost storage (e.g., personal video recorders and video-on-demand services) and the latter with the wider scope of providing tools to describe all forms of multimedia content from the broadest possible range of networks and terminals. TV-Anytime and MPEG-7 use XML syntax to define what metadata will be used (the schema) and then to encode that data (instances). The metadata structure is defined using the XML Schema based MPEG-7 Description Definition Language (ISO 2001). The XML Schema is a recent web standard for defining sophisticated data structures (Worthington, 2001b) and uses the XML syntax. While XML has its advantages, being widely favoured and easy to generate, the number of XML applications is constantly increasing, which leads to many redundant specifications. Furthermore, not all XML applications have common functionalities; for instance, both Synchronized Multimedia Integration Language (SMIL) and thread eXtensible 3D (X3D) have sub-schemas to describe the meta information, yet they also have their individual functionalities to represent multimedia files and create X3D objects. Consequently, there is a demand to create powerful XML applications by reutilizing and integrating functionalities from the existing XML applications. This is called the XML harmonization problem. Most previous work on XML harmonization has been done in the database field, where the focus has been on allying the participating XML applications. These methods are specific and cannot easily generalize to other XML applications. The eXtensible MPEG-4 Textual format (XMT), another harmonization effort in the multimedia domain, simply adopts partial schemas from SMIL and X3D by renaming their element tags (Chen, Kuo, Sun, & Kuo, 2003).

The proposed method

In this paper, we propose a new broadcasting metadata schema based on an example using the Dublin core and Mpeg 7 standards (Hunter, 2002), in which we attempt to define a novel format for the non-textual interchange or metadata between XML schemas of different philosophies in order to avoid the problem of harmonization. The application of the above example is shown in Table 1 and illustrates the mappings between the Dublin Core Metadata Element Set and the MPEG-7 Multimedia Description Schemes (ISO/IEC 15398-5) (ISO/IEC, 2001) using XPath (XML, 1999) expressions to represent the equivalent MPEG-7 descriptors. More specifically, we adopted a new, simpler metadata format based on the Specific Segmentation of a Pre-Determined Frequency (SSPDF). The proposed schema does not need any grammatical syntax as does an XML Schema, and the metadata exchange between operating systems with different philosophies and the one proposed in this work is implemented via a suitably designed non-loosed filter. Furthermore, the proposed metadata format schema SSPDF may be broadcasted via analogue TV stations from a satellite in digital form together with the original TV programme or be compressed with the TV programme using the suitable multiplex methods (Yang, 1999). Generally, the proposed method is based on a novel encoding of the textual part of the content of each element of the XML Schema in a unique dominant broadcasting frequency band. Thus the interchange of data takes place by detecting those specific frequencies.

In brief, our proposed method is divided into the following stages:

·         Pre-processing stage: We create mappings between the Dublin Core Metadata Element Set, Version 1.1 (Chen, Kuo, Sun, & Kuo, 2004), the MPEG-7 Multimedia Description Schemes, and the proposed SSPDF format.

·         Processing stage: We describe analytically the code and the decode SSPDF format.

·         Experimental part: The SSPDF format is tested on real data in the code and decode forms.

·         Related work.  In this stage the proposed method is compared to the Harmonized eXtensible Markup Language (HXML) which, bibliographically, is presented as a valuable solution to the harmonization problem (Chen, Kuo, Sun, & Kuo, 2004).

·         Conclusion and future work: We describe all the advantages of the proposed SSPDF format with regard to the traditional XML consideration format and its possible application in telecommunication areas.


Preprocessing stage

In this stage, we applied a specific conversion from a DC element and MPEG-7 (Dublin Core, 1999; Hunter, 2002) proposed schema path to the SSPDF format. More specifically, any textual data that was taken from each of 15 elements was coded in specific frequency sub-band signals (Table 1). This conversion was implemented via an algorithm described analytically in the processing stage. The main advantage of this conversion is that the specific sub-frequencies, such as the subject element, may be analyzed in more sub-bands using a standard classification system such as, for example, the Dewey Decimal Classification (DDC). This proposed conversion is described herein using the classical example of a TV programme (Table 2).

DC Element


MPEG-7 Path



The abstract identifier/

name of the resource



F1=100*k Hz


The person or entity that originally created this resource.



F2=200*k Hz


A classification of the topic of the resource.



F3=300*k Hz


An abstract description of the content of the information resource attached.


F4=400*k Hz


A person or entity tasked with making the resource available, thus accessible to the interested stakeholders


[Role/Name="Publisher"]/Agent/Name UsageInformation/Availability/



F5=500*k Hz


A person or entity tasked with making contributions to this resource



F6=600*k Hz


The date associated with an event regarding the content (for instance, creation date)




(date at which MPEG-7 metadata description was created)

AvailabilityPeriod (date, time, and duration of broadcast or date of publication if availability is persistent.)

F7=700*k Hz


The nature or genre of the content of the resource



F8=800*k Hz


The physical or digital representation of the resource



F9=900*k Hz


An unambiguous reference to the resource within a given context






F10=1000*k Hz


A reference to the original source where the content derives.






F11=1100*k Hz


The language of the content





F12=1200*k Hz


A reference to a related resource or object with connection to the content.





F13=1300*k Hz


The extent or scope of the content of the resource




F14=1400*k Hz


Information about various rights held on the resource (copying, distributing)



F15=1500*k Hz

Table 1. The illustration of the mappings between the Dublin Core Metadata Element Set, Version 1.1 (Hunter & Iannella, 1998) and the MPEG-7 Multimedia Description Schemes (ISO/IEC 15398-5) using XPath expressions to represent the equivalent MPEG-7 descriptors and the SSPDF format where k is a constant defined by the broadcasting multimedia application


                            General   TV programmes

                   TV Films

                               F3=300*k Hz

                        F36=360*k Hz

1.       Sports news  310*k Hz

2.       Weather news   320*k Hz

3.       Political news   330*k Hz

4.       Music programmes 340*k Hz

5.       Serials   350*k Hz

6.       Films     360*k Hz

7.       Advertisements 370*k Hz

8.       Financial news 380*k Hz

9.       Documentaries 390*k Hz

1.       Drama 361*k Hz

2.       Comedy 362* k Hz

3.       Police 363*k Hz

4.       Adventure 364*k Hz

5.       Cartoon 365*k Hz

6.       Historical 366*k Hz

7.       War 367*k Hz

8.       Horror 368*k Hz

9.       Sex 369*k Hz

Table 2. The illustration of the DC implementation of the third DC element (subject) using SSPDF format

Processing stage

2.2.1 Meta-encode stage

In this stage, we supposed that a selected text is an input vector, where  represent the characters of the selected text. Then, we used a conversion procedure where a symbolic expression (in our case an array of characters of a text) was converted, using suitable software, to ASCII characters in string arithmetic values, and a numerical value vector  whose values ranged from 1-128 was obtained.

Furthermore, vector  was transformed to a specific wave signal at a particular frequency and was generated as follows:

1.      We determined the suitable frequency of

2.      The frequency sampling was calculated using the Nyquist criterion, where  is a constant.

3.      The time vector  was calculated, where:.

4.      We calculated the final strong signal  with pre-determined frequency , where and is the size of the original vector. The above calculation of the pre-determined frequency  was achieved using a trick. Sort vector  and the number n, which was significantly smaller than vector , were added together; thus the pre-dominant frequency of vector  remained stable in practice (Hunter, 2002). For the retrieval procedure, the information of the length of the original vector  is stored as the first element of the vector .

5.      For accuracy and security reasons, a verification test was adopted to check that the detected frequency was correct and that the value information of the coding frequency was stored in the last elements of vector y. This control procedure was adopted in order to avoid any error in the coding frequency that could result from random noise in the broadcasting procedure and for security reasons as in the case of a same frequency of a hacker attack being detected as the proposed system of the request information. In the latter case, the order of this element is used as a secret key in the proposed security system.

6.      Vector  was saved in the binary format file (metadata file) using suitable software.

Meta-decode stage

The Meta-decode stage consists of the following steps:

1.      The transformation of vector y into binary format (metadata file), where  with the use of suitable software.

2.      Then the sampling frequency is  

3.      Vector  was submitted to spectral analysis, and the highest frequency of  was detected with the use of suitable software based on Fast Fourier Transform (FFT) spectral analysis.

4.      The time vector  was retrieved as defined:

5.      Then the original vector was retrieved as , where  with the use of the information (n) given by the first element of vector .

6.      Thereafter, the accuracy of the detected frequency was investigated by comparing it with the frequency of the last element of vector . If the accuracy was verified, the procedure continued. Otherwise, it was cancelled and the Meta-encode stage repeated.

7.      With suitable software, the numerical vector  was transformed into a character vector.

8.      Then the character vector was updated by the suitable database.

Experimental part

In this stage, we implemented the processing stage using an example of the third DC element (title of film) and case 1 of the drama film entitled “One Flew over the Cuckoo's Nest.” This example was encoded in the following steps:

Meta-encode stage

·        Using the “double” function of Matlab, we converted the title of the film into a numerical representation, including the broadcasting frequency (see figure 1).

Fig. 1.  The numerical transformation of the textual title of the film including the 361Hz value  of the meta-code frequency

·        The numerical representation of the title of the film was transformed into a wave of 361*100 Hz (where k=1) with a sampling rate of 170000, according to the equations provided in the Meta-encode stage (Processing Stage) (see figure 2).

Fig. 2. The spectral analysis of the numerical array where the frequency of 361 Hz dominates

Meta-decode stage

In this stage, the numerical matrix is reverted to the original title with the use of inverse text transformation according to the equations described in Meta-decode stage.

More analytically, using vector y, we retrieved two valuable pieces of information, first the length of the encoding text (title of the film) and, second, the highest frequency (see figure 3).

Fig. 3.  Presentation of the numerical array in the decoding phase. The first element (31) indicates  the length size of the film title plus the secret information of the meta-code frequency.

Finally, the original text was retrieved with the use of inverse text transformation implemented in this example by the “char” function of Matlab.

Fig. 4.  The final textual recovery of the film’s title presented with the verification meta-code value of frequency 361 Hz

Related work

The harmonization problem of metadata, apart from being the topic of discussion on the expression level (Chen, Kuo, Sun, & Kuo, 2004); Heery & Patel, 2000), has generated such formal encoding approaches as representations on the harmonization method. An example of such an approach can be found in (Chen, Sun, & Kuo, 2003) where a formal encoding of the harmonization problem is represented along with a proposed solution.

This related work, however, relies heavily on the integration and restructuring of different XML representations in a new XML Schema. More particularly, this kind of approach combines descriptors and elements in a new object, which is organized by an object-oriented harmonization data model. The key concepts and terminologies of this approach are based on a group of objects having a similar structure and behaviour. The disadvantages of this technique lie in the complexity and association among different semantic objects (schemas). For instance, in the relevant Harmonized eXtensible Markup Language (HXML), the complexity of the procedure of the new object produced by s objects is increased by O(N)s, because the new object is a result of the calculation of the other objects. In contrast, however, for the proposed SSPDF, the complexity is O(2*k*N*s) because each calculation is independent for each object where N is the coding or decoding character (of elements) of k entities. Furthermore, the object-oriented harmonization technique is based on the association among semantic elements, entities, and schemas in contrast to the proposed SSPDF method, which is based on the mapping of coding in semantic frequency entities. One major advantage of the proposed method is that it creates queries of different schema similarities that have at least one common entity. This means that the query consists of a common element (key) that drives the reference point of the encoding and decoding without harnessing the semantics of the information domain (mapping to different objects through a separate process not being necessary).

Finally, our method is independent of the schema domain since each entity has a unique representation in the frequency transformation (encoding and decoding) stage. This is because the transformation of the XML data into a signal sequence does not need to be pre-processed in order to identify the common elements that need to be mapped to the new instance. The encoding process is, therefore, independent of the representation level of the elements. Through the encoding exception/handling process, there is also an assertion mechanism that will provide a high level of accuracy in the transformation (encoding and decoding) of XML data from one information source to another.

Conclusion and Future Work

In this paper, we have discussed a new method for the exchange of encoded texts between different XML Schema. In this method, we encoded the text part of each DC element, using a signal processing technique, so that every text element would broadcast an audio signal in a unique frequency. In this way, every schema based on the XML language could exchange textual data. This was achieved using unique frequencies for each schema element. For example, the simple DC schema was used in order to encode a classic TV programme. This technique, however, may also be used in other applications such as the broadcasting of multimedia data, the exchange of medical library data, or by the broadcasting services of automatic libraries. In addition, this process may be used generally by (non)broadcasting interchange services. Our system leans towards a more formal approach regarding the harmonization of metadata schemas through already proven effective techniques and methodologies such as those of information encoding and transmission. Community standards also play an important role since the architecture and deployment of such an approach must have the active support of industry leaders. Our research group is open to discussing such approaches and welcomes any suggestions on deploying such architecture on an industrial level.


Bennett, R. M., & Chapman, R. H. (1984). U.S. Patent No. 4,484,354. Washington, DC: U.S. Patent and Trademark Office.

Bhat, D. (1998, October). On representing video structure using RDF. MPEG document: ISO/IEC JTC1/SC29/WG11 MPEG98/M4132. MPEG Meeting, Atlantic City.

Chen,Y. F., Kuo, M. C., Sun, X., & Kuo, C. C. J. (2003)  An object-oriented approach for harmonization of multimedia markup languages. In M. M. Yeung, R. W. Lienhart, & C. S. Li (Eds.), Storage and retrieval methods and applications for multimedia 2004. Proceedings of the SPIE, 5307, 448-459.

Chen,Y. F., Kuo, M. C., Sun, X., & Kuo, C. C. J. (2004). Object-oriented harmonization of multimedia XML applications. ICME, 2099-2102.

Chen, Y. F., Sun, X., & Kuo, C. C. J. (2003, September). XML Schema harmonization: Design methodology and examples. Proceedings of SPIE Information Technologies and Communications, 5241.

Composite capability/preference profiles. W3C working draft. (2001, March). Access: http://www.w3.org/TR/2001/WD-CCPP-struct-vocab-20010315/

Dublin core metadata element set, version 1.1: Reference description. Access: http://dublincore.org/documents/1999/07/02/dces/

Gonno, Y., Nishio, F., Haraoka, K., & Yamagishi, Y. (1998, August). Metadata structuring of audiovisual data streams on MPEG-2 system. Metastructures '98. Montreal, Canada.

Heery, R., & Patel, M. (2000, September). Application profiles: Mixing and matching metadata schemas. Ariadne, Issue 25. [Online] available at http://www.ariadne.ac.uk/issue25/app-profiles/

Hunter J. (2002, February). An application profile which combines Dublin core and MPEG-7 metadata terms for simple video description. Access: http://www.metadata.net/harmony/video_appln_profile.html

Hunter, J., & Iannella, R. (1998, September). The application of metadata standards to video indexing. Paper presented at the Second European Conference on Research and Advanced Technology for Digital Libraries, Crete, Greece.

ISO/IEC 15938-5 FDIS information technology–multimedia content description interface–part 5: Multimedia description schemes. MPEG document: ISO/IEC JTC1/SC29/WG11 Document W4242, July 2001, Sydney.

Manjunath, B. S., Salembier, P., & Sikora, T. (Eds). (2002, April). Introduction to MPEG-7: Multimedia content description interface. New York: Wiley Publications.

Nishio, F., Gonno, Y., Haraoka, K., & Yamagishi, Y. (1998, August). Transporting RDF Metadata associated with structured contents. Metastructures '98. Montreal, Canada.

Worthington 2001b: Internet-TV convergence with the multimedia home platform. Communications Research Forum. Access: http://www.tomw.net.au/2001/itv.html

XML path language (XPath) version 1.0. W3C recommendation. (1999, November 16). Access: http://www.w3.org/TR/xpath

Yang, Y. (1999) An evaluation of statistical approaches to text categorization. Journal of Information Retrieval, 1(1/2), 67-88.