Knowledge Retrieval and Semantic Systems

Information Storage and Retrieval

Analysis of Resource Storage Formats in Digital Collections (Case Study: National Libraries)

Yaghoub Norouzi; Nayere Jafari Far; Reihaneh DavoodAbadi

Volume 12, Issue 42 , September 2024, , Pages 151-182

https://doi.org/10.22054/jks.2022.66680.1491

Abstract

IntroductionThe management of digital resources in national libraries involves significant challenges, particularly in selecting, maintaining, and ensuring long-term accessibility. A crucial element in this process is the storage format of digital resources, which directly affects both the preservation ... Read More IntroductionThe management of digital resources in national libraries involves significant challenges, particularly in selecting, maintaining, and ensuring long-term accessibility. A crucial element in this process is the storage format of digital resources, which directly affects both the preservation of materials and their usability over time. Below are key reasons why the choice of resource storage formats is important for national libraries:Long-Term Preservation: National libraries manage vast collections of digital materials, from texts to multimedia resources. Choosing a storage format that ensures long-term preservation is essential to prevent data loss over time due to format obsolescence. National Libraries often face challenges of format obsolescence. The right choice of format can minimize the need for complex emulation or migration strategies, which are resource-intensive.Interoperability and Accessibility: Digital collections in national libraries need to be accessible across various platforms, devices, and operating systems. International standards and guidelines, such as those from the International Federation of Library Associations and Institutions (IFLA), guide the choice of formats that facilitate interoperability. Adopting these standards ensures that digital resources are accessible not only to national users but also internationally. This is especially important for cultural heritage materials that may be used globally.Efficiency in Storage and Retrieval: When dealing with large-scale collections, efficient storage formats help minimize infrastructure costs. The speed at which data can be retrieved from a storage system is influenced by the format of the resources. Formats that are optimized for efficient access improve search and retrieval times, facilitating quick access to resources for users.Metadata Preservation and Integration: National libraries depend heavily on metadata to ensure that digital objects are easily discoverable and appropriately categorized. Standardized formats are crucial for integrating digital collections with national library cataloging systems and facilitating interoperability between different systems. National libraries must ensure that the digital resources they store are secure and protected from corruption.User Experience and Engagement: National libraries serve diverse audiences, ranging from academic researchers to the general public. Formats that are easily navigable (e.g., HTML or EPUB for e-books) improve user engagement and accessibility. Interactive or multimedia content (e.g., video, audio) in open formats can provide richer, more engaging experiences. Formats like Unicode-based text formats (e.g., UTF-8) allow for the preservation of multiple languages and scripts, making national library collections accessible to diverse communities, including those with specific linguistic or cultural needs.Legal and Copyright Considerations: Formats may also play a role in how digital rights and licensing information are embedded and protected. Open formats allow for the inclusion of rights management information without reliance on proprietary systems, ensuring proper management of copyrighted materials. The use of appropriate formats for the digital storage of copyrighted content can aid in enforcing digital rights management (DRM) policies. Formats like encrypted PDF or DRM-protected EPUB help libraries protect content according to legal guidelines.Cost-Effectiveness: Choosing the right storage formats can have a significant impact on cost management for national libraries. Open-source and standardized formats tend to be cost-effective compared to proprietary systems that may involve licensing fees, additional maintenance costs, or vendor lock-in. National Libraries must invest in ensuring the longevity and accessibility of their digital collections. By choosing widely accepted formats with strong community support, national libraries can reduce ongoing maintenance and conversion costs associated with less popular or proprietary formats.So, the importance of selecting the right resource storage formats in national libraries cannot be overstated. A well-chosen format ensures the long-term preservation, accessibility, and efficient management of digital collections. It also contributes to the overall mission of national libraries to serve as custodians of cultural heritage, making materials accessible to a global audience while protecting them for future generations. Standardized, open, and widely supported formats are crucial in meeting these objectives, enabling national libraries to optimize their digital storage solutions while maintaining flexibility, cost-effectiveness, and long-term sustainability.Literature ReviewThe results of Sullivan (2006), Thomas & Martin (2006), Hodge & Anderson (2007), Rog & Van Wijk (2008), Van der Knijff (2011), Barabucci & et al (2011), Morrissey (2012), Hajtnik (2012), Jackson (2012), Koo & Chou(2013), Rimkus & et al(2014), Uherek & et al (2015), Termens & et al (2015), Delaney & De Jong(2015), Baratè & et al (2015), Rimkus & Witmer (2016), Anyim (2021), Trianggoro & Prasetyadi (2022) research showed selecting the right format and storage standard for digital collections is a multi-faceted decision that balances accessibility, preservation, interoperability, and user experience. Ensuring that digital collections are presented in a way that aligns with these standards maximizes their utility and longevity. It highlights that while digital libraries have evolved from physical spaces, the principles and practices of resource storage have undergone significant changes. Previous research has addressed these issues, but often as sub-components rather than focusing on them comprehensively. The review indicates that long-term preservation and storage have not been fully explored in the literature.This research aims to identify successful national solutions for preserving and storing resources effectively in digital libraries. In essence, the study seeks to provide a more integrated approach to addressing storage and preservation issues in digital libraries, taking into account both the technological and organizational practices in use today. This effort is important for ensuring that digital resources remain accessible and usable over time, despite the challenges posed by rapid technological change and shifting formats.MethodologyThis article identifies the use of a variety of standard formats for storing video, text, multimedia, and audio resources in the digital collections of 20 selected national libraries around the world. This study is applied, based on the descriptive-analytical method. For data collection, we used a researcher-made questionnaire. After collecting data, a variety of descriptive statistical techniques such as frequency distribution and frequency percentage and Chi-square test were used to analyze the data.ResultsBased on the results, it was found that image sources were used in all studied collections and cultural objects were less popular with a share of 41%, for TIFF image sources (94%); HTML and XML text sources (75%); WAV audio sources (65%) and AVI multimedia sources (65%) were the most common. Switzerland had the greatest variety in the use of a variety of standards. Also, among the research community, Iran, Britain, the United States, Scotland, Qatar, the Netherlands, France, and Spain had the greatest diversity in storing a variety of digital resources. The statistical community was consistent in using standard formats for a variety of visual, textual, and multimedia resources and followed a similar pattern in selecting storage formats.ConclusionThe research community had the least amount of diversity of use in the field of multimedia resource storage. Research findings emphasize the use of EPUB for ease of reading books on e-readers, and GP3 was recommended to increase the usability of the mobile version of the digital library, WARC as a special format for web archiving, for long-term protection of digital content of national libraries, because these three formats had the lowest usage in the statistical community.AcknowledgmentsThe authors are grateful to Michael Day (Digital Preservation Research Lead) at the National Library of Great Britain for her guidance and sharing of scholarly expertise.

Information Storage and Retrieval

Syntax and Semantics: Research Trends and Directions

Hoda Homavandi; Yaghoub Norouzi; Bent-ol hoda Khabbazan

Volume 11, Issue 39 , July 2023, , Pages 205-250

https://doi.org/10.22054/jks.2023.75172.1599

Abstract

IntroductionRecently, the development of artificial intelligence and human-computer interaction has highlighted the increasing importance of language challenges in information retrieval. The crucial role of language in disseminating, accessing, and retrieving information cannot be studied independently ... Read More IntroductionRecently, the development of artificial intelligence and human-computer interaction has highlighted the increasing importance of language challenges in information retrieval. The crucial role of language in disseminating, accessing, and retrieving information cannot be studied independently of syntax and semantics. Explaining and describing research in this field from both quantitative and qualitative perspectives, and understanding researchers' trends, is an important step in comprehending the significance of syntax and semantics in communication structures within modern information search and retrieval environments. Consequently, in this descriptive and analytical study, we conducted qualitative and quantitative analyses of studies in the field of syntax and semantics in information retrieval. Literature ReviewIn recent years, there has been a lot of interdisciplinary research focusing on investigating the impact of language on the interaction between users and the web environment. These studies have discussed the language from various perspectives and have explored information retrieval across different types of information media, including web databases, search engines, commercial websites, and libraries. Tapsai (2019), Norouzi and Hamavandi (2018), Hammo (2009), Lazarinis (2008), Ofoghi, Yearwood & Ghosh (2006) have focused on different languages such as Persian, English, Arabic, and Greek. The findings show that the syntax and morphology, as well as the semantics of searched terms and phrases, have a significant impact on the retrieval of results. In addition, search tools tend to rely more on the general form of words instead of focusing on the real needs of users in order to improve the search process.Due to the huge amount of information on the World Wide Web and the challenges related to information retrieval, researchers and software developers have turned to the Semantic Web to keep up with the changes. The Semantic Web has provided a large amount of structured and machine-understandable information on a wide range of topics (Guha, McCool & Miller, 2003). Semantic models perform well in identifying and recognizing synonyms, similar words, and semantic frameworks. Therefore, one of the most important challenges in the field of information storage and retrieval is to bridge the gap between the language used by information seekers and information providers (Rezaee Sharifabadi et al., 2010).The current study aims to systematically review previous research findings on syntax and semantics in information storage and retrieval across different contexts. Each context represents different dimensions of knowledge representation systems, from traditional to semantics. Upon reviewing the research, it was found that no systematic review has been conducted with a focus on syntax and semantics in the field of information retrieval. MethodologyIn this qualitative research using Aveyard’s systematic review method, we aim to address the following questions:What is the statistical status of studies in the field of syntax and semantics in storing and retrieving information?What are the main subject areas that researchers have focused on in studies related to syntax and semantics in storing and retrieving information?What research methods and approaches have researchers employed in this field?What are the research gaps and areas that require further study in this field?To gather relevant sources from information databases, we selected search keywords based on the research questions. Then, we used search strategies and various operators to combine the keywords and phrases, ensuring a comprehensive and effective search in Persian databases such as Magiran, Irandoc, SID, NoorMagz, ISC, and Civilica, as well as databases including Scopus, Emerald, ProQuest, and Google Scholar. There was no time limitation for the search. We recorded accepted sources such as articles and theses that were relevant and valid. By removing irrelevant and duplicate sources, we selected 12 Persian sources and 42 English sources. After categorizing, the studies were analyzed according to the type of source, research method, and tool. The results of the analysis of the studies were presented in the form of tables and graphs. ResultsThe selected studies were categorized into three groups (their characteristics were described in detail): information retrieval, information organization, and information search based on the analysis of keywords and subjects raised in the sources. The results of the study revealed that among the 54 reviewed studies, Iranian researchers had conducted the most research in the field of syntax and semantics in information retrieval, with 12 studies. The United States followed with 5 studies, and China and Vietnam tied for third place with 4 studies each. The majority of the studies focused on syntax and semantics in information retrieval. DiscussionAnalysis of 54 selected studies has shown that these studies were conducted over a period of 26 years. The oldest study was included in the review back to 1997, while the most recent one is from 2022. This shows the dynamic nature of the field under investigation and demonstrates how it is constantly changing and being influenced by the advancement of web technologies. Furthermore, a thematic analysis of the research, based on the studies' keywords, reveals that "Ontology," as a tool of the semantic web, is closely linked to the semantic and syntax aspects of language in information retrieval.Moreover, in a total of 54 studies, the majority were experimental (19), followed by applied (15) and analytical (9). Additionally, there were 6 studies that combined applied and analytical methods. Content analysis and comparative analysis each had 2 instances, while case studies were the least frequent with only 1 case. These studies have utilized tools such as ontology, search engines, and techniques including natural language processing, annotation, tagging, and indexing.The discussion about exploring syntax and semantics in relation to information retrieval across different languages is believed to make a significant contribution to the development of future research literature in this field. This is because users’ native language plays a central role in forming search terms for information retrieval, based on subjective meanings, context, and content. Considering this point can effectively enhance information retrieval systems. ConclusionAlthough many studies have addressed various aspects of syntax and semantics in information retrieval, more research is needed to investigate syntax and semantics in information organization. It is also important to delve into and analyze their theoretical aspects in information retrieval, especially through interdisciplinary studies.Moreover, the interconnectedness of various areas of study demonstrates the close relationship between syntax and semantics and linguistic issues in nearly every field that involves organizing, storing, and retrieving information. These areas include the study of syntax and semantics in relation to environmental sensors, indexing, identification and summarization of texts, plagiarism detection, natural language processing, repositories, improvement of query users and data retrieval in repositories and search engines, metadata enrichment, and image retrieval. The results of this research, such as its main themes, identified methods and approaches, and research gaps, can offer valuable insights for future studies.

Applying Ontologies in Knowledge Management: A Systematic Review

Hoda Homavandi; Yaghoub Norouzi; Shahed Rashidi

Volume 10, Issue 34 , April 2023, , Pages 225-260

https://doi.org/10.22054/jks.2022.69375.1528

Abstract

IntroductionOntologies as tools for visualizing domain knowledge and improving information search and retrieval have attracted the attention of many researchers in different subject areas. One of the domains where the application of ontologies can be the subject of many researches is knowledge management, ... Read More IntroductionOntologies as tools for visualizing domain knowledge and improving information search and retrieval have attracted the attention of many researchers in different subject areas. One of the domains where the application of ontologies can be the subject of many researches is knowledge management, while it has been less discussed so far. The aim of this research is the qualitative and quantitative analysis of studies related to the application of ontologies in the knowledge management domain in order to identify these applications the main subject areas and their aspects. Literature ReviewReasons for creating ontologies include the ability to reuse knowledge, sharing a common understanding of the structure of information between human and software (machines), analyzing domain knowledge, describing basic concepts in a subject domain, and the relationship between them (Montenegro et al. 2012). Therefore, considering the application of ontology in knowledge engineering and since access to knowledge has a major contribution to the progress of societies in today's world, the way of accessing and managing this knowledge among the large amount of information is very important and challenging. Knowledge management is also one of the topics that is of interest to researchers in various fields. Ontologies are one of the tools for organizing and managing knowledge. Despite the significant number context-based research on the use of these tools and the use of ontologies in knowledge management, lack of connection and understanding between researches in the field of ontology and applied knowledge management will cause knowledge management to be deprived of these tools. Therefore, the following research aims to analyze the studies in the field of ontology application in knowledge management. As it mentioned, most of the research in this field is context-based. The fields of education (Yang, Chen, & Shao, 2004), architecture (Anumba et al, 2008), agriculture (Zheng et al., 2012) and medicine (Zhou et al., 2020) are among the context-based researches in the field of application of ontology in knowledge management. Moreover, identifying the obstacles and challenges of using these tools in the context of knowledge management is another aspect that has not been addressed in the researches. Therefore, this study focuses on researches that are generally discussed applications and methods the use of ontology in knowledge management and research about the application of ontologies in different domains knowledge management or the construction and implementation of ontologies in the field of knowledge management have been ignored. This research can provide a comprehensive view of the applications and capacities that the use of ontologies provides for knowledge management. MethodologyThe present study is a descriptive-analytical one. It was conducted using Pettigrew and Roberts 's (2008) systematic review method. In order to analyze the texts and visualize the findings, the content analysis method using MaxQDA and Data Trapper software was applied. Accordingly, after formulating research questions and search keywords, inclusion and exclusion criteria, selected databases were searched. At first stage, 419 studies were retrieved, after reviewing and considering the defined inclusion and exclusion criteria, 47 studies were selected for final analysis. The results of the analysis of the studies were presented in the form of tables and graphs. ResultsIn addition to presenting the frequency of subjects, the citation statusof studies and subject trends in this field over time, analysis of studies in the field of application of ontologies in knowledge management revealed that these researches include five main topics as follows: construction and creation of ontology in the field of knowledge management, design of knowledge management systems based on ontology, use of ontology in the process of knowledge management, role and synergy of ontologies in knowledge management and challenges and usage guidelines. From ontology in knowledge management. DiscussionAnalysis of 47 selected studies showed that Chinese researchers have studied this issue more than others. In Iran, the number of studies was significant compared to other countries. Moreover, the analysis of the studies indicates that in most of the studies in this field, analytical-descriptive approach has been applied. In terms of the year of conducting the research, most of the studies in this field were conducted in 2009. In Iran the first study was conducted in 2008 and other studies were conducted since 2019. Accordingly, most of the research in the field of the application of ontologies in knowledge management are analytical-descriptive, and case and experimental studies and literature review are in the next in terms of frequency, which have used tools such as texts, questionnaires, and the Delphi technique. Thematic analysis of the studies shows that the most focus of these research is on presenting the methods, models, frameworks and approaches that are used to build ontologies in the field of knowledge management. A number of these studies also dealt with the construction and development of ontology in the field of knowledge management. Ontology based knowledge management systems, the application of ontology in different parts of the knowledge management process, the role and synergy of ontology in knowledge management, and the guides and challenges in the construction of ontologies in the context of knowledge management are other major topics in these studies. The combination of the results of keyword analysis and the identification of the changing process of subjects over the time can confirm the relative superiority of technical aspects in the design and construction of ontologies in the field of knowledge management. ConclusionAlthough studies have addressed different dimensions of the application of ontology in knowledge management as a helpful tool to facilitate different stages of knowledge management processes, the focus of most studies on theoretical dimensions has caused a lack of connection and real practical understanding of applying ontology in this field. In this study, the emphasis is on the general ontologies of knowledge management, but the high importance of user-oriented studies, understanding the environment and the user community in design, construction and use of technologies, including ontology and their impact on the efficiency and effectiveness of knowledge management, require researchers to focus on these non-technical issues, and the results of this review show a significant lack of focus on these issues.

Information Retrieval in the Web Environment (Case Study: Iranian Library Software)

Yaghob Nowrozi; Mohammad Maleki; Eisa Zarei

Volume 9, Issue 30 , April 2022, , Pages 67-92

https://doi.org/10.22054/jks.2021.57974.1402

Abstract

The purpose of this study was to evaluate the library software of Pars Azarakhsh, Simorgh, and Saman in terms of information retrieval features in the web environment. The present study is applied type. The statistical population consists of Pars Azarakhsh, Simorgh, and Saman library software. To measure, ... Read More

Metadata Standards for Organizing in the Digital Library: A Systematic Review

Yaqub Norouzi; Nayere Jafari Far

Volume 7, Issue 24 , October 2020, , Pages 111-137

https://doi.org/10.22054/jks.2020.50159.1281

Abstract

The purpose of this study is systematic reviewing of organizing metadata standards to find the status of researches in digital libraries. The research attempted to categorize them in terms of the functional area and role that they play in digital libraries. By identifying the statue studded standards, ... Read More

Knowledge Retrieval and Semantic Systems

Articles in Press

Current Issue

Volume 12 (2024)

Volume 11 (2024)

Volume 10 (2023)

Volume 9 (2022)

Volume 8 (2021)

Volume 7 (2020)

Volume 6 (2019)

Volume 5 (2018)

Volume 4 (2017)

Volume 3 (2016)

Volume 2 (2015)

Volume 1 (2015)

Author = Norouzi, Yaghoub

Analysis of Resource Storage Formats in Digital Collections (Case Study: National Libraries)

Abstract

Syntax and Semantics: Research Trends and Directions

Abstract

Applying Ontologies in Knowledge Management: A Systematic Review

Abstract

Information Retrieval in the Web Environment (Case Study: Iranian Library Software)

Abstract

Metadata Standards for Organizing in the Digital Library: A Systematic Review

Abstract