Information Storage and Retrieval
Yaghoub Norouzi; Nayere Jafari Far; Reihaneh DavoodAbadi
Abstract
IntroductionThe management of digital resources in national libraries involves significant challenges, particularly in selecting, maintaining, and ensuring long-term accessibility. A crucial element in this process is the storage format of digital resources, which directly affects both the preservation ...
Read More
IntroductionThe management of digital resources in national libraries involves significant challenges, particularly in selecting, maintaining, and ensuring long-term accessibility. A crucial element in this process is the storage format of digital resources, which directly affects both the preservation of materials and their usability over time. Below are key reasons why the choice of resource storage formats is important for national libraries:Long-Term Preservation: National libraries manage vast collections of digital materials, from texts to multimedia resources. Choosing a storage format that ensures long-term preservation is essential to prevent data loss over time due to format obsolescence. National Libraries often face challenges of format obsolescence. The right choice of format can minimize the need for complex emulation or migration strategies, which are resource-intensive.Interoperability and Accessibility: Digital collections in national libraries need to be accessible across various platforms, devices, and operating systems. International standards and guidelines, such as those from the International Federation of Library Associations and Institutions (IFLA), guide the choice of formats that facilitate interoperability. Adopting these standards ensures that digital resources are accessible not only to national users but also internationally. This is especially important for cultural heritage materials that may be used globally.Efficiency in Storage and Retrieval: When dealing with large-scale collections, efficient storage formats help minimize infrastructure costs. The speed at which data can be retrieved from a storage system is influenced by the format of the resources. Formats that are optimized for efficient access improve search and retrieval times, facilitating quick access to resources for users.Metadata Preservation and Integration: National libraries depend heavily on metadata to ensure that digital objects are easily discoverable and appropriately categorized. Standardized formats are crucial for integrating digital collections with national library cataloging systems and facilitating interoperability between different systems. National libraries must ensure that the digital resources they store are secure and protected from corruption.User Experience and Engagement: National libraries serve diverse audiences, ranging from academic researchers to the general public. Formats that are easily navigable (e.g., HTML or EPUB for e-books) improve user engagement and accessibility. Interactive or multimedia content (e.g., video, audio) in open formats can provide richer, more engaging experiences. Formats like Unicode-based text formats (e.g., UTF-8) allow for the preservation of multiple languages and scripts, making national library collections accessible to diverse communities, including those with specific linguistic or cultural needs.Legal and Copyright Considerations: Formats may also play a role in how digital rights and licensing information are embedded and protected. Open formats allow for the inclusion of rights management information without reliance on proprietary systems, ensuring proper management of copyrighted materials. The use of appropriate formats for the digital storage of copyrighted content can aid in enforcing digital rights management (DRM) policies. Formats like encrypted PDF or DRM-protected EPUB help libraries protect content according to legal guidelines.Cost-Effectiveness: Choosing the right storage formats can have a significant impact on cost management for national libraries. Open-source and standardized formats tend to be cost-effective compared to proprietary systems that may involve licensing fees, additional maintenance costs, or vendor lock-in. National Libraries must invest in ensuring the longevity and accessibility of their digital collections. By choosing widely accepted formats with strong community support, national libraries can reduce ongoing maintenance and conversion costs associated with less popular or proprietary formats.So, the importance of selecting the right resource storage formats in national libraries cannot be overstated. A well-chosen format ensures the long-term preservation, accessibility, and efficient management of digital collections. It also contributes to the overall mission of national libraries to serve as custodians of cultural heritage, making materials accessible to a global audience while protecting them for future generations. Standardized, open, and widely supported formats are crucial in meeting these objectives, enabling national libraries to optimize their digital storage solutions while maintaining flexibility, cost-effectiveness, and long-term sustainability.Literature ReviewThe results of Sullivan (2006), Thomas & Martin (2006), Hodge & Anderson (2007), Rog & Van Wijk (2008), Van der Knijff (2011), Barabucci & et al (2011), Morrissey (2012), Hajtnik (2012), Jackson (2012), Koo & Chou(2013), Rimkus & et al(2014), Uherek & et al (2015), Termens & et al (2015), Delaney & De Jong(2015), Baratè & et al (2015), Rimkus & Witmer (2016), Anyim (2021), Trianggoro & Prasetyadi (2022) research showed selecting the right format and storage standard for digital collections is a multi-faceted decision that balances accessibility, preservation, interoperability, and user experience. Ensuring that digital collections are presented in a way that aligns with these standards maximizes their utility and longevity. It highlights that while digital libraries have evolved from physical spaces, the principles and practices of resource storage have undergone significant changes. Previous research has addressed these issues, but often as sub-components rather than focusing on them comprehensively. The review indicates that long-term preservation and storage have not been fully explored in the literature.This research aims to identify successful national solutions for preserving and storing resources effectively in digital libraries. In essence, the study seeks to provide a more integrated approach to addressing storage and preservation issues in digital libraries, taking into account both the technological and organizational practices in use today. This effort is important for ensuring that digital resources remain accessible and usable over time, despite the challenges posed by rapid technological change and shifting formats.MethodologyThis article identifies the use of a variety of standard formats for storing video, text, multimedia, and audio resources in the digital collections of 20 selected national libraries around the world. This study is applied, based on the descriptive-analytical method. For data collection, we used a researcher-made questionnaire. After collecting data, a variety of descriptive statistical techniques such as frequency distribution and frequency percentage and Chi-square test were used to analyze the data.ResultsBased on the results, it was found that image sources were used in all studied collections and cultural objects were less popular with a share of 41%, for TIFF image sources (94%); HTML and XML text sources (75%); WAV audio sources (65%) and AVI multimedia sources (65%) were the most common. Switzerland had the greatest variety in the use of a variety of standards. Also, among the research community, Iran, Britain, the United States, Scotland, Qatar, the Netherlands, France, and Spain had the greatest diversity in storing a variety of digital resources. The statistical community was consistent in using standard formats for a variety of visual, textual, and multimedia resources and followed a similar pattern in selecting storage formats.ConclusionThe research community had the least amount of diversity of use in the field of multimedia resource storage. Research findings emphasize the use of EPUB for ease of reading books on e-readers, and GP3 was recommended to increase the usability of the mobile version of the digital library, WARC as a special format for web archiving, for long-term protection of digital content of national libraries, because these three formats had the lowest usage in the statistical community.AcknowledgmentsThe authors are grateful to Michael Day (Digital Preservation Research Lead) at the National Library of Great Britain for her guidance and sharing of scholarly expertise.
Information Storage and Retrieval
Hoda Homavandi; Yaghoub Norouzi; Bent-ol hoda Khabbazan
Abstract
IntroductionRecently, the development of artificial intelligence and human-computer interaction has highlighted the increasing importance of language challenges in information retrieval. The crucial role of language in disseminating, accessing, and retrieving information cannot be studied independently ...
Read More
IntroductionRecently, the development of artificial intelligence and human-computer interaction has highlighted the increasing importance of language challenges in information retrieval. The crucial role of language in disseminating, accessing, and retrieving information cannot be studied independently of syntax and semantics. Explaining and describing research in this field from both quantitative and qualitative perspectives, and understanding researchers' trends, is an important step in comprehending the significance of syntax and semantics in communication structures within modern information search and retrieval environments. Consequently, in this descriptive and analytical study, we conducted qualitative and quantitative analyses of studies in the field of syntax and semantics in information retrieval. Literature ReviewIn recent years, there has been a lot of interdisciplinary research focusing on investigating the impact of language on the interaction between users and the web environment. These studies have discussed the language from various perspectives and have explored information retrieval across different types of information media, including web databases, search engines, commercial websites, and libraries. Tapsai (2019), Norouzi and Hamavandi (2018), Hammo (2009), Lazarinis (2008), Ofoghi, Yearwood & Ghosh (2006) have focused on different languages such as Persian, English, Arabic, and Greek. The findings show that the syntax and morphology, as well as the semantics of searched terms and phrases, have a significant impact on the retrieval of results. In addition, search tools tend to rely more on the general form of words instead of focusing on the real needs of users in order to improve the search process.Due to the huge amount of information on the World Wide Web and the challenges related to information retrieval, researchers and software developers have turned to the Semantic Web to keep up with the changes. The Semantic Web has provided a large amount of structured and machine-understandable information on a wide range of topics (Guha, McCool & Miller, 2003). Semantic models perform well in identifying and recognizing synonyms, similar words, and semantic frameworks. Therefore, one of the most important challenges in the field of information storage and retrieval is to bridge the gap between the language used by information seekers and information providers (Rezaee Sharifabadi et al., 2010).The current study aims to systematically review previous research findings on syntax and semantics in information storage and retrieval across different contexts. Each context represents different dimensions of knowledge representation systems, from traditional to semantics. Upon reviewing the research, it was found that no systematic review has been conducted with a focus on syntax and semantics in the field of information retrieval. MethodologyIn this qualitative research using Aveyard’s systematic review method, we aim to address the following questions:What is the statistical status of studies in the field of syntax and semantics in storing and retrieving information?What are the main subject areas that researchers have focused on in studies related to syntax and semantics in storing and retrieving information?What research methods and approaches have researchers employed in this field?What are the research gaps and areas that require further study in this field?To gather relevant sources from information databases, we selected search keywords based on the research questions. Then, we used search strategies and various operators to combine the keywords and phrases, ensuring a comprehensive and effective search in Persian databases such as Magiran, Irandoc, SID, NoorMagz, ISC, and Civilica, as well as databases including Scopus, Emerald, ProQuest, and Google Scholar. There was no time limitation for the search. We recorded accepted sources such as articles and theses that were relevant and valid. By removing irrelevant and duplicate sources, we selected 12 Persian sources and 42 English sources. After categorizing, the studies were analyzed according to the type of source, research method, and tool. The results of the analysis of the studies were presented in the form of tables and graphs. ResultsThe selected studies were categorized into three groups (their characteristics were described in detail): information retrieval, information organization, and information search based on the analysis of keywords and subjects raised in the sources. The results of the study revealed that among the 54 reviewed studies, Iranian researchers had conducted the most research in the field of syntax and semantics in information retrieval, with 12 studies. The United States followed with 5 studies, and China and Vietnam tied for third place with 4 studies each. The majority of the studies focused on syntax and semantics in information retrieval. DiscussionAnalysis of 54 selected studies has shown that these studies were conducted over a period of 26 years. The oldest study was included in the review back to 1997, while the most recent one is from 2022. This shows the dynamic nature of the field under investigation and demonstrates how it is constantly changing and being influenced by the advancement of web technologies. Furthermore, a thematic analysis of the research, based on the studies' keywords, reveals that "Ontology," as a tool of the semantic web, is closely linked to the semantic and syntax aspects of language in information retrieval.Moreover, in a total of 54 studies, the majority were experimental (19), followed by applied (15) and analytical (9). Additionally, there were 6 studies that combined applied and analytical methods. Content analysis and comparative analysis each had 2 instances, while case studies were the least frequent with only 1 case. These studies have utilized tools such as ontology, search engines, and techniques including natural language processing, annotation, tagging, and indexing.The discussion about exploring syntax and semantics in relation to information retrieval across different languages is believed to make a significant contribution to the development of future research literature in this field. This is because users’ native language plays a central role in forming search terms for information retrieval, based on subjective meanings, context, and content. Considering this point can effectively enhance information retrieval systems. ConclusionAlthough many studies have addressed various aspects of syntax and semantics in information retrieval, more research is needed to investigate syntax and semantics in information organization. It is also important to delve into and analyze their theoretical aspects in information retrieval, especially through interdisciplinary studies.Moreover, the interconnectedness of various areas of study demonstrates the close relationship between syntax and semantics and linguistic issues in nearly every field that involves organizing, storing, and retrieving information. These areas include the study of syntax and semantics in relation to environmental sensors, indexing, identification and summarization of texts, plagiarism detection, natural language processing, repositories, improvement of query users and data retrieval in repositories and search engines, metadata enrichment, and image retrieval. The results of this research, such as its main themes, identified methods and approaches, and research gaps, can offer valuable insights for future studies.
Hoda Homavandi; Yaghoub Norouzi; Shahed Rashidi
Abstract
IntroductionOntologies as tools for visualizing domain knowledge and improving information search and retrieval have attracted the attention of many researchers in different subject areas. One of the domains where the application of ontologies can be the subject of many researches is knowledge management, ...
Read More
IntroductionOntologies as tools for visualizing domain knowledge and improving information search and retrieval have attracted the attention of many researchers in different subject areas. One of the domains where the application of ontologies can be the subject of many researches is knowledge management, while it has been less discussed so far. The aim of this research is the qualitative and quantitative analysis of studies related to the application of ontologies in the knowledge management domain in order to identify these applications the main subject areas and their aspects.
Literature ReviewReasons for creating ontologies include the ability to reuse knowledge, sharing a common understanding of the structure of information between human and software (machines), analyzing domain knowledge, describing basic concepts in a subject domain, and the relationship between them (Montenegro et al. 2012). Therefore, considering the application of ontology in knowledge engineering and since access to knowledge has a major contribution to the progress of societies in today's world, the way of accessing and managing this knowledge among the large amount of information is very important and challenging. Knowledge management is also one of the topics that is of interest to researchers in various fields. Ontologies are one of the tools for organizing and managing knowledge. Despite the significant number context-based research on the use of these tools and the use of ontologies in knowledge management, lack of connection and understanding between researches in the field of ontology and applied knowledge management will cause knowledge management to be deprived of these tools. Therefore, the following research aims to analyze the studies in the field of ontology application in knowledge management. As it mentioned, most of the research in this field is context-based. The fields of education (Yang, Chen, & Shao, 2004), architecture (Anumba et al, 2008), agriculture (Zheng et al., 2012) and medicine (Zhou et al., 2020) are among the context-based researches in the field of application of ontology in knowledge management. Moreover, identifying the obstacles and challenges of using these tools in the context of knowledge management is another aspect that has not been addressed in the researches. Therefore, this study focuses on researches that are generally discussed applications and methods the use of ontology in knowledge management and research about the application of ontologies in different domains knowledge management or the construction and implementation of ontologies in the field of knowledge management have been ignored. This research can provide a comprehensive view of the applications and capacities that the use of ontologies provides for knowledge management.
MethodologyThe present study is a descriptive-analytical one. It was conducted using Pettigrew and Roberts 's (2008) systematic review method. In order to analyze the texts and visualize the findings, the content analysis method using MaxQDA and Data Trapper software was applied. Accordingly, after formulating research questions and search keywords, inclusion and exclusion criteria, selected databases were searched. At first stage, 419 studies were retrieved, after reviewing and considering the defined inclusion and exclusion criteria, 47 studies were selected for final analysis. The results of the analysis of the studies were presented in the form of tables and graphs.
ResultsIn addition to presenting the frequency of subjects, the citation statusof studies and subject trends in this field over time, analysis of studies in the field of application of ontologies in knowledge management revealed that these researches include five main topics as follows: construction and creation of ontology in the field of knowledge management, design of knowledge management systems based on ontology, use of ontology in the process of knowledge management, role and synergy of ontologies in knowledge management and challenges and usage guidelines. From ontology in knowledge management.
DiscussionAnalysis of 47 selected studies showed that Chinese researchers have studied this issue more than others. In Iran, the number of studies was significant compared to other countries. Moreover, the analysis of the studies indicates that in most of the studies in this field, analytical-descriptive approach has been applied. In terms of the year of conducting the research, most of the studies in this field were conducted in 2009. In Iran the first study was conducted in 2008 and other studies were conducted since 2019. Accordingly, most of the research in the field of the application of ontologies in knowledge management are analytical-descriptive, and case and experimental studies and literature review are in the next in terms of frequency, which have used tools such as texts, questionnaires, and the Delphi technique. Thematic analysis of the studies shows that the most focus of these research is on presenting the methods, models, frameworks and approaches that are used to build ontologies in the field of knowledge management. A number of these studies also dealt with the construction and development of ontology in the field of knowledge management. Ontology based knowledge management systems, the application of ontology in different parts of the knowledge management process, the role and synergy of ontology in knowledge management, and the guides and challenges in the construction of ontologies in the context of knowledge management are other major topics in these studies. The combination of the results of keyword analysis and the identification of the changing process of subjects over the time can confirm the relative superiority of technical aspects in the design and construction of ontologies in the field of knowledge management.
ConclusionAlthough studies have addressed different dimensions of the application of ontology in knowledge management as a helpful tool to facilitate different stages of knowledge management processes, the focus of most studies on theoretical dimensions has caused a lack of connection and real practical understanding of applying ontology in this field. In this study, the emphasis is on the general ontologies of knowledge management, but the high importance of user-oriented studies, understanding the environment and the user community in design, construction and use of technologies, including ontology and their impact on the efficiency and effectiveness of knowledge management, require researchers to focus on these non-technical issues, and the results of this review show a significant lack of focus on these issues.
Yaghob Nowrozi; Mohammad Maleki; Eisa Zarei
Abstract
The purpose of this study was to evaluate the library software of Pars Azarakhsh, Simorgh, and Saman in terms of information retrieval features in the web environment. The present study is applied type. The statistical population consists of Pars Azarakhsh, Simorgh, and Saman library software. To measure, ...
Read More
The purpose of this study was to evaluate the library software of Pars Azarakhsh, Simorgh, and Saman in terms of information retrieval features in the web environment. The present study is applied type. The statistical population consists of Pars Azarakhsh, Simorgh, and Saman library software. To measure, review, identify and evaluate the dimensions of the research, the status of library software in 5 indicators was reviewed and analyzed. The data of this research were collected by observing the software website and using the opinions of the software experts and were analyzed using descriptive statistical methods. Findings showed that in the field of search capabilities in various fields, Pars Azarakhsh with 98.85% points, and Saman and Simorgh with 96.55% points obtained the highest and lowest scores. In the section of various search formulations, Simorgh and Pars Azarakhsh with 100% points and Saman software with 83.34% points obtained the highest and lowest scores. In other types of software searches, Pars Azarakhsh with 78.43% points, and Saman software with 76.97% points had the highest and lowest points. The results also showed that the studied software, in terms of retrieval and information retrieval features in the web environment, had good improvements over their previous versions and were able to meet more than 80% of the criteria studied in this study.
Yaqub Norouzi; Nayere Jafari Far
Abstract
The purpose of this study is systematic reviewing of organizing metadata standards to find the status of researches in digital libraries. The research attempted to categorize them in terms of the functional area and role that they play in digital libraries. By identifying the statue studded standards, ...
Read More
The purpose of this study is systematic reviewing of organizing metadata standards to find the status of researches in digital libraries. The research attempted to categorize them in terms of the functional area and role that they play in digital libraries. By identifying the statue studded standards, it is possible to determine the functional place of other standards that expanded in the digital library area and are functionally similar to covered standards in this research. In this way, systematic review was used for study. By searching the databases, finally 42 research sources were selected for final review. Findings showed that the organizing standards, based on they role in digital library, could be categorized in seven sections (include Resource description, Resource structure determination, Resource management, Resource content organization, Local format definition in digital library database management system, Semantic context and interoperability). Then, a set of standards in each area were identified (50 standards in total). The results finally showed that MARC with 60.5%, DC with 44.7% and OAI and MODS with 42.1% were the most known standards. Audio-MD, VIDEO-MD with 5.3% also were the most unknown.