Applications of Natural Language Processing in Information Science and Knowledge with Emphasis on Digital Libraries

Rabiei, Mahboubeh; Mirzaeian, Vahidreza

doi:10.22054/jks.2022.64237.1478

Document Type : Research Paper

Authors

¹ Phd Candidate of Information Science in Alzahra University and Expert of Data Processing at National Library of Iran, Tehran, Iran

² Assoicate Professor, Department of English, Faculty of Literature, Alzahra University, Tehran, Iran

https://doi.org/10.22054/jks.2022.64237.1478

Abstract

Natural language processing as a branch of computational linguistics whose main effort is to use computers in the process of automating the understanding and processing of human natural language and focusing on human-computer interaction has found ab important place in various fields of science including information science and knowledge. The main purpose of this study is to identify the sub-branches and sub-fields of information science and knowledge in which natural language processing has been effective, and has been done through library and documentary analysis. located deals with the role of digital libraries in the field of information science and application of natural languages processing in them. The result of this study shows that natural language processing in many sub-fields related to information science such as information retrieval, bibliometric, document management, automatic information extraction, automatic indexing , automatic text summarization, automatic text classification, question and answer systems and using spell checker technology, debugging user query phrase and predicting their preferred words, translating speech in to text and vice versa and helping users with physical disabilities such as the visually impaired and the blind, surveying and analyzing the sense of libraries and information centers is Traceable.

Keywords

20.1001.1.2476387.1401.9.33.6.1

References

Baxendale, P.B. (1958).Machine-made index for techniqual literature: An experiments,IBM journal of research and development, 2(4): 354-361.

Berger, Helmut and Merkl, Dieter (2004). A comparison of text categorization methods applied to N-gram frequency statistics,inWebb,GI. Yu X. (eds) AI 2004, Advances in artificial intelligence,AI 2004,Lecture notes in computer science, Springer,Berlin,Heidelberg, 3339:998-1003.

Black, Catherine. (2011).Text mining annual review of information science and technology,44(1):121-155.

Blitzer, John. (2008). A survey of dimensionality reduction techniques for natural language, [on line],available on http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.379.3965&rep=rep1

&type=pdf.

Borgman,Christine.L. (1997).Multi media,multi cultural and multi lingual digital libraries: or how do we exchange data in 400 languages?Dlib Magazine[on line],available on http://www.dlib.org.

Borlund, Pia. (2003).The concept of relevance in IR, Journal of the American society for information science and technology, 54(10): 913-925.

Chanod,Jean Pierre(1999).Natural language processing and digital libraries, in proceeding ofapplied natural language processing Washington DC, Information exraction, toward scalable,adaptable system:17-31

Chowdhury, G. (2003). Natural language processing, Annual review of information scence and technology, 37: 51-89.

Cunningham, Hamish (2005).Information extraction, automatic, Encyclopedia of language and linguistics, 3(8): 10.

Da Sylva, Lyne (2012).NLP and digital library management, in Bandyopadhyay, Sivaji; Naskar, Sudip Kumar; Ekbal, Asif (réds), Emerging applications of natural language processing: Concepts and New Research. Hershey, PA: IGI global,265-290 (https://doi.org/10.4018/978-1-4666-2169-5.ch011).

Dunning, Ted (1993).Accurate methods for statistics of surprise and coincidence, Computational linguistics, 19(1): 61-74.

Evans,David A. Ginther Webster, Kimberly, Hart, Mary, Lefferts,Robert G. and Monarch, Ira A. (1991).Automatic indexing using selective NLP and first order thesauri, RIAO(conference),intelligent text and image handling, v.2: 624-643.

Fang, X. Zhan, J. (2015) Sentiment analysis using product review data. Journal of Big Data 2, 5. https://doi.org/10.1186/s40537-015-0015-2

Feldman, S. (1999). NLP meets the jabberwocky. Online, 23, 62-72.

Green, Bert F. Wolf Alice K. Chomsky, Carol, Laughery, Kenneth (1961).Baseball: an automatic question-answer, proc Western joint IREAIEE-ACM computer conference, New York, NY, USA:219-224.

Gupta, Anupama, Banerjee Imon and Rubin, Daniel L. (2018). Automatic information extraction from unstructured mammography reports using distributed semantics, Journal of Biomedical informatics,78: 78-86 (https://doi.org/10.1016/j.jbi.2017.12.016).

Gupta, Vishal. and Lehal,Gurpreet.S. (2009). A survey of text mining techniques and applications, Jurnal of imerging technologies in web intelligence:60-77.

Hassel,Martin and Mazdak,Nima(2004). FarsiSum: A Persian text summarizer, Proceedings of the workshop on computational approaches to Arabic script-based languages, Geneva, Switzerland: 82-84.

Ingwersen,Peter (2002). Information retrieval interaction, London, Tylor Graham.

Karoo, Krishna (2017). Natural language processing and digital library management system, International journal of science and research (IJSR), 7(11): 1580-1584.

Knight, Kevin (1999).Mining online text,Communication of the ACM,42(11): 58-61.

Korycinsky, C. and Newell, Alan F. (1990). Natural language processing and automatic indexing, 17(1): 21-29.

Krovetz, R. (1997) Homonymy and Polysemy in Information Retrieval. Proceedings of the 35th Meeting of the Association for Computational Linguistics and the 8th Meeting of the European Chapter of the Association for Computational Linguistics (ACL/EACL-97).

Kukich, K. (1992). Techniques for automatically correcting words in text, ACM computer surveys (CSUR), 24: 377-439.

Lesk, Michael. (1986). Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In proceedings of ACM CIGDOC conference, Toronto, Canada: 24-26.

Lee, ChristopherA. Woods, Kam (2017). Diverse digital collections meet diverse uses: applying natural language processing to born digital primary sources,presented at proceeding of the 14^th international conference on digital preservation(iPRES),Kyoto,Japan.

Lee, Richard (1998). Automatic information extraction from documents: A tool for intelligence and law inforcement analysts, in proceeding of 1998 AAAI fall symposium on artificial intelligence and link analysis (Menlo park CA), American association for artificial intelligence,CA:AAAI press: 63-65.

Lewis, David D. and Jones, Karen Sparck (1996).Natural language processing for information retrieval, Communications of ACM,39(1):92- 101.

Li, Kai, Rollins, Jason and Yan, Erjia (2018).Web of science use in published research and review papers 1997-2017: a selective, dynamic, cross domain, content-based analysis,Scientometrics, 115: 1-20.

Liddy,Elizabeth D. (2018).Natural language processing,in Encyclopedia of library and information science,4^thed,CRC press (DOIhttps://doi.org/10.1081/E-ELIS4).

Liu, Bing (2012). Sentiment analyzing and opinion mining, Synthesis lectures on human language technology, Morgan & Claypool,5(1): 1-167.

Luhn,H. (1957). A statistical approach to mechanized encoding and searching of literacy information,IBM journal of research and development, 1(4):309-317.

Mayr, Philipp, Frommholz, Ingo, Cabanac, Guillaume, Chandrasekaran,Kumar, Muthu, Jaidka,Kokil, Kan, Min Yen and Wolfram, Dietmar(2018). Introduction to special issue on bibliographic enhanced information retrieval and natural language processing for digital libraries (BIRNDL), International journal of digital libraries, 19(2-3):107-111.

Meusel, R. Niepert, M. Eckert, K. Stuckenschmidt, H. (2010). Thesaurus extension using web search engines, In proceeding of ICADL 2010: 198-207.

Moens, Marie Francine (2003). Automatic indexing and abstracting of document texts, second edition,Massachusetts, MA,Kluwer.

Morse, Emile, Lewis, Michael and Olsen, Kai A. (2001). Testing visul information retrieval methodologies case study: comparative analysis of textual icon, graphical and spring display, Journal of American society for information science and technology, 53(1): 28-40.

Moschitti, Alessandro (2003).Natural language processing and automated text categorization: a study on the reciprocal beneficial interactions,P.hDDisertationdepartment of computer science university of rome.

Nenkova, Ani and Mackeown, Kathleen (2012). A survey of text summarization techniques, IN C.C.Aggarwal& C. Zhai, ed. Mining text data, Boston, MA Springer US: 43-76.

Peters,C. and Picchi,E. (1997).Across languages, across cultures: issues in multilinguality and digital libraries, D-Lib Magazine. [on line] available on http://www.dlib.org.

Rabertson, S. and Spark, Jones, K. (1997).Simple proven approaches to text retrieval.Technical report TR 356, Cambridge university computer laboratory.

Ravichandran, Deepak and Hovy, Eduard (2002). Learning surface text patterns for a question answering system, In ACL,02, Proc, 40^th annual meeting on association for computational linguistics: 41-47.

Rodgers, Peter,Gaizauskas, Robert, Humphreys, Kevin and Cunninghum, Hamish (1997). Visual execution and data visualization in natural language processing, in proceedingof IEEE symposium on visual language: 338-343.

Rajman,M. and Besancon, R. (1998).text mining: natural language techniques and text mining appllications. In:Spaccapietra, S. Maryanski, F. (eds).Data mining and reserve engineering.IFIP,Springer,Boston:50-64.

Rubin, Victoria L. Chen,Yimin (2013). Information manipulaton classification theory for LIS and NLP, in proceeding of American society and information science and technology(ASIST), 49(1): 1-5 (https://doi.org/10.1002/meet.14504901353).

Russel-Ross, Tony, Stevenson, Mark (2009). The role of natural language processing in information retrieval: searching for meaning and structure, in A. Goker and J. Davies, eds,information retrieval: searching in the 21^st century,Wiley, 2: 215-232.

Saracoglu,Ridvan, Tutuncu, Kemal and Allahverdi, Novruz (2008). A new approach on search for similar documents with multiple categories using fuzy clustering, Expert systems with applications, 34(4): 600-605 (https://doi.org/10.1016/j.eswa.2007.04.003).

Scott, S. and Gaizauskas, R. (2001). University of sheffield TREC-9 Q & A system. Proceeding 9^th text retrieval conference. NIST special publication,500-249: 635-644.

Shiri, A. Revie, C (2005). Usability and user perceptions of thesaurus enhanced search interface, Journal of documentation, 61(5): 640-656.

Shruthi,Jand Swamy, Suma (2019).Effectivenes of recent research approaches in natural language processing on data science-an insight, Springer nature Switzerland: 172-182.

Smeaton,Alan F. (1995).Natural language processing and information retrieval, alecture presented at the European summer school in information retrieval Glasgow.

Smeaton, Alan F. (1992). Progress in the application of natural language processing to information retrieval tasks, the computer journal, 35(3): 268-278.

Taskin, Zehra, Al, Umut (2019).Natural language processing applications in library and information science, online information review,43(4): 676-690 (https://doi.org/10.1108/oir-07-2018-0217).

Tsatsaronis, George, Varlamis,Iraklis and Vazirgiannis, Michalis (2010).Text relatedness based on word thesaurus,Journal of artificial intelligence research, 37: 1-39.

Vickery, Alina and Vickery, Brian C. (2005).Information science in theory and practice, 3^rd ed, Walter de Gruyter.

Voorhees, Ellen M. (1999). natural language processing and information retrieval, in proceeding ofapplied natural language processing and information retrieval, Information exraction, toward scalable,adaptable system: 32-48.

Wolfram, Dietmar (2016). Bibliometrics, information retrieval and natural language processing: natural synergies to support digital research, in proceeding of the joint workshopon bibliometric enhanced information retrieval and natural language processing for digital libraries(BIRNDL):6-13.

Woods, William A. (1973). Progress in natural language understanding: an application to lunar geology, In American Federation of Information Processing Societies, National computer conference, 42: 441-450.

Xi, Sumei (2013). Application of natural language processing for information retrieval, Applied mechanics and materials, Trans tech, 380,384: 2614-2618.

Zerual, Imad andLakhouaja, Abdelhak (2018).Data science in light of natural language processing: an overview,Procedia computer science,127:82-91.

Zuo,Feng,Wang, Fu Lee, Deng, Xiaotie, Han, Song, and Wang, Lu Sheng (2006),WSEAS transaction on information science and applications, 3(6): 1036-1044.

References [in Persian]

Aghakardan, Ahmad and Keyhaninejad, Mina (2011). Presenting a model for extracting information from text documents, based on text mining in the field of e-learning, Iran Information and Communication Technology Scientific Research Quarterly, 4(12,11): 47-54. [in Persian]

Baradaran, Razieh and Golpar Rabooki, Effat (2018), Scientific Research Quarterly Journal of Signal and Data Processing, 41(3): 79-88. [in Persian]

Bahrevar, Majid, Mahdipour, Elham and Kamel, Azad (2007). Automatic Persian text summarization system, 14th annual national conference of Iran Computer Association, Amirkabir University of Technology, Tehran. [in Persian]

Bina, Bahareh, Rahgozar, Masoud, and Dehmoobad, Azin (2006). Automatic Classification of Persian Texts, 13th National Conference of Iranian Computer Association, Kish Island. [in Persian]

Davoodabadi, Marzieh (2005). Semantic processing of sentences and execution of commands issued in Farsi language, artificial intelligence master's thesis, Faculty of Electrical and Computer Engineering, Isfahan University of Technology. [in Persian]

Dastgheib, Mohammad Baqer, Keilini, Sara and Fakhrahamd, Seyed Mostafa (2018). Designing and implementing a system for identifying and correcting spelling errors in Persian texts based on the meaning of words, Scientific Research Quarterly of Signal and Data Processing, 41(3): 117-127. [in Persian]

Dolani, Abbas and Farhadpour, Mohammadreza (2008). A review of automatic indexing and common software in its production, Book Quarterly, 79(3): 291-310. [in Persian]

Farhadpour, Mohammad Reza and Motalebi, Dariush (2018). Infocrystals and their application in information retrieval, Journal of National Library Studies and Information Organization, 22(3):24-45. [in Persian]

Guillory, Abbas (1379). Automated Indexing (Past, Present, Future), Information Research and Public Libraries, 10(4): 17-25. [in Persian]

Harir, Najla and Heratizadeh, Saina (2014). Measuring users' satisfaction with the thesaurus of Islamic sciences as an information retrieval tool, National Library and Information Organization Studies Quarterly, 26(2): 141-160. [in Persian]

Hosseinikhah, Tayyebeh, Ahmadi, Abbas and Mohebi, Azadeh (2016). Improving automatic Persian text summarization using natural language processing and similarity graph methods, 33 (2): 885-914. [in Persian]

Ismaili Taft, Shima and Shakeri, Azadeh (2014). Cross-language analysis using semantic features, Scientific Journal of Iran Computer Association, 13(2): 47-59. [in Persian]

Izadi, Sara (2011). Application of natural language processing techniques for matching questions in Persian question and answer systems, Master's thesis, Yazd University. [in Persian]

Jalalimanesh, Ammar, Alidousti, Sirous and Khosrojerdi, Mahmoud (2012). Machine indexing of Persian sources: an integrated model for Iran Research Institute of Science and Information Technology, Scientific Research Quarterly of Iran Research Institute of Science and Information Technology, 29(2): 425-451. [in Persian]

Jamali, Iman, Miraabedini, Seyed Javad and Haroonabadi, Ali (2016). Presenting a classification model of Persian texts using a combination of classification methods, Journal of Telecommunication Engineering, 7(23): 34-44. [in Persian]

Khaseh, Ali Akbar (2010). Data mining, text mining and web mining: definitions and applications, electronic journal of scientific communication, 55: 1-6. [in Persian]

Khani Jazni, Iman and Sajedi, Hedieh (2015). Jouya: A Persian Question and Answer System, Computer Science, 3:51-66. [in Persian]

Kaveh Yazdi, Fatemeh, Zare Mirakabad, Mohammad Reza and Bahrani, Mohammad (2006). Designing and implementing a sample question and answer system, the 13th National Conference of the Iranian Computer Association, Kish Island. [in Persian]

Naeimi, Fatiemeh and Qods, Vahid (2018). Farsi speech synthesis using step frequency in Flite software, Advanced Signal Processing, 3(1): 97-107. [in Persian]

Noorbehbahani, Seyed Fakhreddin (2017). Incremental analysis using active learning on text flow, Iranian Journal of Electrical Engineering and Computer Engineering, 16(4): 291-300. [in Persian]

Niakan, Shahrzad (2013). Machine indexing, Tehran, Iran Scientific Information and Documents Center. [in Persian]

Parei, Azam al-Sadat and Hamidi, Hojatolah (2016). Presenting an approach for managing and organizing text documents using intelligent text analysis, Scientific Research Quarterly of Iran Information Science and Technology Research Institute, 32 (4): 1171-1202. [in Persian]

Rad, Farhad, Parveen, Hamid, Dehbashi, Atusa and Minaei, Behrouz (2015). Presenting a new method for automatic indexing and keyword extraction for information retrieval and text clustering, Journal of Signal and Data Processing, 27(1): 78-100. [in Persian]

Rezaei, Vahideh, Mohammadpour, Majid, Parvin, Hamid and Nejatian, Samad (2016). Presenting a method for extracting keywords and weighting words to improve the classification of Persian texts, Scientific Research Quarterly of Signal and Data Processing, 34(4): 55-78. [in Persian]

Sepehrian, Zahra, Sadidpour, Saeideh Sadat and Shirazi, Hossein (2014). The method based on semantic similarity in summarizing Persian texts based on the user's query phrase, Scientific Research Journal of Electronic and Cyber Defense, 2(3): 51-63. [in Persian]

Sanji, Majid and Davarpanah, Mohammad Reza (2008). Identification of non-conceptual (common) words in automatic indexing of Persian documents, Library and Information Quarterly, 48(4): 9-36. [in Persian]

Sheikhan, Mansour, Nasirzadeh, Majid and Daftarian, Ali (2004). Design and implementation of text to natural speech conversion system for Persian language, Journal of Faculty of Engineering, 17(2): 31-48. [in Persian]

Sanatjo, Azam (2005). The necessity of revising the structure of thesauruses: investigating the ineffectiveness of thesauruses in the new information environment and the capabilities of ontologies compared to it, Book Quarterly, 64: 79-92. [in Persian]

Talebian Khouchaksaraei, Mehdi (2006). Automatic information extraction based on an ontology, Master's thesis, Islamic Azad University, Science and Research Unit. [in Persian]

Veysei, Hadi and Parsafard, Pouyan (2018). A review of automatic text classification methods and researches, Computer Science, 13(1):2-23. [in Persian]

Zebardast, Maryam (2009). Digital reference services, with a look at the system of asking the librarians of the Organization of Libraries, Museums and Documents Center of Astan Quds Razavi, 2 (6): 1-20. [in Persian]

Knowledge Retrieval and Semantic Systems

Applications of Natural Language Processing in Information Science and Knowledge with Emphasis on Digital Libraries

References

References

Volume 9, Issue 33 - Serial Number 33
January 2023
Pages 197-262

Applications of Natural Language Processing in Information Science and Knowledge with Emphasis on Digital Libraries

References

References

Volume 9, Issue 33 - Serial Number 33January 2023Pages 197-262

Volume 9, Issue 33 - Serial Number 33
January 2023
Pages 197-262