Document Type : Research Paper

Authors

1 Ph.D Student in Knowledge and Information Science, Ferdowsi University of Mashhad, Mashhad, Iran

2 Associate Professor, Knowledge and Information Science Department,, Ferdowsi University of Mashhad, Mashhad, Iran

Abstract

Data mining detects patterns in the massive volume of data used in many disciplines. It can also be useful in our field, especially in information retrieval. In information retrieval, first the system-oriented paradigm and then the user-oriented paradigm have been introduced, the second paradigm being concerned with information needs. In the second paradigm, the inclusion of inappropriate queries is considered the main reason for not retrieving relevant documents. Therefore, one of the main topics of this paradigm is proposing and extending the appropriate query in the recommender system that can be used for data mining methods. There are four important methods to propose a query to strengthen the recommender system. The time series rule is one of these methods that deal with query frequency in a particular time unit. Another method is the association rule that addresses the dependency and association of queries. In addition to the dependence and association of queries, the order of query terms is also considered in the method of Association rule with Levenshtein distances. However, in all three of these methods, the log file is used, while in probabilistic theory, the document words are used to repair the lexical gap between the queries and the documents. Therefore, it seems that using probability theory to suggest the query yields better results.

Keywords

Bhatia, S., Majumdar, D., & Mitra, P. (2011). Query suggestions in the absence of query logs. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval (pp. 795-804). ACM.
Borlund, P. (2003). The concept of relevance in IR.  Journal of the American Society for information Science and Technology, 54 (10), 913-925.
Brockwell, P. J., & Davis, R. A. (2016). Introduction to time series and forecasting. springer.
Budd, J. M (2004). Relevance: Language, semantics, philosophy. Library Trend, 52 (3).
Capurro, R., & Hjørland, B. (2003). The concept of information. Annual Review of Information Science and Technology, 37(1), 343-411.
Chien, S., & Immorlica, N. (2005). Semantic similarity between search engine queries using temporal correlation. In Proceedings of the 14th international conference on World Wide Web (pp. 2-11). ACM
Croft, W. B., Metzler, D., & Strohman, T. (2010). Search engines: Information retrieval in practice (Vol. 520): Addison-Wesley Reading
Cui, H., Wen, J. R., Nie, J. Y., & Ma, W. Y. (2002, May). Probabilistic query expansion using query logs. In Proceedings of the 11th international conference on World Wide Web (pp. 325-332). ACM.
Derr, R. L. (1983). A conceptual analysis of information need. Information Processing & Management, 19(5), 273-278.
Dervin, B. & Nilan, M. S. (1986). Information needs and use. Annual Review of Information Sczence and Technology, 21, 3-33
Fidel, R. (2008) Are we there yet?: Mixed methods research in library and information science. Library & Information Science Research, 30, 265-272.
Fidel, R (1993). Qualitative methods in information retrieval research. Library and Information Science Research, 15, 219-219.
Fonseca, B. M., Golgher, P. B., de Moura, E. S., & Ziviani, N. (2003, November). Using association rules to discover search engines related queries. In Web Congress, 2003. Proceedings. First Latin American (pp. 66-71). IEEE.
Hiemstra, D (2017). Information Retrieval Models. Retrivited 2 decamber 2017 from http://wwwhome.cs.utwente.nl/~hiemstra/papers/IRModels Tutorial-draft.pdf
Kent, A.&  Lancou, H (1968). Encyclopedia of Library and Information Science. New York: M. Dekker. 
Lewandowski, D. (2012). Web search engine research: Emerald Group Publishing Limited.
Miranda, S. V. & Tarapanoff, K. M. A. (2007). Information needs and information competencies: a case study of the off-site supervision of financial institutions in Brazil.  Information Research, 13(2), Retrieved from http://InformationR.net/ir/13-2/paper344.html
Mizzaro, S. (1998). How many relevances in information retrieval? Interacting with Computers, 10(3), 303-320.
Reitz, J. M. (2019). Information need. In Online Dictionary for Library and Information Science. Retrived from https://www.abc-clio.com/ODLIS/odlis_i.aspx.
Saracevic, T. (2007). Relevance: A review of the literature and a framework for thinking on the notion in information science. Part III: Behavior and effects of relevance. Journal of the American Society for Information Science and Technology , 58 (13), 2126-2144.
Saracevic, T. (2012). Research on relevance in information science: a historical perspective. Proceedings of the ASAS&T2012 on pre-conference on the history of ASAS&T in information and technology.
Shi, X., & Yang, C. C. (2007). Mining related queries from web search engine query logs using an improved association rule mining model. Journal of the American Society for Information Science and Technology, 58(12), 1871-1883.
Siegmund, David O (2019). Probability theory. In Encyclopedia Britannica. Retrieved from https://www.britannica.com/science/probability-theory
Thornley, C., & Gibb, F. (2007). A dialectical approach to information retrieval. Journal of documentation, 63 (5), 755-764.
Timmins, F. (2006). Exploring the concept of ‘information need’. International journal of nursing practice, 12(6), 375-381
Ullah, M. I. (27 December 2013).  Time Series Analysis. Basic Statistics and Data Analysis. WEN Themes. Retrieved 2 January 2014
Vidinli, I. B., & Ozcan, R. (2016). New query suggestion framework and algorithms: A case study for an educational search engine. Information Processing & Management.
Wilson, T. D. (2000). Recent trends in user studies: action research and qualitative methods. Information research, 5(3), 5-3.