Monday, 24 August 2009
Introduction
For my ongoing search engine research project I need to understand much more about search mechanics, and to gain a deeper understanding of where it is heading into the future. Search engines have come a long way since their humble beginnings out of directory listings and approachable and accessibly keyword search techniques have driven their popularity for finding information on the Internet (Li et al, 2008). However, perform a few searches and you may discover that using keyword searching for non-trivial information is as much a problem today as it was in the early days of search (Finkelstein et al, 2002).
Searching for non-trivial information can be broadly split into three areas (Torrey, et al 2009).
- Locating and navigating to sources of information
- Making sense of the content presented
- Engaging in the process of social seeking of information
When considering the future impact of search engine mechanics in the context of information retrieval there are three factors which may be useful in measuring success which are coverage of information, unbiased content, and user focus – the information should be presented fairly, be accurate and be accessible and relevant to the searchers needs (Datta, et al 2008).
When looking into search using keyword techniques it may be useful to talk about what we would do if search was not an option – this may offer some insight into how we look for information and the decisions we make in deciding which information is useful to us.
The starting point for the analysis is the question: if the Internet did not exist – what would be the process for finding new information?
For example, if I wanted to know more about black holes – what steps might I take?
I will probably break this down into a few steps and consider what I am looking for, where I might find more information, and why I need this information. Answering these three questions appropriately will offer some useful insight that may be able to apply later.
Step 1: What…?
So we start with the question What…? – What are we looking for? We already know that – we want to know more about black holes. Where next? The next thing might be to decide the format of the information I need. For example:
- Do I have just a passing interest? If so, I could simply ask someone.
- Am I writing an academic paper? If so, I need researched, peer reviewed material.
- Do I need an image of a black hole? Could I use an image library?
- Is it for a competition? What level of detail do I need?
- Have I seen a black hole and wanted to find out if it was dangerous?
- Do I have concerns about black holes in my immediate vicinity?
- Is my interest similar to black holes but not exactly black holes?
The last three points start to clarify our requirement for information further and have indicated some new areas of information which might lead to further information.
Step 2: Where…?
There are many sources of information, ranging from local gossip or research academic papers. The next step in our process would be to decide where to start looking for this information:
- Ask someone close to me for more information
- Buy a book or magazine related to my interest
- Contact a professional who might have detailed knowledge
- Telephone someone – for example the local observatory
- Do a college course – this may take longer but could give a good grounding into what we are looking for
- Call someone out – a builder or pest control perhaps?
- Borrow a book from someone
- Visit the library
- Watch television
Hopefully, you may have noticed the “call a builder or pest control” point – what sort of black holes do I really need more information about? Now we are starting to explore the context of the question. Context and Clarification are becoming important factors in finding a solution to our problem.
Step 3: Why…?
After deciding what I am looking for and approaching one of the where locations – I will probably be faced with the clarification question Why…? This is typical used in order to allow someone else to understand what my requirements are and for me to clarify my expectations in any response, for example let us assume we decided to visit the library for more information, how might the conversation go?
ME: Excuse me, could you point me to the section talking about black holes please?
LIBRARIAN: Depending on what you are looking for you could start in the Space and Cosmology section over there – may I ask what your specific interested in black holes is?
ME: Thanks, I am interested in black holes because I a have a problem at home with lots of tiny black holes appearing on my walls
LIBRARIAN: Oh, I see, in that case you need the Biology section over there- and if you are being infested by something then try household pests – you should probably telephone the local council and ask for their pest control department.
So now we, and others, are starting to understand my problem further. I started out with a simple word search looking for more information about “black holes” – but by clarifying and contextualising the information someone else has given me the information I need.
Now, lets bring the Internet back into existence … pop … there it is.
When approaching this problem using current Internet search engine technology, I could start the search again using the initial words – my keywords of “black holes”. These are the only words that I could use for definite at phase 1 for a keyword search so I type: "black holes" into a search engine, let’s try that in Google:
After clicking on search we are returned part of a list containing nearly 3.7 million results (Since I wrote this initial article this figure has jumped to over 4 million results) – they are all talking about black holes and, as expected, they have returned space related black holes as the librarian initially pointed out.
However, the black holes are appearing in my imaginary house and I doubt very much that I have a concentrated hub of singularities – otherwise I might be searching for deeper questions about existence.
Now, the results are not meeting my expectations and I have no librarian to talk to so I need to try and randomly attached clarification terms to get the information I need. By clarifying my expectation of the results to the librarian I obtained two new keywords/phrases "infesting" and "household pests".
With these phrases in mind let’s try the keyword search again but this time we will be a little more specific:
Now, this is much more like it, the search engine has returned only 16 thousand potential matches for our search term and the first article talks about termites infesting a house and leaving little black holes - bingo.
The librarian suggested telephoning the council and asking to speak to their pest control department – we have come a long way from phoning the local observatory.
Summary
Initially, this (albeit rather simply example) tells us a lot about search using keywords. It tells us that keywords alone are simply not enough for finding information. Keywords require two additional factors – context and clarification to really work – and these may need to be applied in an iterative fashion to yield the results we need.
In our example above, the context was "my house" and the clarification was "infesting". Without these two I might still be staring into space awaiting an imminent invasion from above. At least now I can deal with the problem at home.
This leads me to think that a two stage search process might work better for giving the results we need? However, If I type a keyword or keyword phrase into a search engine would I be put off by a second stage of context and clarification?
References
Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 2 (Apr. 2008), 1-60. DOI= http://doi.acm.org/10.1145/1348246.1348248
Li, G., Feng, J., Wang, J., and Zhou, L. 2008. An effective and versatile keyword search engine on heterogenous data sources. Proc. VLDB Endow. 1, 2 (Aug. 2008), 1452-1455. DOI= http://doi.acm.org/10.1145/1454159.1454198
Finkelstein, L, et al. 2002. Placing search in context: the concept revisited. ACM Trans. Inf. Syst. 20, 1 (Jan. 2002), 116-131. DOI= http://doi.acm.org/10.1145/503104.503110
Torrey, C., Churchill, E. F., and McDonald, D. W. 2009. Learning how: the search for craft knowledge on the internet. In Proceedings of the 27th international Conference on Human Factors in Computing Systems (Boston, MA, USA, April 04 – 09, 2009). CHI ’09. ACM, New York, NY, 1371-1380. DOI= http://doi.acm.org/10.1145/1518701.1518908
Related
- Understanding Search With A Search Engine Project
- Keywords or Tags, or both?
- Search Engine Optimisation Unmasked, Part One
- Wolfram|Alpha Is Not A Web Search Engine
- PHP: Highlight a keyword in a phrase



Isn’t the Google algorythm like the coca-cola recipe?