Altavista
Griesbaum, J.
The goal of this study was to investigate the retrieval effectiveness of three popular German Web search services. For this purpose the engines Altavista.de, Google.de and Lycos.de were compared with each other in terms of the precision of their top twenty results. The test panelists were based on a collection of fifty randomly selected queries, and relevance assessments were made by independent jurors. Relevance assessments were acquired separately a) for the search results themselves and b) for the result descriptions on the search engine results pages. The basic findings were: 1.) Google reached the best result values. Statistical validation showed that Google performed significantly better than Altavista, but there was no significant difference between Google and Lycos. Lycos also attained better values than Altavista, but again the differences reached no significant value. In terms of top twenty precision, the experiment showed similar outcomes to the preceding retrieval test in 2002. Google, followed by Lycos and then Altavista, still performs best, but the gaps between the engines are closer now. 2.) There are big deviations between the relevance assignments based on the judgement of the results themselves and those based on the judgements of the result descriptions on the search engine results pages.
2004
23-01-2009
Grado-Caffaro, M
Es bien sabido que los motores de búsqueda y los índices han surgido como herramienta para ayudar a encontrar información en el enorme y rápidamente creciente volumen de páginas web: su origen se inscribe en el contexto académico, al igual que ocurrió con la propia Red, para pasar posteriormente al escenario comercial.
2000
23-01-2009
Acceso al texto completo de mi tesis doctoral presentada en la Universidad de Murcia en julio de 2002. Editada en la Biblioteca Virtual Miguel de Cervantes y en el repositorio de tesis doctorales TDR.
Davis, E.
We are a society obsessed with convenience. We go to extreme lengths to invent devices that promise a simpler or more convenient lifestyle. This paradox is exemplified in our fascination with the Internet, as well as with our attempts to index it for access purposes. The Internet is today a rapidly evolving organism that is almost completely lacking in fundamental organization. The question of whether each individual achieves a net gain from all the effort expended in this process lies somewhere beyond the scope of my project, but I think we all can agree on the need to somehow organize this very unstructured information resource.
1996
26-12-2008
Chignell, Mark. H, Gwizdka, J. and Bodner, Richard C.
There was a proliferation of electronic information sources and search engines in the 1990s. Many of these information sources became available through the ubiquitous interface of the Web browser. Diverse information sources became accessible to information professionals and casual end users alike. Much of the information was also hyperlinked, so that information could be explored by browsing as well as searching. While vast amounts of information were now just a few keystrokes and mouseclicks.
1999
16-12-2008
Bharat, K. and Broder, A
Search engines are among the most useful and popular services on the Web. Users are eager to know how they compare. Which one has the largest coverage? Have they indexed the same portion of the Web? How many pages are out there? Although these questions have been debated in the popular and technical press, no objective evaluation methodology has been proposed and few clear answers have emerged. In this paper we describe a standardized, statistical way of measuring search engine coverage and overlap through random queries. Our technique does not require privileged access to any database. It can be implemented by third-party evaluators using only public query interfaces. We present results from our experiments showing size and overlap estimates for HotBot, AltaVista, Excite, and Infoseek as percentages of their total joint coverage in mid 1997 and in November 1997. Our method does not provide absolute values. However using data from other sources we estimate that as of November 1997 the number of pages indexed by HotBot, AltaVista, Excite, and Infoseek were respectively roughly 77M, 100M, 32M, and 17M and the joint total coverage was 160 million pages. We further conjecture that the size of the static, public Web as of November was over 200 million pages. The most startling finding is that the overlap is very small: less than 1.4% of the total coverage, or about 2.2 million pages were indexed by all four engines.
Computer Networks and ISDN Systems archive
Volume 30 , Issue 1-7 (April 1998) table of contents
Pages: 379 - 388
Year of Publication: 1998
ISSN:0169-7552
1998
Este post debiera haber sido el primero de este humilde blog si bien nunca es tarde para repasar los conceptos básicos de nuestra materia, con el fin de aclarar las posibles dudas que siempre nos pueden surgir.
Informa el diario El País en la noche del 26 de julio, que a lo largo del día de hoy un ataque contra diversos buscadores ha ralentizado durante varias horas la velocidad de Internet.

