Search engines are a core component of the internet as we know it. Based on Durkheims idea in his book „Division of labor“ suggesting infrastructure such as roads and railways connect people, enabes economic development and – indirectly – contributes to the transformation of social solidarity, it is hard to overestimate the impact of search engines. On any day in 2006, about 60 million adult Americans entered more than 200 million search queries into searchengines. In 2005, 84 per cent of Internet users have used search engines. On any given day, 56 per cent of those online have used search engines. 92 per cent of those who use search engines say they are confident about their searching abilities, with over half of them, 52 per cent, say they’re “very confident”. 68 per cent of users say that search engines are a fair and unbiased source of information. Only 19 per cent say they don’t place that trust in search engines (PEW Internet and American life project – Search engine users, 2005). As of August 2007, Google is not only handling the majority of all search queries. It manages to increase its share to handling 1200 million searches per day on average worldwide, according to Clickz reporting on Comscore data. Yahoo is way behind at 275 million search queries per day, and MSN at 70 Million. Baidu (a Chinese search engine) beats MSN, coming in at 105 million. 2006 figures for the US only put Google at 91 million searches per day. Reason enough to theorize a little bit about search engines, their importance, their policies and why they give rise to trust concerns online. Search engines are technology, information infrastructure, knowledge infrastructure and a socio-economic thing simultaneously. Hence, if I enter a search query into any search engine, Is it social action? Is it economic action? Can social and economic action be separated in a search query, at all? And how does that relate to trust concerns?
So I came across “Web Search – Multidisciplinary Perspectives” (Springer, 2008), edited by Amanda Spink and Michael Zimmer. The book is structured into three main sections with five chapters each. Following the introduction, Part II presents social, cultural and philosophical aspects of Web search. Part III presents political, legal and economic aspects of Internet search. Part IV presents information behavior perspectives. And in section five – conclusion – the editors draw together the results and discuss avenues for further research.
Essays include “Through the Google Googles: Sociopolitical Bias in Search Engine Design”, “Reconsidering the Rhizome: A textual Analysis of Web Search Engines ”, “Searching ethics: The role of search engines in the Construction and Distribution of knowledge”, “The Gaze of the Perfect Search Engine: Google as an Infrastructure of Dataveillance”, “Search Engine Liability for Copyright Infringement”, “The democratizing effects of Search Engine use: On Chance exposures and Organizational Hubs”, “Googling Terrorists: Are Northern Irish Terrorists visible on Internet search engines?”, “The history of Internet Search Engine: Navigational Media and the Traffic Commodity”, “Toward a Web Search Information Behavior model”, “Web Searching for Health: theoretical foundations and Connections to Health Related Outcomes”, and
“Web Searching: A quality measurement Perspective.”
A. Diaz in his Chapter “Through the Google Googles: Sociopolitical Bias in Search Engine Design” holds that search engines are the predominant gatekeepers in cyberspace, and Google handles the majority of search queries. Thus Google directs hundreds of millions of users to specific sites and sources – and not others and is thus the Internet’s most important gatekeeper. The web is loaded with people’s expectations of democratization in the sense that underrepresented voices, unusual and heterodox viewpoints can be heard and discussed through the filter of search engines. Hence, the question arises whether Google supports or hinders the “deliberativeness” of the Web. Google operates based on its PageRank algorithm. PageRank estimates the “importance” of an arbitrary page by looking at how many other “important” link to it. Thus, being “important” means being popular and being visible. PageRank is biased towards large, famous, technology-driven companies such as Amazon and eBay. In contrast, the millions of “typical” websites and blogs the user occasionally stumbles upon have among the lowest PageRank results. Following the company’s PR literature, Google’s PageRank is not only in tune with democratic principles, it embodies the very process of democracy itself, since web authors “vote” by installing hyperlinks. But from the perspective of deliberative democrats, PageRank is highly problematic: First, it mirrors rather than mitigates the Web’s link inequality. It abandons the goals of reflecting a website’s “importance” or “authority” on a given subject, and instead aims to mirror the common wishes of users. Second it suppresses controversy by presenting only the sunny side of a topic. Third, Google promotes advertising that directly competes with editorial content. Google’s founders may have abandoned their original vision for a search engine that is “competitive” and in the “academic realm” in the process of commercializiation, but they have come to see this contradiction. In search for a solution to the problem, some see the answer in technology, others in political regulation and subsidization.
Critical theorists have drawn from Deleuze and Guattari when discussing the potential of the Internet. If the Deleuzian notion of the “rhizome” is a valid metaphor for the Internet, denoting its limitless expansion, randomly intersecting sites and enabling rupture and re-growth. The rhizome is associated to structure, hypertext, epistemology and resistance, In his essay “Reconsidering the Rhizome: A textual Analysis of Web Search Engines”, A. Hess critically discusses the development of the web given the impact of commercial search engines such as Google, Yahoo! and MSN: First, the connections between individuals in the fight against hegemonic practices have been mitigated by the commercial restructuring of the web. Second, as a consequence of personalization and the use of “cookies”, the user becomes prescribed to specific experiences and limited in his or her ability to expand into new areas of learning. Third, search engines are profoundly hierarchical in how they arrange knowledge. While hypertext, as a language, may carry some rhizomatic potential, the frequent use of and reliance on search engines as a gateway to the Internet limits its potential as a global heterarchical sphere of information, knowledge and public discourse.
A Faustian bargain, argues M. Zimmer, is “The Gaze of the Perfect Search Engine: Google as an Infrastructure of Dataveillance”, M. Zimmer. Google is “to organize the world’s information and to make it globally accessible and useful” and the “perfect search engine” that produces only intuitive, personalized and relevant results. Yet, the perfect search engine comes at a high price: the loss of privacy. Integration of search queries into one’s daily routines involves monitoring of all users‘ of information-seeking activities, and due to the US PATRIOT Act and other national legislation fostering the war on terrorism, does not preclude user data from being handed to third parties, namely state agencies. “My goodness, it’s all my personal life … I had no idea somebody was looking over my shoulder” a woman exclaimed upon being identified by the New York Times solely on her search terms in the AOL database. Search engine providers keep detailed records of users‘ searches and are eager to transform the information generated in the process of users‘ information seeking activities into economic value. Obviously, the woman had entered quite a bit of identifiable information such as names, social security numbers, addresses and the like. Google’s perpetual surveillance and the “continuous registration, perpetual assessment and classification” of those under the gaze of personal data accumulation allows the search engine to develop into the center of gravity online. Google, for example, offers dozens of search-related tools for free to help users organize and use information in many contexts, ranging from daily routines to email, office, academic research, financial, medical, shopping and travel. Consequently, users increasingly search, find, navigate, organize, distribute information based on Google’s information infrastructure, enabling the search engine to accumulate detailed information about the users, their interests, habits, lifestyles, social position and networked relations. Is there a solution to the Faustian bargain? Zimmer discusses state regulation as to enact laws regulating the capture and use of personal information online, or self-regulation on the side of the Internet search industry, creating strict policies regarding the capture, aggregation of user data. Particularly, the latter solution seems not very convincing, not even to the author, due to the search engines‘ vital economic interests in capturing, evaluating and selling user information. Even a technological design can be political in character since the default settings of some products and services enroll users in data-collecting processes and some turn them off.
Are search engines constantly violating copyright law just by distributing content on the Web? This question is examined by B. Fitzgerald, D. O‘ Brian, A. Fitzgerald in “Search Engine Liability for Copyright Infringement”. First, the authors introduce basic principles of copyright law and their application to search engines. Much of the digital content distributed online or available for viewing and downloading at Internet locations is protected by copyright. However, today’s copyright law was established centuries before digital era technologies such as search engines were invented. Copyright emerged in the 15th and 16 th century following the invention of the printing press. Over the centuries, the scope of copyright expanded to encompass new forms of knowledge and related goods as well as new ways of distributing those materials. As new ways of expressing and exploiting creative materials have been developed, the exclusive rights conferred on creators have been reformulated and extended aiming to ensure that creators will reap the rewards of their efforts. The Berne Convention of 1886 created a set of rules where the creator or author is automatically the copyright owner at the outset, but can assign their copyright to a commercializing agent, e.g. a publisher who then becomes the copyright owner. Moral rights stay with the creator or author, while economic rights remain with the copyright owner. Most national jurisdictions are also provisions whereby third parties can be held responsible for authorizing or contributing to copyright infringement. The rationale behind such provisions is that third parties are in a better position to discourage copyright infringement. Importantly for search engines, provisions called “safe harbours” exist in national jurisdictions as to immunize search engines and providers from secondary liability copyright infringement. However, legal issues do arise when copyright protected material (e.g. text, sound material, image, video) is distributed online. In the subsequent discussion, the authors also discuss cases decided in courts of the United States, Australia, China and Europe regarding the liablility of search engines for copyright infringement. The chapter concludes with some considerations for copyright reform to accommodate and unfold the informative potential search engines have to offer society. One recommendation is to have the World Intellectual Property Organization (WIPO) organize a conference on search engine liability, a more radical proposal to be decided on such a conference would be to allow search engines the broadest possible immunity with regard to copyright acknowledging search engines as information infrastructure to improve the public’s ability to research, manage and process knowledge. Ultimately, copyright law must balance between the right to own and exploit information and the ability of users to access and reuse information in a knowledge economy.
A. Lev-On expresses an optimistic view on Internet search engines. In his essay “The democratizing effects of Search Engine use: On Chance exposures and Organizational Hubs” the author does not focus on Google’s China problem, but rather on China’s Google problem. The author discusses the dilemmas that search engines pose for authoritarian regimes and points to the democratic potential of search engine use. Google’s search algorithm PageRank, the author argues, produces socially relevant results which are driven by the linking decisions with individuals and not compromised by spammers, firms or government agencies, produce a genuine ‚public choice‘, a ’slightly filtered public opinion‘.Google has been criticized for not distinguishing clearly enough between organic results and paid content, and for being biased as a consequence of governmental intervention, as in the Chinese case such that ‚forbidden‘ content is blocked. Moreover, Google’s search algorithm PageRank has been criticized for transforming the equality of opportunity the Internet is so much praised for, into inequality of outcome and into predominance of a small elite of highly linked sites over users‘ attention. The author argues that search engines indirectly advance political organization, that is they activate people to become active, get organized and create ‚organizational hubs‘, thus make the political marketplace more open, more inclusive and more competitive. On the other hand, authoritarian governments such such as China aim at avoiding unpredictability and exposure to critical information. Their action is oriented toward Hobbes‘ advice to monitor and control the online world. Search engines can be used to expose people (e.g. to their government) or they may serve as an infrastructure enabling the formation of political action and social movements distributing information that authoritarian regimes disapprove of. With regard to China, the sheer number of Chinese flocking to the Internet raises high hopes for the democratic potential. Yet, from the picture the Western world got from China in the course of the Olympic Games of 2008 I am rather doubtful.
E. van Couvering traces the economic development of the search engine industry in his chapter “The history of Internet Search Engine: Navigational Media and the Traffic Commodity”. He sets out with the first movers Internet search such as Yahoo! (1994), Lycos (1994), Web Crawler (1994), Alta Vista (1995) and Hotbot (1996) from today’s incumbents such as Google, Yahoo! and MSN. The author discusses in how far Internet companies operate similar to traditional media organizations and how the relations between traditional and Internet media. In particular, he traces the development of the Internet portal, the first of which appeared in 1997, and the steady increase of their revenues. Using a theoretical framework based on the political economy of communications, the author argues that web search engines are the purveyors of a new media form, the navigational media, that have taken the advantage of a fragmented media market to attract and distribute traffic via the creation of flexible and stable networks. In 2006, few large search engines overwhelmingly have roughly three quarters of share of the online advertising market. In the first period, many new technologies were created, and venture capital systems helped launch new companies into emerging industries. In the second period developed specialized content “channels” created of advertiser content where lucrative sponsorship deals became possible as a result of segmentation of their audiences. The cost-per-click model contributed to re-define the online media commodity from audience to traffic. And in the third period, the emphasis on the sale of traffic gave a massive boost to search engine revenues, especially for the first movers Overture and Google. Instead of seeking to acquire and control content, the search engines chose the strategy on distributing traffic based advertising throughout the web. The essay teaches that unique visitors, page impressions and clicks have become so important is a consequence of the dense interconnection between the search engine as a social function and the advertising business.
S.A. Knight and A. Spink show the immense complexity of empirical information retrieval behavior on the side of users in their chapter “Toward a Web Search Information Behavior model”: Some are user-related and include cognitive processes, motivational issues, information needs, technology attitude and adoption, others are system related and include algorithms and user interface design. The field lacks a coherent model of Web interaction in the information behavior context, argue Knight and Spink, and go on to explore the range of information behavior, information seeking and retrieval. The end user on the Internet is quite different from the end user of other information databases who were likely specialized professionals: (1) End users are not necessarily the “information professionals” of previous generation of online searchers. (2) Formal training in information search cannot be taken for granted. (3) Users are likely to use wide set of search strategies, with inconsistent results. (4) Users are cognitively and physically on their own. (5) Users are likely to search for a wider variety of information type and format than specialized professionals. (6) Users are more likely to be the “information user” of information they are seeking. The change in users is accompanied by a dramatic change in the online information environment: (1) Open architecture – lack of enforceable quality standards, (2) Open classification and meta-tagging system – Web pages failing to be indexed appropriately by search engines. (3) Highly Dynamic use of hypertext – favoring browsing over query. (4) Dynamic/fluid content structure – resulting in pages being “moved” within directories of a Websites, leading to frequent 404 errors, (5) Partial representation – at any time, results of a search query provide only a snapshot of the relevant internet. (6) Volume – due to sheer size of the internet, a snapshot of a search engine of the Internet on any given day is likely to represent less than 30 % of the known Web. A comprehensive model of internet information behavior needs to take into account both the motivating human aspect (information need) and the setting (information environment) in which the individuals try to find the information needed. The fourth element which needs to be added is the interaction between the user and the web system environment, and the user and and the information.
The book “Web Search – Multidisciplinary perspectives” is a valuable contribution showing the Internet is relevant to trust concerns: In Durkheim’s words, the Internet is a social thing, a social fact. That means the internet is external, coercive, like a thing, and cannot be altered from the single participant at any given time. The size, growth and complexity of the internet is overwhelming, and the importance of advertising can hardly be overestimated. The enormous density of interconnection between the search engine industry and the advertising business leads to the fact that visibility and relevance or irrelevance of content online can only be understood if the interrelatedness of search and the advertising business is taken into account. The book teaches about the dense interrelatedness of social and economic action online: Social action (e.g. mutual advice and support, cooperation, communal association) is economically relevant because digital footprints left online unintendedly can be gathered, evaluated, transformed into data for the advertising business and sold to third parties without knowledge and permission of the Internet user. Beyond the book, it can be added, economic action (e.g. purchase of goods, borrowing and lending) takes place in social settings, mostly on provider platforms, whose development, again, depends on revenues based on advertising. Thus, any discussion of privacy and trust online is silently accompanied by the advertising business and the business interests of companies who sell ads and distribute content unless other financial ressources become more important. And shrinking advertising activities as a consequence of the global financial and economic crisis will certainly affect further development of the internet.