SEARCH ENGINE HISTORY
Before the invention of Yahoo!, Facebook, Twitter, and Google, people went to libraries to read articles and books and consume knowledge. Libraries were considered eternal and powerful as they provided a vast collection of written knowledge. However, they were also seen as a nexus of potential information overload (Havalais 2013, p. 11). As a result, techniques to sort and find items in the library (e.g. Dewey Decimal System) were created in order to easily manage the collection of books. These library techniques often classify and index books based on their subject matter and were recorded and kept on paper.
Librarians were not the only ones to use data collection and subject-oriented classification. Offices such as insurance companies also needed ways to organise their files. Documents, articles, customer records, and data recorded on papers resulted in data glut. The computing systems that were first used and created for indexing books in the libraries were expanded and adapted for office use. Documents and data that were supposed to be on papers were soon digitally stored in computer systems; not in drawers nor shelves (Havalais 2013, p.13).
The emergence of the World Wide Web gave birth to a more massive data glut. Large documents, information, and articles were uploaded and made available on the internet. Thus, the evolution of search engines.
According to the search engine timeline created by WordStream, the first search engine on the internet is Archie. It was created in 1990 (before the World Wide Web) to search FTP sites and index each file names. In 1991, Tim Berners-Lee set up the World Wide Web Virtual Library which is the oldest catalogue of the web.
As the users of the World Wide Web increased in 1993, bots and spiders indexed titles and URLs on the web. They were collated and listed without ranking. In the following year, WebCrawler was created and was the first bot/spider to index entire pages on the web. Yahoo! was also created but it only included selected sites and paid commercial sites. Lycos ranked pages according to relevance, prefix matching, and word proximity.
From 1995 to present, numbers of search engines were developed and varies on what you need and how they rank a page. Below are some examples of various web search engines:
HOW DOES A SEARCH ENGINE WORK?
The system generally gathers information around the web that contains the words typed in the search box. Then the collected data is presented and ranked based on their relevance to the particular keyword.
The search box is where you type the words that you want to search for. The engine will show a list of pages that contain the combination of the words that you put through the search box (Havalais 2013 p.13).
Search engines on the web would not be efficient without spiders/crawlers. Spiders literally crawl on every page on the web and index each one into one database where search results of a particular keyword appear.
Spiders index each page to the database result of the query. However, search engines do not simply show a list of pages where the particular keyword has occurred. Since some pages on the web are perceived to be unhelpful and because of the rapid growth of spam, search engines rank page results according to their relevance to the content.
To show the best result of a particular query, search engines use various algorithms. Google, for example, uses an algorithm called PageRank which ranks the results based on the number of links in its page and number of pages the particular page was linked to. It determines the authority of a given page (p. 18). Other attributes that help pages to rank first is the number of “hits” or “clicks” the page gets and the speed of the particular page.
IMPLICATIONS OF SEARCH ENGINES
The web gives us an opportunity to learn and connect with other people. Search engines, basically, make the method of finding information on the web much easier. A by-product of search engines according to Levene (2010, p. 28) is that users spend less time navigating a certain website looking for answers since a search engine gives a curated list of page results for users to choose from and navigate. However, negative implications of search engines, especially Google, have been brought to the attention.
Database of intention
John Battelle (2005) considered search engines as a database of intentions. Search engines were said to have a map of each of their user’s interest and personality (Havalais 2013, p. 30) through the user’s search history. Search engines have slowly learned how people behave. The largest and the most popular search engine today, Google, was by defined by Battelle as
‘Database of intentions, as a living artifact of immense power…
is holding the world by its thoughts’.
Watch the short clip of John Battelle’s talk about database of intentions here:
To further examine what Battelle meant by a database of intentions, Kylie Jarrett (2014) explored different vignettes and implications of Google’s data-mapping. The cost of using free sites are the data, consumer traffic and demographic information that we, as users, provide for free in every query, every click, every move we make on the web. This information is the backbone of digital media economics (Jarrett 2014, p. 18). The said information is sold to advertisers and marketers. The data statistics are transformed into a commodity.
Monopolisation of the web
In February 2012, the Pew Internet and American Life surveyed more than 2,000 adult users of search engines. Users are generally satisfied with the results and quality of search engines. However, most are anxious and disapprove the collection of personal information for pre-emptive suggestions and targeted advertising.
In 2010, Mark Levene wrote a book on search engines and web navigation. Interestingly, he wrote a futuristic scenario where a single search engine dominates the web. He asserted (p. 65) that the dominant search engine can monopolise the web by determining the information we could easily see. It can also make or break businesses since they choose which websites will be visible. He also mentioned that it will most likely be able to track each user’s queries and tailor the user with personalised answers, ads, and versions of the web. As mentioned above, Google seems to be the dominant search engine right now and is most probably doing what Levene had written in his futuristic scenario in 2010.
As our user data become commodities in the advertising market, our user data become the economic surplus that allows Google to continue to grow, build more server farms, and engage in research and development to identify new tools or systems to further capture our user data. The system that Google created, the commodification of user data, allows Google to be the capitalist enterprise.
The habits, the knowledge that we consume, our desires that Google put into their own data-map, become their own property, taken from our control and is commodified to further increase its revenue mainly through advertisements.
When you type something on Google, it anticipates and proposes suggestions on what you might actually be searching. It draws data from your past behaviour combined with generic data from other similar users in your area (Jarrett 2014, p. 23).
The suggestions Google makes is thus shaped by your previous search data and of other users that they have articulated into their database. These pre-emptive assumptions, moreover, sustain social inequality by giving unfair and bias search results. Preemption in search engines thus intervenes and engineers reality (Parisi and Goodman cited in Jarrett p. 25). Google has created ‘algorithmic identity’ which grows and changes and adapts in every new data you enter and every new information from of other users.
The notion of algorithmic identity is almost the same as what Mark Poster has warned in 1995. His concerns were the digital profile built from credit card use, marketing profiles, insurance information, etc (Jarrett 2014, p. 24).
The digital identity is constituted in Google’s database. And from their database, they control subjects by preemptively ascribing and preparing for particular intentional subjects. Google’s database of identities contributes to biopolitics as each user’s algorithmic identity supplies to categorisations that make up populations which Google then use for pre-emptive suggestions and to determine which ads and pages to show (Jarrett 2014, p. 26).
As information gatekeepers, search engines have the power to choose which websites they will include and exclude on their results page (Levene 2010, p. 65). Therefore, search engines influence and to some extent, control the information we get to access.
SEARCH ENGINE BENEFITS
Despite the negative implications of search engines mentioned above, search engines are undoubtedly useful and has determined how we use the web and has transformed how we live our lives especially in terms of learning and accessing information.
According to the survey conducted by PEW, Google is preferred by 83% of web users in 2012. I, myself use and prefer Google as I believe it shows accurate and more dynamic results.
I have been using Google to help me find answers for my assignment in high school and until today as I write this essay. The search engine user interface developed by Google gives each user a clean and uncluttered results page. It seems that they found the right balance between the users’ needs and commercial needs where users can easily distinguish genuine content from advertising (Levene 2010, p. 56).
On the other hand, I use some search engines of social media platforms such as Facebook, Linked, and Instagram to connect and get in touch with people and follow brands around the world. I also use the search engine such as Seek to find jobs to support myself financially. Lastly, I used our university’s online library catalogue to access free credited research papers to help me through this essay.
However, the number of search engines and software that claims to be private and do not acquire data from your system are increasing. Some examples are:
Yet, at the end of the day, no matter how helpful search engines are, what they are reportedly doing is an invasion of privacy. How they curate the results based on past search queries and just how we use the web clearly controls the information we get online. What we might need and what everyone really wants is more transparency, access to our own data, and them to leave us alone and just let the traditional advertisements live longer.
Havalais, Alexander (2013), ‘The Engines’. In Search Engine Society. Cambridge and Malden: Polity Press. pp. 5-31.
Jarrett, Kylie (2014), A Database of Intention. In Society of the Query Reader: Reflections on Web Search. Konig R and Rasch M (eds). Amsterdam: Institute of Networked Cultures. 16-29.
Levene, Mark (2010), An Introduction to Search Engines and Web Navigation. Hoboken, New Jersey: John Wiley & Sons, Inc.
Purcell K, Brenner J, Rainie L 2012, Search Engine Use 2012, viewed 10 October 2018, http://www.pewinternet.org/2012/03/09/search-engine-use-2012/
The History of Search Engines, WordStream, viewed 10 October 2018, https://www.wordstream.com/articles/internet-search-engines-history.
All photos are screenshots taken 12th of October 2018 from my personal computer