The purpose of this chapter is to present a number of socio-economic aspects and pinpoint issues of interest for further investigation by IPTS in the second year. It is not intended to give a full overview of all socio-economic aspects. Next period's work will be devoted to understanding the details of the business models from which we intend to derive some pathways for the future and understand their policy implications. Major search engine providers are large multinationals, offering far more than just a search tool for internet surfers. Google, Yahoo! and Microsoft’s MSN Live search have introduced and continue to propose a series of services. Key elements of their ´core business´ and major adjacent services are sketched in Figure 2. The free email accounts, instant messages, and voice over IP services of these multinationals are communication that both complement and compete with traditional ways of communication. Search engine providers are also owners of popular social network sites, like YouTube or Flickr, whose members not only upload large amounts of audio-visual content, but they do also classify (tag) and filter information (e.g. ranking by voting). Making use of social networks, search engine providers get control over huge amounts of structured and unstructured audio-visual content. Although not all this content is valuable as resource for (semi-)automatic tagging and mark-up and further processing, such content together with proprietary content could be packaged and specifically delivered to users. An example how search engines can act as information providers is news syndications. News syndication can be generated automatically by search engines, like Google News or Yahoo News, or in combination with human expertise. Companies, like the Finish M-Brain,[1] use a search engine to pick articles from the internet. Media analysts and experts then select the relevant information, summarize it, and provide it to the clients. Summarizing, search engine providers have a pivotal role for the digital economy and knowledge society not only because of their famous search tools and because of running huge online advertising business, but also because of their role as enablers of content creation, as information providers and as communication facilitators. These roles are intertwined. The purpose of this paper is not to discuss this complex structure in detail, but rather to discuss some particular socio-economic issues. Figure 2: Google, Yahoo and Microsoft operate a number of services that render them key players as enables of content producers, as information providers and as communication facilitators.
[1] http://www.m-brain.fi/english/ In 1973, Daniel Bell predicted that the economic of goods would be replaced by the economics of information.[1] The amount of information created, stored and replicated in 2006 was estimated to be about 161 billion gigabytes – equivalent to three million times the information in all books ever written. That figure is expected to reach 988 billion gigabytes by 2010.[2] This data comes in a variety of formats, and content has evolved far beyond pure text description.Following Bell's prediction, does more information also mean more value? Not necessarily, as information needs also to be 'useful'. From an economical point of view, information becomes valuable only if it is both relevant and new to the user and here is where search engines come into play. As there is an abundance of digital information, search engine add value by filtering relevant and new content for the user. As the degree of relevance and novelty of the information is a critical issue, a main objective of search engine providers is to gather the freshest contents, and to prioritize information following the priority criteria perceived by the user. To this aim, search engines have a set of innovations both from the technological as from the business point of view. [1] The Coming of Post-Industrial Society” Daniel Bell, Harper Colophon Books, New York 1974. [2] See Andy McCue, Businesses face data 'explosion', ZDNet, 23rd May 2007, at http://news.zdnet.co.uk/itmanagement/0,1000000308,39287196,00.htm (last visited: 18th December, 2007), referring to IDC/EMC Study The expanding Digital Universe. The data explosion has been estimated also by other studies, such as the previous "How Much Information", by Peter Lyman and Hal R. Varian (http://www.sims.berkeley.edu/how-much-info-2003) 1.2.1. An Innovation-based Business In an econometric study, Prusa and Schmitz examined empirically whether 'first-movers' become market leaders.[3] They conclude that new firms in the PC software industry have an advantage over incumbents in development new software, while incumbents can have a comparative advantage in product improvement of existing categories. The search engine market evolution is not a story of a 'first-mover' advantage. Yahoo!, Altavista, Inktomi, or Lycos started early but they were unsuccessful to maintain the initial advantage. Google entered the market relative late but employed a far better technology for ranking relevant results. In addition, users appreciated also Google´s less intrusive advertising strategy and other features like their solution to spamming. In a way it is the story of a 'second-mover improvement'.[4] Early players had an advantage, but their technology was not good enough to compete and the 'brand name' advantage declined over time. Of the first wave of search engines, Yahoo! is the only one still maintaining a prominent role, possibly because it provided continuously a good service and technology. While in early times the quality of the technology alone determined the survival of a search engine, this is no longer the sole factor. In fact, today, many users can hardly perceive any notable quality differences amongst the major engines, while brands (to the point that "googling" has become a sinomym of web search) and adjacent services do play a more important role. Though, that a 'latecomer' would be able to overthrow former market leaders was not foreseeable. This market dynamism makes believe that the search engine market is not a 'winner-takes-it-all' situation, unlike PC operating systems, desktop applications (like Office), or Internet browsers (Netscape first, and Explore later). Although, there is a concentration to few major players, the search engine market is not comparable to the dominance of Amazon for book sales or eBay for auctions. It is a business requiring a steady flow of technological and business innovation. Google has become market leader because it offers an excellent search tool and runs an extraordinary efficient advertising business model. In addition, they have introduced numerous innovative products and attractive services which have been well perceived by the public. In fact, over the past years the sources of revenue are roughly equally divided between advertising on the search portal itself (i.e. Adwords) and the affiliated sites (i.e. AdSense), see the Google Web sites and Google network sites in Table 1. Google's revenues other than advertising, such as licensing (i.e. business search solutions), contribute only minor to the overall result. [3] “Are new firms an important source of innovation?” Prusa, Thomas J. and James A. Schmitz, Jr., Economic Letters, 35, 1991 339-342. [4] "Google: What it is and what it is not", Michael A Cusumano, Communications of the ACM Vol 48, p15 ff. 2005 | In Thousands US$ | 2003 | 2004 | 2005 | 2006 |
| Advertising in Google web sites | 792,063 | 1,589,032 | 3,377,060 | 6,332,797 |
| Advertising in Google Network web site | 628,600 | 1,554,256 | 2,687,942 | 4,159,831 |
| Total advertising revenues | 1,420,663 | 3,143,288 | 6,065,002 | 10,492,628 |
| Licensing and other revenues | 45,271 | 45,935 | 73,558 | 112,289 |
| Total Revenue | 1,465,934 | 3,189,223 | 6,138,560 | 10,604,917 |
Table 1: Revenue for Google in the period 2003 in thousand US$. Source: Google Annual Report 2005 and 2006. Information facilitated to the US securities and Exchange Commission. Recently, the share of advertising revenues from the Google web sites (60% in 2006) seems to raise with respect to the advertising revenues from the Google Network web sites (39% in 2006) as can be seen from Erreur ! Source du renvoi introuvable.. In the future, the ratio between the two revenues sources may shift in view that Google's acquired the online advertisement company Doubleclick (the acquisition still needs approval from the competition authorities). Anyhow, Google's web site roughly contributes to approximately half of Google's searches and revenues. The other half derives from subscribed affiliated sites, embedding the Google search technology (advertising platform or pay-per-click business model) in their sites. In principle, these sites might relatively easy shift to a competitor, if they consider another search engine being more convenient for them.[5] In practice, there are few real alternatives in Europe.
| Revenues | 2003 | 2004 | 2005 | 2006 |
| Advertising: Google web sites | 54 | 50 % | 55 % | 60 % |
| Advertising: Google Network web sites a | 43 | 49 % | 44 % | 39 % |
| Licensing and other revenues | 3% | 1 % | 1 % | 1 % |
Table 2: Google's advertising and licensing revenues by share. Source: Google Annual Report 2005 and 2006. Information facilitated to the US securities and Exchange Commission.
For new entrants the entry barrier to become a fully-fledged player (offering the whole value chain) is currently huge. A new search engine provider would need considerable investments to set-up a state-of-the are the infrastructure, including server farms, and cover operational costs, before they can get into the advertising business. And such state-of-the-art infrastructure is necessary to offer a good search experience returning relevant results to a very large audience. For sake of illustration let us assume that in online advertising the average click-through rate might is 2% and the average purchase rate is also 2%.[6] This means that in the best case only four out of thousand people who see an advertisement will buy the product. As the purchase rate is low, advertisers need to reach large audiences to sell their products. For this purpose they establish alliances, buy social network sites, etc. In addition, they try to increase the click-through rate by tailoring ads to target users. For this they analyse user search patterns trying to gather the highest degree of user or group profiles.
The search engine business is highly competitive and resource intensive. On one side, operative costs to maintain a good service is very high. On the other side, the costs for a user to switch from one search engine to a competing one is very low; just one mouse click away. In fact, Fallows[7] points that 56% of users employ more than one search engine and it is likely to assume that users would change if they are not satisfied with the quality of the search. Similarly, advertisers too are loosely tight to a single search engine and would switch to the one providing them with the largest possible audience and the best offer to place their ads.
Low switching costs for users and advertisers provide the basis for a sane competition amongst search engines. At the same time, the huge investments (infrastructure and operational costs) the requirement of a mass market, and an advertiser supported business model suggests that the equilibrium market for general purpose search engines is one with few large competitors.[8] This is similar market structure to national newspapers, where few large companies, compete for readers supported by advertising. A major difference between newspapers and web search engines is, of course, that the newspaper market is less language or country specific. Also, it is more straightforward to adapt experiences in search engine applications learnt in the Anglo-Saxon environment to other languages and countries.
Search engine providers have been successful in attracting larger circles of audience by diversifying beyond their core business and offering attractive services. The range and the rate of innovative services have been impressive. Major players, are offering search options for emails, search for mobile phones, or short messaging service of mobile wireless devices. In addition, they integrate novel services to their offers. Google offers print services to search online books, images from satellites, chart groups, news syndication, a tool to perform prices comparison on the web (Froogle). The Google Video store has already 3000 music videos and 300 television programs for sale. According to projections of the research firm IDC, by 2009, more than 30 million wireless subscribers will be watching commercial TV and video on a handheld device.[9] The ensemble of these services and innovations render where users flow to the portal site because of habit, market power and indirect externality.
Summarising, as switching costs for users and advertisers are low, text search engines are forced to innovate continuously on different 'fronts'. First they have been improving their technology. Second, they need to adapt their revenue model. Third, they need to take a series of measures to attract more users. All three factors are important, but not necessarily equally. One search questions is to determine the relative weighting for each three factors and whether there is a change expected in the future. Things may look different in the future. For instance, a major barrier to entry are the expensive server farms needed to support today's main technology approach. Alternative less expensive technical infrastructures for search engines, like P2P, are currently under exploration. If successful, this may decrease the investment costs and give more room for competition. Further, the AV search market does not need to be as monolithic as the current one for text search. Many players may offer complementary and competitive services, where searchers will be chosing different AV search engine providers because of their particular strengths, e.g. for image or audio search, or services, like e.g. better personalization of the interface. Given the user habits in current web search, it seems likely to believe that also the AV market revenue would be based on advertising, although the pricing may differ.
[5] "Google: What it is and what it is not", Michael A Cusumano, Communications of the ACM Vol 48, p15 ff. 2005 [6] These are only average figures. In the praxis the click though rate and purchase rate is context dependent. Following AGOF, the conversion rate –i.e. the multiplication of both looking for a product and buying it over the internet– depends highly on the products or service. For instance the highest conversion rates are for books (36.1%), followed by theatre and cinema tickets (31.4%) and flight and train tickets (29.9%), while food (2.4%) and beverages (2.7%) are on the other extreme [AGOF 2007]. See
Berichtsband – zur internet facts 2007, Arbeitsgemeinschaft Online-Forschung e.V., August 2007 available at
www.agof.de/if-2007-i-teil-1-online.download.6033aa53fd516aa8e75adb6e40408d3e.pdf [7] 'Search Engine Users: Internet searchers are confident, satisfied and trusting – but they are also unaware and naïve.', D. Fallows 2005, PEW Internet & American Life Project
[8] It has been argued that in online businesses, the market structure and the entry barriers may lead to a situation were only few actors can survive. The main argument is that there are inherent limitations of human attention and for some internet-based services, amongst those possibly also the search engine, network effects lead to winner-take-all situation. In other words: in the long-run there is room only for limited number of Googles, eBays, Explorers or Wikipedias to survive for each of their respective sectors, i.e. search engines, online auctions, browsers or encyplopaedias.
[9] “Google becomes an entertainment company”, Michael Macedonia, January 2006 Computer.org
1.2.1.1. Online Advertising Search engines offer both traditional advertising services and innovative internet-based services. Traditional services include display advertising, like banners or buttons appearing on the search engine's page, or classified advertisements, like ads listings in a directory. Today, search-specific advertising is dominant.. When a query is introduced into a search engine, the user receives two results lists delivered. The first list, is a web search provided for free in a pull mode, whose ranking is by relevancy. It is usually called organic result. The second is an advertising list whose ranking is auctioned. Search-specific advertisement is highly efficient, as the user informs the engine what he/she is looking for, unlike traditional advertisement, e.g. newspaper or TV. Merchants would spend less for marketing and be able to offer cheaper services or products to end-user.[1] Possible pricing models include display advertising, paying for the delivery of a targeted visitor to the advertiser's website and Pay-per-click (PPC). In PPC, the advertiser pays upon the number of clicks on the hyperlink. Today's most diffused pricing model is given by the 'click-through rate'. In contrast to the 'price-per-click', where the number of user click on a specific ad are counted, the 'click-trough rate' measures how often ads prompts a response from users. An advertiser would be prepared to pay more for a click if the click-through rate is high. Part of the success search-specific advertising is that it allows even small businesses to advertise in the global market, as costs can be as less than $5 to open an account. Similar to the eBay business model, the aim is to capture also parts of the long-tail.[2] An interesting issue is that following the opinion of some observers,[3] the technology gap between the leader Google and competitors has significantly narrowed to the point that no significant difference in search quality can be observed amongst the major players, while, over the past periods, the leader's market share continues to increase particularly in Europe. This may indicate that Google is getting into an attractor position in which the search engine's exposure to large audiences attracts more advertisers, who generate more money to provide more services to enlarge the audience. Following Interactive Advertising Bureau (IAB) and PricewaterhouseCoopers, internet marketing spending in the US totalled $16.9 billion in 2006 [4], and advertising revenues were nearly $10 billion for the first six months of 2007, (nearly 27% increase over same period of 2006). De facto, internet advertising revenues have always grown in two digit rates over the past last years. More importantly the biggest share of the online advertising business is in the hands of search engine providers. Search advertising formats has 41% share, followed by display (rich media, banners, display ads, sponsorships and slotting fees) with 31% and classified ads with 17%.[5] The world market for search-related advertising is estimated to rise over $8 billion for 2007, up from $7 billion in 2006.[6] These estimations may even be higher in view of the recent acquisitions of online advertisement firms by search engine providers. In particular, in spring 2007 Google bought DoubleClick for $3.2 billion, Yahoo! RightMedia for $680 million and Microsoft aQuantive for $6 billion. The huge sums spent for these acquisitions seem to reflect the search engines provider´s optimism regarding online advertising as an expanding market. This optimism seems to be shared by Nielsen/NetRatings reporting that the number of online ad campaigns have increased by 35% in the period April 2006 to April 2007. In addition, combining intelligently search ads and display ads may enhance each other. The role of brand awareness in how users respond to search ads is also gaining attention. Yahoo claims that consumers are more likely to click on a search ad if they had already been exposed to some brand building banner advertising from the same company.[7] User-generated, user-complemented and user-volunteered content are taken by the search engines at no direct cost from the IPR owner. This content includes also collective property generated by social networks, like metadata generation through file tagging or data sorting. In exchange these companies provide servers, software and a set of rules for enabling users to share content with providers have generated through advertisement. Value, therefore derives both from search engines providers and the users. In literature there is a discussion if this is equally fair for both parts and if it is sustainable business model also in the long-term. Given that owners of high-value content are reluctant to place their content on the web, alternative business models may appear in the future, that better suits the interest of content owners. [1] 'The good, the Bad and the Ugly of the search business' Kamal Jain, Microsoft Research [2] All major search engines offer similar advertising models to Google, including Microsoft Ad Center and Yahoo Search Marketing. Google's ad programs are called AdSense and AdWord. Website operators enrol in the AdSense program to enable text, image and video advertisements on their sites. Google administers these ads and generate revenue on either a PPC basis. AdSense has become popular because these ads are less intrusive than most banners and the keyword-based concept makes the ad content of the relevant to the website. The auction-based advertising programme AdWord, specific keywords can be auctioned for a specific time period. Whenever a user types this keyword into the search, the ad will be displayed in the results list as a sponsored link. [3] 'The Good, the Bad and the Ugly of the Search Business' Kamal Jain, Microsoft Research [6] "Wikipedians Promise New Search Engine" 16 March 2007, http://www.technologyreview.com/Biztech/18394/page1/?a=f [7] "Search Advertising" Financial Times, 11th July 2007 1.2.1.2. The Web Search Engines Landscape Web search is –after sending emails– the second most favourite activity on the internet. For example, 85.9% of German internet users use search engines slightly less than sending emails 86.1% and far more often than any other activity, like reading newspapers online, chatting or participating in social networks.[8] Currently close to hundred search engines are operational,[9] but the bulk of the searches are performed by few service providers only. Following the consultancy firm Nielsen/Netratings, the first three operators control more than eighty percent of the market. In particular, online searches by engine performed in the US in August 2007 were executed by Google 53.6%, Yahoo! 19.9%, MSN 12.9%, AOL 5.6%, Ask 1.7% and the rest 6.3%. These searches include local searches, image searches, news searches, shopping searches and other type of vertical search activity. More than 5.6 billion searches were carried out only in that month (August 2007).[10] The ranking of the top three players is undisputed. According to comScore Networks in December 2006, Google sites captured 47.3% of the U.S. search market, Yahoo! 28.5% and Microsoft (10.5 percent). Americans conducted 6.7 billion searches in December 2006. With respect of the same month a year ago, this represents an annual growth rate in search query volume of a 30% increase. This growth rate is considerable and explains the high expectations of online advertisement of search engines as a promising growth market. Google is the uncontested leader in web search and advertising revenue. Yahoo!, which faced a notable decline time ago, appears slowly recuperating some popularity. Some experts believe that this popularity is due to the new advertising strategy and the success of some recently launched services, such as Yahoo! Answers. MSN appears to move in a slow but constant decline. Any other search engine are far from the top three. A comparison amongst the three companies is not easy, as some interesting data, such as margins, is not publicly available. Moreover, financial data about MSN Live, is embedded in the overall Microsoft account. For sake of simplicity, let us compare Google and Yahoo! as of autumn 2007. Google had a market capitalization of 152.79b$, 5.680 employees, generating a revenue of 9.32b$ and a net income of 2.42 b$. Yahoo! for its part, a market capitalization of 39.36 b$, 9.800 employees, 6.22b$ revenues and 1.17b$ net income (for an overview see Chapter 6.4.1.) Google's revenues and earning have been sky-rocketing over the past three years, and also Yahoo!'s earnings have been increasing, but to a lesser extent. European internet users make also massive use of search engines as their counterparts on the other side of the Atlantic. The intensive use of search engines explains with they are amongst he most visited pages on the internet and attract a lot of traffic. Google is the most visited search engine in practically all countries of the European Union. For instance in June 2007, Google reached 88.8% of the UK, 69.5% of the French and 69% of the German online population. The internet audience is notably higher than for the MicroSoft sites (83.3 UK, 62.3 France, 54% Germany) and Yahoo! (65.9% UK, 39.6% France and 36% Germany) according to the internet audience measuring company comScore.[11] The search engine market consolidation becomes evident when observing the evolution of hits over a longer time periods. Figure 3 and Figure shows the evolution of the share for Germany and France, respectively. The evolution of Germany and France is similar to other European Member States. In particular, less than a handful search engine providers have a market share of over ninety percent and Google being much bigger than its followers.
Figure 3 Evolution of WebHits for search engines in Germany in the period 2001 to 2007. Source: WebBarometer,[1] [Speck 2007] and own calculations 
Figure 4: Evolution of WebHits for search engines in France in the period October 2001 to September 2007. Source: Baromètre Secrets2Moteurs[1] and own calculations
These data highlight that the market of the search engine providers is highly concentrated and the way of using them has also penetrated our lives. The average German –for instance– uses Google more than forty times a month[1] and three quarters of the internet users get to internet offers through search engines.[2] Although the traffic amongst the search engine providers may vary from one country to another, the user experience is similar for most western countries (see Chapter 6.2.1.2). Consultancy firms metering the market share, such as Nielson/NetRatings, Compete, Hitwise or comScore, retrieve data for measuring the search behaviour by installing real-time meters the computes to web surfers (Nielsen states 500,000 people worldwide). The market share retrieved by these consultancy firms may differ to a certain extend for each of the search engines, due to the fact that they employ different metrics for measurement and the accuracy of the data is not sufficiently clear. This may partially explain why comScore's traffic data for Germany and France differs from the hits counts by WebHits.de (Germany) and Secrets2Moteurs.fr (France). Although the measurement method is not standardized and values may vary amongst consultancy firms, there is consistency with regard to the search engines' top rankings and long-term trends. Though, how the internet audience is measured is not an academic curiosity. Small differences in market shares make a difference and have implications for business decisions. Page views a widely used audience measure used to advertisers to decide where to spend their money are becoming less significant amid the growing use of audio and video on the internet and website ability to automatically update content. Nielsen's methodology is to add 'total minutes' and 'total sessions' information to better measure the degree to which websites engage their users. This way, Nielsen thinks to measure the use of website in a more adequate way. The 'Interactive Advertising Bureau' that represents many of the biggest online publishers in putting together guidelines with the definition of unique users, time spend and other online measures.[3] The concentration of the web search engine market appears to be a general trend in the USA and the most EU Member States. Why Google is far more dominant in Europe than in the USA is not may have multiple reasons, including, national marketing strategies, better adaptation to market size, better technological adaptation to language, lack of powerful national search engines, etc. An interesting case –although not being of the European Union– is Russia, where Google is only third by market share after Yandex and Rambler (see Figure below). Yandex claims to have a superior technology as it masters better the declinations and conjugations of the Russian language that other search engine. Other Slavic search engines, like the Czech Morfeo or the Polish NetSprint, also claim in their corporate web sites to have an advantageous technology.
[1] comScore German data June 2007
[2] Internetverbreitung in Deutschland: Potenzial vorerst ausgeschöpft? Birgit van Eimeren, Heinz Gerhards and Beaste Frees, Media Perspektiven, Vol 8, page 350 - 370
[3] "Search Advertising" Financial Times, 11th July 2007
How much the Yandex high market share of over 55% in Russia can be explained by better linguistic performance is, however, not obvious as the same search engine provider achieves only 16% in the Ukraine, although the Russian and Ukraine are linguistically speaking very close. One factor that has certainly favoured Google's dominant position is the rate under which innovative services have been introduced. Many of these have been proposed to the audience at development phase (beta versions), rather than offering finished services to the users. This user involvement in the development stage is part is the company's culture of learning-by-doing. The company be have benefit from using the internet dominating language English when testing services and applications in the huge Anglo-Saxon environment, before introducing and adapting these into other cultures. The question arises how European cultural diversity may be turned into an advantage. 1.2.2. Issues with the Advertising Model As advertising is the business model of all major search engines, some of threats and challenges, like conflicting interests between actors in the field, have some commonalities with its traditional pendant. Others issues, however, are arise from the auctioning model, which gains, dominance in the internet worlds. 1.2.2.1. Conflicting Interests When a merchant subscribes to a ad programme for given key word, it is recommendable that the sponsored list does not conflict with the organic result of the query. For instance, if merchant auctions the term 'cell phone', it is not in its interest that the sponsored link appears in a response of a the ´adverse´ query like 'cell phone radiation danger'. A search engine may choose not to show links conflicting with the advertiser. The potential conflict is between the user and advertiser and it does in practice cause little problems because there is no financial conflict between the two. The nature of the issue changes when the conflict of interest has financial implication, as in the following case. Every search engine provider is aware that if a merchant does not appear on the search engine result list, then it does -de facto- not exist on the web at all. The search engine may be motivated to decrease intentionally the quality of the search engine for the commercial category to force merchants to buy advertisements.[1] This would cause considerable negative consequences for advertisers and users. The bid prices would keep increasing to the point only those merchants with large marketing budgets would appear while more less powerful merchants no matter how good they would not be presented to the audience. At the extreme, powerful merchants who sell at inflated prices could afford large marketing budget. The injured parties of such a scenario are not only the companies who would pay excessive prices for advertising, but also the users, who would have at the end to carry the costs of excessive advertising through the price of the acquired products. A problem is that there is way to identify if search engines do intentionally decrease the quality of for 'commercial' category. Such an abuse of the search mechanism would even be more extreme in case of a monopolistic position of a search engine in which users hardly would hardly have a possibility to change provider. Unfortunately –in view that the search algorithm is not public– there is no easy way to check if the quality of search engines for the 'commercial' category has intentionally decreased to force merchants to buy advertisements.[1] The good, the Bad and the Ugly of the search business' Kamal Jain, Microsoft Research 1.2.2.2. The content quality problem
The biggest asset of a conventional library is not its index (although it is very important), but most notably the books available in the library. With regard to a library, the commercial value between index and content seems to be somewhat inversed in the internet environment. While search engines are highly profitable, many content owners make little or no money. Search engine companies do neither share the revenue from the ads on the index directly with the content owner. User-generated, user-complemented and user-volunteered content are taken by the search engine providers at no direct cost. Search engines take also for free other valuable goods, including personal information or file meta-data generated by community file tagging and data sorting. But it is not only a taking, search engine also a giving. They provide at no direct cost for the user servers capacity (storage, processing power, etc.), software and a set of rules for enabling users to share content. Value, therefore derives both from search engines providers and the users. This interplay has facilitated certainly the amount of content available stored on the internet, but how much it has contributed to high-quality of the content is less clear. As content owners are often not direct beneficiaries their intellectual property, many IPR holders do chose not uploading quality content on the internet.
If Europe wants to shift quicker towards a knowledge-based economy it would be advisable to improve not only the quantity but also the quality of content on the web. The actual model has been successful, and it may become even more so if the potential quality problem becomes a limitation. In a more general way, it would be worth reflecting how the internet economy could share best benefits amongst their stakeholders. Although this may be a too ambitious undertaking, the search engine market may be an important case to study possible model. Some former concepts, which were proposed in the past and could not be implemented at the time, could be reassessed under the current market environments and technological possibilities. As a matter of illustration we may cite Laudon's proposal to establish a (national) 'information market for property rights of individuals'.[1] Laudon explored the idea that individuals may sell their own property rights in personal information on markets. As Laudon emphasized already in 1996 there is already a large market in personal information, but the property rights are held by those who collect and compile information about individuals and not by the individuals themselves. These third parties buy and sell information that can impose cost on those individuals, without the individuals being directly involved in the transactions. Laudon proposed that pieces of individual information could be aggregated into bundles that would be leased on a public market, which he refers as National Information Market. For instance, a person might offer information about himself to a company that aggregated it with other persons with similar demographic and marketing characteristics. Groups of this kind could be targeted as “youngster, male, interested in online computer games” or “30-40 year old males looking for family cars in Andalusia”.[2] Search engines and other companies who wanted to make use of such group information could purchase rights to use these mailing lists for limited periods of time. The payments they made would flow back to the individual as “dividends”. Individuals who found the annoyance cost of being on such lists greater than the financial compensation could remove their names. Individuals who felt appropriate compensated would remain on the list. Although many practical details would need to be solved to implement Laudon´s market, it is important to recognize that information about individuals is commonly brought and sold already by third parties in market like environment.[3] Such a national or EU-wide information market might contribute individuals to gain an economic stake in those transactions in which they are concerned but they currently do not have. In addition it would be worthwhile investigating other policy options to support the generation of content. For instance, a kind of web yellow pages could be encouraged that provide a catalogue of companies with website directions and topic hierarchy; ideally the list would comprise services within a proper ontology. This list might be contributed by companies during registration or feeded by the databases of governmental bodies. Another policy measure could be to push for standards for web services for local transport and mapping services so that citizens make take advantage of it on future mobile applications.
[1] “Markets and Privacy” Kenneth C. Laudon, 1996 Communications to the ACM 39(9), 92-104
[2] It is worth observing that the Fair Information Practices Principles would automatically be implemented if the property rights in individual information resided solely with individuals: secret information archives would be illegal, individual could demand the right of review before allowing information about themselves to be used and those who wanted to utilise individuals information would have to explicitly request that right form the individual in question or and agent acting on this behalf.
1.2.2.3. Self-bidding and Click-Fraud
Search algorithms are well-kept secrets and will remain so, because this assures companies a competitive advantage. It prevents also parties with vested interest to manipulate the advertising search engine results in their interest if they would know the details of the algorithms used to rank the results. At the same time, non-transparent auction systems have an inherent risk of fraud through self-bidding. If an eBay seller bids on its own listings through a proxy account, eBay considers this a fraud. Similarly, self-bidding in the search engines domain would also be possible, but difficult to prove because of the complex auction system, which some observers consider to be opaque. The opacity results from non-revealing exact terms under which the auction bid is awarded. When bidding for a keyword the price is an important criterion, but not the only one. Moreover, Google Checkout customers get about 20% discount on Google adwords. This inflates the bids of discount getting bidders. In the case the discount getting bidder does not win the top slot, then other advertisers end up paying the Google checkout subsidy, instead of Google itself, who becomes the beneficiary in two ways.[1] Another important issue is 'click-fraud'. Search engine companies sell specific keywords to advertisers. When a user searches enters this specific term, a link to the advertiser is displayed in the results page. The advertiser then pays the search engine company a fixed amount for each user that clicks on the advertiser's link. This have given rise to the so-called 'click-fraud' phenomenon, whereby a person, automated script, or computer program repeatedly clicks on the competitor's advertisements in order to drive up the advertising costs paid by their competitors.[2] The average price-per-click for popular keywords is in the order of $1.70 and in some rare cases it can raise as high as $50. It is estimated that click fraud has generated the losses as high as $3.8 billion annually.[3] With regard to click-fraud, search engines have a dual role as advertising networks and publishers on their own search engines. A search engine loses money to undetected click fraud when it pays out to the publisher. In turn it generates revenue when it collects it from the advertiser. It is believed, but not proven, that as a search engine more collects than what it pays out, thus click fraud indirectly benefits search engines. [3] "Click Fraud looms as Search Engine Threat", Michael Lidtke Associated press, 11 feb 2005
1.2.3. Adjacent Markets Web search engines are economic drivers, whose technology and business have given raise to other adjacent markets. The dynamic sector of search engine optimisation is direct spill-over from the web search sector and the technology attractive also for enterprise search solutions and future mobile search. 1.2.3.1. Search Engine Optimization Search Engine Optimisation (SEO) is a trend that has raised considerable dynamism and is possibly the biggest side-markets around the main search engine landscape. SEO aims at improving both the volume and quality of traffic to a web site from search engines via search results in order to get a better chance for sites to appearing highly ranked. SEO can target image search, local search, and industry-specific vertical search engines. Common for all is that for increasing a site's relevance, SEO needs to consider how search algorithms work and what people search for. Search engine providers have guidelines on how to take care site's coding and structure in order to facilitate search engine indexing crawlers to spider efficiently the site. Apart from these 'legal' ways to optimize websites to be ranked, some SEO use also spamdexing techniques. Spamdexing or so-called black hat methods (examples include link farms and keyword stuffing) aim at increasing the sites ranking at the expense of search engine user experience, as they may be directed to less relevant sites. Therefore sites employing these techniques may remove from the search engine listings.Some marketing experts report that people are increasingly ignoring conventional online advertising.[1] Therefore, considerable effort is spent to make advertising more effective in terms of manpower and investment. This has boosted the SEO area. Being rated high in the organic results list and to pay to appear in the sponsored list are two distinct ways to gain visibility for merchants. The fact that many merchants spend considerable amounts for SEO, rather than spending directly on advertising, may indicate means that they consider it as necessary (and possibly the better) option. One reason may be that the organic results list may be perceived by users as 'neutral' and more prone to their interests. This may give rise to a kind of economic discrimination, since the richest providers would be in the position to put more money into SEO techniques than financially weaker ones, and consequently they will be more likely to get return on investment. The increased level of sophistication in search marketing has pushed also the barrier of entry for new entrants. These entry costs include high costs for the technology and (outside) professional support needed do manage online campaigns.[2] This explains why search engine optimization is an expanding market, worth $1.5 billion worldwide in 2005, according to Forrester Research. By 2010, European marketers will spend almost €3bn, up from €856m in 2004, on search marketing.[3] The SEO market is very fragmented and the profile of the companies being active in this sector is generally, small but specialised enterprises. Search algorithms are well-kept secrets also with the aim to prevent potential spammers to manipulate the search engine results ranking of the query results. Also undisclosed is the way the auctioning systems. For the auction system, search engines use –apart of the price– a number of other actors before deciding awarding to be ranked in the sponsored list. As the parameters of auctions are undisclosed, -if my ad loses- I do not know the reason and do not learn how to optimize better. The advertiser can hardly determine the way search engines decide how to rate adverts in their systems. The search engine's undisclosed qualitative assessments are basically the root of the 'opaque' search engine optimisation business. 1.2.3.2. Business Search SolutionsIn the past, companies have invested largely in the IT infrastructure and in particular the hardware for information storage and handling. They have gathered the necessary resources and technologies to capture, store and transfer the information the enterprise needs for its operation. A remaining bottle neck is to provide a consolidated user-centred view for employees to ease their jobs and render them more efficient. This shift from a basically storage oriented infrastructure to an information consumption, goes along with a user-centric model rather than a technology based one. Providing an efficient, interactive and secure way to present user-specific content is complex, because it has to take into account different operational systems, file formats, schemas, etc. Therefore, tailored search solutions for business and enterprises are becoming an emerging field. The aim is to identify and enable specific content across the enterprise to be indexed, searched, and displayed to authorized users. Following a study by the consulting firm IDC, the worldwide market for enterprise search and retrieval software in 2005 was $976m. This is a growth of 32% with respect to the previous year. The size of this sector is notably smaller then the aforementioned web search advertisement market. The three big players, Google, Yahoo! and Microsoft, have some activity in the field, but their revenues from licensing technology are minor. Though, business search solutions may be an interesting case study for Europe, as many of the key players are European, including FAST (Norway), Autonomy (United Kingdom) and Expert System SpA (Italy). For more company info see Chapter 6.4.4. Some of their products comprise knowledge management modules on top of the search function. This way, it is intended to uncover meaning arising from any enterprise information including documents, emails, entries in relational databases, etc. Today, the market for 'knowledge management' tools is very distinct from web search engine market. The more the search engines move from text-based search to audio-visual search, the more the technological interest will overlaps, as need developing solutions for conceptual search, document classification, text mining and information analysis and correlation. This may drive current web search engines to penetrate more the 'knowledge management' market. The fact that Microsoft has made an offer to buy FAST may be an indicator of this trend.[4] 1.2.3.3. Mobile Search Mobile Search refers to information retrieval services accessible through mobile devices like phones or PDA.[5] European telecom operators do provide some search options for their 2G, 2.5G and 3G services. For this, telecom operators rely on technology provided by companies like Google or FAST[6], alternatively users can access the URL of search engines offering a dedicated interface for handheld services, like MetaGer.[7] Although still in its creation, the mobile search market is likely to differ significantly from web search engine market. The technological context (e.g. small screens, limited bandwidth), the reduced amount of suitable content for mobile devices, the role of the market players (e.g. as telecom operators as a provider to the internet by mobiles do have a more powerful role, than internet service providers have for accessing the internet via a computer), the user behaviour (e.g. type of search requested on the move), might beg for a different search engines business model. Walled-garden markets seem to be the currently prevailing model, but it may become more open in the future. There are discussions if a flat-rate pricing is possible of if bandwidth restrictions will force payment by bit download. This make would make a difference not only for bandwidth intensive downloading such as video (e.g. There may pricing by video per resolution), but would have also implications on location-based services which are regarded to be very promising and would allow to find the nearest restaurant typing the question to or simply speaking into our mobile telephone. .
[1] "Internet advertising: Is anybody watching?", Xavier Drèze François-Xavier Hussherr, Journal of Interactive Marketing, 17 Vol 4, p8 [2] "Search Advertising" Financial Times, 11th July 2007 [4] http://www.01net.com/editorial/368946/microsoft-s-achete-la-place-de-numero-un-de-la-recherche-en-entreprise/ [5] Although being a mobile device, laptops are not considered within this category as their technical characteristics are more similar to PC than mobile telephones or PDA, in terms of accessing and displaying audio-visual content. Using search engines is the second most common activity on the internet, only preceded by sending emails. 85.9% of all German internet users make queries with search engines slightly less than sending emails 86.1% [1] and more often than any other activity, like reading newspapers, chatting or participating in social networks. Citizens in other European countries are similarly often search engines. The intensive use of search engines explains with they are amongst he most visited pages on the internet and attract a lot of traffic. Google is the most visited property in most countries of the European Union. For instance in June 2007, Google reaches 88.8% of the UK, 69.5% of the French and 69% of the German online population. The internet audience is notably higher than for the Microsoft sites (83.3 UK, 62.3 France, 54% Germany) and Yahoo! (65.9% UK, 39.6% France and 36% Germany) following the internet audience measuring company comScore.[2] These figures highlight that the market of the search engine providers is highly concentrated and the way of using them has also penetrated our lives. The average German –for instance– uses Google more than forty times a month [3] and three quarters of the internet users get to internet offers through search engines.[4] Although the traffic amongst the search engine providers may vary from one country to another, the user experience is similar for most western countries. The user experience and behaviour has been analysed in recent study whose main messages will be presented in the following chapter. 1.3.1.1. User behaviour patterns In a recent telephone interviews with about 2200 adults, Pew Internet & American Life project investigated the internet user behaviour with regard to the use of search engines. They conclude that the average user in the USA is content, dependent and naïve.[5] Their survey found that 84% of internet users have used search engines and 56% of them use search engines on any given day. This data is in line with the analysis of major consulting firms measuring internet data traffic (see previous chapter). Also interesting is the high level of dependency on search engines as perceived by the users. 35% of the searchers use a search engine daily and 47% of searchers will use a once a week. Interestingly, 32% consider themselves "addicts" and say they cannot live without search engines. The dependency is focalized with respect to providers. 44% of searchers say they regularly use one single search engine, 48% will use just two of three search engines and only 7% will use more than three. This explains partially the market concentration around Google, Yahoo! and MS live Search. One explanation why users are loyal to few search engines is that internet users are generally very positive about their online search experiences. In particular, 87% of the internet users say they have successful search experiences most of the time. More worrying, however, is that fact that 68% of users say that search engines are a fair and unbiased source of information (while only 19% say they do not place that trust in search engines) [6] [7]. Most users may be naïve about search engines or simply do not fully realize that and how search engines make money. An explanation in this regard is that many users interviewed did not realize that search engines make money through advertising. While practically all interviewees can discriminate between regular programming and its infomercials in TV, only a slightly more than third of search engine users are aware of the difference between the paid or sponsored results, on one side, and the unpaid or 'organic' results, on the other, presented by search engines. Overall, only about one in six searchers say they can consistently distinguish between paid and unpaid results.[8] With regard to the distribution of searchers by gender and age, this follows largely the pattern of internet users. Generally, speaking men and younger users are more plugged into the world of search than women and older users. In earlier times, when internet was dominated by young men, two of the most popular search topics were sex[9] and technology. Nowadays, search landscape has changed because of the demographic enlargement of the internet user population, their more diverse interest and the huge growth of online content. A recent study examining search trends finds the proportion of searches for especially sex and pornography has declined since 1997 while searches of tamer topics of commerce and information have grown.[10]1.3.1.2. Product Search Practically all internet users perform online search of products. In Germany alone, 37.5 million users have informed this medium to get information about products; this is 97.3% of the online population.[11] The motivation is to prepare the acquisition of products, may it be on the traditional way or over the internet. More than half of the internet users search information about flight and train tickets (58,9%), holiday planning and last-minute offers (57,8%), books (56,6%), hotels (54.1%), tickets for cinema, theatre or other (52,9%), cars (52,3%), music CD (49,0%), telecommunication products (48,9%), DVD and video (39,9%). How many searches finally materialize into acquisitions depends of the sectors and the specificity of the products. For instance, books have a conversion rate of 70%, while cars achieve hardly 16%. In most of the cases the initial search to buy any product starts at the level of a search engine provider, which point to the service provider that will offer the product we search. Internet users have traditionally performed product comparison on specialized sites like like Billiger[12], mySimon[13], Bonprix[14], Pricegrabber[15] which used their software agents to gather product price information and compare to compare them. Search Engine providers are also entering also this domain, like Yahoo! Shopping or more recently Google Product Search. Given their huge indexes and their expertise in search technology it is a natural market for them. 1.3.1.3. Vertical Search Engines General purpose search engines, such as Google or Yahoo!, are very effective when users search for web sites, web pages, or general information. For search within a specific medium or in specific content categories, specialized search engines are better performing. Users are increasingly using these so-called vertical search engines for search in specific categories or media. Examples of category-focused vertical search engines include search engines for shopping (e.g. Froogle or NexTag), for government (e.g searchgov.com), for legal (e.g. law.com and lawcrawler), for traveling (e.g. travelocity and Expedia), financial (e.g. business.com and Hoovers), or business (e.g. knuru) Media-focused search engines -on the other hand- focuses on within specific online media. These search engines are used for discussion boards, forums, groups, or answer pages (e.g. Omgili and board-tracker), for scanning news worldwide. (e.g. bincrawler, Google groups, knuru), for searching the blogosphere (e.g. Technorati, knuru, and Blog-search-engine) for search in mailing lists (e.g. E-Zine List), or for search on chat rooms (e.g. e.g. Chatsearch, Search IRC). A more detailed compendium of search tools is given in the annex. Specialization goes also along with a personalized search. Continuously personalized experience for each user is a core driver for search engines. This is applies for any type of search engines, but may become the key differentiation factor for vertical search engines. The user experiences is key, irrespective is a job seeker is looking for a new employment, if a client is looking for a integrative travel package, a television viewers selecting the right news segments of a shop keeper to advice on the best accessory. One asset is interactivity with the search medium to increase the search experience. This may change the way we search. For instance the large video proliferation may raise the possibility to video syndications (similar to netvibes), where new pieces of work may result from picking video fragments and recompiling them in a creative way. Two phenomena seem to occur at the same time. One is the emergence of specialised search engines in different domains. The other one a consolidation of general purpose search engines, triggered by the fact that few search engines that can effectively compete in the tough advertising market. These phenomena are not necessarily excluding. General purpose engines could introduce features (e.g. Directories or separate tools) that cover also specialized areas. [4] Internetverbreitung in Deutschland: Potenzial vorerst ausgeschöpft? Birgit van Eimeren, Heinz Gerhards and Beate Frees, Media Perspektiven, Vol 8, page 350 - 370 [5] 'Search Engine Users: Internet searchers are confident, satisfied and trusting – but they are also unaware and naïve.', D. Fallows 2005, PEW Internet & American Life Project [6] 'Search Engine Users: Internet searchers are confident, satisfied and trusting – but they are also unaware and naïve.', D. Fallows 2005, PEW Internet & American Life Project [7] It seems that there is a growing lack of trust in news media and Americans believe that news organisatons a biased. See´ Voters Believe Media Bias is Very Real’ Zogby Pool, 14th March 14, www.zogby.com [8] 'Search Engine Users: Internet searchers are confident, satisfied and trusting – but they are also unaware and naïve.', D. Fallows 2005, PEW Internet & American Life Project [9] All time hits are searches include attractive celebrities. Britney Spears and Pamela Anderson have been on the Lycos top 50 for 277 weeks in a row. [10] 'Web Search: Public Searching of the Web' Amanda Spink and Bernard J Jansen, Springer Publishers, 2004 1.3.2.1. Communities developing Search Engines
The term web 2.0 refers to a second generation of web-based communities and hosted services which aim to facilitate collaboration and sharing between users. Examples of such collaborative services are social-networking sites, wikis and folksonomies. Basically, there are two facets of search engines within the Web 2.0 context: the first one is what the web-based community can do for search engines and the second what search engines can offer for (future) web 2.0 applications. Chris Sherman clusters these applications in different categories, namely shared bookmarks and web pages;[1] tag engines, tagging and searching blogs and RSS feeds;[2] collaborative directories;[3] personalized verticals or collaborative search engines;[4] collaborative harvesters; [5] Social Q&A sites.[6] Of particular interest are those projects and services constructed and maintained in a sustainable manner by a community of volunteers. One example is the Open Directory Project (ODP), also known as dmoz, a multilingual open content directory.[7] In this collaborative directory, the web is catalogued by user community, which has established a system on how to handle, organize and prioritize millions of inputs. In a way, these web communities have established an operational ´authority model´ for their domains, similar to other traditional communities the ´impact factor´ of academic journals. Some web communities are already getting together to provide personalized search engine by offering results from a user selected collection of trusted sites on any given topic. Rollyo,[8] for example, does this by searching those sites that have been chosen by an inscribed used after carrying our search query. Eurekster's Swicki[9] is another collaborative search results aggregator, whose concept is to adapt a search engine to your own needs. For this, a swiki user has to provide information about the topic of interest by selecting relevant keywords, websites, site search, etc. Based on click patterns the information is used to learn which results users like the most and move them to the top. Over time user feedback will modify the search queries. Learning from the behaviour of your swicki's users which search results are relevant and which filtering techniques work the best for your topic. For their operation, both Rollyo and Eurester are using Yahoo! index. Recently, Google offers also the possibility to tailor the search engine specifically to user's needs, like non-profit, government, or educational organisations. In the above examples, search engines get personalized through the adaptation of the query algorithm, but they still operate a server-based network principle. Many bottom-up approaches developed by the web community, however, operated on principles of decentralised technological resources.[10] Making use of is discussed in literature and some beta-version are being tested already. Examples include OpenSearch,[11] YaCy[12] and Faroo[13] are examples search engines currently being tested that operate under peer-to-peer principles. One of the major motivations of web communities to develop a search engine is their fear to be manipulated or suffer censorship by dominant search engine providers. Therefore, their technology offers more transparency about the search process and complies with high privacy standard. Most of these collaborative projects follow wiki-principles and use open source software or reveal their code.[14] They intend also to use the user's search patterns behaviour for the user's own benefit (rather than for adapting advertising strategies of the search engine providers). Wikipedia is a successful example how web communities can effectively collaborate together, Wikia Search to create an open global search engine is another.
1.3.2.2. Communities tagging and filtering audiovisual content
Search engines are at the heart of popular multimedia sites like as wikipedia, Flickr, or YouTube. The steady increase of creation, storage and interchange of audio-visual material renders search engines even more interesting. Making use of user generated preferences, like Chacha or WikiaSearch[15] (the project announced in 2006 by Wikipedia founder with a investment backing of over $4 million capital), are just emerging. Basically there are two major ways to carry out AV search, through content-based search and metadata search. Content-based search is a considerable technological challenge. The EU funded projects gathered under the umbrella of the CHORUS coordination action offer a nice view of the spectrum of scientific challenges. If successful, speech and pattern technologies would be able to automatise many search processes. The creation of meta-data can be automatized to a certain extent only. Researchers are pursuing the development of software that automatically tags audio-visual content. In spite of the efforts, it seems unlikely that that getting rid complete of any human input will be possible. The cognitive abilities of humans and semantic understanding make people hardly replaceable by machines. In early times, search engines operated with human edited directories, e.g. Yahoo! or Lycos. Today, practically all leading search engine providers perform search by an automated process –including user behaviour by clicks, popular URLs, and link structure)– and manual input is limited. Having people paid to introduce meta-data on audio-visual content is financially unviable option at large scale. However, there will always need a certain level of human input, particularly audio-visual search is likely to dependent on humans as long as tagging will be necessary. Here, social networks and web communities emerge as an unexpected ally. In Web 2.0 environments shared bookmarks and web pages,[16] tag engines, tagging and searching blogs and RSS feeds[17] are common. Web communities members are very active members do provide meta-data for free and this information is largely available for search engine providers. The exploitation of these freely available metadata will a focus of future search engine providers in order to offer a better search experience that prioritizes by reflecting the user's relevance. In addition, audio-visual content on social networks –like in Flickr- could be used as data to train high-level automatic object recognisers in image search.
[7] See Wade Roush, New Search Tool Uses Human Guides, Technology Review, February 2, 2007, at http://www.techreview.com/Infotech/18132. 1.3.3.1. Profiling of Individuals
Whenever a query is introduced, the search engine stores the query and associates it to an IP address and a cookie, from which the user's computer might be identified. The more additional (non-search) services a search engine offers, the more personal information they can gather and combine. The threat is that users can be identified, and their habit, hobbies, believes and political views could be monitored. The problem is that many users are too naïve or not aware of the data stored about them. How much better information campaigns may contribute to raise awareness is unclear. The popular assumption seems to be that privacy has already been irrevocably eroded,[1] because of some prominent negative experiences. Recording the search queries of users can easily be used to the identification of the searcher, as a prominent American Online (AOL) case shows. On 4th august 2006, AOL released a data file on search queries. It contained 20 million search keywords introduced by some 650,000 users over a 3-month period. Each user on this list was numbered by a unique sequential key, and the user's search history was compiled. The file did not include any personal information per se, but certain keywords could contain personally identifiable information, like user typing in their own name, their address, social security number or by other data. Although intended for research purposes only, this data file was widely diffused into the blogosphere and on popular sites. The list got into the hands of some New York Times journalist, who tested whether it was possible to identify and locate individuals from the 'anonymous' search records. Shortly after, the New York Times discovered the identity of several searchers by simply cross referencing the data with phonebooks or other public records. AOL took consequences of this privacy breach by firing some responsible. More importantly, the AOL case demonstrates that data collected by search engines can lead to the identification of the user and can be misused to infringing the private sphere. In fact, to target ads better search engine providers keep the user query data indefinitely without giving any control to users.[2] Even worse, the user's information stored is not limited to the search query only. The more additional (non-search) services a search engine offers, the more personal information they can gather and combine. Some examples: In October 2004, Google introduced Desktop Search, which indexes the content on personal computers including files, emails or web search tracking (optional). The potential –but also the threat- of such a programme is that permits personalised search. This may tie users to the software provider's solutions. A battle is starting around search behaviour and its technology.[3] Google offers a service called "My Search History" which allows users to retrieve and store former searches. Recoding over long time periods search histories may provide insights on what someone is doing, his interests and thinking. The search engine provider would be able to provide advertisers with far more sophisticated consumer profiles if it maintains a comprehensive database of search histories that can be sorted by individual user. A danger is that such a monitoring may be employed to monitor and eventually suppress political opponents. Such an erosion of the personal liberty is not implausible scenario, given that major search engine companies have already given in political pressures in the past, like the filtering of internet content in China.[4] 1.3.3.2. Censorship
The common perception has been that the internet is an unstoppable force for democratization, a force for liberation that cannot be tamed by local governments. While the internet has undeniably contributed to making citizens getting access to information in many parts of the world, this cannot be generalized everywhere. Some search engines have been accused of censorship. The accusers assume political and economic motivation, as the following examples show. Yahoo! Google China, Microsoft, AOL, Baidu and others, are accused to have cooperated with the Chinese government in order to implementing a system of Internet censorship in mainland China. In fact, Google's Chinese search engine (www.google.cn ) filters information perceived to be harmful by the government of the People's Republic of China, including content relating to the Tiananmen Square protests of 1989, sites supporting the independence movements of Tibet and Taiwan, the Falun Gong movement, or more recently the Chinese demonstrations against Japans more recent attempts at revisionists history.[5] J. Zittrain and B. Edelman from Harvard Law School, who studying exclusions from search engine search results all over the world, report that China is not the only country performing censorship, similar filtering of internet documentation is practiced also in Saudi Arabia.[6] Although China has little economic influence on and no political power over Google, it seems that the US search engine provider has accommodated to the wishes of the Chinese government. Some US observers, amongst those Prof. L. Hinman at the University of San Diego, are worried that Google could eventually be much more strongly influenced by the United States governments which has far greater economic and political impact on Google than does the government of China.[7] In fact, the power of search engines lies that –due to its key role for the internet– it may contributes preventing citizens for accessing certain sites on the internet. Such a scenario is a potentially frightening aspect for Europeans, whose values include a maximum of personal liberty, and access to uncensored information. Freedom of speech on the web has meaning only if the speech could be communicated to the interested audience. In view that every search engine may have a both an intentional and a unintentional bias, it would be suitable to be able to discriminate between both. Possibly, software algorithms to detecting unintentional bias could be help in for such a purpose. A piece of work in towards this aim is CenSEARCHip[8]. This tool explore the differences in the results returned by different countries' versions of the major search engines. Web search and image search functions are available for the four national sites (United States, China, France, and Germany) of Google and Yahoo! When clicking the "Image Search" button, each side of the display shows images returned in the first page of search results only by that country's search engine. Through the agreements with the search engine providers, the Chinese government has successfully restricted their citizen's access to non-desired politic sites. In addition, the government is very strict with citizens, trying to circumvent their internet policies. Following information by Open Search[9] "in April 2005, Shi Tao, a journalist working for a Chinese newspaper, was sentenced to 10 years in prison by the Changsha Intermediate People's Court of Hunan Province, China (First trial case no 29), for "providing state secrets to foreign entities". The "secret", as Shi Tao's family claimed, refers to a brief list of censorship orders he sent from a Yahoo! Mail account to the Asia Democracy Forum before the anniversary of Tiananmen Square Incident". 1.3.3.3. Racism and the Protection of Youth
Major search engines apply internal rules of conduct to protect against forbidden information or youth endangering content. Apart of this industry self-regulation there is at least one case of industry-government co-regulation in the EU. In Germany, all major search engine providers have subscribed to a code of conduct that obliges them not to display those URL that have been marked as endangering by the German Authority for Youth Protection (BPjM - Federal Department for Media Harmful to Young Persons).[10] The working principle is the following: search engines providers become members of FSM ('Freiwillige Selbstkontrolle Multimedia-Diensteanbieter (FSM)' an a registered association founded in 1997 by e-commerce and web-operating companies dedicated to the protection of the youth and minors. The FSM operates a hotline where any person or organisation may report on illegal or harmful web content. The governmental BjM and the FSM are in close contact and members about harmful content whose sites are then taken blanked by the members. Content subject to restricted distribution under German law on harming young people include sites with explicit incitement to hate or violence against a group of people (proscribed by the criminal law such as Volksverhetzung), instructions on how to commit a crime, glorification or trivialization of violence, incitement to racial hatred, content glorifying war or showeing minors in an unnatural/harmful situation. Although all EU countries have regulations and law protecting the minor, Germany seems to be the only Member State within the EU where co-regulation for search engine providers is currently in place. This does not mean, however, that these countries do not pay attention, on preventing minors to have access to harmful content. The laws on youth protection and against racisms, as well as the freedom of expression may vary form country to country, explaining differences in search results. For instance, in many EU Member States, anti-Semitic websites are illegal. Therefore, Google.de and google.fr do not list these anti-Semitic sites [11], while this is not the case in the US. In a fact, when querying the term 'jew' several of the top ranked sites in Google.com are anti-semitic. The Google management is aware of the issue and relased a note explaing the companies policy in repect and noting that anti-semitic sites do not typically appear in a search for 'jewish people', 'jews' or 'judaism', but only in the search of the singular word 'jew'. [12] This points also to another more general problem, namely that harmful or illegal content may be hidden / appear after querying on unrelated or naïve terms. [1] “The future of the internet is not the internet: open communications policy and the future wireless grid(s)” Lee W McKnight, NSF/OECD Workshop, Washington 31st January 2007 [2] 'The good, the Bad and the Ugly of the search business' Kamal Jain, Microsoft Research [3] "Google: What it is and what it is not", Michael A Cusumano, Communications of the ACM Vol 48, p15 ff. 2005 [4] 'Esse est indicato in Google: Ethical and Political Issues in Search Engines', Lawrence M Hinman, International Review of Information Ethics, Vol 3 p 19, June 2005 [7] 'Esse est indicato in Google: Ethical and Political Issues in Search Engines', Lawrence M Hinman, International Review of Information Ethics, Vol 3 p 19, June 2005 [11] A search for the German expression 'Jude' or 'Juden' or the French 'Juif' delivers millions of entries, but the first pages of top ranked sites are not anti-Semitic. [12] www.google.com/explanation.html Search engines tend to penalize sites when they detect that methods are used that not conform to their guidelines. Search engine providers can reduce their rankings or eliminating completely their listings from the research results. The prominent disputes of the past to are a result of opposing interests between search engine providers and search engine optimizers. Search engine providers have argued that by penalizing black cheeps they are defending user's interest not to get a distorted ranking. One potential threat of the practice to down rank sites is that it may be on an arbitrary way or misused by search engines providers in order to force commercial sites to subscribe to the search engines advertising programs. In February 2006, Google found that BMW's German website influenced search results to ensure top ranking when users searched for "used car." BMW's German website, which is reliant on javascript code unsearchable by Google, used text-heavy pages liberally sprinkled with key words to attract the attention of Google's indexing system. Google considered that spiking doorway pages with keywords, was not complying with Google's guideline not to present different content to search engines than displaying to users. Therefore Google reducing BMW's page rank to zero, ensuring the car manufacturer's site no longer appeared at the top.[1] Similar, accusations of manipulating page ranks have been reported in the past, including SearchKing,[2] Ricoh Germany, or 'September 11th Truth'. [3] Page rank manipulations are not stricted to Google, Baidu has also been told to have decreased the rank of the blogging service Sina, since Sina published several negative reports on Baidu.[4] The aforementioned BMW case could be considered as a consequence of the fierce battle for marketing of site by the search engine optimization (SEO) industry (see also chapter 6.2.3.1). SEO aims at improving site architecture in such a way search engines can index it well in and by optimizing keyword phrases in the site content. The objective is to get high rankings in search engines organic results. SEO make use of techniques that search engines recommend as part of good design (so-called white hat), but may use spamdexing techniques that search engines do not approve (so-called black hat). At the first glance this distinction appears to clear, but at second sight this may neither easy to implement nor always to be objective. Therefore search engine providers use automatic but also manual procedures to counter effect misdoings. At the same time this leaves room for search engine providers to commit injustice. Basically we have to understand that if a merchant does not appear on the search engine ranking, then I does no exist on the web. Search engine providers are aware of this and may be tempted use it to their advantage. A frightening scenario is that market leaders do intentionally decrease the quality of search for the "commercial" category in other to force merchants to subscribe to their advertisement programmes.[5] Such an abuse of the dominating role would most likely distort the market with serious consequences. Merchants would be obliged to increase the bids for advertising. In the long-term, only merchants that can afford a large marketing budget might survive. 1.3.5.1. Education and Learning On one hand, search engines have become crucial to society because they are used by many millions of people. On the other hand, search engines are owned by private companies, whose objective is to make profit. This creates a tension between the corporate mission of the shareholder's interest and the public role of search engines. Computers have not only entered our homes but also our schools from which the internet can be accessed. Many EU Member States have programmes aiming at connecting schools to the internet, to increase the student's IT literacy, introducing e-learning programmes, internet based life long learning programmes, etc.[6] In a nutshell, education and learning patterns have drastically changed of the past decade. The services provided by search engines have become central to education and an indispensable tool for pupils and students. Geographic search on online maps have displaced traditional search in paper atlas. Online reference database, like Wikipedia have displaced traditional encyclopaedia. Bibliographic search in libraries have been replaced by online search like Google Scholar. In addition, Google's project to scan books and making them publicly available has been an additional asset for accessing information. Undoubtedly, search engines have greatly contributed to make information available for pupils and students. Today, probably many students search Google far more often than consulting in books for information or going to the library. While offering free access to information is positive, the concentration of information in few locations controlled by very few companies bears some risks. One potential risk is manipulation; another bias. The latter can be voluntary (e.g. by systematic by omission) or involuntary.1.3.5.2. Are Search Engines a public good?. The web has become the principal source for research information and news for many people in the developed world. It is a predominantly increasing way to get informed about the news of the world. The vast amount of information available on the web would be practically useless without search engines. As search engines are gatekeepers of the web, guiding people to reach their desired destinations, the question arises how much search engines fulfil a public responsibility and if a universal service must be assured. Similar to telecommunication providers having to offer a minimal universal service to any citizen requiring it, there is an ongoing discussion if access to the internet would also need to be included in a future universal service. In such a future scenario it is not unlikely to believe that search engine providers would have to take their stake to offer such a universal service. Defenders of a public good view, like Lucas D Introna and Helen Nissenbaum, see the web as a conveyor of information is getting the elements of a public good. And the way search engines perform the news syndication influences the view of the news.[7] For them the ideal web would facilitated associations and communications that could empower and give voice to those who traditionally have been weaker and ignored. They consider that society would need to protect public interest against encroaching commercial interests. As a consequence they consider public support for developing more egalitarian and inclusive search mechanisms and fore reach into search and meta-search technologies that would increased the transparency and access.[8] [5] 'The good, the Bad and the Ugly of the search business' Kamal Jain, Microsoft Research [6] "The Future of ICT and Learning in the Knowledge Society", Yves Punie, Marcelino Cabrera, Marc Bogdanowicz, Dieter Zinnbauer, Elena Navajas, April 2006 IPTS publication EUR Number: 22218 EN www.jrc.es/publications/pub.cfm?id=1407 [7] 'Esse est indicato in Google: Ethical and Political Issues in Search Engines', Lawrence M Hinman, International Review of Information Ethics, Vol 3 p 19, June 2005 [8] 'Shaping the Web: Why the Politics of Search Engines Matters', Lucas D Introna, Helen Nissenbaum, 2000, The Information Society 16:3, p169 ff 1.4. Annex: Profiles of Selected Search Engine Providers
1.4.1. Overview
The Pandia website provides extensive list of tools to search the internet, which on the 18th October comprised over 100 engines.[1] The list comprise tools for web search, directory search, custom search, local search, search in databases, social search and search in reference material and dictionaries and is presented in Table 3.
Table 3: List of relevant search engines by area of operation. Source Pandia (www.pandia.com). Accessed 18/10/2007. With regard to search engines several engines for finding audio-visual material, there are several tools for this purpose, see Table 4. The list does no distinguish between content-based search and meta-data search technology. Some engines have been discussed in the body of this document. Table 4: Table audio-visual search engines. Source Pandia (www.pandia.com) accessed 18/10/2007. In spite of the large number of search engines, only few of them are larger companies. The concentration effect has been discussed in the economic chapter. As a matter of illustration, | Search Engine Provider | Google | Yahoo! | MSN LiveSearch | Ask Excite CitySearch | Baidu |
| Parent Organization |
|
| Microsoft | IAC Search & Media |
|
| Headquarters | USA | USA | USA | USA | China |
| Market Cap: | $160.79b | $31.74b | $275.77b | $8.73b | $7.10b |
| Employees: | 10.674 | 11.400 |
| 16.000 | 3,113 |
| Revenue (ttm): | $13.43b | $6.65b | $51.12b | $6.42b | $157.05m |
| Gross Margin (ttm): | 60.26% | 60.20% | 79.08% | 48.43% | 66.87% |
| EBITDA (ttm): | 5.78B | 2.07B | 20.48B | 946.57M | 68.13M |
| Oper Margins (ttm): | 32.45% | 12.99% | 37.23% | 7.23% | 32.06% |
| Net Income (ttm): | 3.69B | 730.19M | 14.07B | 194.23M | 57.59M |
Table 5: Comparison of the major search engine providers in terms of financial data. Note that data are for the parent organization or the search engine provider, which in the case of Microsoft and IAC have also other important business operations. Source: Yahoo! Finance, and Company Reports In the following, the most business summary of some selected companies will be presented. The information is taken form their own sites or are the information provided to the financial portal of Yahoo! The word-wide players mentioned underneath have considerable business role most Member States of the European Union. Google is the market leader in all countries we have investigated so far, these include the United Kingdom, France, Germany, The Netherlands, Italy and Spain. Due to the supremacy of US players globally, we have include Baidu (China), Yanex (Russia) and Rambler (Russia) as examples of champions in their respective markets. Finally, we have included a section with a selection of European companies with interesting technology. 1.4.2. World-wide Players 1.4.2.1. Google (USA) Business Summary Google, Inc. provides targeted advertising and Internet search solutions worldwide. It offers intranet solutions via an enterprise search appliance. The company's products and services include Google.com that offers Google Base, which lets content owners submit content that they want to share on Google Web sites; personalized homepage and search; and Google Video and YouTube that lets users find, upload, view, and share video content, as well as Web, image, book, and literature search. It offers communication, collaboration, and communities, such as Gmail that is Google's Web mail service that comes with built-in Google search technology for searching emails; orkut that enables users to search and connect to other users through networks of trusted friends; Blogger, a Web-based publishing tool that lets people publish to the Web using Weblogs; and Google Docs & Spreadsheets, which allow users to create, view, and edit documents and spreadsheets using a browser. The company also offers Google GEO that offers earth and local maps; Google Labs that tests product prototypes and solicits feedback on how the technology could be used or improved; and Google Mobile that lets people search and view both the mobile Web, consisting of pages created specifically for wireless devices, and the entire Google index, including products like Image Search. In addition, it offers AdWords, an online self-service program that enables advertisers to place text-based ads on Google Web sites; AdSense, a program through which Google distributes its advertisers' ads for display on the Web sites of its Google Network members; and Google Checkout, an online shopping payment processing system for consumers and merchants. Further, the company licenses its Web search technology along with Google AdSense service for search to companies. Google Inc.
1600 Amphitheatre Parkway
Mountain View, CA 94043, USA
Phone: 650-253-0000
Fax: 650-253-0001
Web: www.google.com1.4.2.2. Yahoo! (USA) Business Summary Yahoo! Inc. provides Internet services to users and businesses worldwide. It offers online properties and services to users; and various tools and marketing solutions to businesses. The company's search products include Yahoo! Search, Yahoo! Toolbar, and Yahoo! Search on Mobile, Yahoo! Local, Yahoo! Yellow Pages, and Yahoo! Maps that allow user to navigate the Internet and search for information from their computer or mobile device. It also offers marketplace products that comprise Yahoo! Shopping, Kelkoo, and Yahoo! Auctions for shopping; Yahoo! Real Estate for real estate information; Yahoo! Travel, an online travel research and booking site and Yahoo! FareChase, a travel search engine; Yahoo! Autos to price and compare cars online; and Yahoo! Personals and Yahoo! Personals Premier for online dating. Yahoo! provides information products, such as Yahoo! News that aggregates news stories; Yahoo! Finance that offers financial resources; Yahoo! Food, an online food destination; Yahoo! Tech that offers information on consumer electronics; and Yahoo! Health, a healthcare destination. Its entertainment offerings comprise Yahoo! Sports, Yahoo! Music, Yahoo! Movies and Yahoo! TV, Yahoo! Games, and Yahoo! Kids; communications products include Yahoo! Mail and Yahoo! Messenger with Voice; communities offerings include Yahoo! Communities and Yahoo! Photos; and front door products comprise Yahoo! Front Page and My Yahoo!. In addition, it offers Yahoo! Broadband, Yahoo! Digital Home, Yahoo! Mobile, and Yahoo! PC Desktop to access its content and communities across Internet-enabled devices. Further, it provides Yahoo! HotJobs, an online recruitment solution; Yahoo! Small Business to purchase products on the Internet; and Yahoo! Local that offer businesses a service to post company information. It has strategic partnerships with Seven Network Limited; eBay; AT&T, Inc.; and Verizon Communications, Inc. The company was founded in 1994 and is headquartered in Sunnyvale, California. Yahoo! Inc.
701 First Avenue
Sunnyvale, CA 94089, USA
Phone: 408-349-3300
Fax: 408-349-3301
Web Site: www.yahoo.comEmployees: 11.400 6.4.2.3. MSN Live Search, Microsoft (USA) Business Summary Microsoft Corporation engages in the development, manufacture, licensing, and support of software products for various computing devices worldwide. It operates in three divisions: Platforms and Services, Microsoft Business, and Entertainment and Devices. The Platforms and Services division comprises Client, Server and Tools, and Online Services Business segments. Client segment offers operating systems for servers, personal computers (PCs), and intelligent devices. Server and Tools segment offers Windows Server operating systems. Its Windows Server products include the server platform, operations, security, applications, and collaboration software. It also builds software development lifecycle tools for software architects, developers, testers, and project managers; and provides consulting, and training and certification services. Online Services Business segment provides personal communications services, such as email and instant messaging; and online information offerings, such as MSN Search, MapPoint, and the MSN portals and channels. The Microsoft Business division includes Microsoft Office system of programs, services, and software solutions. It also provides financial management, customer relationship management, supply chain management, and analytics applications. The Entertainment and Devices division offers the Xbox video game system, such as consoles and accessories, third-party games, and games published under the Microsoft brand, as well as Xbox Live operations, research, and sales and support. It provides PC software games, online games, and other devices; and consumer software and hardware products, such as learning products and services, application software for Macintosh computers, and PC peripherals. The division also develops and markets products that extend the Windows platform to mobile devices and embedded devices. Microsoft was founded in 1975 by William H. Gates III and is headquartered in Redmond, Washington. Microsoft Corporation
One Microsoft Way
Redmond, WA 98052-6399, USA
Tel +1 425-882-8080
Fax: +1 425-936-7329
Web www.microsoft.com 6.4.2.4. Ask.com, Excite, CitySearch, (USA) Business Summary IAC is a conglomerate operating more than 60 diversified brands in sectors being transformed by the internet, online and offline. Within the internet media and advertising IAC operates the brands Ask.com; CitySearch; Excite, Evite. Employees: Approximately 20,000 full-time employees as of December 2006. IAC/InterActiveCorp
555 West 18th Street
8th Floor
New York, NY 10011
Tel +1 212-314-7390
Fax: +1 212-632-9621
Web www.iac.com 1.4.3. Regional Champions 1.4.3.1. Baidu (China) Business Summary Baidu.com, Inc. provides Chinese language Internet search services. Its services enable users to find relevant information online, including Web pages, news, images, and multimedia files through its Web site links. The company offers a Chinese language search platform, which consists of Web sites and certain online application software, as well as Baidu Union, which is a network of third-party Web sites and software applications. Its products include Baidu Web Search that allows users to locate information, products, and services using Chinese language search terms; Baidu Post Bar and Baidu Knows, which provide users with a query-based searchable community; and Baidu News that provides links to an extensive selection of local, national, and international news. The company also offers Baidu MP3 Search that provides algorithm-generated links to songs and other multimedia files provided by Internet content providers; Baidu Image Search, which enables users to search millions of images on the Internet; Baidu Space to create personalized homepages in a query-based searchable community; Baidu Encyclopedia; and other online search products and software tools. Baidu.com designs and delivers its online marketing services to its P4P and tailored solutions customers based on their requirements. The company's auction-based P4P services enable its customers to bid for priority placement of their links in keyword search results. Baidu.com primarily serves small and medium enterprises, large domestic corporations, and Chinese divisions or subsidiaries of large multinational corporations in the e-commerce, information technology services, consumer products, manufacturing, health care, entertainment, education, financial services, and real estate and other industries. The company was founded in 2000 and is headquartered in Beijing, China. Baidu.com, Inc.
12th Floor Ideal International Plaza
No 58 West-North 4th Ring
Beijing, 100080
Tel: +86 10 8262 1188
Fax: +86 10 8260 7007
Web www.baidu.com 1.4.3.2. Yandex (Russia) Yandex (Russian: Я́ндекс)[1] is a Russian search engine and one of the biggest Russian Web portals. It has been online since 1997. Its name can be explained as "Yet Another iNDEXer" (yandex) or "Языково́й (language) Index". Besides the Russian word "Я" corresponds to the English pronoun "I", "Яndex" looks a little bit like translation. According to research studies conducted by Gallup Media, FOM and Comcon, Yandex is the largest resource and largest search engine in Russian Internet, based on the audience size and internet penetration. Yandex LLC became profitable in November of 2002. In 2004 Yandex sales increased to $17M, which was 10 times greater than the company revenues just 2 years earlier, in 2002. The net income of the company in 2004 constituted $7M. In June of 2006 the weekly revenue of Yandex.Direct context ads system exceeded $1M. All of Yandex accounting measures have been audited by Deloitte & Touche since 1999. The closest competitors of Yandex in the Russian market are Rambler and Mail.ru. Although services like Google and Yahoo! are also used by Russian users and have Russian interfaces, Google has about 21-27% of search engines generated traffic to Russian sites and Yandex has around 42-49% (Mar 2007). In Ukraine Yandex enjoys 16 percent share of the search traffic while Google has 40 percent share. One of the biggest Yandex advantages for Russian-language users is understanding Russian inflection in search queries. In March 2007 Yandex acquired social networking site Moikrug.ru - - a Russian social network to search and support professional and personal contacts Yandex Я́ндекс
Address: 1, building 21, Samokatnaya St.,
Moscow 111033
tel. +7 495 739-70-00,
fax +7 495 739-70-70 [1] From Wikipedia 1.4.3.3. Rambler (Russia) Rambler Media's main website is Rambler.ru, a leading and the oldest Russian language internet portal, which combines search with email/communication and community activities and media and entertainment services. Rambler.ru aggregates the best of class internet media and services in Russia and enables mass audiences to navigate to specific pages according to their interests. Rambler Media generates revenues primarily from advertising, which includes banner or display advertising, context display advertising and sponsored key word searches, e-commerce referral and product placement. Rambler Media incorporates a full-service wholly-owned advertising agency called Index 20 in charge of generating sales from display advertising. In 2005, Rambler Media introduced “sponsored links search” and &lqduo;context advertising” through Begun (meaning “Runner” in Russian). Begun is one of Russia's leading search and contextual text based advertising platforms with a network of over 35,000 individual advertisers and over 50,000 partner distribution sites. Rambler Media has a 25.1% interest in Begun. Rambler Media has been publicly traded on the AIM market of the London Stock Exchange (LSE: RMG) since June 2005. In December 2006, Prof-Media, one of Russia's largest media holding groups and a major private investor in most sectors in the Russian media market, became Rambler Media's majority shareholder by acquiring approximately 55% of Rambler Media. Rambler
Leninskaya sloboda, 2
115280, Moscow, Russia
Phone/Fax: +7 (495) 745-3619
E-mail: info@ramblermedia.co 1.4.4.1. Fast (Norway)
Business Summary
FAST's Business is Enterprise Search. Since we set up our company in Norway back in 1997. We are the market leader in Enterprise Search and number one in revenue growth. We have no debt. We have been profitable, exceeding our projections, for every quarter during the last 4 years. And we have made these profits while investing a quarter of our income back into R&D. Performance like this gives us the freedom to invest in innovation and win on value and financial return. Headquarters Oslo Offices: Helsinki, Paris, Frankfurt, München, Milano, Rome, Tromsø, Madrid, Zürich, Amsterdam, London.
1.4.4.2. Exalead (France)
Founded in 2000 by search-engine pioneers, Exalead is a global provider of software that is designed to simplify all aspects of information search and retrieval for organizations of all sizes. Based on the first and only unified technology platform for desktop, intranet or Web search, Exalead offers easier deployment, administration and use than any other enterprise-type search software. This is true whether for one or thousands of desktops, a small business or global enterprise, and conforms to any technology environment. It also adapts to user habits for a uniquely satisfying search experience. Exalead software is used by leading banking and financial services, media, consumer packaged goods, research, retailing sports entertainment and telecommunications companies. Exalead is an operating unit of Qualis, an international holding company.
1.4.4.3. NetSprint (Poland)
NetSprint.pl, formerly XOR Internet, was established in Warsaw in 2000. From the very launch of operation, NetSprint has been focused on creating precise and efficient search engines. The goal and ambition of NetSprint is to provide users with a quick and intuitive tool for retrieval of any type of information, both on the Internet and in closed archives. Thanks to the experience of our IT team and our focus the needs of local users and customers, the solutions offered by NetSprint are more effective and more attractive in terms of price than the corresponding products of our global competitors. In November 2004 Netsprint.pl ranked seventh in the Rising Stars category of the prestigious Fast 50 Deloitte ranking of the fastest-growing technology companies in Central Europe. The Netsprint search engine is available on the Polish and Lithuanian markets. In Poland, it is used on the NetSpint.pl site, the Wirtualna Polska portal and on over 140 other big Internet sites. NetSprint is also accessible on several thousand amateur pages in its amateur version (the so called "skin"). In April 2004 the NetSprint search engine won the 4-th edition of the "Internet Now" competition in the category of "Data-base, catalogues, search engines". In June 2005 NetSprint.pl won again in the 5-th edition of "Internet Now" competition. Market: Poland and Lithuania (Netsprint.lt) Websites which use the engine: NetSprint.pl, wp.pl, and 140 others. NetSprint.pl Sp. z o.o. ul. Bieżanowska 7 02-655 Warszawa, Poland
tel. (022) 844 49 90, fax (022) 852 20 60
http://firma.netsprint.pl/ 1.4.4.4. Morfeo (Czech Republic) The search engine Morpheo (www.morfeo.cz) was developed by scientists related to Charles University in Prague, mostly: Martin Mares (http://mj.ucw.cz/) and Robert Špalek (http://www.ucw.cz/~robert/index-en.html). The development has been sponsored by the advertising company Netcentrum s.r.o. (http://www.netcentrum.cz) which is also one of the most important users and works as an exclusive distributor of the commercial version. Back in 1997, Martin Mareš wrote the first version called Sherlock 1.0 as his term project at MFF UK but it somehow escaped from his control soon – in October 1997 it was indexing the whole .cz domain in cooperation with the Bajt company. The time slowly passed by, the author was busy working on other stuff, Bajt had its own problems and the whole project would have been almost forgotten weren't it for people from Netcentrum who were building a new Czech portal, wanted to use Sherlock for searching and were willing to sponsor its further development. After several years of successfully running Sherlock 1.2 on a couple of servers, Robert Spalek joined the "team" and together we decided to rewrite the whole project from scratch and change the whole architecture (confirming the ancient wisdom that every good program including TeX has to be rewritten at least once in its lifetime :) ). Unfortunately, we have been forced to delay the public release of this version for some time. So was it back in 2001. In September 2002, we have resurrected the freely distributable version of Sherlock, but in the meantime Apple started distributing another program of the same name as part of their OS X, so we decided to rename the whole package to Sherlock Holmes (or Holmes) to avoid both confusion and trademark problems. Market: Czech Republic, Slovakia, Poland Websites which use the engine: onet.pl, morfeo.cz, morfeo.sk Netcentrum S.R.O
Drtinova 557/10
15000 Praha 5, Czech Republic
Phone : +420 227 018 100
Fax : +420 227 018 104
Web site : o.centrum.cz 1.4.4.5. Autonomy (United Kingdom) Autonomy is the acknowledged leader in the rapidly growing area of Meaning-Based Computing (MBC). Founded in 1996 and utilizing a unique combination of technologies borne out of research at Cambridge University, the company has experienced a meteoric rise and currently has a market cap of $4 billion and offices worldwide. Autonomy's position as industry leader is widely recognized by analysts including Gartner Group, Forrester Research and Delphi, which calls Autonomy the fastest growing public company in the space. Autonomy's revenues are twice that of its nearest rival. Meaning-Based Computing extends far beyond traditional methods such as keyword search which simply allow users to find and retrieve data. Keyword search engines for example cannot comprehend the meaning of information; these products were developed simply to find documents in which a word occurs. Unfortunately, this inability to understand information means that other documents that discuss the same idea (i.e. are relevant) but use different words are overlooked. Equally, documents with a meaning entirely different to that which the user searches for are frequently returned, forcing the user to alter their query to accommodate the search engine. In addition, some of the key functionality of Meaning-Based Computing such as automatic hyperlinking and clustering are simply not available in keyword search engines. For example, automatic hyperlinking which connects users to a range of pertinent documents, services or products that are contextually linked to the original text requires that the meaning of the original document is fully understood. Similarly for computers to automatically collect, analyse and organize information computers have to be able to extract meaning. Only Meaning-Based Computing Systems can do this. Revenue USD 250.1 million (2006), 116% higher compared to 2005
Employees 1,300 Autonomy Corporation plc
Cambridge Business Park
Cowley Rd
Cambridge CB4 0WZ, United Kingdom
Tel: +44 (0) 1223 448000
Fax: +44 (0) 1223 448001
www.autonomy.com 1.4.4.6. Expert System (Italy) Expert System S.p.A Founded Modena, Italy (1989) Products Cogito Employees 140 (2007) Expert System S.p.A
Via Virgilio, 56/Q - Staircase 5
41100 Modena – Italy
Tel: +39 059 894011
Fax: +39 059 894099 info@expertsystem.netwww.expertsystem.net