If there is a technology company that accurately defines the concept of "big data", it must be Google. According to the survey of Comsk, a search research company, in March 2065.438+2002, the number of search words handled by Google was as high as 654.38+0.22 billion.
Google not only stores the network connections that appear in its search results, but also stores the behavior of all people searching for keywords. It can accurately record the time, content and way of people's search behavior. These data can enable Google to optimize advertising rankings and turn search traffic into a profit model. Google can not only track people's search behavior, but also predict what searchers will do next. In other words, Google can predict your intentions before you realize what you are looking for. This ability to capture, store and analyze massive human-machine data and then make predictions based on these data is called "big data".
20 12: big data crossroads?
Why did big data suddenly become so hot? Why did The New York Times define 20 12 as "the crossroads of big data"?
The reason why big data enters the mainstream public's field of vision stems from the synergy of three major trends:
First, many high-end consumer goods companies have strengthened the application of big data. Facebook, a huge social network, uses big data to track users' behaviors in its network, and gives new friend recommendation suggestions by identifying your friends in its network. The more friends users have, the higher the viscosity between them and Facebook. More friends mean that users will share more photos, post more status updates and play more games.
LinkdIn, a commercial website, uses big data to establish a connection between job seekers and recruitment positions. With LinkdIn, headhunters no longer have to make strange phone calls to try their luck on potential employees, but can find potential employees and contact them through simple search. Similarly, job seekers can naturally sell themselves to potential employers by contacting others on the website.
Second, the above two companies were listed in early 20 12. Facebook is listed on Nasdaq and LinkedIn is listed on new york Stock Exchange. Like Google, these two companies are consumer goods companies on the surface, but they are big data companies in essence. In addition to these two companies, Splunk also went public on 20 12. It is a big data enterprise that helps large and medium-sized enterprises to provide operational intelligence. The public listing of these companies has increased Wall Street's interest in big data. This interest has brought an unprecedented grand occasion-venture capitalists in Silicon Valley have begun to invest in big data companies. Big data will trigger the next wave of entrepreneurship, which is expected to make Silicon Valley replace Wall Street in the next few years.
Third, active users of data-centric consumer goods such as Amazon, Facebook and LinkedIn begin to expect that they can get an unimpeded experience of using big data at work, not just for life and entertainment. Users have always wondered, since Amazon, an Internet retailer, can recommend reading books, movies and buying goods, why can't their company do something similar?
For example, since car rental companies have information about customers' past car rentals and existing available vehicle inventories, why can't these companies provide suitable vehicles for different car renters smarter? Companies can also use public information through new technologies-such as the situation of a specific market, information about conference activities and other events that may affect market supply and demand. By combining internal supply chain data with external market data, companies can more accurately predict what vehicles will be available and when.
Similarly, retailers should be able to combine external public data with internal data and use this mixed data for product pricing and market layout. At the same time, we can also consider many factors that affect the spot supply ability and consumers' shopping habits, including which two products sell better together, so that retailers can increase the average purchase of consumers and obtain higher profits.
Google's actions
Compared with most other companies, Google's scale and scope give it more ways to apply big data. One of Google's advantages is having an army of software engineers, which enables Google to build big data technology from scratch.
Another advantage of Google lies in its infrastructure. The Google search engine itself is designed to enable it to seamlessly link thousands of servers. If there is more processing or storage demand, or the server crashes, Google engineers can easily handle it by adding more servers.
The design of Google software technology also adheres to the same concept of infrastructure. MapReduce (a programming tool developed by Google) is used for parallel operation of large-scale data sets. (translator's note) and Google file system are two typical examples. Wired magazine reported in the early summer of 20 12 that these two software systems "reshaped the way Google built its search index".
Now many enterprises are using Hadoop, which is an open source derivative of MapReduce and Google file system. Hadoop allows distributed processing of huge data sets on multiple computers. When other companies just started using Hadoop, Google has been deeply involved in big data technology for many years, which gives it a huge leading edge in the industry.
Now Google is further opening up the field of data processing and sharing it with more third parties. Google recently launched the web service BigQuery. This service allows users to analyze huge data sets interactively. According to Google's current situation, "super large" is billions of lines of data. BigQuery is a data analysis that runs in the cloud according to instructions.
In addition, Google also has a lot of machine data generated when people search on Google's website and pass through its network. Every search request entered by the user will let Google know what he is looking for, and all human behaviors will leave traces on the Internet. Google has occupied an excellent point to capture and analyze the path.
Not only that, Google has more ways to get data than search. Companies install products such as "Google Analytics" to track visitors' footprints on their websites, and Google can also obtain these data. The website also uses "Google advertising alliance" to display advertisements from Google's advertiser network on its website, so Google can not only gain insight into the display effect of advertisements on its own website, but also see the display effect of other advertisement publishing websites at a glance.
The result of putting all these data together is that enterprises not only benefit from the best technology, but also benefit from the best information. In terms of information technology, many enterprises are expensive. However, in the information field, one of the components of information technology, Google has made huge investment and achieved great success, but few companies can match it.
Amazon is pressing hard.
Google is not the only large technology company that promotes big data. Internet retailer Amazon has taken some radical actions, which may make it the biggest threat to Google.
Some analysts have predicted that Amazon's revenue in 20 15 years will exceed 1000 billion US dollars, and it will soon surpass Wal-Mart to become the world's largest retailer. Like Google, Amazon has to deal with massive data, but it has a stronger e-commerce tendency when dealing with data. Every time consumers search for a TV program they want to watch or a product they want to buy on Amazon's website, Amazon's understanding of consumers will increase. Based on search and product purchase behavior, Amazon can know what products should be recommended next. Amazon's cleverness does not stop there. It will constantly test new design schemes on its website and find out the scheme with the highest conversion rate.
Do you think a page on Amazon? What just happened? If you think so, you should think again. The layout, font size, color, buttons and other designs of the whole website are actually the best results after many careful tests.
The data-oriented method is not limited to the above-mentioned fields. According to a former employee, Amazon's corporate culture is a cold data-oriented culture. Data show what is effective and what is not, and new business investment projects must be supported by data. The long-term focus on data allows Amazon to provide better services at a lower price. Consumers tend to completely skip search engines such as Google and go directly to Amazon. Search for goods and make purchases.
The smoke of war for consumer control is still spreading. Apple, Amazon, Google and Microsoft, the four recognized giants, are fighting not only on the Internet, but also in the mobile field. In view of the fact that consumers spend more and more time on mobile devices such as mobile phones and tablets, and less and less time sitting in front of computers, enterprises that have access to mobile devices in consumers' hands will have more advantages in selling and obtaining consumer behavior information. The more enterprises know about consumer groups and individuals, the better they can design content, advertisements and products.
From the infrastructure supporting emerging technology companies to mobile devices consuming content, Amazon's tentacles have touched a wider range of fields, which is incredible. Amazon foresaw the value of opening the server and storage infrastructure to others several years ago. "Amazon Web Services (AWS)" is a well-known public-oriented cloud service provider in Amazon, providing scalable computing resources for emerging enterprises and established enterprises. Although AWS has not been established for a long time, some analysts estimate that its annual sales exceed $6543.8+$500 million.
The computing resources provided by AWS pave the way for enterprises to carry out big data actions. Of course, enterprises can still continue to invest in building their own infrastructure in the form of private clouds, and many enterprises will do so. But if the enterprise wants to use additional resources,
Scalable computing resources, they can also use multiple servers on Amazon public cloud conveniently and quickly. Today, Amazon not only leads the trend and attracts attention through its own website and new mobile devices such as Kindle, but also through the infrastructure supporting thousands of popular websites.
As a result of AWS, big data analysis no longer requires companies to invest fixed costs in IT. Now, getting data and analyzing data can be done simply and quickly in the cloud. In other words, enterprises used to have to give up data because they could not store it, but now they have the ability to obtain and analyze unprecedented data.
Realize information superiority
The combination of services such as AWS and open source technologies such as Hadoop means that enterprises can finally taste the fruits that information technology described to the world many years ago.
For decades, people's attention to the so-called "information technology" has been focused on the "technology" part. The CIO's responsibility is only to purchase and manage servers, storage and networks. Nowadays, information and the ability to analyze and store information and make predictions based on it are becoming the source of competitive advantage for enterprises.
When information technology is just emerging, enterprises that applied information technology earlier can develop faster and surpass others. Microsoft established its prestige in 1990s, not only because it developed the most widely used operating system in the world, but also because it used email as the standard communication mechanism within the company at that time.
Although many enterprises are still hesitant to adopt e-mail, e-mail has actually become a mechanism for Microsoft to discuss recruitment, product decision-making and marketing strategy. Although a large number of e-mail exchanges are now commonplace, at that time, such measures gave Microsoft an advantage over other companies that did not adopt e-mail in terms of speed and collaboration. Embracing big data and using it democratically between different organizations will bring similar advantages to enterprises. Companies such as Google and Facebook benefit from "data democracy".
By opening the internal data analysis platform to all analysts, managers and executives related to their own companies, Google, Facebook and other companies have enabled all members of their organizations to ask business-related questions and get answers to data.
And take prompt action. Take Facebook as an example, it promotes big data as an internal service, which means that the service is designed not only for engineers, but also for end users-production line managers, who need to use queries to find effective solutions. Therefore, managers don't have to wait a few days or weeks to find out which changes in the website are the most effective or which advertising methods are the best. They can use the internal big data service, which aims to meet their needs and make the results of data analysis easily shared among employees.
The past twenty years are the era of information technology, and the theme of the next twenty years is still information technology. These enterprises can process data faster, and the integration of public data resources and internal data resources will bring unique insights, enabling them to far surpass their competitors. As I wrote the eight laws of big data, the faster you analyze the data, the greater its predictive value. Nowadays, enterprises are gradually moving away from batch processing (batch processing refers to storing data first and then slowly analyzing and processing it afterwards) and turning to real-time analysis to gain competitive advantage.
The good news for executives is that the information advantage from big data no longer belongs to big companies such as Google and Amazon. Open source technologies such as Hadoop give other companies such advantages. Established Fortune 100 companies and emerging start-ups can use big data to gain a competitive advantage at a reasonable price.
The subversion of big data
The subversion brought by big data is not only the ability to acquire and analyze more data than before, but more importantly, the price of acquiring and analyzing the same amount of data has also dropped significantly. The lower the price, the higher the sales volume will be. However, the implied irony is like the so-called "jevons Paradox". Jevons, an economist, got this paradox by observing the industrial revolution, and named it after him (the core of jevons's paradox is that the increase of resource utilization leads to the decrease of price, which will eventually increase the use of resources. -translator's note). Scientific and technological progress has made data storage and analysis more efficient, and companies will do more data analysis, so they have not reduced their work. In short, this is the subversion brought by big data.
From Amazon to Google, from IBM to Hewlett-Packard and Microsoft, a large number of large-scale technology companies have devoted themselves to big data. Based on big data solutions, more start-ups have sprung up, realizing open source and enjoying the cloud. Large companies are committed to horizontal big data solutions, while small companies focus on providing applications for important vertical businesses. Some products optimize sales efficiency, while others provide suggestions for future marketing activities by correlating the marketing performance of different channels with the actual product usage data. These big data applications (BDA) mean that small companies don't have to develop or equip all big data technologies internally; In many cases, they can use cloud-based services to meet data analysis needs. In addition to technology, these small enterprises will also develop some products, track and record health-related indicators, and put forward suggestions to improve people's behavior. Products like this are expected to reduce obesity, improve the quality of life and reduce medical costs.
Big data roadmap
Forrester, an industry analysis and research company, estimates that the total amount of enterprise data is soaring at an annual growth rate of 94%. With such rapid growth, every enterprise needs a big data roadmap. At the very least, enterprises should formulate a strategy to obtain data, which should range from routine machine logs of internal computer systems to online user interaction records. Even if the enterprise didn't know the purpose of these data at that time, it should do so. The purpose of these data may have been discovered suddenly later.
The value of data is much higher than your initial expectation, so don't throw it away. Enterprises also need a plan to cope with the exponential growth of data. The number of photos, instant messages and emails is huge, and even more data is released by "sensors" composed of mobile phones, GPS and other devices.
Ideally, an enterprise should have a vision that data analysis can run through the whole organization, and the analysis should be as close to real time as possible. By observing technology leaders such as Google, Amazon and Facebook, you can see the possibilities under big data. What managers need to do is to integrate big data strategies into their organizations.
Companies like Google and Amazon have been using big data to make decisions for several years, and they have achieved extensive success in data processing. Now, you can have the same ability.
These are the big companies and big layouts that Bian Xiao shared for you about big data. For more information, you can pay attention to the global ivy and share more dry goods.