These people are called Data Scientists, a title first coined in 2008 by D.J. Pati and Jeff Hammerbacher, who later became the heads of the data science teams at LinkedIn and Facebook. And the position of data scientist has now begun to create value in traditional U.S. industries such as telecommunications, retail, finance, manufacturing, logistics, healthcare, and education.
But in China, the application of big data is just budding, the talent market is not so mature, "you can hardly expect a generalist to complete all the links in the whole chain. More companies will recruit talents that can complement their existing teams based on the resources and shortcomings they already have." Wang Yuyao, director of business analytics and strategy at LinkedIn China, told CBN Weekly.
What does a data engineer do? So each company has different requirements for big data jobs: some emphasize database programming, some highlight knowledge of applied math and statistics, some require experience related to consulting firms or investment banks, and some are hoping to find application-oriented people who know about products and markets. Because of this, many companies will be for their own business type and team division of labor, to the group of people dealing with big data some new titles and definitions: data mining engineers, big data specialists, data researchers, user analysis experts, etc. are often in the domestic company Title, we collectively referred to as the "big data engineers! ".
Since the domestic big data work is still in a stage to be developed, how much value can be extracted from it depends entirely on the engineer's personal ability. Experts who are already in the industry have given some broad frameworks for talent needs, including the need for computer coding skills, a background in math and statistics, and of course a deeper understanding of specific fields or industries that can be more helpful in making quick judgments and grasping key factors.
While for some big companies, people with master's degrees are a better choice, but Alibaba Group researcher Xue Guirong emphasized that education is not the main factor, and that experience in handling data on a large scale and curiosity in searching for treasures in the ocean of data would be more suitable for the job.
Besides that, a good big data engineer has to be able to analyze logically and quickly locate the key attributes and determinants of a business problem. "He has to know what's relevant, which is important, what kind of data is most valuable to use, and how to quickly find the most core needs of each business." Shen Zhiyong, a data scientist at the United Nations Baidu Big Data Joint Laboratory, said. Learning ability can help big data engineers quickly adapt to different projects and become data experts in this field in a short period of time; communication ability can make their work go more smoothly, because the work of big data engineers is mainly divided into two ways: driven by the marketing department and driven by the data analysis department, the former needs to often learn about the development needs of the product manager, and the latter needs to look for the operations department to learn about the data models actual conversions.
You can look at these requirements as a direction to work toward becoming a big data engineer, because according to Nicole Yan, Managing Partner at ManpowerGroup, it's a big talent gap. Currently, most of the big data applications in China are concentrated in the Internet sector, with more than 56% of companies preparing for the development of big data research, and "94% of companies will need data scientists in the next five years." Yan Liping (Nicole Yan) said. So she also suggests that some company people who were originally in data-related jobs might consider making the transition.
In the words of Xue Guirong, a researcher at Alibaba Group, big data engineers are a group of people who "play with data," play with the business value of data, and turn data into productivity. The biggest difference between big data and traditional data is that it is online, real-time, massive in scale and irregular in form, no rules and regulations to follow, so it is very important to "know how to play" these data people.
Shen Zhiyong thinks that if you imagine big data as a mine that keeps accumulating, then the job of a big data engineer is, "The first step is to locate and extract the data set where the information is located, which is equivalent to prospecting and mining. The second step is to turn it into information that can be directly judged, which is equivalent to smelting. The final step is application, visualizing the data, etc."
So analyzing history, predicting the future, and optimizing choices are the three most important tasks for big data engineers when they "play with data". By working in these three directions, they help organizations make better business decisions.
1. Find out the characteristics of past events
A very important job for big data engineers is to analyze data to find out the characteristics of past events. For example, Tencent's data team is building a data warehouse to comb through the huge amount of irregular data information on all the company's web platforms and summarize the features available for querying to support the company's demand for data for various types of business, including advertisement placement, game development, and social networking.
The biggest effect of identifying the characteristics of past events is that it can help companies better recognize consumers. By analyzing a user's past behavioral trajectory, it is possible to understand that person and predict his behavior. "You can know what kind of person he is, his age, his interests, whether he is a paid Internet user, what type of games he likes to play, and what he usually likes to do online." Zheng Lifeng, general manager of Tencent Cloud Computing Co.'s Beijing R&D center, told CBN Weekly. The next step to the business level, you can recommend relevant services for all types of people, such as handheld games, or derive new business models based on different characteristics and needs, such as WeChat's movie ticket business.
2. Predicting what might happen in the future
By introducing key factors, big data engineers can predict future consumption trends. On AliMom's marketing platform, engineers are trying to help Taobao sellers do business by introducing weather data. "For example, if it's not hot this summer, it's likely that certain products won't sell as well as they did last year, and in addition to air conditioners and fans, undershirts and swimsuits may be affected by it. Then we will establish the relationship between weather data and sales data, find the categories related to it, and warn sellers in advance to turnover inventory." Xue Guirong said.
In Baidu, Shen Zhiyong supports the model development of some of the "Baidu Forecast" products, trying to use big data to serve a wider range of people. Already online, including the World Cup prediction, college entrance examination prediction, attractions prediction and so on. Baidu attraction prediction, for example, big data engineers need to collect all the key factors that may affect the flow of attractions for a period of time to predict, and for the country's attractions in the future crowdedness rating - in the next few days, it is smooth, crowded, or generally crowded?
3. Finding the Optimized Outcome
Depending on the nature of an organization's business, big data engineers can use data analytics for different purposes.
Taking Tencent as an example, Zheng Lifeng believes the simplest and most direct example that reflects the work of big data engineers is the option test (AB Test), which helps product managers choose between two alternatives, A and B. In the past, decision makers could only make choices based on experience. In the past, decision makers could only make judgments based on experience, but today big data engineers can help the marketing department make the final choice by conducting large-scale real-time tests -- for example, in the case of a social networking product, letting half of the users see the A interface and the other half use the B interface, and observing and counting click-through rates and conversions over a period of time.
Alibaba, as an e-commerce company, wants to target precise people through big data to help sellers do better marketing. "What we are more looking forward to is that you can find such a group of people who are more interested in the product than the existing users." Xue Guirong said. One Taobao example is that a ginseng seller originally promoted a target demographic of women in labor, but engineers found that marketing directed at pregnant women had a higher conversion rate after digging into the correlation between the data.
Capabilities needed
1. Math and statistics-related background
In the case of the three major BAT Internet companies we interviewed, the requirements for big data engineers are expected to be a master's or doctoral degree in statistics and math background. According to Shen Zhiyong, data workers who lack a theoretical background are more likely to enter a skills Danger Zone - a bunch of numbers, according to different data models and algorithms can always be primed with some results, but if you don't know what that means, it's not really meaningful results, and that kind of results are also prone to mislead you. "Only with a certain amount of theoretical knowledge can you understand the model, reuse the model and even innovate the model to solve practical problems." Shen Zhiyong said.
2. Computer coding skills
Practical development skills and the ability to process data on a large scale are some of the essential elements of being a big data engineer. "Because much of the value of data comes from the process of mining, you have to get your hands dirty to discover the value of gold." Zheng Lifeng said.
For example, many of the records people generate on social networks are now unstructured data, and how to grab meaningful information from all that clueless text, voice, images, and even video requires big data engineers to do the digging themselves. Even in some teams where the big data engineer's role is focused on business analytics, it's important to be familiar with the way computers process big data.
3. Knowledge of a specific application domain or industry
For Nicole Yan, it's important that the role of a big data engineer is not divorced from the marketplace, because big data can only be valuable if it's combined with applications in a specific domain. Therefore, experience in one or more vertical industries can help candidates accumulate knowledge of the industry, which is a great help to become a big data engineer, and therefore a more convincing plus point when applying for the position.
"He can't just know the data, but also have business acumen, whether it's in retail, pharmaceuticals, gaming or tourism, etc., to have a certain understanding of some of these areas, and preferably in line with the company's business direction," Xue Guirong said in this regard. In the past, we said that some luxury sales clerks were snobbish, and that they could tell if someone could afford it or not just by looking at them, but this group of people happens to be perceptive, and we consider them to be experts in this industry. Another example would be someone who knows the healthcare industry, who, when thinking about the health insurance business, will not only correlate with records of people's hospital visits, but also dietary data, all of which is based on knowledge of the field."
Careers 1. How to become a big data engineer
Because of the current scarcity of big data talent, it's difficult for companies to recruit the right people-both highly educated and, ideally, also experienced in large-scale data processing. So many companies will be digging through internal mining.
In August 2014, Alibaba held a big data competition, taking data from its Tmall platform, removing sensitive issues, and placing it on a cloud computing platform to be handed over to more than 7,000 teams for the competition, which was divided into internal and external races. "This is a way to incentivize internal employees and also to discover external talent, so that big data engineers from various industries can emerge."
Nicole Yan suggests that people who are currently engaged in database management, mining, and programming for a long time, including traditional quantitative analysts, engineers in Hadoop, and any managers who need to make judgmental decisions through data in their work, such as operations managers in certain fields, can try out for the position, and that people who are professionals in various fields can also become big data engineers if they learn to use data, can also become big data engineers.
2. Compensation
As the "panda" of IT careers, big data engineers can be said to be at the top of their class. According to Yan Liping (Nicole Yan) observation, domestic IT, communications, industry recruitment, 10% are related to big data, and the proportion is still rising. Yan Liping (Nicole Yan) said, "the arrival of the era of big data is very sudden, the momentum of development in the country is radical, while the talent is very limited, and now it is completely in short supply." In the United States, the average annual salary of big data engineers as high as 175,000 U.S. dollars, and it is understood that in the country's top Internet-based companies, the same level of big data engineer's salary may be 20% to 30% higher than other positions, and is quite valued by enterprises.
3. Career Development Path
Because of the small number of big data talents, the data department of most companies is generally a flat hierarchical model, which is roughly divided into three levels: data analyst, senior researcher, and department director. Large companies may divide different teams according to the dimensions of the application area, while in small companies you need to wear several hats. Some Internet companies that place special emphasis on big data strategy create another top position-such as Alibaba's chief data officer. "Most people in this position will move toward research and become important data strategy talent." Nicole Yan said. On the other hand, big data engineers understand business and products as well as business unit employees, so they can also move into product or marketing, or even rise to senior management in the company.