In recent years, the cloud computing market can be said to be a blowout growth. Various new technologies and companies around the cloud have sprung up like mushrooms after rain, and various voices and discussions are constantly appearing in the market. Every week, all kinds of news emerge one after another, new products are released, companies get investment, and of course, large-scale system downtime and other news appear.
Customers gradually move from passive to front desk. A few years ago, customers said that only cloud titles would be fashionable when doing a project. Now more customers have begun to practice and try from different angles, such as business-driven, enterprise transformation, experimental testing and large-scale promotion. The emergence of many opportunities and ideas is a good thing, providing companies at different levels with opportunities to show and a stage to speak. Such people have completely changed the stable balance formed in the market of large enterprise-level foreign enterprises such as IOE in the past. But is there a new balance to break this balance? In fact, it is not because everyone is constantly exploring and trying. I also observe and support my personal participation in this wave. I want to share with you when this wave of clouds will come, or when the opportunity for change will come. What kind of trends and trends are suitable for customers, so that they can truly integrate with their own business and re-form stability. After all, the IT of traditional enterprise customers is not a laboratory. They can't have the human resources and ever-changing business front end of the Internet industry, nor can they have the huge infrastructure operated by operators. For them, the core of building a cloud lies in business integration, which can provide support from innovation, delivery experience, cost, stability and support for future transformation. This process is not only an experience for customers, but also a test for service providers. Just like steelmaking, it needs step-by-step technology and technology to become a qualified product. Of course, my own post and work scope make me pay more attention to the construction and operation of customer private cloud and industry cloud, that is, how to help customers refine into a good cloud.
Is your organization and enterprise ready to turn to cloud computing?
Let's talk about the construction of private cloud first. A few years ago, if we talked to our customers about cloud computing, they would pay more attention to virtualization. The main job at that time was to make customers accept virtualization, so as long as we can virtualize, all projects can be named cloud. With the mature experience of large-scale public cloud and the popularization of cloud technology in recent years, customers gradually accept that cloud is not just virtualization, but a service-oriented and delivery-oriented system, including self-help acquisition by end users and business departments, integrated operation and maintenance from network connection to service quality assurance, and even the business model of some industry clouds that customers need from operation to sales management. This kind of real cloud has a huge investment in customers. First, we must judge whether we need to be a real private cloud (industry cloud) or directly buy a public cloud or someone else's industry cloud service. On the one hand, considering customization, on the other hand, considering security, public cloud services are just like eating in a canteen. Although there are many kinds of food, they are not necessarily delicious or suitable for themselves. Some enterprise features require a lot of infrastructure or middle tier, so you can only build your own cloud service. Of course, companies that pursue the ultimate data security need it more. Another aspect is whether the customer's application needs to go to the cloud. Logically, customer applications are generally divided into non-cloud applications, cloud applications, cloud adaptation applications and native cloud applications. The criteria for judging these categories are not only technical virtualization, distributed storage and converged computing. It also includes application flexibility requirements, business delivery requirements, change requirements, and interaction requirements between peripheral APIs and other applications. For customers who build private clouds, the flexible resource pool, that is, the underlying platform, cannot be deployed and loaded at will, and it takes a certain period of time to deploy the resource pool. Therefore, the important purpose of building a private cloud is to ensure the best input-output ratio of its own pool, that is, the running applications and loads can make full use of the existing resource pool resources reasonably, ensure certain redundancy, and reduce the resource pool appropriately after a large number of services are offline. From the application point of view, different types of applications can be realized by different means, although the cost and technology may be very different. Native cloud applications are generally new business products and directions that enterprises are currently trying. If this kind of application is within the scope of data and information security, it is most suitable for public cloud or hybrid cloud. Of course, more extensive cloud-based applications and cloud-supporting applications are the mainstream of enterprises at present. For this type of application, application development departments and teams need to get involved when considering building a cloud. Simply providing pure IAAS or virtualized cloud platform for virtual machines and storage servers cannot improve the service delivery model. Therefore, starting from the delivery mode, can we leave the application loading to the business department to choose, and let them choose to start or close the service from the application to the underlying infrastructure according to the business transformation? This is for the cloud system. This process requires the cooperation of the application department and the infrastructure department, and * * * works out an elastic plan to ensure that the bomb can be recovered. This process is called service orchestration.
From our experience, there are two major tasks to be done to successfully realize cloudization. The first step is to build a shelf, and the second step is to go to the cloud. Of course, we should build it from the perspective of the whole life cycle. Only in "don't forget your initiative" do we know that no matter how far we go and how many attempts we make, we can reach the finish line. The following figure mainly shows the process of cloud platform construction and the required services.
Now let's talk about shelving. Shelving is the planning and evaluation of cloud computing, designing the delivery mode of the whole cloud service, including the road map that needs to consider 1. Application of cloudization, because of building a cloud platform, application transformation, migration, testing, expansion and so on. It will not be achieved overnight, but must be achieved through a certain period of time. 2. Demand for standardized resources and tools. If it is standard for all enterprises, of course, you can use public clouds or even hybrid clouds. However, due to the development of different applications and the accumulation of time, the deployment architecture and middleware database used by different enterprises are different, so it is necessary to consider preparing an integrated version and gradually unifying it in the process of building a cloud platform, which makes it possible for later service arrangements. 3. Operation and maintenance process standardization and SLA. Cloud is a container in which various applications are built and delivered to the final departments and users through service delivery automation and self-service. What to deliver, delivery process and specifications, especially SLA, should be designed from the beginning, because this will directly affect the layout of technical systems and the cost load of operation and maintenance management services. 4. Finally, the process of continuous delivery in cloud environment, in which a series of issues such as application expansion flexibility and business continuity should be considered. Doing these steps well and building a perfect cloud system will at least not be blind and lack direction. However, based on these, the shelf has not been completely built, and the more important step is the service arrangement and resource pool design. The arrangement of services is from the application point of view, which is to start or close the application portfolio according to the needs through an automated process, and completely integrate the infrastructure part into the automated operation of the application. The design of service choreography is very important, and people who understand the application and infrastructure and even build a cloud management platform later need to work together. This process is to standardize and template the integration of project-based application software and the underlying architecture, and finally realize the so-called workflow. Through the design of this part, we can know which automated templates the service needs to compile, the basic resources needed by the underlying architecture and the related technical system.
The design of resource pool, when it comes to this part, we have to talk about OpenStack. OpenStack is a very good tool, which can help customers connect the entire infrastructure including computing, storage and network, and constantly shake the commercial product positions held by IOE and VMware vendors. Customers can start the journey of open source cloud computing through OpenStack. But since it is an open source product and system, it must have the characteristics and constraints of open source products. Lack of stability and maturity of commercial products, lack of clear product evolution route and lack of supporting services are all problems that customers need to face, especially stability and operation and maintenance. From the current experience, OpenStack is more like a Lego toy. Different technicians may have different deployment and implementation methods at the customer's site, so the coupling between software and hardware becomes very important. When a customer designs a resource pool, what kind of hardware configuration, network, storage server and software may produce a large number of different arrangements and combinations. Because of some coupling and characteristics of hardware, problems and failures will become very frequent, and customers can't be mice with repeated experiments, so reference architecture becomes very important, that is, through experiments and projects, it has been summarized which brands of configuration machines can meet the requirements when combined. Of course, in addition to machine configuration, standardized design also provides guidance for customers in project implementation, which greatly reduces the implementation time. Customers only need to verify the rationality of the function through POC.
With systems and shelves, basically even if you figure out how to make a cloud, there are many ways to do it. You can start from scratch, or you can ask someone to buy, manage and host a private cloud. Of course, cost and delivery time, as well as the ROI of later operation and maintenance, can be used as the criteria for judging. Of course, some people think it is complicated, so I will find an OpenStack vendor to build a pool or use VMware to provide virtual machines, storage and networks. I have basic IaaS, and I'm already in the cloud. Of course, this can be done, but the scale, whether it is only for business departments, how to manage and deliver it, and whether the follow-up projects are growing are unavoidable. So whether or not to be a cloud, as mentioned earlier, is a change in business model, and organizations should also make corresponding changes, such as adding some roles such as service delivery specialists and layout design architects.
The Importance of Cloud Computing Management Platform
Cloud management platform is actually a very important platform, which not only has a separate magic quadrant on Gartner, but also can be seen in the whole life cycle of our construction, which is actually an important foothold of the whole cloud construction. Of course, let's talk about Gartner's magic quadrant. In fact, in the current domestic cloud management platform, customers prefer localization and customization. This is because the cloud management platform is the embodiment of customer management soul. How to combine with its own management, mass customization is inevitable, so the competition of cloud management platform is the fiercest.
It can be said that the cloud management platform has a very long history of development. As early as a few years ago, when customers found that client-oriented automatic delivery could not be achieved only through VCenter, the cloud management platform came into being. Nowadays, the cloud management platform is already the horizontal and vertical convergence point and the access core. Backward compatibility and management of infrastructure systems, including various resource pools, such as VMware, KVM, Power, etc. The KVM system resource pool is basically realized by calling OpenStack because OpenStack, VMware is vCenter and Power is PowerVC. Through the interaction with the underlying management software, the cloud management platform can see different resource pools, and it is more important to build a service layer in the resource Ikenoe, that is, organize different resources into services and deliver them to customers through orchestration. This is the downward dimension of cloud management platform, and the upward dimension is the integration of cloud management platform and PaaS application layer. At present, most cloud-enabled applications and cloud adaptation applications use traditional middleware and databases, so the cloud management platform mainly integrates cloud-based software middleware templates and low level virtual machine templates through service arrangement, and finally delivers them directly to customer application systems, rather than just delivering the underlying resources through services. This part is a hot spot in the development of cloud management platform at present, especially for enterprise customers. Of course, there is also a Docker mode for application undertaking now, so the cloud management platform can provide customers with the deployment environment and capabilities of native cloud applications by loading Docker services. To fully realize the value of cloud management platform, horizontal collaboration is needed, that is, operation and maintenance monitoring and hybrid cloud interface. In the past, cloud platforms were usually experimental, and many management processes, especially ITIL, were not loaded into them. On the one hand, we hope that the monitoring process provided by the cloud management platform is integrated, which can help us understand the performance and status of the basic resource pool, virtual machines and even middleware. We also hope that the arranged rules can be automatically started and closed, thus completely realizing the automation and flexibility value of the cloud. This requires that the cloud management and monitoring system can be fully integrated or provide some functions. For the hybrid cloud, although the deployment mode, data transmission and security are still problems to be faced and considered, the cloud management platform can manage the environment purchased by the remote public cloud itself, and the integrated management has been gradually realized by many cloud management platforms. The next step must be to deploy monitoring and billing across the cloud. Therefore, as one of the most important components of the cloud, the cloud management platform plays a very important role in the interaction and delivery of service business, which can not be realized only through a Chinese vision of OpenStack. Of course, there are also many software vendors doing this part of the cloud management platform, because it will not affect the bottom of production, so there are only problems of customization and ease of use. Of course, don't underestimate this customization and usability. One of them is the usability design of users. For example, one of the problems in the design of OpenStack itself is that its user base is a virtual manager, and the cloud management platform is a software system designed for real users. These are two design ideas and two directions of OpenStack.
Operation and maintenance management
After customers entered the cloud world, the operation and maintenance management has also undergone earth-shaking changes. In the past, the customer's business system was relatively isolated, and basically maintained high availability forms such as HA, stable host, network and storage, which could be automatically discovered and monitored through API and ports. At present, most cloud management platforms adopt SDE architecture. Under this architecture, the hardware is relatively standardized and simplified, such as X86 servers, more like household appliances. In the process of manipulating hardware products, software becomes more important. From supporting the virtual machine core to storage to networking, it is all realized by software, which will inevitably make customers deal with operating system and even kernel parameters. In the horizontal direction, the overall health of network, storage and computing should be considered. For example, we have encountered a case in which the virtual machine was lost in a project. It can be traced back to the problem that the long connection conference card and the network card in OPENSTACK are the low version of the Intel network card driver that comes with Redhat. Finally, the upgraded version solved the problem. Not counting these, we should also consider the compatibility and compatibility with the hardware. We have also encountered repeated shutdowns in the project and finally locked the hardware microcode.
Of course, some people will say that the cloud platform we build should be highly available for HA. Yes, it is very important from the architecture design and function of the cloud platform. We must consider all levels of high availability in the design, from storage to OPENSTACK to cloud management platform. All levels of high availability can achieve the stability of the whole platform through monitoring software. And in the actual production process, it is very important for VM to support online migration. For example, a resource pool supports microcode upgrade of all physical machines, but there are already many key production businesses running. In order to upgrade the physical machine without affecting the business, in the cloud platform part, because it supports high availability, stopping the service of a physical machine will not affect the operation of the cloud platform, so it can be upgraded iteratively. In the computing node, because it supports VM online migration, it can also be iteratively upgraded. Of course, these underlying operation and maintenance scenarios are only a part, and some of them are the integration of business and middleware. We have also encountered repeated packet loss of only one virtual machine, and the final investigation is an application problem.
Therefore, cloud service is a big container. This big container can't be said to be a pressure cooker, but the pressure is not small. When applied to today's cloud operation and maintenance services, the whole tool chain system is constantly evolving. At present, the tools commonly used by our cloud operation and maintenance team are open source and commercial, among which saltstack, nagios, elk and lpar2rrd are all open source. There are itm, powershellcli, and finally you have to write your own scripts. If customers want to run the cloud well by themselves, on the one hand, they should integrate the cloud management platform and these monitoring tools, on the other hand, they should also invest their time in learning and research. Compared with traditional integrated operation and maintenance, cloud operation and maintenance is 1. Cloud operation and maintenance not only pays attention to hardware and physical layer, but also pays attention to the whole system and application layer. 2. Cloud operation and maintenance needs more automatic deployment of software and systems. 3. Manage applications in the form of life cycle, such as automatic software update. 4 Load balancing and automatic capacity expansion keep the best input-output ratio of private cloud. 5. Reduce operation and maintenance costs.
So from the current project, customers are more concerned about the application layer of VMware-based cloud platform. For the cloud platform built by OPENSTACK, customers should rely more on service providers, or gradually realize operation and maintenance by integrating their own tools and resources. At present, the more fashionable mode is to provide customers with management cloud services. Managing cloud services is a housekeeper cloud service, and professional people do professional things. Butler cloud service helps customers maintain and operate a stable OpenStack infrastructure by providing professional management operation and maintenance tools or management services. This kind of business can completely liberate customers, and the professional team has a professional knowledge base and personnel sharing. Of course, this business model sacrifices complete customization, because the customer's hardware system and even software, especially OpenStack, must be managed by the operation and maintenance vendor, and the management service vendor must remotely monitor the operation and maintenance to achieve a certain level of service and response.
cloud security
Cloud computing manages data centrally through resource pool mode, which is more secure than traditional data distributed on a large number of terminals. Due to the concentration of data, security audit, security assessment and security operation and maintenance are simpler, and it is easier to realize system fault tolerance, high availability, redundancy and disaster tolerance. However, the traditional IT security threats still exist, so the traditional security protection scheme can still play a certain role.
The security of cloud platform can be divided into two levels: management and technology. First of all, technically, we should carry out comprehensive protection from physical infrastructure, virtualization layer, network, system, application, data and other levels on the basis of the idea of stratification and the division of security domains. Secondly, in terms of management, it is necessary to manage the cloud platform, cloud service, cloud data life cycle, security events, operation and maintenance monitoring, measurement and evaluation. Considering the changes and risks brought by cloud computing, a secure cloud is constructed from the perspective of ensuring the overall security of the system. In addition to infrastructure security, virtualization layer security, virtual network boundary security, host security, application security, data confidentiality and leakage prevention, we should also pay attention to the requirements of security operation and maintenance management, law and compliance.
At present, most of the projects we have implemented are to let customers add multi-application and virtualization layer security control to the traditional network and security protection. Because open source software, especially application software, is widely used now, vulnerability scanning patch management must be carried out before the business goes online. After the physical metal logic in cloud computing joins the cloud platform, protection software needs to be installed simultaneously to avoid streaking on the Internet side. Generally speaking, the security problem of cloud computing is ubiquitous and severe. The best way is that security devices can form a pooled resource pool like storage devices, and when users apply for cloud servers, they can be allocated to users as needed together with computing resources and storage resources. At present, in the public cloud market, some cloud service providers have delivered security as a basic attribute to users. When users buy cloud computing services, they get secure ECS, CDN, RDS and OSS. I believe that in the near future, all kinds of pooled security resources will also be used in private cloud environment. By then, security will also become one of many Lego building blocks for building cloud services, which you can choose and control. It is limited by your security needs and imagination.
abstract
Having said so much above, I actually want to share with you some ideas of doing projects for so many years. Building a good cloud is like building an engineering work. You must design the drawings from the beginning. With the drawings, you can choose to come by yourself or directly purchase the appropriate cloud service. Because the market competition is fierce and there is no clear specification, it is difficult for customers to choose.
My personal suggestion is to look at it in categories and separate cloud management from the bottom. Cloud management can ultimately achieve results through customization and project step by step, but once the bottom layer is replaced again, it will be very time-consuming and laborious. At the same time, the matching degree between the bottom layer and operation and maintenance is also very important. An unmanageable bottom layer that is inconvenient for monitoring and management is tantamount to a complete black hole. So the bottom layer that can provide good management and even management tools will become very valuable. After talking about this, it is actually an inspection team. It is actually difficult for a vendor's cloud construction team to be omnipotent. Because there are various conditions, such as skill level, customer site complexity, product maturity and so on. It is difficult for you to get a carpenter to lay bricks well. Of course, OpenStack's implementation service requirements are already developing in this direction. To solve problems comprehensively, the service and support system is also very important. In fact, it is normal to have problems and problems in the early stage of cloudization. You have built a highly available business environment with a bunch of open and relatively cheap software and hardware. No matter from the point of view of pressure and ability, failures and problems are normal, so recognizing this problem, establishing a support service team or purchasing related services as soon as possible can greatly improve the customer experience. Personally, I think that the changes brought by SDE will make the software more complex and open, and the business model will gradually transition to service-oriented. The era of trying to rule the world through personal versions is slowly dying out. Therefore, the core of buying a cloud, whether it is a public cloud or a private cloud, is the purchased service, the end user buys and uses the service, and the operator buys the service from the manufacturer. Therefore, hosted private cloud will develop in an all-round way under the framework of open source and open system.