Challenges to digging deeper into genetic big data

Challenges to deep mining of genetic big data

As a new type of genetic testing technology, gene sequencing can analyze and measure the entire sequence of genes from blood or saliva, and predict the likelihood of developing various diseases, the behavioral characteristics of an individual, and the reasonableness of his or her behavior. Gene sequencing technology can target individual disease genes for early prevention and treatment. It is for this reason that the listing of UBM this year has triggered an enthusiastic pursuit by the capital market.

At the Fourth National Functional Genomics Summit Forum held in Beijing recently, many experts exchanged views on the development direction of gene technology and the opportunities and challenges it faces.

Gene sequencing has a wide range of uses

Currently, gene sequencing-related products and technologies have evolved from laboratory research to clinical applications. Some scholars even believe that gene sequencing technology may be the next world-changing technology, as it has an irreplaceable role in the natural world and even the human world.

In May this year, a joint research team led by the Kunming Institute of Botany of the Chinese Academy of Sciences (CAS) has overcome the problem of sequencing the tea tree genome through a series of key technologies such as gene setup libraries and sequencing, taking the lead in the international arena in obtaining a high-quality tea tree genome sequence.

Gao Lizhi, a researcher at the Kunming Institute of Botany of the Chinese Academy of Sciences, admitted that this is an important contribution to the revelation of the genetic basis for determining the fitness, flavor, and quality of tea, as well as the global ecological adaptability of the tea tree.

Then again, for example, Zhang Xianlong's team at Huazhong Agricultural University resequenced the whole genome of both cultivated and wild varieties of cotton, and found that there is an obvious process of asymmetric selection of subgenomes during artificial selection in cotton. "More than 10 years of functional genomic research has identified more than 20 genes related to the formation of important traits, which will play an important role in cotton molecular design breeding." Wang Maojun, a member of Zhang Xianlong's team, told China Science Daily.

Gene sequencing also plays an important role in the development of human medicine. Chen Runsheng, a researcher at the Institute of Biophysics of the Chinese Academy of Sciences and an academician of the Chinese Academy of Sciences, said that precision medicine based on histological big data, as an epoch-making industry, has been included in the strategic planning by various countries. It has the potential to directly address many of the difficulties currently facing the healthcare industry, and will see explosive growth in the next few years, with the global market expected to reach $223.8 billion by 2018.

The era of genetic big data opens

Zheng Hongkun, former head of UW Genetics' science and technology services and chairman of Beijing Baimaker Biotechnology Co. Ltd, pointed out that, with the continuous development of gene sequencing technology and a significant drop in costs, as well as the country's strong support and investment in the field of genetic research, nowadays scientists are getting more and more in-depth in the field of genes, the accumulation of genetic big data more and more, "the world has spent tens of billions of dollars cumulatively, has produced nearly 20Pb of massive genetic data."

"The development of sequencing technology has allowed genetic data to accumulate at a rate far exceeding Moore's Law, and the massive amount of data puts new demands on researchers." Zhang Zhang, a researcher at the Beijing Institute of Genomics, Chinese Academy of Sciences, said.

Chang Zhang introduced, according to incomplete statistics, China's life histology data production accounted for about 40% of the world, but these valuable data resources are handed over to others to manage, the main reason is that China's long-term lack of bio-big data centers covering multi-histology data resources. To this end, the Chinese Academy of Sciences, Beijing Institute of genomics life and health big data center around the national precision medicine and important strategic biological resources of the histological data, the establishment of massive life histology big data storage, integration and mining analysis and research system, and has initially built a life and health multi-histology data confluence and **** enjoyment platform.

Urgent need for deep excavation and scientific interpretation

Compared with foreign countries, the current domestic genomics, gene sequencing advancement speed is not slow. From the academic point of view, the Chinese Academy of Sciences, Beijing Institute of Genomics, the Chinese Academy of Agricultural Sciences, Institute of Genomics and other institutions are strong, and a number of companies engaged in gene sequencing, such as Huada Genetics, Baimaike, and other related enterprises are also gradually growing. However, in the view of experts, the challenges facing genomics are still not small, because with the rapid development of various fields such as information and instruments, the total amount of data is getting more and more complicated, together with the addition of various new indicators and parameters.

"In the face of massive sequencing results, the serious challenges in data deep mining and interpretation are becoming increasingly obvious. How to utilize these data resources in the era of genetic big data has become an important issue in the new era of biological research." Zheng Hongkun said.

Chen Runsheng also pointed out that at present, the rapidly accumulating data have not been efficiently interpreted; the integration between highly heterogeneous data is still in its infancy. The challenges at the sample end directly threaten data quality. But he also said, "These challenges often mean opportunities, and the large amount of uninterpreted data also brings the possibility of unlimited innovation."