Heat map is a data visualization technique, it shows the density and distribution of data in the form of heat map. In a heat map, the darker the color, the more densely populated areas, and the lighter the color, the less densely populated areas.
Here are the steps for drawing a heat map:
1. First, determine the data and areas to be displayed. For example, show the house prices in each area of a city.
2. According to the data to be displayed, the region will be categorized, such as the price of housing is divided into different grades, or sales are divided into different levels. This can clearly show the differences between different regions.
3. According to the categorized data, respectively, to each region to assign a value. For example, convert the average house price or sales rank of each region into a numerical value.
4. According to the value of the range to set the color. Set the color, you need to pay attention to the choice of contrasting color scheme, in order to better distinguish the differences between different values.
5. The colors correspond to the values, fill each area on the map with the corresponding color. This can be achieved with a variety of tools, such as Photoshop, Illustrator, Python, R, etc.
6. Create a legend. Illustrate the correspondence between values and colors in the legend.
7. Check the heat map to make sure that the correspondence between the color and the value is correct and easy to understand, and then you can publish and share it after it is correct.
All in all, heat map belongs to a kind of data visualization technology, can intuitively show the density and distribution of data, is an indispensable tool in the process of data analysis and decision-making.
The most complete heatmap drawing tools and operation process in history (a)
The term heatmap (heat map) is certainly not unfamiliar to you, and is very common in many heavyweight scientific papers. The use of heatmap can easily show the relationship or correlation between multiple components, can also show the difference between before and after the gene expression. heatmap actually contains a lot of analysis of the secret, so how to realize such a high level of heatmap?
Heatmap has a wide range of applications, before introducing the heatmap drawing tools, I will first give you some information about the basic concepts, history, and uses of heatmap.
Heatmap basic concept
Heatmap is a heat map, also known as a heat map, you can use the color changes to reflect the two-dimensional matrix or table of data information, you can intuitively data values of the size of the definition of the color shades. Heat maps allow complex data to be visualized and understood at a glance. The data are usually clustered according to the similarity of abundance between species or samples, and the clustered data are represented on the heatmap, where high and low abundance species are clustered in blocks, and the color gradient and degree of similarity reflect the similarity and difference of the community composition of multiple samples at each taxonomic level. The results are available in rainbow colors and black and red.
There are two types of heatmaps: cluster heatmap and spatial heatmap. In the cluster heat map, the image size is fixed in the cell, formulated as a matrix containing rows and columns. The cell size is adjustable. In a spatial heat map, the size and position are fixed in a space.
Heatmap generation principle can be summarized as follows, first set a radius for the discrete points to create a buffer; and then for each discrete point of the buffer, the use of progressive gray scale band (the complete gray scale band is 0-255), from the inside to the outside, from light to dark fill; because the gray value can be superimposed on the value of the larger is the color of the darker, in the gray scale band appears to be the more white. In fact, you can choose any channel in the ARGB model as a superimposed gray value, so that the buffer crosses the region, can be superimposed gray value, and thus the more the buffer crosses, the larger the gray value, the more the region is "hot". Finally, the superimposed grayscale value as an index, from a 256-color color band in the color mapping, and color recoloring, so as to achieve the heat map.
Grayscale band
Color band
History of heat maps
Heat maps are actually not a new concept, dating back to the 19th century.
Heat maps originated from the display of two-dimensional values in data matrices. Larger values were represented by smaller dark gray or black squares (pixels). In 1873 Lona used a shaded matrix to visualize socially relevant statistics for the Parisian districts. In 1957 Sneath showed the results of cluster analysis, where similar values were placed near each other according to the clusters by replacing the rows and columns of the matrix. Later Jacques Bertin used a similar method to display data on the Gettleman scale by connecting cluster trees to the rows and columns of a data matrix, an idea that came from Robert_ing in 1973, who used printer characters to represent different shades of gray, i.e., a pixel was one character wide. In 1994 Leland ilkinson developed the first computer program (SYSTAT) to produce cluster heat maps for high-resolution color graphics. In 1991, software designer Cormac Kinney registered the trademark "heatmap", inventing a tool to display real-time financial market information in 2D graphics. Today, heatmaps can still be created manually, in Excel spreadsheets or using specialized software like Hotjar.
Four types of heat maps
The first, biological heat maps, are typically used in the context of molecular biology to show the expression levels of many genes in a large number of comparable samples (cells in different states, samples from different patients) obtained from DNA microarrays.
The second, the dendrogram, is a 2D hierarchical partitioning of the data, visually similar to a heat map.
The third, mosaic, is a tiled heat map used to represent a table of data in a bidirectional or higher way, and like a tree map, the rectangular regions embedded in the map are hierarchically composed. Meaning that these areas are rectangular.
The fourth, the density function visualization graph, is used to represent a heat map of the density of the points in the graph, enabling one to perceive the density of the points independently in the zoom parameter. In a method proposed by Perrot et al. in 2015, billions of points can be seen using density functions by using big data infrastructures such as Spark and Hadoop.
Heat maps in various fields of use
Heat maps through the many data points of information, convergence into an intuitive visualization of the color effect, so far heat maps have been widely used in different fields and categories, such as weather forecasting, medical imaging, computer room temperature, and even applied to the field of competitive sports data analysis.
When watching a World Cup soccer match, the judges usually use heat maps to understand the running positions of goalkeepers, defenders, midfielders, and strikers in the championship team, allowing us to see at a glance the differences in the running positions of multiple players in the match.
The Bureau of Meteorology can also use heat maps to determine the location of earthquake epicenters, making it clear which areas are prone to earthquakes (and have the highest frequency).
Combined with Baidu Maps and heat maps, the heat map can be used to see the financial business district, the coordinates of the merchants will be collected, according to the coordinates of the points for clustering. Look at the following chart, the red color indicates the place where there are more businesses, we can know which is the financial business district.
So what use does a heat map provide in the field of biology?
Heat maps are often used to show the expression levels of multiple genes in different samples, and then clustered to look at the ways in which the experimental and control groups are unique.
As shown above, each column represents a sample, each row represents a gene, and the color represents the amount of expression (the legend of this image shows that the redder the color, the larger the value, and the higher the amount of gene expression).
Heat maps can also be used to show the abundance of other substances, such as the relative abundance of a particular bacterium, or the amount of different substances in a metabolome. Of course, another important use of heat maps is to show correlations between different metrics, samples, and so on.
The above figure is a heat map of correlation, the color of which represents the size of the correlation coefficient, the closer the color is to white, the weaker the correlation is, the bluer the correlation is (negative correlation) or the redder the correlation is (positive correlation), and the lighter the color is, the weaker the correlation is. In addition to the correlation coefficient, we also see whether the p-value is significant in the correlation calculation. If the p-value were to be expressed, it would be possible to put an * or a specific value in the box. But since we see in the above graph that the relationship between the two different indicators is repeated twice, sometimes it is enough to show only half (above or below the diagonal) of half of the graph. As shown in the figure below:
Well, this one is here, the next will reveal more about the mystery of the heat map.
How to make a heat data map of China with excel1.The first step is to prepare the vector map before creating the heat map. Take the administrative map of China for example. The administrative map of each province can be edited separately.
2, then make sure Excel has enabled macros before enabling and adding the "Developer Tools" menu. "Development Tools" menu can be added as: "File" - "Options" - "Customize the Ribbon" - the main tab - check "Developer Tools".
3. Add the completion of the return to the Excel page, the menu bar will appear in the "Development Tools" menu.
4. Then create a temporary storage of various types of data in the cell area. Please note that you should contain three values: the region name, and the region corresponding to the results of the data and the region corresponding to the color. You can choose any empty cell to place it.
5. Next, rename each region or cell by selecting the appropriate region or cell, and then rename it. For example, select $ J $ 3 cell, and then in the name box type "Actreg", and then press Enter. At this point, you can rename all the following areas.
6, and then the first cell, named: Actreg, used to temporarily store the "current region" phonetic name, the second cell named: ActregValue, used to temporarily store the "current region" instructions. value.
7. Next, the need to set $ J $ 4 cells and $ J $ 5 cells of the formula, as follows:
$ $ _ 4: = VLOOKUP (ACTREG, REGDATA, 2, 0)
8. Then must be in the "Development Tools" menu Then you must insert the "Button" in the "Developer Tools" menu, and note that a button is inserted above the map.
9, then select the button and double-click on it, this time to open the VisualBasic editor and enter the following code:
PrivateSubCommandButton1_Click()
For the range i=4 to 34
("ActReg"). "). Value = range("sheet1!b"&i).Value
(range("ACTREG"). Value). Select
=Range(Range("ActRegCode"). Value).
10. Finally save and close the VBA editor, return to the Excel interface, just click the button to fill the map according to the range of values in each region to complete the corresponding color.