0

I am trying to apply geom_tile or heatmap indistinctly, but the results when I apply them are completely different.

I think that I understand why, I think that it is because the units for the different variable are different between them. So, while heatmap function understands that and only compares whit the same variable in the same column, geom_tile requires that all the variables including into the dataset will be expressed in the same unit.

1) Am I wrong with my assumption? 2) There is a way to use geom_tile and obtain the same result generated by heatmap?

Example using heatmap function:

library(ggplot2)
library(RColorBrewer)
library(readr)
url_soccer <-  'https://raw.githubusercontent.com/frm1789/soccer_ea/master/Example_Data_Matrix_heatmap.csv'

df_matrix <- read_csv(url_soccer)
# Order data for titles
df_matrix <- df_matrix[order(df_matrix$Titles, decreasing = FALSE),]
df_matrix <- data.frame(df_matrix)

#removing names of the teams.
row.names(df_matrix) <- df_matrix$Team
df_matrix <- df_matrix[,-1]

options(digits=2)
df_matrix$Points_1 <- sub(',', '.', df_matrix$Points_1)
df_matrix$Points_1 <- as.double(df_matrix$Points_1)

# transformation to numeric for column "Performance"
df_matrix$Performance = 
substr(df_matrix$Performance,1,nchar(df_matrix$Performance)-1)
df_matrix$Performance <- sub(',', '.', df_matrix$Performance)
df_matrix$Performance <- as.double(df_matrix$Performance)
df_matrix$Performance <- log(df_matrix$Performance)

small_matrix <- data.matrix(df_matrix)

# Creation of heatmap
america_heatmap <- heatmap(small_matrix, Rowv=NA, 
                       Colv=NA, col = brewer.pal(9, "Blues"), 
scale="column", 
                       margins=c(2,6))

Result using heatmap

Example using geom_tile function:

url_soccer 'https://raw.githubusercontent.com/frm1789/soccer_ea/master/Example_Data_format_ggplot_geom_tile.csv'

df_exa <- read_csv(url_soccer)
ggplot(data = df_exa, aes(x = df_exa$country, y = df_exa$metric)) +
geom_tile(aes(fill = df_exa$value)) +
coord_flip()+ 
theme_minimal()

Result obtained using geom_tile

A89
  • 89
  • 9
  • `heatmap()` does alot of the ordering for you. you have to do that deliberately in ggplot2 – hrbrmstr Nov 24 '18 at 02:28
  • @hrbrmstr I didn't know that. But, Am I right about that heatmap let you compare different variables with different units and geom_tile requires an unique unit for all the dataset? – A89 Nov 24 '18 at 05:29

1 Answers1

0

Point 1) The initial assumption is correct.

In this example, for this dataset, it is not posible to use geom_tile, because the way that geom_tile works is to divide all the data into smaller rectangles or squares. Each of the smaller rectangles is called a tile. There is no parameter to consider different scales, for columns or rows, because geom_tile assumes all the dataset is expressed in the same unit.

In this example we have variables expressed in different units like goals, performance, points, and there is no relationship between them.

On the other hand, heatmap allow you to use the parameter "scale", and for this case we are using scale= "column", indicating that the values should be scaled for each column.

Point 2) There is a way to do that: Heat map per column with ggplot2

A89
  • 89
  • 9