Search This Blog

Sunday, March 23, 2014

R (16) : Order of legend entries in ggplot2



I'm struggling get the right ordering of variables in a graph I made with ggplot2 in R.
Suppose I have a dataframe such as:
set.seed(1234)
my_df<- data.frame(matrix(0,8,4))
names(my_df) <- c("year", "variable", "value", "vartype")
my_df$year <- rep(2006:2007)
my_df$variable <- c(rep("VX",2),rep("VB",2),rep("VZ",2),rep("VD",2))
my_df$value <- runif(8, 5,10) 
my_df$vartype<- c(rep("TA",4), rep("TB",4))
which yields the following table:
  year variable    value vartype
1 2006       VX 5.568517      TA
2 2007       VX 8.111497      TA
3 2006       VB 8.046374      TA
4 2007       VB 8.116897      TA
5 2006       VZ 9.304577      TB
6 2007       VZ 8.201553      TB
7 2006       VD 5.047479      TB
8 2007       VD 6.162753      TB
There are four variables (VX, VB, VZ and VD), belonging to two groups of variable types, (TA and TB).
I would like to plot the values as horizontal bars on the y axis, ordered vertically first by variable groups and then by variable names, faceted by year, with values on the x axis and fill colour corresponding to variable group. (i.e. in this simplified example, the order should be, top to bottom, VB, VX, VD, VZ)
1) My first attempt has been to try the following:
ggplot(my_df,        
    aes(x=variable, y=value, fill=vartype, order=vartype)) +
       # adding or removing the aesthetic "order=vartype" doesn't change anything
     geom_bar() + 
     facet_grid(. ~ year) + 
     coord_flip()
However, the variables are listed in reverse alphabetical order, but not by vartype : the order=vartype aesthetic is ignored.
enter image description here
2) Following an answer to a similar question I posted yesterday, i tried the following, based on the post Order Bars in ggplot2 bar graph :
my_df$variable <- factor(
  my_df$variable, 
  levels=rev(sort(unique(my_df$variable))), 
  ordered=TRUE
)
This approach does gets the variables in vertical alphabetical order in the plot, but ignores the fact that the variables should be ordered first by variable goups (with TA-variables on top and TB-variables below).
enter image description here
3) The following gives the same as 2 (above):
my_df$vartype <- factor(
  my_df$vartype, 
  levels=sort(unique(my_df$vartype)), 
  ordered=TRUE
)
... which has the same issues as the first approach (variables listed in reverse alphabetical order, groups ignored)
4) another approach, based on the original answer to Order Bars in ggplot2 bar graph , also gives the same plat as 2, above
my_df <- within(my_df, 
                vartype <- factor(vartype, 
                levels=names(sort(table(vartype),
                decreasing=TRUE)))
                ) 
I'm puzzled by the fact that, despite several approaches, the aesthetic order=vartype is ignored. Still, it seems to work in an unrelated problem: http://learnr.wordpress.com/2010/03/23/ggplot2-changing-the-default-order-of-legend-labels-and-stacking-of-data/
I hope that the problem is clear and welcome any suggestions.
Matteo
I posted a similar question yesterday, but, unfortunately I made several mistakes when descrbing the problem and providing a reproducible example. I've listened to several suggestions since, and thoroughly searched stakoverflow for similar question and applied, to the best of my knowledge, every suggested combination of solutions, to no avail. I'm posting the question again hoping to be able to solve my issue and, hopefully, be helpful to others.
share|edit

    
    
It's not a duplicate of stackoverflow.com/q/5208679/602276 . Please read the question carefully. –  MatteoS Sep 4 '11 at 13:48
    
It is indeed the same question. You need to specify the levels of your factor in the order that you want them in your plot. The linked answer tells you how to do that. –  Andrie Sep 4 '11 at 13:54
1  
+1 for learning to provide reproducible code. –  Roman Luštrik Sep 4 '11 at 13:58
2  
More generally, I believe there is an issue related to coord_flip() when ordering variables. In my original data frame (not the one shown above), the order of groups in the legend is correct and corresponds to that of the dataframe, but the vertical order of variables is upside-down. (although the plot is conceptually different, the issue is similar to this learnr.files.wordpress.com/2010/03/… ). As far as I can see, this is something beyond an order issue of the dataframe, but an issue concerning the order reversal in ggplot2, possibly related to coord_flip. –  MatteoS Sep 4 '11 at 14:41
show 11 more comments

1 Answer


This has little to do with ggplot, but is instead a question about generating an ordering of variables to use to reorder the levels of a factor. Here is your data, implemented using the various functions to better effect:
set.seed(1234)
df2 <- data.frame(year = rep(2006:2007), 
                  variable = rep(c("VX","VB","VZ","VD"), each = 2),
                  value = runif(8, 5,10),
                  vartype = rep(c("TA","TB"), each = 4))
Note that this way variable and vartype are factors. If they aren't factors, ggplot() will coerce them and then you get left with alphabetical ordering. I have said this before and will no doubt say it again; get your data into the correct format first before you start plotting / doing data analysis.
You want the following ordering:
> with(df2, order(vartype, variable))
[1] 3 4 1 2 7 8 5 6
where you should note that we get the ordering by vartype first and only then by variable within the levels of vartype. If we use this to reorder the levels of variable we get:
> with(df2, reorder(variable, order(vartype, variable)))
[1] VX VX VB VB VZ VZ VD VD
attr(,"scores")
 VB  VD  VX  VZ 
1.5 5.5 3.5 7.5 
Levels: VB VX VD VZ
(ignore the attr(,"scores") bit and focus on the Levels). This has the right ordering, but ggplot() will draw them bottom to top and you wanted top to bottom. I'm not sufficiently familiar with ggplot() to know if this can be controlled, so we will also need to reverse the ordering using decreasing = TRUE in the call to order().
Putting this all together we have:
## reorder `variable` on `variable` within `vartype`
df3 <- transform(df2, variable = reorder(variable, order(vartype, variable,
                                                         decreasing = TRUE)))
Which when used with your plotting code:
ggplot(df3, aes(x=variable, y=value, fill=vartype)) +
       geom_bar() + 
       facet_grid(. ~ year) + 
       coord_flip()
produces this:
reordered barplot
share|edit

1  
I thank you for your solution! It works. However, i've also found, with a thorough search, that my original issue is a particular case of a common nuisance when using coord_flip(). –  MatteoS Sep 4 '11 at 15:38
1  
@MatteoS Do you understand now why people felt this was another duplicate? The answer is the same - reorder the levels of the factor in the order you want them. The issue here was how to derive that ordering. All the ggplot code was superfluous and distracting. It does help to boil problems down to their base level and also tell us exactly what you want. Andrie's Answer was almost spot on until you happened to mention in comments you didn't want to enter the ordering by hand. –  Gavin Simpson Sep 4 '11 at 15:43
2  
Now I see, but ggplot2 is the issue here. With coord_flip(), the axis are flipped, the variables that are originally ordered L-> R are then ordered B -> T, while the legend does not match them. –  MatteoS Sep 4 '11 at 15:44
1  
@MatteoS Ask away, but I don't see the need for this given the general solution of getting the factor levels in the order you want. –  Gavin Simpson Sep 4 '11 at 17:56
3  
@MatteoS scale_fill_discrete(guide = guide_legend(reverse=TRUE)) would be the equivalent for top.down=TRUE to reverse the order in legend. –  mlt Dec 6 '12 at 6:16
show 12 more comments

1 comment:

  1. Thanks for sharing your innovative ideas to our vision. I have read your blog and I gathered some new information through your blog. Your blog is really very informative and unique. Keep posting like this. Awaiting for your further update.f you are looking for any R Programming related information, please visit our website R Programming training institute in Bangalore

    ReplyDelete