Wednesday, September 4, 2013

Poll: R top language for data science three years running

Via R-bloggers:
I am not surprised that R is the top language. I am a little let down by the visualization.
1. It uses the bar (length) to represent a percentage, and compared year over year. The length should be used when we try to visualize a quantity (population). It is understandable that the poll cannot get a accurate number of  user base of each language (only ~700 votes), but the bars implying a growth in number of users, not a change in proportion.
Given the growing interest in "data science", it is likely the user bases for each language are all growing and just more new comers are from the R camp.
2. Bar graphs grouped by languages makes it hard to compare language vs. language.
3. I don't like the color scheme. :)

Also, the poll itself is questionable. The design of questionnaire seems to ask for only one language. However, Many data analysts are using more than one language to handle different tasks, R + python, SQL + SAS, etc.

