Friday, August 30, 2013

Plot Weekly or Monthly Totals in R

Via R-bloggers
library(ggplot2)
library(scales)

# load data:
log <- span=""> data.frame(Date = c("2013/05/25","2013/05/28","2013/05/31","2013/06/01","2013/06/02","2013/06/05","2013/06/07"), 
  Quantity = c(9,1,15,4,5,17,18))
log
str(log)

> log
        Date Quantity
1 2013/05/25        9
2 2013/05/28        1
3 2013/05/31       15
4 2013/06/01        4
5 2013/06/02        5
6 2013/06/05       17
7 2013/06/07       18

> str(log)
'data.frame': 7 obs. of  2 variables:
 $ Date    : Factor w/ 7 levels "2013/05/25","2013/05/28",..: 1 2 3 4 5 6 7
 $ Quantity: num  9 1 15 4 5 17 18
# convert date variable from factor to date format:
log$Date <- span=""> as.Date(log$Date,
  "%Y/%m/%d") # tabulate all the options here
str(log)
> str(log)
'data.frame': 7 obs. of  2 variables:
 $ Date    : Date, format: "2013-05-25" "2013-05-28" ...
 $ Quantity: num  9 1 15 4 5 17 18
# create variables of the week and month of each observation:
log$Month <- span=""> as.Date(cut(log$Date,
  breaks = "month"))
log$Week <- span=""> as.Date(cut(log$Date,
  breaks = "week",
  start.on.monday = FALSE)) # changes weekly break point to Sunday
log

> log
        Date Quantity      Month       Week
1 2013-05-25        9 2013-05-01 2013-05-19
2 2013-05-28        1 2013-05-01 2013-05-26
3 2013-05-31       15 2013-05-01 2013-05-26
4 2013-06-01        4 2013-06-01 2013-05-26
5 2013-06-02        5 2013-06-01 2013-06-02
6 2013-06-05       17 2013-06-01 2013-06-02
7 2013-06-07       18 2013-06-01 2013-06-02
# graph by month:
ggplot(data = log,
  aes(Month, Quantity)) +
  stat_summary(fun.y = sum, # adds up all observations for the month
    geom = "bar") + # or "line"
  scale_x_date(
    labels = date_format("%Y-%m"),
    breaks = "1 month") # custom x-axis labels
# graph by week:
ggplot(data = log,
  aes(Week, Quantity)) +
  stat_summary(fun.y = sum, # adds up all observations for the week
    geom = "bar") + # or "line"
  scale_x_date(
    labels = date_format("%Y-%m-%d"),
    breaks = "1 week") # custom x-axis labels

Tuesday, August 27, 2013

Moved from old blog: Eric Schmidt on Android history

Google CEO: Android Predated iPhone

A wise move, considering what the company’s prototype Android handset looked like before the debut of the iPhone, and what the first Android smartphone — the HTC Dream — looked like when it finally arrived at market.
And add “what today’s Android phones look like”.
Android_before_after_iphone

Monday, August 26, 2013

Data Scientists vs. Statisticians

According to Harvard Business Review, data scientist is the sexiest job of the 21st century.

Nate Silver, the rockstar of statisticians/data scientists, said "Data scientist is just a sexed up word for statistician" and of course statisticians love this but data scientists didn't necessarily agree. Some want us all just get along. And sure we need a Venn diagram:

Two sorts of frankness in a blog

Via Andrew Gelman: obnoxiousness and openness.
Freedom to offend others and expose one’s own uncertainty/weakness. I agree with Andrew that even though we see feature 1 more often in blogosphere, and feature 2 is important too. I would add that expose one’s own uncertainty voluntarily, even in blogs, takes courage.

Excel Error, Again!

Via Businessinsider:
Bombshell Paper Claims That Microsoft Excel Coding Error Is Behind The Reinhart-Rogoff Study On Debt


Coding Error. As Herndon-Ash-Pollin puts it: "A coding error in the RR working spreadsheet entirely excludes five countries, Australia, Austria, Belgium, Canada, and Denmark, from the analysis. [Reinhart-Rogoff] averaged cells in lines 30 to 44 instead of lines 30 to 49...

Lessons learned?
1. Excel is a terrible tool for data analysis.
2. Reproducible research needs more advocates! Like Biostatistics Ryan Gosling?


In excel, the formulas are hidden unless you chose the cells. This error is pretty straight forward. Imagine real "code" that involve VBA programming, or just several layers of formula, then the error checking is next to impossible.

Folks, please learn R! even better learn Knitr!