Friday, November 28, 2014

Helping Santa's Helpers Kaggle Competition

You can help elves in Santa Workshop pack toys in most efficient way and win $20,000.
Only 40 days left!
More about  compettition: http://www.kaggle.com/c/helping-santas-helpers

Friday, November 21, 2014

What is Spark?

Here you can read a bit old, but great article from IBM Developers.

Wednesday, November 12, 2014

Plotting multiple graphs on one page

R + ggplot2 is my favourite tools for building plots.
Today I need to have few graphs on one page.
Solution was found.

You can easily build plots, like that:


Wednesday, November 5, 2014

Coursera: Mining Massive Datasets

Extremely useful course for data scientist  - Mining Massive Datasets by by Jure Leskovec, Anand Rajaraman and Jeff Ullman.

Course Syllabus

Week 1:
MapReduce
Link Analysis -- PageRank

Week 2:
Locality-Sensitive Hashing -- Basics + Applications
Distance Measures
Nearest Neighbors
Frequent Itemsets

Week 3:
Data Stream Mining
Analysis of Large Graphs

Week 4:
Recommender Systems
Dimensionality Reduction

Week 5:
Clustering
Computational Advertising

Week 6:
Support-Vector Machines
Decision Trees
MapReduce Algorithms

Week 7:
More About Link Analysis --  Topic-specific PageRank, Link Spam.
More About Locality-Sensitive Hashing

In addition, you can buy or download for free Mining Massive Datasets book from Mining Massive Datasets web-page .