Nice presentation about search queries:
And amazing article about search experience optimization.
Thursday, October 29, 2015
Wednesday, September 30, 2015
Memory Networks is on GitHub now
Great news! Facebook made Memory Networks project public. Memory Networks is the research project, which implemented kind of human long-term memory.
Talk about Memory Networks:
Tuesday, September 1, 2015
Thursday, July 16, 2015
ICML 2015 Word Cloud
Nice visualisation from Andrew Collier. It is word cloud of 300 most popular words from accepted ICML2015 papers.
Methodology of this word cloud generation: http://www.exegetic.biz/blog/2015/07/constructing-word-cloud-for-icml-2015/
List of presented papers can be found here: http://icml.cc/2015/?page_id=825
Methodology of this word cloud generation: http://www.exegetic.biz/blog/2015/07/constructing-word-cloud-for-icml-2015/
List of presented papers can be found here: http://icml.cc/2015/?page_id=825
Tuesday, July 14, 2015
Recommendation papers from ArXiv
Sometimes you found some idea and wondering, why you are not implemented it earlier.
The idea:The arXiv is a repository of over 1 million preprints in physics, mathematics and computer science. So, it is possible to train recommender for papers and optimize searching process.
Here you can find full description: https://blog.lateral.io/2015/07/harvesting-research-arxiv/
And motivating image:
[source http://physicsbuzz.physicscentral.com/2012/08/risks-and-rewards-of-arxiv-reporting.html]
The idea:The arXiv is a repository of over 1 million preprints in physics, mathematics and computer science. So, it is possible to train recommender for papers and optimize searching process.
Here you can find full description: https://blog.lateral.io/2015/07/harvesting-research-arxiv/
And motivating image:
[source http://physicsbuzz.physicscentral.com/2012/08/risks-and-rewards-of-arxiv-reporting.html]
Monday, May 11, 2015
Emoji natural language processing
In Instagram Engineering Blog you can read about NLP techniques for discovering "context" of emoji
(like this one on the picture). They use word2vec for mapping every emoji to metric space and t-SNE as visualisation tool.
Full texts of articles:
Emojineering Part 1: Machine Learning for Emoji Trends
Emojineering Part 2: Implementing Hashtag Emoji
Emoji Wiki
(like this one on the picture). They use word2vec for mapping every emoji to metric space and t-SNE as visualisation tool.
Full texts of articles:
Emojineering Part 1: Machine Learning for Emoji Trends
Emojineering Part 2: Implementing Hashtag Emoji
Emoji Wiki
Friday, April 17, 2015
Words similarity
Finding words (sentences, documents) with the same meaning is general problem for NLP (Natural Language Processing). Deep learning helps improve this field of science.
For example, word2vec approach helps you derive from text corpus some things with relationship like "man to king" as "women to ?". And "?" should be replaced by "queen". It's amazing stuff. In addition, you can train not just similarity word-to-word but also word-to-sequence of words.
Here is some examples from model which were trained on Google News corpus:
Paper with description:
Distributed Representations of Words and Phrases and their Compositionality"
Open-source implemenatation.
https://code.google.com/p/word2vec/
For example, word2vec approach helps you derive from text corpus some things with relationship like "man to king" as "women to ?". And "?" should be replaced by "queen". It's amazing stuff. In addition, you can train not just similarity word-to-word but also word-to-sequence of words.
Here is some examples from model which were trained on Google News corpus:
Paper with description:
Distributed Representations of Words and Phrases and their Compositionality"
Open-source implemenatation.
https://code.google.com/p/word2vec/
Tuesday, April 7, 2015
Deep Learning for Natural Language Processing
Videos, slides and tutorials fopm Stanford university course
CS224d: Deep Learning for Natural Language Processing can be found here: http://cs224d.stanford.edu/syllabus.html.
CS224d: Deep Learning for Natural Language Processing can be found here: http://cs224d.stanford.edu/syllabus.html.
Thursday, April 2, 2015
Spark for Data Science
In June you can learn with EdX "how to apply data science techniques using parallel programming in Apache Spark to explore big (and small) data". It will be "Introduction to Big Data with Apache Spark" course from Berkeley.
Friday, March 13, 2015
Data Quality Checklist for Process Mining
Nice paper from Fluxicon about basic problems in process mining workflow, how to discover and fix them.
Tuesday, January 20, 2015
Process Mining: Data Science in Action by Coursera
This is my short feedback for Coursera Process Mining course (by Wil van der Aalst from Eindhoven University of Technology)
Name of the course sounds very interesting, but the main task is quite simple.
It's about building behaviour model based on events log(you need to consider overfitting and underfitting). That's mean: your model should explain majority of cases and be general enough for explaining new cases.
Main tools, recommended in the course, are Disco and ProM. They allow building models according to different notations(e.g. BPMN) and making visualisations.
Two main aspects of process mining are organisational and social aspects:
Organisational aspects tasks:
- discover typical workflow actions(for customers, employees, etc)
- analyse of time spent for every tasks
- "bottlenecks" mining
- discover users groups and users relations within process
- analysis of time spending for every worker, customer, etc
Lectors slideshare: http://www.slideshare.net/wvdaalst
Next session: April-May 2015
Tuesday, January 13, 2015
Visualizing Data using t-SNE
If you need solution for visualization of high-dimensional datasets- t-SNE is a great choice.
Video:
Read more about t-SNE
Video:
Read more about t-SNE
Friday, January 9, 2015
Dive into Deep Learning
If you are interested in deep learning - you can try UFLDL (Unsupervised Feature Learning and Deep Learning) tutorial from Stanford University.
If this topic is really new for you - it is better to start from https://www.coursera.org/course/ml. After this session course will change format into self-study.
If this topic is really new for you - it is better to start from https://www.coursera.org/course/ml. After this session course will change format into self-study.
Subscribe to:
Posts (Atom)