Resources

Note: for nontopicspecific resources, you can add them below with topic 'General'
Relevance scores: 5 = very relevant; 1 = slightly relevant
Click the column headers to sort the table.
Topic Medium ref/link to Material Difficulty Relevance Summary Big Data Article TechCrunch Low 2 Article about organizations leveraging big data systems. Discusses pitfalls. Big Data Blog LinkedIn Low 2 The 10 business areas where big data is receiving the greatest buyin, according to Bernard Marr. Big Data Blog LinkedIn Low 2 The 4 data layers of a big data system, according to Bernard Marr. Big Data Article McKinsey Low 2 Short article about the personnel firms should deploy to harness 'big data.' Big Data Document Buyer's Guide Low 3 Comprehensive guide to how an organization should go about selecting a 'big data' analytics solution. Note: written by Datameer, a solution vendor. Big Data Document Datameer Low 2 Top five high impact use cases for big data analytics, according to one of the vendors. Big Data Video Intel Low 2 Introductory video to 'Big Data' and links to further resources from Intel. Big Data Web Low 2 Google offer various big data solutions on their cloud platform, in particular 'BigQuery' which allows fast queries on enormous datasets. Big Data Article Information Age Low 2 Arguing that 'big data' (eg. billions of rows) is not necessarily as important as 'smart data' (eg. a few million rows, and a well trained machine learning algorithm). Big Data Article McKinsey Low 2 Getting big impact from big data. Big Data Article TechCrunch Low 2 Article about various big data companies. Talks about the lag in being able to capitalize on big data technology. Big Data Web MIT Low 2 Research into a compression algorithm for very large datasets  reduces the overall number of records while guaranteeing that mathematical/informational properties remain the same. Big Data Book Book description Medium 3 Practical introducton to big data for business with case studies from the author's experience. By Bernard Marr. Big Data Document Kimball Group Medium 3 Describes strategic best practices for a firm's big data policies (management, architecture, modelling, data governance). Big Data Discussion Quora Medium 3 How the various frequently mentioned frameworks in the big data field all fit together into a technology stack. General University Course Coursera Low 5 10part Johns Hopkins class
Free or certificate ($2949/section), 49 hrs work per week.General Podcast Data Skeptic Low 3 Weekly podcast including interviews with data science processionals. Covers hot topics from a skeptical/critical perspective. Also features 'mini' episodes that cover fundamental concepts like regression. Website has accompanying notes. General Article TechRepublic Low 1 Piece about managing data science staff. General Article TechRepublic Low 2 Extols the benefits of pair programming for data science projects (eg. two people, generalist + specialist, working at same PC). General Article TechCrunch Low 3 Scoop on what services secretive startup Palantir has been offering to the public, finance and legal sectors. Seems like their software is very good at connecting existing data sets and providing a 'revolutionary' interface to users. General Article The Verge Low 2 How with the advent of science, anonymity does not guarantee privacy (eg. anonymized Netflix film scores, released to the public, were compared against nonanonymous IMDB reviews, and the correlations were enough to reveal Netflix user identities. Netflix got sued.) General Article Medium Low 3 An investor has mapped out the landscape of 'Machine Intelligence' startups and created a taxonomy of where each company falls across a few distinct categories. Interesting article to go with the taxonomy/diagram. General Book PDF Low 5 An Introduction to Data Science  free textbook, a gentle introduction to the topic, including tutorials for getting started with R, plus questions/exercises General Article kdnuggets Low 3 As data science has spread through the mainstream, so too has a dense vocabulary of illdefined jargon. Article offers several perspectives on many of data science's most confused terms. General Article kdnuggets Low 2 List of the 'top 30 people' in Big Data and analytics General Article kdnuggets Low 3 How predictive analytics reinvents industries. Links to case studies. General Article LinkedIn Low 2 Article about IBM Verse, a new email platform which applies data science techniques and leverages Watson. General Magazine Analytics Low 3 Online magazine about data analytics. General Article Popular Mechanics Low 3 Article about a Google program that learns how to play any Atari game with no prior knowledge of the rules (it just needs a pointsreward system in the game to give it feedback). Uses a combination neural network learning and something called 'Qlearning'. Full paper was published in Nature. General Tutorial Data Science Masters Low 4 Comprehensive list of skills, tutorials and open source courses in data science General Web Quora Low 5 Data Science FAQ/roundup of Quora content (a question and answer website). General Blog John Foreman Low 2 'The $30 data scientist'  article about the various levels of expertise in 'data science' and what they bring to an organization General Web Computer World Low 3 3 part article about the data science ecosystem. Explains how it evolved and what tools have emerged for supporting each stage in the process of data storage, data wrangling and data applications. General Web Deloitte Low 1 On the need within organizations for 'analytic translators' or 'lite quants' who can focus on the writeup and communication of data science results. General Newsletter Data Elixir Medium 3 updated once a week with data science articles and resources from around the web General Article kdnuggets Medium 3 Round up of influential research papers on data science. General Web Articles Varies 3 News portal on emerging trends and solutions in big data.
Filter for Financial Sector articlesGeneral Q&A Reddit Varies 2 Reddit AMA ('askmeanything') hosted by prominent data scientist Andrew Ng. Machine Learning Paper Google High 4 A neural network framework to monitor and optimize operations in data centres. Machine Learning Datasets UCI High 4 Repository of datasets for use in machine learning. Machine Learning Web Google High 2 Site that hosts research publications from Google's 'Deep Mind' research group. Machine Learning Blog Laura D Hamilton Low 3 Fighting fraud with an open source machine learning tool called repsheet. Machine Learning Blog Laura D Hamilton Low 3 Speculation about the proprietary technology Google uses for spam blogs filtering/detection. Machine Learning Blog Laura D Hamilton Low 3 Ten surprising machine learning applications. Machine Learning Web IBM Research Low 2 Predictions on what breakthroughs machine learning will bring to our lifestyles over the next 5 years. Machine Learning Article Datafloq Low 3 Introductory piece on the different flavours of machine learning and their potential to change businesses. Machine Learning Web Wikipedia Low 5 Good overview of the machine learning field with links to more in depth articles. Machine Learning Article Forbes Low 2 Six novel machine learning applications. Machine Learning University course Coursera Low 5 Andrew Ng (Stanford / Baidu) lecture series. Sign up for the course here. Machine Learning Article BusinessInsider Low 2 Personality profiling using Facebook 'likes' to train the algorithm. Machine Learning Article Microsoft Low 4 How an elevator operator/company used Microsoft's Azure platform to dramatically increase uptime. Lots of sensors feeding into a realtime monitoring system which had the ability to predict problems and recommend preemptive service/maintenance. Machine Learning Video YouTube Low 3 'How machines learn.' Good talk from Peter Norvig (Google's director of research) on how Google conquers machine learning problems by harnessing the vast quantities of data it collects from indexing the internet for search. Not too technical. Machine Learning Tutorial Kaggle Low 4 An introduction to machine learning with scikit learn  video series about doing ML with the scikit Python library. Machine Learning Web Phys.org Low 3 On 'probabilistic programming'  a new approach to programming that makes it easy to model a problem probabilistically and explore it through experimentation. Machine Learning Web Atomic Object Low 5 An introduction to the meanshift clustering algorithm Machine Learning Slides BNP Paribas Medium 2 Presentation about using machine learning methods for risk monitoring and investment recommendations in quantitative finance. Machine Learning Blog Tim Hopper Medium 2 Discusses the common ground between the fields of operations research and machine learning: optimization. More interesting stuff on this if you google search.. Machine Learning Book Book description Medium 5 Book about semisupervised learning. A technique which is good when you have inadequately labelled data. Machine Learning Book PDF Medium 5 An Introduction to Statistical Learning with Applications in R. Full book and supporting material online. Machine Learning Slides Harvard Medium 3 Research project using image recognition and machine learning to monitor physical lecture attendance and then draw conclusions from resulting statistics. Machine Learning Book Stanford Medium 4 The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman. This book covers important concepts of data mining, machine learning, and bioinformatics. It is available as a .pdf file.
Machine Learning Tutorial Document Medium 5 A stepbystep guide on how the document's author got started in machine learning. Machine Learning Article kdnuggets Medium 3 Review of fdcunn, a deep learning module for Torch (open source deep learning library) which was released by Facebook. Machine Learning Discussion Reddit Medium 3 Highest voted machine learning posts from the Reddit discussion forum. Machine Learning Article kdnuggets Medium 4 Deep learning in a nutshell  what it is, how it works, why care Machine Learning Article kdnuggets Medium 2 Discusses two research papers on flaws/shortcomings of deep learning neural networks. Machine Learning Tutorial Andrew Moore  CMU Medium 4 Excellent college level tutorial across the breadth of ML concepts Machine Learning Web Edureka Medium 4 A very good overview (coupled with a few examples) of Kmeans clustering Machine Learning Web Kaggle Varies 4 Competition to develop models to accurately predict the trade price of a US corporate bond, given last 10 trades. Machine Learning Web GitHub Varies 4 Curated list of high quality machine learning libraries across all programming languages, organized by categories. Machine Learning Web Low 2 A programmer presents MarI/O  a program that learned how to play Super Mario World through machine learning Statistics Paper University of North Carolina High 3 Paper with statistical analysis of nominal categorical data  looking through highway accident records and determining which variables and their interactions had significant effects on the outcome (ie. fatal / nonfatal injury). Statistics University course Harvard Low 4 Harvard Statistics 110: Probability. Materials on website, lectures on YouTube/iTunes. Statistics Article kdnuggets Low 5 10 ideas from applied statistics which are relevant for big data analysis Statistics Article Harvard Business Review Low 3 A Predictive Analytics Primer  discusses how a business could collect the right data and apply regression analysis
Statistics Web University of Minnesota Low 4 Very useful breakdown of what statistical tests you should apply in different situations Statistics Book Thomas Haslwanter Low 3 An Introduction to Statistics  free online text book with illustrated examples in Python. Statistics Book Statsoft Low 4 A free introductory textbook to statistics and data mining. Statistics Web MIT Medium 3 Links to groups that do statistics related research. Statistics Book Amazon Medium 4 All of Statistics by Larry A. Wasserman  a comprehensive textbook for probability and statistics. Highly rated but quite advanced. Statistics Book Amazon Medium 4 Categorical Data Analysis by Alan Agresti. This is about doing statistical analysis on nonnumerical data (which we have a lot of in Aladdin). Statistics Book NN Taleb Medium 2 Silent Risk  unfinished draft of the 'technical companion' to The Black Swan, Fooled by Randomness and Antifragile. Basically about avoiding pitfalls in risk management and the interpretation of statistics & probability. Tool University Course Course homepage High 4 Online content for a course on multiagent semantic web systems (Semantic web = turning the unstructured information on the web into a more unified knowledge store). Discusses the enabling technologies and ideas. Tool Web python.org High 5 Python For Artificial Intelligence. Links to AIrelated libraries/packages on the official site. Tool Book High 4 Natural Language Processing with Python  Analyzing Text with the Natural Language Toolkit. Well known book. Tool Tutorial Michael Czerny High 3 Modern methods for sentiment analysis in Python. Tool Article Nature Low 3 Introductory piece about doing statistical analysis with R and tips on getting started. Tool Blog Lullabot Low 4 Introduction and explanation of Calais (see Calais in the Existing Solutions section also). Tool Video IBM Low 4 Demo of Watson analytics. Tool Video YouTube Low 3 Talk about how Watson works Tool Article ExtremeTech Low 3 New Watson features: Speechtotext, texttospeech, visual recognition, concept insights, and tradeoff analytics. Tool Tutorial DataQuest Low 5 Online tutorial that takes you through various data science 'missions' using real Python code and real data sets in your browser. Tool Forum PyData Low 3 Google group to help people doing data science in Python. Tool Notebook iPython Low 5 Crash Course in Python For Scientists  using the iPython Notebook format so you can interactive run and toy with the code Tool Web Home Medium 3 Shiny is a web application framework for R (Examples)
Used by FMGTool Paper Academic paper Medium 4 Research paper on using Calais to help create an ontology for business news, in the context of trading based off news stories/events. Tool Tutorial ipython.org Medium 4 Walks through how to build a basic document summarization program in Python using the Natural Language Toolkit. Tool Book Amazon Medium 4 Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython .Book from the author of the 'pandas' library for Python.
General Book Quora data scientist Varies 5 16 free books on data science topics Statistics Tutorial Github High 3 Series of iPython Notebooks to teach you about Bayesian methods and probabilistic programming. Machine Learning Paper Jason Yosinski Medium 3 Understanding Neural Networks Through Deep Visualization  research paper trying to better understand the inner workings of Deep Neural Networks.
Machine Learning Paper Ding et al High 2 Yingyang KMeans: an improvement on the original KMeans algorithm. Big Data Blog Netflix Medium 2 This is the blog of the Netflix technology team. They write about how big data and machine learning techniques are used in their work. General Blog DataQuest Low 3 "How to actually learn data science"  by using an interesting dataset as a starting point rather than theory; also contains list of useful resources Machine Learning University Course ComputerVisionTalks.com Medium 5 Complete classroom lecture video series on Machine Learning. This site also has many other videos of talks/lectures on machine learning and computer vision. It is curated by a group of researchers and academics. General Blog GitHub Low 3 A curated list of data science blogs. General Web LearnDataSci Low 5 A curated list of freely available data science books. General Blog Medium Low 3 Doing data science at Twitter  a blog by one of their engineers. General Tutorial Greg Reda Medium 3 Doing cohort analysis in Python (a technique for tracking shared behavior of groups) Statistics Tutorial DeepLearning4Java Medium 2 A Beginner’s Guide to Eigenvectors, PCA, Covariance and Entropy Big Data Paper Arxiv.org High 1 Using a Power Law Distribution to Describe Big Data  discusses a method for removing the large amounts of uninteresting data from a dataset Statistics Blog Coursera Low 3 Common Probability Distributions: The Data Scientist's Crib Sheet Machine Learning Lecture Yale Low 4 Video of the Machine Learning lecture of Yale's computer science course  focuses on an example of a Naive Bayes spam filter. General Document Deutsche Bank Low 5 40 page overview covering the Big Data technology stack from machine learning models to cloud computing. From Deutsche Bank's quantitative trading group. Machine Learning Paper Journal of Machine Learning Research Medium 4 An Introduction to Variable and Feature Selection. Gives some background on the history of feature selection procedures and examines what works best in different circumstances. General Book Microsoft Research High 3 The Foundations of Data Science. University level textbook covering data science. Aimed at computer science students. Quite technical. Machine Learning Lecture Lumiverse Low 3 Neural Netorks Demystified. 6 part video lecture series with lots of explanatory diagrams/illustrations. Machine Learning Web Jupyter Medium 4 A comparison of clustering algorithms in Python (in iPython Notebook form).