Data Science Technology Solutions

    TechnologyBarrier to entryEase of useFunctionalityLimitationsGeneral CommentsSign-up sheet (leave your call sign)
    Watson AnalyticsLowHigh
    • Multiple regression analysis of tabular data sets (called "predictions")
    • Visualizations of relationships in tabular data sets(called "explorations")
    • Free version limited to 100K input rows
    • Cloud-based so must anonymize data
    • GUI has very limited customization available
    Probably not flexible enough for most projects but would allow a group of non-technical users to do some "brute force" analysis on spreadsheets of dataCLew
    • Programming language with dozens of free libraries that can be used for data analysis
    Not a pre-built application so need to do some work to acquire, clean, analyze and report on data AMcN
    • R is an integrated suite of software facilities for data manipulation, calculation and graphical display
    Cheat SheetDMM
    • indexes any machine data you throw at it,
    • determines relationships between entries from disparate sources
    • build monitoring, dashboard, search, and reporting capabilities
    • also has 'automated detection of interesting patterns in your data.'
    • not free but there is a trial period
    A commercial enterprise system.
    • add semantic metadata to content you submit
    • perl and Python APIs
    • cloud based so must anonymize data
    • collection of algorithms optimized for fast nearest neighbor search in large datasets and for high dimensional features
    • c++
    An interface to FLANN(Fast Library for Approximate Nearest Neighbours). Technique used for matching similar images.
    • Python package
    • Simple and efficient tools for data mining and data analysis
    • classification, regression, clustering, dimensionality reduction, model selection, preprocessing
    • Requires NumPy and SciPy packages
    Open source. Website has lots of example applications.
    • visualization library for drawing attractive statistical graphs
    • requires numpy, scipy, matplotlib, pandas packages
    Site has a tutorial on how to use, plus examples gallery
    • short for classification and regression training
    • R package with wrappers for lots of other ML packages
    • provides a simple interface to leverage all these other packages
    • requires other packages installed in order to call them
    Essentially a set of functions that attempt to streamline the process for creating predictive models. Apparently popular in industry.
    SwirlLowLowThe swirl R package makes it fun and easy to learn R programming and data scienceRequires R to be installedInstall guide


