Big Data Platforms

Sherlock's Innovation Accelerator Platforms provide organizations quick access to on-demand, elastic, secure, big data platforms to tackle large amounts of data. Sherlock’s first offering as part of this capability is the Amazon Elastic Map Reduce (EMR) platform, a turnkey HIPAA-compliant Hadoop platform configured with Apache Spark.

Sherlock Cloud EMR V1 Tech.jpg

Value Proposition

    • Reduce time of deployment, eliminate the first 20-30% of the setup and configuration effort; all running in a highly secure Cloud
    • Platform ready for users to begin working with data
    • Leverage Cloud compute capability to scale up and down as needed
    • Spin up compute for specific use cases (transient vs. persistent)
    • leveraging an extensive open source & Cloud marketplace of user apps and libraries that support data science, machine learning, etc.
    • Option to “build your own data pipeline”, fully customizable

Core Functionality

    • Provides a managed Hadoop framework that makes it easy, fast, and cost-effective to process vast amounts of data across dynamically scalable EC2 instances.
    • Leverages multiple data stores, including Amazon S3 and the Hadoop Distributed File System (HDFS). Additionally, with the EMR File System (EMRFS), EMR can efficiently and securely use Amazon S3 as an object store for Hadoop.
    • Includes Apache Spark, a unified analytics engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
    • Securely and reliably handles a broad set of big data use cases, including log analysis, web indexing, data transformations (ETL), machine learning, financial analysis, scientific simulation, and bioinformatics.
    • Deployed in minutes with automation to enable infrastructure provisioning, security groups setup, encryption, cluster setup, Hadoop configuration, and cluster tuning, all operating within the secure boundaries of Sherlock Cloud.