Pro Microsoft HDInsight Hadoop on Windows
Pro Microsoft HDInsight is a complete guide to deploying and using Apache Hadoop on the Microsoft Windows Azure Platforms. The information in this book enables you to process enormous volumes of structured as well as non-structured data easily using HDInsight, which is Microsoft’s own distribution o...
Autor principal: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Berkeley, CA :
Apress
2014.
|
Edición: | 1st ed. 2014. |
Colección: | The Expert's Voice in Big Data
|
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009629537006719 |
Tabla de Contenidos:
- Contents at a Glance; Contents; About the Author; About the Technical Reviewers; Acknowledgments; Introduction; Chapter 1: Introducing HDInsight; What Is Big Data, and Why Now?; How Is Big Data Different?; Is Big Data the Right Solution for You?; The Apache Hadoop Ecosystem; Microsoft HDInsight: Hadoop on Windows; Combining HDInsight with Your Business Processes; Summary; Chapter 2: Understanding Windows Azure HDInsight Service; Microsoft's Cloud-Computing Platform; Windows Azure HDInsight Service; HDInsight Versions; Cluster Version 2.1; Cluster Version 1.6; Storage Location Options
- Azure storage accountsAccessing containers; Understanding the Windows Azure Storage Blob; Uploading Data to Windows Azure Storage Blob; Windows Azure Flat Network Storage; Summary; Chapter 3: Provisioning Your HDInsight Service Cluster; Creating the Storage Account; Creating a SQL Azure Database; Deploying Your HDInsight Cluster; Customizing Your Cluster Creation; Configuring the Cluster User and Hive/Oozie Storage; Choosing Your Storage Account; Finishing the Cluster Creation; Monitoring the Cluster; Configuring the Cluster; Summary; Chapter 4: Automating HDInsight Cluster Provisioning
- Using the Hadoop .NET SDKAdding the NuGet Packages; Connecting to Your Subscription; Coding the Application; Using the PowerShell cmdlets for HDInsight; Command-Line Interface (CLI); Summary; Chapter 5: Submitting Jobs to Your HDInsight Cluster; Using the Hadoop .NET SDK; Adding the References; Submitting a Custom MapReduce Job; Adding the MapReduce Classes; Running the MapReduce Job; Submitting the wordcount MapReduce Job; Submitting a Hive Job; Adding the References; Creating the Hive Queries; Running the Hive Job; Monitoring Job Status; Using PowerShell; Writing Script; Executing The Job
- Using MRRunnerSummary; Chapter 6: Exploring the HDInsight Name Node; Accessing the HDInsight Name Node; Hadoop Command Line; The Hive Console; The Sqoop Console; The Pig Console; Hadoop Web Interfaces; Hadoop MapReduce Status; The Name Node Status Portal; The TaskTracker Portal; HDInsight Windows Services; Installation Directory; Summary; Chapter 7: Using Windows Azure HDInsight Emulator; Installing the Emulator; Verifying the Installation; Using the Emulator; Future Directions; Summary; Chapter 8: Accessing HDInsight over Hive and ODBC; Hive: The Hadoop Data Warehouse; Working with Hive
- Creating Hive TablesLoading Data; Querying Tables with HiveQL; Hive Storage; The Hive ODBC Driver; Installing the Driver; Testing the Driver; Connecting to the HDInsight Emulator; Configuring a DSN-less Connection; Summary; Chapter 9: Consuming HDInsight from Self-Service BI Tools; PowerPivot Enhancements; Creating a Stock Report; Power View for Excel; Power BI: The Future; Summary; Chapter 10: Integrating HDInsight with SQL Server Integration Services; SSIS as an ETL Tool; Creating the Project; Creating the Data Flow; Creating the Source Hive Connection
- Creating the Destination SQL Connection