Big data architect's handbook a guide to build proficiency in tools and systems used by leading big data experts

A comprehensive end-to-end guide that gives hands-on practice in big data and Artificial Intelligence About This Book Learn to build and run a big data application with sample code Explore examples to implement activities that a big data architect performs Use Machine Learning and AI for structured...

Full description

Bibliographic Details
Other Authors:	Fahad Akhtar, Syed Muhammad, author (author)
Format:	eBook
Language:	Inglés
Published:	Birmingham, England : Packt Publishing 2018.
Edition:	1st edition
Subjects:	Big data > Handbooks, manuals, etc.
See on Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630633306719

Table of Contents:

Cover
Title Page
Copyright and Credits
Packt Upsell
Contributors
Table of Contents
Preface
Chapter 1: Why Big Data?
What is big data?
Characteristics of big data
Volume
Velocity
Variety
Veracity
Variability
Value
Solution-based approach for data
Data - the most valuable asset
Traditional approaches to data storage
Clustered computing
High availability
Resource pooling
Easy scalability
Big data - how does it make a difference?
Big data solutions - cloud versus on-premises infrastructure
Cost
Security
Current capabilities
Scalability
Big data glossary
Big data
Batch processing
Cluster computing
Data warehouse
Data lake
Data mining
ETL
Hadoop
In-memory computing
Machine learning
MapReduce
NoSQL
Stream processing
Summary
Chapter 2: Big Data Environment Setup
Oracle VM VirtualBox installation
Ubuntu installation
Hadoop prerequisite installation
Java installation
SSH installation and configuration
Hadoop system user
Apache Hadoop installation
Hadoop configuration
Path configuration for Hadoop commands
Hadoop server start and stop
Summary
Chapter 3: Hadoop Ecosystem
Apache Hadoop
Hadoop Distributed File System
HDFS hands-on
Creating a directory in HDFS
Copying files from a local file system to HDFS
Copying files from HDFS to a local file system
Deleting files and folders in HDFS
Hadoop MapReduce
Job Tracker and Task Tracker
The execution flow of MapReduce
Mapper
Shuffle and Sort
Reducer
Example program
Preparing the data file for analysis
Program code
Driver program
Mapper program
Reducer program
Observations and results
YARN
Resource Manager
Node Manager
Container
Application Master.
Apache Projects related to big data
Apache Zookeeper
Apache Kafka
Apache Flume
Apache Cassandra
Apache HBase
Apache Spark
Summary
Chapter 4: NoSQL Database
What is NoSQL?
Benefits of NoSQL databases
NoSQL versus RDBMS
The CAP theorem
The ACID properties
Data models in NoSQL
Key-value data stores
Document store
Column stores
Graph stores
Apache Cassandra
Installation
Starting Cassandra
The Cassandra Query Language - CQL
The help command
Basic commands
Data manipulation
Creating, altering, and deleting a keyspace
Creating, altering, and deleting tables
Inserting, updating, and deleting data
The MongoDB database
Installing MongoDB
Starting MongoDB
Working on MongoDB
The help command
Basic commands
Data manipulation
Creating and deleting databases
Creating and deleting collections
The create, retrieve, update, delete operations
Neo4j database
Installing Neo4j
Starting Neo4j
The cypher query language
Help
Basic operations in Cypher
Creating nodes, relationships, and properties
Updating nodes, relationships, and properties
Deleting nodes, relationships, and properties
Reading nodes, relationships, and properties
Summary
Chapter 5: Off-the-Shelf Commercial Tools
Microsoft Azure
Building a practical application
Microsoft Azure account
The Azure Event Hub
IoT simulation application
Setting up an Azure Stream Analytics job
Input
Query
Output
Dashboard in Power BI
Summary
Chapter 6: Containerization
Virtualization
Hypervisors
Hardware-based hypervisors
Software-based hypervisors
What is containerization?
Benefits of containers
Docker
Docker workflow
Installation
Basic commands
Docker images
Building a Docker image.
Running and verifying Docker images
Importing and exporting Docker images
Docker Swarm
Setting up Docker Swarm
Creating service containers
Replicating containers
Removing container services
Kubernetes
Key components
Pods
ReplicaSets
Deployments
PetSets
Installation
Deployment
Kubernetes Dashboard
Summary
Chapter 7: Network Infrastructure
Network
Local area networks
Metropolitan area networks
Wide area networks
Network connectivity
Wired
Wireless
Network visualization
Gephi
Installation
Java installation
First run
Practical example
Summary
Chapter 8: Cloud Infrastructure
Companies moving to cloud
Driving factors
Infrastructure
Locality of data
Requirements
Design considerations
Open source versus commercial
Commodity hardware versus purpose build
Cloud versus on-premises
Scale up and down
Application architecture
Cost decision
Summary
Chapter 9: Security and Monitoring
Simple Network Management Protocol
Benefits of SNMP
Security
Agents and Traps
Netflow
Nagios
Key benefits
Security Onion
Deployment scenarios
The Standalone model
The Server-Sensor model
Hybrid model
Preconfigured tools
Wireshark
Key features
Summary
Chapter 10: Frontend Architecture
React JS
Key concepts
Node.js
JSX
Unidirectional dataflow
Getting started with ReactJS
Single page application
React application project
React app directory structure
Components
Properties
Event handling
State
Redux
Architecture of Redux
Key concepts
Single store
Action
Reducers
Guestbook application
Installation
Create a store
Setting up Reducer
Setting up Dispatcher
Connect function
Setting up Subscribers
Final output
Summary.
Chapter 11: Backend Architecture
API
RESTful API
HTTP request methods
GET
POST
PUT
DELETE
Authentication
Basic authentication
JSON Web Token
Header
Payload
Signature
Practical
RESTful web service
Java client
Redis
Installation
Redis server
Redis client
Working with Redis
Redis data types and structures
String
HashMap
List
Set
Redis Publish/Subscribe
Common key operations
Summary
Chapter 12: Machine Learning
Machine learning
Types of algorithms
Parametric algorithms
Non-parametric algorithms
Supervised learning
The classification model
Binary classification
Multi-class classification
The regression model
Linear regression
Polynomial regression
Unsupervised learning
Clustering, k-means
Neural networks
Feedforward neural network
Recurrent neural network
Symmetrically connected neural network
Deep neural networks
Decision tree classifiers
Summary
Chapter 13: Artificial Intelligence
Artificial intelligence
Convolutional neural networks
Deep learning using TensorFlow
TensorFlow
Installation
TensorFlow program
Uninstalling TensorFlow
TensorBoard
Program
Launching TensorBoard
TensorBoard graph
Object detection using YOLO
Installation
Compiling YOLO library
Trained weights
Detecting objects in an image
Summary
Chapter 14: Elasticsearch
Installing Elasticsearch
Starting the Elasticsearch server
Auto starting the Elasticsearch service
Stopping the Elasticsearch server
Uninstalling Elasticsearch
Kibana
Installation
Starting Kibana
Uninstalling Kibana
Security
Securing Elasticsearch
Securing Kibana
Understanding queries - CRUD commands
Creating
Reading
Updating
Deleting
Summary.
Chapter 15: Structured Data
Data analysis
Installing MySQL
Importing data
Analyzing the data model
HBase
Installation
Starting an HBase instance
Stopping a HBase instance
Preparing an HBase for migration
Sqoop
Installation
Verifying the installation
MySQL JDBC driver
Importing data
Verifying the imported data
Summary
Chapter 16: Unstructured Data
Moving data into Hadoop
Downloading Flume
Environment configuration
Configuring agent and sink
Running Apache Flume
Transferring a log file
Converting images into text for analysis
Tesseract OCR
Installing Tesseract
Practical example
Complete code
Program execution
Summary
Chapter 17: Data Visualization
Matplotlib
Installing Matplotlib
Line chart
Bar charts
Stack charts
Scatter charts
Pie charts
Geographic projections
D3.js
Installation
Practical example
Output
Summary
Chapter 18: Financial Trading System
What is algorithmic trading?
Benefits of algorithmic trading
Big data in the financial market
Algorithmic trading strategies
Building an Expert Advisor
MetaTrader
Downloading and setting up MetaTrader
MetaQuotes language
Trading bot objective
Practical
Trading pattern - moving average
Decision time: buy or sell
Complete program
Backtesting in MetaTrader 4
Summary
Chapter 19: Retail Recommendation System
Types of recommendation system
Collaborative filtering
Content-based filtering
Demographic-based system
Utility-based system
Knowledge-based system
Hybrid model
Commercial tools
Barilliance
Softcube
Strands
Monetate
Nosto
Book recommendation system
Dataset
Directory structure
Code
Reading the dataset
Verifying the dataset
Data analysis
Age group
Commutative rating.
Algorithms.

Big data architect's handbook a guide to build proficiency in tools and systems used by leading big data experts

Similar Items