Java for data science examine the techniques and Java tools supporting the growing field of data science
Examine the techniques and Java tools supporting the growing field of data science About This Book Your entry ticket to the world of data science with the stability and power of Java Explore, analyse, and visualize your data effectively using easy-to-follow examples Make your Java applications more...
Otros Autores: | , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England ; Mumbai, [India] :
Packt
2017.
|
Edición: | 1st edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630293406719 |
Tabla de Contenidos:
- Cover
- Copyright
- Credits
- About the Authors
- About the Reviewers
- www.PacktPub.com
- Customer Feedback
- Table of Contents
- Preface
- Chapter 1: Getting Started with Data Science
- Problems solved using data science
- Understanding the data science problem - solving approach
- Using Java to support data science
- Acquiring data for an application
- The importance and process of cleaning data
- Visualizing data to enhance understanding
- The use of statistical methods in data science
- Machine learning applied to data science
- Using neural networks in data science
- Deep learning approaches
- Performing text analysis
- Visual and audio analysis
- Improving application performance using parallel techniques
- Assembling the pieces
- Summary
- Chapter 2: Data Acquisition
- Understanding the data formats used in data science applications
- Overview of CSV data
- Overview of spreadsheets
- Overview of databases
- Overview of PDF files
- Overview of JSON
- Overview of XML
- Overview of streaming data
- Overview of audio/video/images in Java
- Data acquisition techniques
- Using the HttpUrlConnection class
- Web crawlers in Java
- Creating your own web crawler
- Using the crawler4j web crawler
- Web scraping in Java
- Using API calls to access common social media sites
- Using OAuth to authenticate users
- Handing Twitter
- Handling Wikipedia
- Handling Flickr
- Handling YouTube
- Searching by keyword
- Summary
- Chapter 3: Data Cleaning
- Handling data formats
- Handling CSV data
- Handling spreadsheets
- Handling Excel spreadsheets
- Handling PDF files
- Handling JSON
- Using JSON streaming API
- Using the JSON tree API
- The nitty gritty of cleaning text
- Using Java tokenizers to extract words
- Java core tokenizers
- Third-party tokenizers and libraries.
- Transforming data into a usable form
- Simple text cleaning
- Removing stop words
- Finding words in text
- Finding and replacing text
- Data imputation
- Subsetting data
- Sorting text
- Data validation
- Validating data types
- Validating dates
- Validating e-mail addresses
- Validating ZIP codes
- Validating names
- Cleaning images
- Changing the contrast of an image
- Smoothing an image
- Brightening an image
- Resizing an image
- Converting images to different formats
- Summary
- Chapter 4: Data Visualization
- Understanding plots and graphs
- Visual analysis goals
- Creating index charts
- Creating bar charts
- Using country as the category
- Using decade as the category
- Creating stacked graphs
- Creating pie charts
- Creating scatter charts
- Creating histograms
- Creating donut charts
- Creating bubble charts
- Summary
- Chapter 5: Statistical Data Analysis Techniques
- Working with mean, mode, and median
- Calculating the mean
- Using simple Java techniques to find mean
- Using Java 8 techniques to find mean
- Using Google Guava to find mean
- Using Apache Commons to find mean
- Calculating the median
- Using simple Java techniques to find median
- Using Apache Commons to find the median
- Calculating the mode
- Using ArrayLists to find multiple modes
- Using a HashMap to find multiple modes
- Using a Apache Commons to find multiple modes
- Standard deviation
- Sample size determination
- Hypothesis testing
- Regression analysis
- Using simple linear regression
- Using multiple regression
- Summary
- Chapter 6: Machine Learning
- Supervised learning techniques
- Decision trees
- Decision tree types
- Decision tree libraries
- Using a decision tree with a book dataset
- Testing the book decision tree
- Support vector machines
- Using an SVM for camping data.
- Testing individual instances
- Bayesian networks
- Using a Bayesian network
- Unsupervised machine learning
- Association rule learning
- Using association rule learning to find buying relationships
- Reinforcement learning
- Summary
- Chapter 7: Neural Networks
- Training a neural network
- Getting started with neural network architectures
- Understanding static neural networks
- A basic Java example
- Understanding dynamic neural networks
- Multilayer perceptron networks
- Building the model
- Evaluating the model
- Predicting other values
- Saving and retrieving the model
- Learning vector quantization
- Self-Organizing Maps
- Using a SOM
- Displaying the SOM results
- Additional network architectures and algorithms
- The k-Nearest Neighbors algorithm
- Instantaneously trained networks
- Spiking neural networks
- Cascading neural networks
- Holographic associative memory
- Backpropagation and neural networks
- Summary
- Chapter 8: Deep Learning
- Deeplearning4j architecture
- Acquiring and manipulating data
- Reading in a CSV file
- Configuring and building a model
- Using hyperparameters in ND4J
- Instantiating the network model
- Training a model
- Testing a model
- Deep learning and regression analysis
- Preparing the data
- Setting up the class
- Reading and preparing the data
- Building the model
- Evaluating the model
- Restricted Boltzmann Machines
- Reconstruction in an RBM
- Configuring an RBM
- Deep autoencoders
- Building an autoencoder in DL4J
- Configuring the network
- Building and training the network
- Saving and retrieving a network
- Specialized autoencoders
- Convolutional networks
- Building the model
- Evaluating the model
- Recurrent Neural Networks
- Summary
- Chapter 9: Text Analysis
- Implementing named entity recognition
- Using OpenNLP to perform NER.
- Identifying location entities
- Classifying text
- Word2Vec and Doc2Vec
- Classifying text by labels
- Classifying text by similarity
- Understanding tagging and POS
- Using OpenNLP to identify POS
- Understanding POS tags
- Extracting relationships from sentences
- Using OpenNLP to extract relationships
- Sentiment analysis
- Downloading and extracting the Word2Vec model
- Building our model and classifying text
- Summary
- Chapter 10: Visual and Audio Analysis
- Text-to-speech
- Using FreeTTS
- Getting information about voices
- Gathering voice information
- Understanding speech recognition
- Using CMUPhinx to convert speech to text
- Obtaining more detail about the words
- Extracting text from an image
- Using Tess4j to extract text
- Identifying faces
- Using OpenCV to detect faces
- Classifying visual data
- Creating a Neuroph Studio project for classifying visual images
- Training the model
- Summary
- Chapter 11: Mathematical and Parallel Techniques for Data Analysis
- Implementing basic matrix operations
- Using GPUs with DeepLearning4j
- Using map-reduce
- Using Apache's Hadoop to perform map-reduce
- Writing the map method
- Writing the reduce method
- Creating and executing a new Hadoop job
- Various mathematical libraries
- Using the jblas API
- Using the Apache Commons math API
- Using the ND4J API
- Using OpenCL
- Using Aparapi
- Creating an Aparapi application
- Using Aparapi for matrix multiplication
- Using Java 8 streams
- Understanding Java 8 lambda expressions and streams
- Using Java 8 to perform matrix multiplication
- Using Java 8 to perform map-reduce
- Summary
- Chapter 12: Bringing It All Together
- Defining the purpose and scope of our application
- Understanding the application's architecture
- Data acquisition using Twitter
- Understanding the TweetHandler class.
- Extracting data for a sentiment analysis model
- Building the sentiment model
- Processing the JSON input
- Cleaning data to improve our results
- Removing stop words
- Performing sentiment analysis
- Analysing the results
- Other optional enhancements
- Summary
- Index.