Data analysis and applications 1 clustering and regression, modeling-estimating, forecasting and data mining

"This series of books collects a diverse array of work that provides the reader with theoretical and applied information on data analysis methods, models, and techniques, along with appropriate applications. Volume 1 begins with an introductory chapter by Gilbert Saporta, a leading expert in th...

Descripción completa

Detalles Bibliográficos
Otros Autores: Skiadas, Christos H., editor (editor), Bozeman, James R., editor
Formato: Libro electrónico
Idioma:Inglés
Publicado: London, England ; Hoboken, New Jersey : ISTE 2019.
Edición:1st edition
Colección:Innovation, entrepreneurship and management series. Big data, artificial intelligence and data analysis set ; 2.
THEi Wiley ebooks.
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630637206719
Tabla de Contenidos:
  • Cover
  • Half-Title Page
  • Title Page
  • Copyright Page
  • Contents
  • Preface
  • Introduction: 50 Years of Data Analysis: From Exploratory Data Analysis to Predictive Modeling and Machine Learning
  • I.1. The revolt against mathematical statistics
  • I.2. EDA and unsupervised methods for dimension reduction
  • I.2.1. The time of syntheses
  • I.2.2. The time of clusterwise methods
  • I.2.3. Extensions to new types of data
  • I.2.4. Nonlinear data analysis
  • I.2.5. The time of sparse methods
  • I.3. Predictive modeling
  • I.3.1. Paradigms and paradoxes
  • I.3.2. From statistical learning theory to empirical validation
  • I.3.3. Challenges
  • I.4. Conclusion
  • I.5. References
  • PART 1: Clustering and Regression
  • 1. Cluster Validation by Measurement of Clustering Characteristics Relevant to the User
  • 1.1. Introduction
  • 1.2. General notation
  • 1.3. Aspects of cluster validity
  • 1.3.1. Small within-cluster dissimilarities
  • 1.3.2. Between-cluster separation
  • 1.3.3. Representation of objects by centroids
  • 1.3.4. Representation of dissimilarity structure by clustering
  • 1.3.5. Small within-cluster gaps
  • 1.3.6. Density modes and valleys
  • 1.3.7. Uniform within-cluster density
  • 1.3.8. Entropy
  • 1.3.9. Parsimony
  • 1.3.10. Similarity to homogeneous distributional shapes
  • 1.3.11. Stability
  • 1.3.12. Further Aspects
  • 1.4. Aggregation of indexes
  • 1.5. Random clusterings for calibrating indexes
  • 1.5.1. Stupid K-centroids clustering
  • 1.5.2. Stupid nearest neighbors clustering
  • 1.5.3. Calibration
  • 1.6. Examples
  • 1.6.1. Artificial data set
  • 1.6.2. Tetragonula bees data
  • 1.7. Conclusion
  • 1.8. Acknowledgment
  • 1.9. References
  • 2. Histogram-Based Clustering of Sensor Network Data
  • 2.1. Introduction
  • 2.2. Time series data stream clustering
  • 2.2.1. Local clustering of histogram data.
  • 2.2.2. Online proximity matrix updating
  • 2.2.3. Off-line partitioning through the dynamic clustering algorithm for dissimilarity tables
  • 2.3. Results on real data
  • 2.4. Conclusions
  • 2.5. References
  • 3. The Flexible Beta Regression Model
  • 3.1. Introduction
  • 3.2. The FB distribution
  • 3.2.1. The beta distribution
  • 3.2.2. The FB distribution
  • 3.2.3. Reparameterization of the FB
  • 3.3. The FB regression model
  • 3.4. Bayesian inference
  • 3.5. Illustrative application
  • 3.6. Conclusion
  • 3.7. References
  • 4. S-weighted Instrumental Variables
  • 4.1. Summarizing the previous relevant results
  • 4.2. The notations, framework, conditions and main tool
  • 4.3. S-weighted estimator and its consistency
  • 4.4. S-weighted instrumental variables and their consistency
  • 4.5. Patterns of results of simulations
  • 4.5.1. Generating the data
  • 4.5.2. Reporting the results
  • 4.6. Acknowledgment
  • 4.7. References
  • PART 2: Models and Modeling
  • 5. Grouping Property and Decomposition of Explained Variance in Linear Regression
  • 5.1. Introduction
  • 5.2. CAR scores
  • 5.2.1. Definition and estimators
  • 5.2.2. Historical criticism of the CAR scores
  • 5.3. Variance decomposition methods and SVD
  • 5.4. Grouping property of variance decomposition methods
  • 5.4.1. Analysis of grouping property for CAR scores
  • 5.4.2. Demonstration with two predictors
  • 5.4.3. Analysis of grouping property using SVD
  • 5.4.4. Application to the diabetes data set
  • 5.5. Conclusions
  • 5.6. References
  • 6. On GARCH Models with Temporary Structural Changes
  • 6.1. Introduction
  • 6.2. The model
  • 6.2.1. Trend model
  • 6.2.2. Intervention GARCH model
  • 6.3. Identification
  • 6.4. Simulation
  • 6.4.1. Simulation on trend model
  • 6.4.2. Simulation on intervention trend model
  • 6.5. Application
  • 6.6. Concluding remarks
  • 6.7. References.
  • 7. A Note on the Linear Approximation of TAR Models
  • 7.1. Introduction
  • 7.2. Linear representations and linear approximations of nonlinear models
  • 7.3. Linear approximation of the TAR model
  • 7.4. References
  • 8. An Approximation of Social Well-Being Evaluation Using Structural Equation Modeling
  • 8.1. Introduction
  • 8.2. Wellness
  • 8.3. Social welfare
  • 8.4. Methodology
  • 8.5. Results
  • 8.6. Discussion
  • 8.7. Conclusions
  • 8.8. References
  • 9. An SEM Approach to Modeling Housing Values
  • 9.1. Introduction
  • 9.2. Data
  • 9.3. Analysis
  • 9.4. Conclusions
  • 9.5. References
  • 10. Evaluation of Stopping Criteria for Ranks in Solving Linear Systems
  • 10.1. Introduction
  • 10.2. Methods
  • 10.2.1. Preliminaries
  • 10.2.2. Iterative methods
  • 10.3. Formulation of linear systems
  • 10.4. Stopping criteria
  • 10.5. Numerical experimentation of stopping criteria
  • 10.5.1. Convergence of stopping criterion
  • 10.5.2. Quantiles
  • 10.5.3. Kendall correlation coefficient as stopping criterion
  • 10.6. Conclusions
  • 10.7. Acknowledgments
  • 10.8. References
  • 11. Estimation of a Two-Variable Second- Degree Polynomial via Sampling
  • 11.1. Introduction
  • 11.2. Proposed method
  • 11.2.1. First restriction
  • 11.2.2. Second restriction
  • 11.2.3. Third restriction
  • 11.2.4. Fourth restriction
  • 11.2.5. Fifth restriction
  • 11.2.6. Coefficient estimates
  • 11.3. Experimental approaches
  • 11.3.1. Experiment A
  • 11.3.2. Experiment B
  • 11.4. Conclusions
  • 11.5. References
  • PART 3: Estimators, Forecasting and Data Mining
  • 12. Displaying Empirical Distributions of Conditional Quantile Estimates: An Application of Symbolic Data Analysis to the Cost Allocation Problem in Agriculture
  • 12.1. Conceptual framework and methodological aspects of cost allocation
  • 12.2. The empirical model of specific production cost estimates.
  • 12.3. The conditional quantile estimation
  • 12.4. Symbolic analyses of the empirical distributions of specific costs
  • 12.5. The visualization and the analysis of econometric results
  • 12.6. Conclusion
  • 12.7. Acknowledgments
  • 12.8. References
  • 13. Frost Prediction in Apple Orchards Based upon Time Series Models
  • 13.1. Introduction
  • 13.2. Weather database
  • 13.3. ARIMA forecast model
  • 13.3.1. Stationarity and differencing
  • 13.3.2. Non-seasonal ARIMA models
  • 13.4. Model building
  • 13.4.1. ARIMA and LR models
  • 13.4.2. Binary classification of the frost data
  • 13.4.3. Training and test set
  • 13.5. Evaluation
  • 13.6. ARIMA model selection
  • 13.7. Conclusions
  • 13.8. Acknowledgments
  • 13.9. References
  • 14. Efficiency Evaluation of Multiple-Choice Questions and Exams
  • 14.1. Introduction
  • 14.2. Exam efficiency evaluation
  • 14.2.1. Efficiency measures and efficiency weighted grades
  • 14.2.2. Iterative execution
  • 14.2.3. Postprocessing
  • 14.3. Real-life experiments and results
  • 14.4. Conclusions
  • 14.5. References
  • 15. Methods of Modeling and Estimation in Mortality
  • 15.1. Introduction
  • 15.2. The appearance of life tables
  • 15.3. On the law of mortality
  • 15.4. Mortality and health
  • 15.5. An advanced health state function form
  • 15.6. Epilogue
  • 15.7. References
  • 16. An Application of Data Mining Methods to the Analysis of Bank Customer Profitability and Buying Behavior
  • 16.1. Introduction
  • 16.2. Data set
  • 16.3. Short-term forecasting of customer profitability
  • 16.4. Churn prediction
  • 16.5. Next-product-to-buy
  • 16.6. Conclusions and future research
  • 16.7. References
  • List of Authors
  • Index
  • Other titles from iSTE in Innovation, Entrepreneurship and Management
  • EULA.