Guerrilla analytics a practical approach to working with data

Doing data science is difficult. Projects are typically very dynamic with requirements that change as data understanding grows. The data itself arrives piecemeal, is added to, replaced, contains undiscovered flaws and comes from a variety of sources. Teams also have mixed skill sets and tooling is o...

Descripción completa

Detalles Bibliográficos
Otros Autores: Ridge, Enda, author (author), Rogers, Mark, designer (designer)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Waltham, Massachusetts : Morgan Kaufmann 2015.
Edición:1st edition
Colección:Savvy manager's guides
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009628695506719
Tabla de Contenidos:
  • Cover; Title Page; Copyright Page; Contents; List of Figures; Table of War Stories; Preface; Why this book?; What this book is and what it is not; Who should read this book?; How this book is organized; Disclaimer; Part 1 - Principles; Chapter 1 - Introducing Guerrilla Analytics; 1.1 - What is data analytics?; 1.1.1 - Data Analytics Definition; 1.1.2 - Examples of Data Analytics; 1.2 - Types of data analytics projects; 1.3 - Introducing Guerrilla Analytics projects; 1.4 - Guerrilla Analytics definition; 1.4.1 - Changing Data; 1.4.2 - Changing Requirements; 1.4.3 - Changing Resource
  • 1.4.4 - Limited Time1.4.5 - Limited Toolsets; 1.4.6 - Analytics Results Must be Reproducible; 1.4.7 - Work Products must be easily explained; 1.5 - Example Guerrilla Analytics projects; 1.6 - Some terminology; 1.7 - Wrap up; Chapter 2 - Guerrilla Analytics: Challenges and Risks; 2.1 - The Guerrilla Analytics workflow; 2.2 - Challenges of managing analytics projects; 2.2.1 - Tracking Multiple Data Inputs; 2.2.2 - Versioning Multiple Data Inputs; 2.2.3 - Tracking Multiple Data Work Products; 2.2.4 - Data Generated by People; 2.2.5 - External Data; 2.2.6 - Version Control of Analytics
  • 2.2.7 - Creating Analytics that is Reproducible2.2.8 - Testing and Reviewing Analytics; 2.2.9 - Foreign Data Environment; 2.2.10 - Upskilling a Team Quickly; 2.2.11 - Reskilling a Team Quickly; 2.3 - Risks; 2.3.1 - Losing the Link Between Data Received and its Storage Location; 2.3.2 - Losing the Link Between Raw Data and Derived Data; 2.3.3 - Inability to Reproduce Work Products Because Source Datasets have Disappeared or been Modified; 2.3.4 - Inability to Easily Navigate the Analytics Environment; 2.3.5 - Conflicting Changes to Datasets; 2.3.6 - Changing of Raw Data
  • 2.3.7 - Out of Date Documentation Misleads the Team2.3.8 - Failure to Communicate Updates to Team Knowledge; 2.3.9 - Multiple Copies of Files and Work Products; 2.3.10 - Fragmented Code that Cannot be Executed Without the Author's Input; 2.3.11 - Inability to Identify the Source of a Dataset; 2.3.12 - Lack of Clarity Around Derivation of an Analysis; 2.3.13 - Multiple Versions of Tools and Libraries; 2.4 - Impact of failure to address analytics risks; 2.5 - Wrap up; Chapter 3 - Guerrilla Analytics Principles; 3.1 - Maintain data provenance despite disruptions; 3.2 - The principles
  • 3.2.8 - Principle 7: Prefer Analytics Code that Runs from Start to Finish