Data governance for AI training data create de-bias, and improve datasets to train AI alogrithms and models

In this course, we will cover the different aspects of building and vetting datasets in order to minimize any problems in AI training. We’ll talk about compiling datasets, “fixing” bad or incomplete data, profiling data to find biases, data labeling and annotations, data quality considerations, data...

Descripción completa

Detalles Bibliográficos
Autor Corporativo:	O'Reilly (Firm), publisher (publisher)
Otros Autores:	Patricio, Vasco, instructor (instructor)
Formato:	Vídeo online
Idioma:	Inglés
Publicado:	[Sebastopol, California] : O'Reilly Media, Inc [2024]
Edición:	[First edition]
Materias:	Data protection. Artificial intelligence. Machine learning. Data privacy. Big data. Data curation.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009852336206719

Descripción
Sumario:	In this course, we will cover the different aspects of building and vetting datasets in order to minimize any problems in AI training. We’ll talk about compiling datasets, “fixing” bad or incomplete data, profiling data to find biases, data labeling and annotations, data quality considerations, data privacy and security in the process, and more. We will start by covering how to properly assemble and curate datasets. How to properly vet and catalog the data, how to preprocess them, and how to ensure that we achieve a certain level of data quality from the get-go. Then, we’ll talk about ensuring data quality. How to both put in place policies and rules to ensure that future data are of quality, but also how to remediate data when they are already below the expected quality thresholds. Then, our focus will be on de-biasing data. Even if data seem to objectively be “of quality,” they may still lack variety according to several dimensions, which can cause warped and harmful model outputs. We’ll cover how to tease out—and how to deal with—various types of biases. And, finally, we’ll cover data security and privacy. Because, even if your models don’t harm users, the actual data can. We’ll talk about how to limit access control, how to put in place various security controls of various types, and how to protect Personally Identifiable Information when handling training datasets. All of this so that, at the end of the day, data becomes an accelerator instead of a hurdle in your training efforts.
Descripción Física:	1 online resource (1 video file (2 hr., 50 min.)) : sound, color

Data governance for AI training data create de-bias, and improve datasets to train AI alogrithms and models

Ejemplares similares