Train Word embeddings from scratch with Nessvec and PyTorch

Hobson and his colleagues try to figure out how to train word embeddings from scratch using the WikiText2 dataset in PyTorch. The WikiText2 dataset contains redacted words, but they were unable to find the "labels" that reveal the words masked with the symbol ``. If you try to use the `Wik...

Descripción completa

Detalles Bibliográficos
Autor Corporativo: Manning (Firm), publisher (publisher)
Otros Autores: Lane, Hobson, presenter (presenter)
Formato: Video
Idioma:Inglés
Publicado: [Place of publication not identified] : Manning Publications [2022]
Edición:[First edition]
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009825927106719
Descripción
Sumario:Hobson and his colleagues try to figure out how to train word embeddings from scratch using the WikiText2 dataset in PyTorch. The WikiText2 dataset contains redacted words, but they were unable to find the "labels" that reveal the words masked with the symbol ``. If you try to use the `Wikipedia` package to retrieve Wikipedia pages directly, you may hit the `suggest` bug. There are more than 100 unanswered issues on the project, and the maintainer has pushed any changes for many years. The Tangible AI fork on GitLab fixes this search suggestion bug so we could easily crawl Wikipedia. Unfortunately, the Wikipedia-API package is not very useful for searching and crawling Wikipedia to retrieve text.
Descripción Física:1 online resource (1 video file (41 min.)) : sound, color