Sumario: | Execute SQL with CSV datasets SQL is usually reserved for interacting with databases but in this video I show how you can use Databricks to run SQL queries against a CSV dataset. There are a few defaults that can make working with a CSV dataset problematic, like disabled schema infering and no headers. These are crucial when running SQL against the CSV since the defaults will treat every single value as a string. Although this video uses Azure Databricks, the same concepts should apply to any Databricks cluster. In this video you will learn: Uploading a CSV dataset to Databricks Create a Notebook to work with the CSV dataset Find potential pitfalls in default options for CSV and SQL Useful Resources Azure Databricks, Pandas, and Opendatasets Free Azure Certification for Students Learn Azure Databricks fundamentals Try Azure for Free Introduction to Azure Databricks.
|