launchflow.gcp.bigquery
BigQueryDataset
1class BigQueryDataset(GCPResource[BigQueryDatasetConnectionInfo])
A BigQuery Dataset resource.
Example usage:
1from google.cloud import bigquery
2import launchflow as lf
3
4# Automatically configures / deploys a BigQuery Dataset in your GCP project
5dataset = lf.gcp.BigQueryDataset("my-dataset")
6
7schema = [
8 bigquery.SchemaField("name", "STRING", mode="REQUIRED"),
9 bigquery.SchemaField("age", "INTEGER", mode="REQUIRED"),
10]
11table = dataset.create_table("table_name", schema=schema)
12
13dataset.insert_table_data("table_name", [{"name": "Alice", "age": 30}])
14
15# You can also use the underlying resource directly
16# For example, for a table with columns name,age
17query = f"""
18SELECT name, age
19FROM `{dataset.dataset_id}.table_name`
20WHERE age > 10
21ORDER BY age DESC
22"""
23
24for row in dataset.client().query(query):
25 print(row)
__init__
1def __init__(name: str, *, location="US") -> None
Create a new BigQuery Dataset resource.
Args:
name
: The name of the dataset. This must be globally unique.location
: The location of the dataset. Defaults to "US".
dataset_id
1@property
2def dataset_id() -> str
Get the dataset id.
Returns:
- The dataset id.
get_table_uuid
1def get_table_uuid(table_name: str) -> str
Get the table UUID, {project_id}.{dataset_id}.{table_id}.
Args:
table_name
: The name of the table.
Returns:
- The table UUID.
client
1@lru_cache
2def client() -> "bigquery.Client"
Get the BigQuery Client object.
Returns:
- The BigQuery Client object.
dataset
1@lru_cache
2def dataset() -> "bigquery.Dataset"
Get the BigQuery Dataset object.
Returns:
- The BigQuery Dataset object.
create_table
1def create_table(table_name: str,
2 *,
3 schema: "Optional[List[bigquery.SchemaField]]" = None
4 ) -> "bigquery.Table"
Create a table in the dataset.
Args:
schema
: The schema of the table. Not required and defaults to None.
Returns:
- The BigQuery Table object.
delete_table
1def delete_table(table_name: str) -> None
Delete a table from the dataset.
Args:
table_name
: The name of the table to delete.
load_table_data_from_csv
1def load_table_data_from_csv(table_name: str, file_path: Path) -> None
Load data from a CSV file into a table.
Args:
table_name
: The name of the table to load the data into.file_path
: The path to the CSV file to load.
insert_table_data
1def insert_table_data(table_name: str,
2 rows_to_insert: List[Dict[Any, Any]]) -> None
Insert in-memory data into a table. There's seems to be a bug in bigquery where if a table name is re-used (created and then deleted recently), streaming to it won't work. If you encounter an unexpected 404 error, try changing the table name.
Args:
table_name
: The name of the table to insert the data into.rows_to_insert
: The data to insert into the table.
Raises: ValueError if there were errors when inserting the data.