Member-only story
How to select data from columns contain a substring from BigQuery
Recently I’m dealing with a situation where the requests ask for data from columns that contain a user-defined substring in the column name in our BigQuery table. In other words, it’s something like this:
SELECT columns contain this pattern FROM <YOUR_TARGET_TABLE>;
If we are querying the actual data that matches with a pattern, we can use the LIKE
operator.
SELECT * FROM <YOUR_TARGET_TABLE> WHERE name LIKE "%Abc" limit 10;
However, this does not work for the column names. So how do we solve it? In this post, we will look at two solutions:
- Using dynamic SQL
- Using the
Table
API
Solution one: use dynamic SQL
Since the user-defined pattern can vary, the SQL query itself becomes dynamic in nature. BigQuery provides a way to execute this type of query through scripting: https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting.
Specifically, we need to construct the columns based on the user input. This fits into the use case of scripting to “declare a variable, assign a value to it, and then reference it in a third statement” as stated in the documentation.
We can declare a string variable to represent the columns matching the pattern.
DECLARE columns STRING;