If you are a Firebase or Google Analytics 4 user and you have setup data imports into your BigQuery data warehouse then you might want to create real-time custom reports with your data in intraday schema. The problem is that this integrated dataset is being deleted automatically by Google every day. So if you choose to connect it as a datasource to your report in Google Data Studio you won't find it the day after.
For example, I want to do real-time analytics on intraday tables in our BI solution built with Data Studio. Due to it's being constantly deleted I can't connect it as a dataset to Google Data Studio. Also when it's deleted it stops being a partitioned table and you can't use wildcard suffix for date querying.
Having said that, I would need the following:
I'm sure it's very common.
I have raised a feature request with Google Engineering to enable manual deletion if needed but at the moment of writing this article it is still in progress.
"intraday tables shouldn't be deleted automatically. We would want to do it manually." It might take a while though. So here is the solution to overcome this.
Ideally you would want to run a scheduled script where parameter is an intraday table's date suffix.
However it is not that simple...
So you don't know the exact time to schedule the script. Alternatively you would use a daily export table but again you would have to wait unitl that extract is ready. I have been looking for information about intraday tables deletion time, and I found that it is possible for the intraday table from the past day to still exist for a couple of hours. One of the reason is the process requiring more time for large amounts of data Stackoverflow. So there is no set time for the past table to be deleted, but generally, it can still exist a couple of hours after the new day. The timezone you set in GA also affects this.
BigQuery scripting is a powerful tool but at the moment (2021-11-20) Google's BigQuery scripts can't use parameters for table names. According to [Parameterized queries in BigQuery documentation5, query parameters can't be used for SQL object identifiers
Parameters cannot be used as substitutes for identifiers, column names, table names, or other parts of the query.
You would want to use this script returning true
if data exists:
Then you add into you script that would perform an update:
You could:
This script wil copy whole table:
You would want to use the same SQL script as in Copy and Replace intraday events table but change copyAndReplaceQuery to selectAndInsert query. Let's create this script that would do the following:
Solution:
Intraday tables might have minor discrepancies in comparison to the daily session tables. These discrepancies are due to the following:
In general if those fields don't make any difference in your data stack this is the solution to improve data availability with *Firebase or Google Analytics 4 integrations.
Comments
Jen Lopez November 13, 2021
Very useful.