Skip to main content

Frequently Asked Questions

Why is the data from Data Warehouse tables not exactly the same as what I see in the GameAnalytics tool?

We use hyperloglog approximation methods for our GameAnalytics tool, which may introduce small discrepancies when comparing with non-approximated results, such as BigQuery.

When is the data updated?

Data Warehouse is updated daily, around 8 am, with data from the previous day.

Can I control my BigQuery spending on Data Warehouse?

Yes. By default, we apply a quota on queries, with an upper limit on your spending. Depending on your needs we can either decrease it, increase it or remove it altogether.

Can I create new tables and datasets in my Data Warehouse project?

Yes.

How can I provide access to the Data Warehouse for my team?

You’ll need to give us your Google Group email address, which we will use to grant access to the BigQuery project. Anyone you add to that group on your side will automatically get access to the service.

Can you provide Data Warehouse service accounts to connect to visualization tools or programmatic access?

Please get in touch with us, and we’ll share the relevant details for you to connect to any visualisation tool of your choice.

I get an error “Access Denied” when performing a query

If you can see the project, the permissions are most likely correct. The most common root cause for this is querying the dataset live_my_studio_name instead of live_my_studio_name_checkpoints. If this is not the case, reach out to us.

I get an error “Custom quota exceeded” when performing a query

The query you’re trying to run processes more data than the quota established. The tables in Data Warehouse contain vast amounts of data, so if you query all of it, you will most likely hit the threshold. The most typical queries where we see this happening are:

SELECT * FROM table LIMIT 100

In BigQuery, applying a LIMIT clause to a any query does not affect the amount of data read, as explained in this section of BigQuery Docs.

We recommend instead to take advantage of the partition system in place. All tables are partitioned by date (checkpoint) so writing queries like the following will be more efficient:

SELECT DISTINCT player_id FROM table WHERE checkpoint = “YYYY-MM-DD”

How to work with JSON strings in BigQuery

Out of the box, BigQuery supports multiple functions to retrieve and transform JSON data: JSON Functions in BigQuery

Our suggestion is to use JSON_VALUE since it already removes the outermost quotes and unescapes the values. Example to retrieve the number of events per "ball_color", where "ball_color" is a field in the custom_fields:

SELECT
JSON_QUERY(custom_fields, "$.ball_color"),
COUNT(*) AS nr_events
FROM `events.design_event`
GROUP BY 1
ORDER BY 2 DESC