Frequently Asked Questions
Why is the data from Player Warehouse tables not exactly the same as what I see in the GameAnalytics tool?
We use hyperloglog approximation methods for our GameAnalytics tool (which is quite common among analytics tools). The error margin is low and should always be <5%, but this is the reason why results can be slightly different between our GameAnalytics tool and the queries in Player Warehouse.
When is the data updated?
Player Warehouse is updated daily, around 8 am, with data from the previous day.
Can I control my BigQuery spending on Player Warehouse?
Yes. By default, we apply a quota on queries, with an upper limit on your spending. Depending on your needs we can either decrease it, increase it or remove it altogether.
Can I create new tables and datasets in my Player Warehouse project?
Yes, you can create new datasets which you’ll own.
How can I provide access to the Player Warehouse for my team?
You’ll need to give us your Google Group email address, which we will use to grant access to the BigQuery project. Anyone you add to that group on your side will automatically get access to the service.
Can you provide Player Warehouse service accounts to connect to visualization tools or programmatic access?
Please get in touch with us, and we’ll share the relevant details for you to connect to any visualisation tool of your choice.
I get an error “Access Denied” when performing a query
If you can see the project, the permissions are most likely correct.
The most common root cause for this is querying the dataset
live_my_studio_name instead of
If this is not the case, reach out to us.
I get an error “Custom quota exceeded” when performing a query
The query you’re trying to run processes more data than the quota established. The tables in Player Warehouse contain vast amounts of data, so if you query all of it, you will most likely hit the threshold. The most typical queries where we see this happening are:
SELECT * FROM table LIMIT 100
In BigQuery, applying a
LIMIT clause to a any query does not affect the amount of data read, as explained in this section of BigQuery Docs.
We recommend instead to take advantage of the partition system in place. All tables are partitioned by date (checkpoint) so writing queries like the following will be more efficient:
SELECT DISTINCT player_id FROM table WHERE checkpoint = “YYYY-MM-DD”
How to work with JSON strings in BigQuery
Out of the box, BigQuery supports multiple functions to retrieve and transform JSON data: JSON Functions in BigQuery
Our suggestion is to use
JSON_VALUE since it already removes the outermost quotes and unescapes the values.
Example to retrieve the number of events per "ball_color", where "ball_color" is a field in the custom_fields:
COUNT(*) AS nr_events
GROUP BY 1
ORDER BY 2 DESC