Switch to bigquery-public-data datset from the-psf (#39)

This commit is contained in:
Dustin Ingram
2021-06-30 19:29:09 -05:00
committed by GitHub
parent e44f3944da
commit da4d4aa8da
4 changed files with 5 additions and 5 deletions

View File

@@ -297,11 +297,11 @@ def get_query(date):
details.python AS python_version,
details.system.name AS system
FROM
`the-psf.pypi.file_downloads`
`bigquery-public-data.pypi.file_downloads`
WHERE
DATE(timestamp) = '{date}'
AND
(REGEXP_CONTAINS(details.python,r'^[0-9]\.[0-9]+.{{0,}}$') OR
(REGEXP_CONTAINS(details.python,r'^[0-9]\.[0-9]+.{{0,}}$') OR
details.python IS NULL)
)
SELECT

View File

@@ -8,7 +8,7 @@
Index in lieu of having to execute queries against raw download records in Google BigQuery.</p>
<h3>Data</h3>
<p>Download stats are sourced from the Python Software Foundation's publicly available
<a href="https://bigquery.cloud.google.com/table/the-psf:pypi.downloads">download stats</a>
<a href="https://bigquery.cloud.google.com/table/bigquery-public-data:pypi.downloads">download stats</a>
on Google BigQuery. All aggregate download stats ignore known PyPI mirrors (such as
<a href="{{ url_for('general.package_page', package='bandersnatch') }}">bandersnatch</a>) unless noted
otherwise.</p>

View File

@@ -27,7 +27,7 @@
</p>
<p>
You are much better off extracting the data directly from the Google
BigQuery <a href="https://bigquery.cloud.google.com/table/the-psf:pypi.downloads">pypi downloads tables</a>. You
BigQuery <a href="https://bigquery.cloud.google.com/table/bigquery-public-data:pypi.downloads">pypi downloads tables</a>. You
can query up to 1TB of data FREE every month before having to pay. The volume of data queried for this website
falls well under that limit (each month of data is less than 100 GB queried) and you will have your data
in a relatively short amount of time. <a

View File

@@ -8,7 +8,7 @@
</h3>
<p>
PyPI provides download records as a publicly available dataset on Google's BigQuery. You can access the data
with a Google Cloud account <a href="https://bigquery.cloud.google.com/table/the-psf:pypi.downloads">here</a>.
with a Google Cloud account <a href="https://bigquery.cloud.google.com/table/bigquery-public-data:pypi.downloads">here</a>.
</p>
<h3>
When is the website data updated?