blob: d9ff7172ebe5a6ea3ae0f41134b84afbe6d02e1b [file] [log] [blame] [view]
---
orphan: true
---
# Scripts for GitHub CI
A set of `gh_*.py` scripts work together to produce size comparisons for PRs.
## Reports on Pull Requests
The scripts' results are presented as comments on PRs.
**Note** that a comment may be updated by the scripts as CI run results become
available.
**Note** that the scripts will not create a comment for a commit if there is
already a newer commit in the PR.
A size report comment consists of a title followed by one to four tables. A
title looks like:
> PR #12345678: Size comparison from `base-SHA` to `pr-SHA`
The first table, if present, lists items with a large increase, according to a
configurable threshold.
The next table, if present, lists all items that have increased in size.
The next table, if present, lists all items that have decreased in size.
The final table, always present, lists all items.
## Usage in CI
The original intent was to have a tool that would run after a build in CI, add
its sizes to a central database, and immediately report on size changes from the
parent commit in the database. Unfortunately, GitHub provides no practical place
to store and share such a database between workflow actions. Instead, the
process is split; builds in CI record size information in the form of GitHub
[artifacts](https://docs.github.com/en/actions/advanced-guides/storing-workflow-data-as-artifacts),
and a later step reads these artifacts to generate reports.
### 1. Build workflows
#### gh_sizes_environment.py
The `gh_sizes_environment.py` script should be run once in each workflow that
records sizes, _after_ checkout and _before_ any use of `gh_sizes.py` It takes a
single argument, a JSON dictionary of the `github` context. Typically run as:
```
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
- name: Set up environment for size reports
if: ${{ !env.ACT }}
env:
GH_CONTEXT: ${{ toJson(github) }}
run: scripts/tools/memory/gh_sizes_environment.py "${GH_CONTEXT}"
```
#### gh_sizes.py
The `gh_sizes.py` script runs on a built binary (executable or library) and
produces a JSON file containing size information.
Usage: `gh_sizes.py` _platform_ _config_ _target_ _binary_ [_output_]
Where _platform_ is the platform name, corresponding to a config file in
`scripts/tools/memory/platform/`.
Where _config_ is a configuration identification string. This has no fixed
meaning, but is intended to describe a build variation, e.g. a particular target
board or debug vs release.
Where _target_ is a readable name for the build artifact, identifying it in
reports.
Where _binary_ is the input build artifact.
Where _output_ is the name for the output JSON file, or a directory for it, in
which case the name will be
_platform_`-`_config_name_`-`_target_name_`-sizes.json`.
Example:
```
scripts/tools/memory/gh_sizes.py \
linux arm64 thermostat-no-ble \
out/linux-arm64-thermostat-no-ble/thermostat-app \
/tmp/bloat_reports/
```
#### Upload artifacts
The JSON files generated by `gh_sizes.py` must be uploaded with an artifact name
of a very specific form in order to be processed correctly.
Example:
```
Size,Linux-Examples,${{ env.GH_EVENT_PR }},${{ env.GH_EVENT_HASH }},${{ env.GH_EVENT_PARENT }},${{ github.event_name }}
```
Other builds must replace `Linux-Examples` with a label unique to the workflow,
but otherwise use the form exactly.
### 2. Reporting workflow
Run a periodic workflow calling `gh_report.py` to generate PR comments. This
script has full `--help`, but normal use is probably best illustrated by an
example:
```
scripts/tools/memory/gh_report.py \
--verbose \
--report-increases 0.2 \
--report-pr \
--github-comment \
--github-limit-artifact-pages 50 \
--github-limit-artifacts 500 \
--github-limit-comments 20 \
--github-repository project-chip/connectedhomeip \
--github-api-token "${{ secrets.GITHUB_TOKEN }}"
```
Notably, the `--report-increases` flag provides a _percent growth_ threshold for
calling out ‘large’ increases in GitHub comments.
When this script successfully posts a comment on a GitHub PR, it removes the
corresponding PR artifact(s) so that a future run will not process it again and
post the same comment. Only PR artifacts are removed, not push (trunk)
artifacts, since those may be used as a comparison base by many different PRs.
## Using a database
It can be useful to keep a permanent record of build sizes.
### Updating the database: `gh_db_load.py`
To update an SQLite file of trunk commit sizes, periodically run:
```
gh_db_load.py \
--repo project-chip/connectedhomeip \
--token ghp_ThIsIsNoTMyReAlGiThUbToKeNSoDoNoTtRy \
--db /path/to/database
```
Those interested in only a single platform can add the `--github-label` option,
providing the same name as in the size artifact name after `Size,` (e.g.
`Linux-Examples` in the upload example above).
See `--help` for additional options.
_Note_: Transient 4xx and 5xx errors from GitHub's API are very common. Run
`gh_db_load.py` frequently enough to give it several attempts before the
relevant artifacts expire.
### Querying the database: `gh_db_query.py`
While the database can of course be used directly, the `gh_db_query.py` script
provides a handful of common queries.
Note that this script (like others that show tables) has an `--output-format`
option offering (among others) CSV, several JSON formats, and any text format
provided by [tabulate](https://pypi.org/project/tabulate/).
Two notable options:
- `--query-build-sizes PLATFORM,CONFIG,TARGET` lists sizes for all builds of
the given kind, with a column for each section.
- `--query-section-changes PLATFORM,CONFIG,TARGET,SECTION` lists changes for
the given section. The `--report-increases PERCENT` option limits this to
changes over a given threshold (as is done for PR comments).
(To find out what PLATFORM, CONFIG, TARGET, and SECTION exist:
`--query-platforms`, then `--query-platform-targets=PLATFORM` and
`--query-platform-sections=PLATFORM`.)
See `--help` for additional options.