aboutsummaryrefslogtreecommitdiffhomepage
path: root/site/dev
diff options
context:
space:
mode:
authorGravatar Ben Wagner <benjaminwagner@google.com>2017-09-27 14:45:47 -0400
committerGravatar Skia Commit-Bot <skia-commit-bot@chromium.org>2017-09-28 13:45:37 +0000
commit05572fa6e784b3037e94966c74c8a358348b2245 (patch)
tree97d1584708ded030488b988157cd47e7d9c3ddaf /site/dev
parent420c4cfcd75189c03c735b1f02dee360e705c3e9 (diff)
Update docs on Skia bots.
No-Try: true Docs-Preview: https://skia.org/?cl=52163 Change-Id: I2bcd73bc7597219e4748c28e9120b5138a0eb3d1 Reviewed-on: https://skia-review.googlesource.com/52163 Reviewed-by: Eric Boren <borenet@google.com> Reviewed-by: Ravi Mistry <rmistry@google.com> Commit-Queue: Ben Wagner <benjaminwagner@google.com>
Diffstat (limited to 'site/dev')
-rw-r--r--site/dev/sheriffing/trooper.md27
-rw-r--r--site/dev/testing/skialab.md225
-rw-r--r--site/dev/testing/swarmingbots.md69
3 files changed, 75 insertions, 246 deletions
diff --git a/site/dev/sheriffing/trooper.md b/site/dev/sheriffing/trooper.md
index bcd456051f..e49d7cfb0e 100644
--- a/site/dev/sheriffing/trooper.md
+++ b/site/dev/sheriffing/trooper.md
@@ -50,14 +50,10 @@ Tips for troopers
- Install the Skia trooper Chrome extension (available [here](https://chrome.google.com/webstore/a/google.com/detail/alerts-for-skia-troopers/fpljhfiomnfioecagooiekldeolcpief)) to be able to see alerts quickly in the browser.
- Where machines are located:
- - Machine name like "skia-gce-NNN", "ct-gce-NNN" -> GCE
- - Machine name ends with "a3", "a4", "m3" -> Chrome Golo
+ - Machine name like "skia-gce-NNN", "skia-i-gce-NNN", "ct-gce-NNN", "skia-ct-gce-NNN", "ct-xxx-builder-NNN" -> GCE
+ - Machine name ends with "a9", "m3" -> Chrome Golo/Labs
- Machine name ends with "m5" -> CT bare-metal bots in Chrome Golo
- - Machine name starts with "skiabot-" -> Chapel Hill lab
- - Machine name starts with "win8" -> Chapel Hill lab (Windows machine
- names can't be very long, so the "skiabot-shuttle-" prefix is dropped.)
- - slave11-c3 is a Chrome infra GCE machine (not to be confused with the Skia
- Buildbots GCE, which we refer to as simply "GCE")
+ - Machine name starts with "skia-e-", "skia-i-" (other than "skia-i-gce-NNN"), "skia-rpi-" -> Chapel Hill lab
- The [chrome-infra hangout](https://goto.google.com/cit-hangout) is useful for
questions regarding bots managed by the Chrome Infra team and to get
@@ -76,23 +72,12 @@ Tips for troopers
username is chrome-bot and the password can be found on
[Valentine](https://valentine.corp.google.com/) as "chrome-bot (Win GCE)".
+- To log in to other bots, see the [Skolo maintenance doc](https://docs.google.com/document/d/1zTR1YtrIFBo-fRWgbUgvJNVJ-s_4_sNjTrHIoX2vulo/edit#heading=h.2nq3yd1axg0n) remote access section.
+
- If there is a problem with a bot in the Chrome Golo or Chrome infra GCE, the
best course of action is to
[file a bug](https://code.google.com/p/chromium/issues/entry?template=Build%20Infrastructure)
- with the Chrome infra team. But if you know what you're doing:
- - To access bots in the Chrome Golo,
- [follow these instructions](https://chrome-internal.googlesource.com/infra/infra_internal/+/master/doc/ssh.md).
- - Machine name ends with "a3" or "a4" -> ssh command looks like `ssh
- build3-a3.chrome`
- - Machine name ends with "m3" -> ssh command looks like `ssh build5-m3.golo`
- - Machine name ends with "m5" -> ssh command looks like `ssh build1-m5.golo`.
- [Example bug](https://bugs.chromium.org/p/chromium/issues/detail?id=638193) to file to Infra Labs.
- - For MacOS and Windows bots, you will be prompted for a password, which is
- stored on [Valentine](https://valentine.corp.google.com/) as "Chrome Golo,
- Perf, GPU bots - chrome-bot".
- - To access bots in the Chrome infra GCE -> command looks like `gcutil
- --project=google.com:chromecompute ssh --ssh_user=default slave11-c3` (or
- use the ccompute ssh script from the infra_internal repo).
+ with the Chrome infra team.
- Read over the [Skolo maintenance doc](https://docs.google.com/document/d/1zTR1YtrIFBo-fRWgbUgvJNVJ-s_4_sNjTrHIoX2vulo/edit) for more detail on
dealing with device alerts.
diff --git a/site/dev/testing/skialab.md b/site/dev/testing/skialab.md
deleted file mode 100644
index 5ddb0deaf2..0000000000
--- a/site/dev/testing/skialab.md
+++ /dev/null
@@ -1,225 +0,0 @@
-SkiaLab
-=======
-
-Overview
---------
-
-Skia's buildbots are hosted in three places:
-
-* Google Compute Engine. This is the preferred location for bots which don't
- need to run on physical hardware, ie. anything that doesn't require a GPU,
- stable performance numbers, or a specific hardware configuration. Most of our
- compile bots live here, along with some non-GPU test bots on Linux and
- Windows.
-* Chrome Golo. This is the preferred location for bots which require specific
- hardware or OS configurations that are not supported by GCE. We have several
- Mac, Linux, and Windows bots in the Golo.
-* The local SkiaLab in Chapel Hill. Anything we can't get in GCE or the Golo
- lives here. This includes newer or uncommon GPUs and all Android, ChromeOS,
- and iOS devices.
-
-This page covers the local SkiaLab in Chapel Hill.
-
-
-Layout
-------
-
-The SkiaLab consists of three wireframe racks which hold machines connected to
-two KVM switches. Each KVM switch has a monitor, mouse, and keyboard and is the
-primary mode of access to the lab machines. In general, the machines are on the
-same rack as the KVM switch used to access them. The switch nearest the door
-(labeled "DOOR"), is connected to machines on its own rack as well as a smaller
-rack closer to the door.
-
-Each machine is labeled with its hostname and the number or letter used to
-access it on the KVM switch. Android devices are located on the rack nearest
-the interior of the office (the KVM switch is labeled "OFFICE"). They are
-labeled with their serial number and the name of the buildslave they are
-associated with. Each device connects to a host machine, either directly or
-by way of a powered USB hub.
-
-**Disclaimer: Please ONLY make changes on a lab machine as a last resort, as it
-is disruptive to the running bots and can leave the machines in a dirty state.
-If you must make changes, such as cloning a copy of Skia to run tests and debug
-failures, be sure to clean up after yourself. If a permanent change needs to be
-made on the machine (such as a driver update), please contact an infra team
-member.**
-
-
-Common Tasks
-------------
-
-### Locating the host machine for a failing bot
-
-Sometimes failures can only be reproduced on a particular hardware
-configuration. In these cases, it is sometimes necessary to log into the host
-machine where a failing bot is running in order to debug the failure.
-
-From the [Status](https://status.skia.org/) page:
-
-1. Click on the box associated with a failed build.
-2. A popup will appear with some information about the build, including the
- builder and buildslave. Click the "Lookup" link next to "Host machine". This
- will bring you to the [SkiaLab Hosts](https://status.skia.org/hosts) page,
- which contains information about the machines in the lab, pre-filtered to
- select the machine which runs the buildslave in question.
-3. The information box will display the hostname of the machine as well as the
- KVM switch and number used to access the machine, if the machine is in the
- SkiaLab.
-4. Walk over to the lab. While standing at the KVM switch indicated by the host
- information page, double tap \<ctrl\> and then press the number or letter from
- the information page. It may be necessary to move or click the mouse to wake
- the machine up.
-5. Log in to the machine if necessary. The password is stored in
- [Valentine](https://valentine/) as "Chapel Hill buildbot slave password".
-
-### Rebooting a problematic Android device
-
-Follow the same process as above, with some slight changes:
-
-1. On the [Status](https://status.skia.org/) page, click the box for the failed
- build.
-2. Click the "Lookup" link for the host machine. Remember the name of the
- buildslave which ran the build.
-3. The hosts page will display the information used to access the host machine
- for the device as well as the serial number for the device next to the name
- of its buildsave.
-4. Walk over to the lab and find the Android device with the serial number from
- the hosts page. Hold the power and volume-up buttons until the device
- reboots.
-5. Access the host machine for the device, per the above instructions. Use the
- `which_devices.py` script to verify that the device has re-attached. From
- the home directory:
-
- $ python buildbot/scripts/which_devices.py
-
-
-Maintenance Tasks
------------------
-
-### Bringing up a new buildbot host machine
-
-This assumes that we're just adding a host machine for a new buildbot slave,
-and doesn't cover how to make changes to the buildbot code to change the
-behavior of the builder itself.
-
-1. Obtain the machine itself and place it on the racks in the lab. Connect
- power, ethernet, and KVM cables.
-2. If we already have a disk image appropriate for this machine, follow the
- instructions for flashing a disk image to a machine below. Otherwise, follow
- the instructions for bringing up a new machine from scratch.
-3. Power on the machine. Be sure to kill any buildbot processes that start up,
- eg. `killall python` on Linux and Mac, and just close any cmd instances which
- pop up on Windows.
-4. Set the hostname for the machine.
-5. Ensure that the machine is labeled with its hostname and KVM number.
-6. Add the new slave to the slaves.cfg file on the appropriate master, eg.
- https://chromium.googlesource.com/chromium/tools/build/+/master/masters/master.client.skia/slaves.cfg,
- and upload the change for code review.
-7. Add an entry for the new host machine to the slave_hosts_cfg.py file in the
- Skia infra repo: https://skia.googlesource.com/buildbot/+/master/site_config/slave_hosts_cfg.py,
- and upload it for review.
-8. Commit the change to add the slave to the master. Once it lands, commit the
- slave_hosts_cfg.py change immediately afterward.
-9. Restart the build master. Either ask borenet@ to do this or file a
- [ticket](https://code.google.com/p/chromium/issues/entry?template=Build%20Infrastructure&labels=Infra-Labs,Restrict-View-Google,Infra-Troopers&summary=Restart%20request%20for%20[%20name%20]&comment=Please%20provide%20the%20reason%20for%20restart.%0A%0ASet%20to%20Pri-0%20if%20immediate%20restarted%20is%20required,%20otherwise%20please%20set%20to%20Pri-1%20and%20the%20restart%20will%20happen%20when%20the%20trooper%20gets%20a%20free%20moment.) for a trooper to do it.
-10. Reboot the machine and monitor the build master to ensure that it connects.
- This can take some time, since the bot needs to sync Chrome.
-
-
-### Bringing up a new Android bot
-
-1. Locate or add a host machine. We generally want to keep the number of
- devices attached to each host below 5 or so. If a new host machine is
- required, follow the above instructions for bringing up a new buildbot
- host machine, with the exception that the slave corresponds to the Android
- device, not the host machine itself.
-2. Ensure that the buildslave is not yet running:
-
- $ killall python
-
-3. Disable MTP and PTP on the device. Some devices require one or the other to
- be enabled; in that case, select PTP and choose to 'do nothing' when
- attaching to the host machine.
-4. Connect the device to the host machine, either through a powered USB hub or
- directly to the machine.
-5. Make sure that the device is in developer mode and that USB debugging is
- enabled.
-6. Authorize the device for USB debugging on the host machine by checking the
- "always allow" box on dialog box which appears on the Android device after
- plugging it into the host.
-7. Ensure that the device appears as "connected" when you run the
- `which_devices.py` script:
-
- $ python buildbot/scripts/which_devices.py
-
-8. Reboot the machine to start the buildslave.
-
-
-### Bringing up a new machine from scratch
-
-TODO(borenet): Migrate from Google Docs.
-
-OS-specific instructions are available in a
-[Google Doc](https://docs.google.com/document/d/1X7Hvsj33AlBmj-KEWfFbmdCArUJJAICLkB7ipDcxRV8/edit)
-
-
-### Flashing a disk image to a machine
-
-1. Find the USB key labeled, "Clonezilla" in the SkiaLab and insert it into the
- machine.
-2. Turn on the machine and load the boot menu. For Shuttle machines, press
- \<del\> or \<esc\>. Mac machines require that you plug in the Mac keyboard and
- press the \<option\> key at boot. Boot from the USB key. It's typically UEFI
- and named something like "FlashBlu" or "Kanguru".
-3. At the Clonezilla menu, choose the "to RAM" option.
-4. Choose your preferred language.
-5. "Don't touch keymap".
-6. "Start Clonezilla".
-7. "device-image".
-8. "local_dev".
-9. Unplug the flash drive and plug in the external hard drive labeled, "Disk
- images." Wait for the "Attached Enclosure device" message to appear, then
- hit \<enter\>.
-10. Select the external drive to use for /home/partimag, something like,
- "1000GB_ntfs_My_Passport".
-11. Select the bot_img directory.
-12. Hit \<enter\> to continue.
-13. "Beginner"
-14. "restoredisk"
-15. Select the image to use. Make sure that it's compatible with this machine.
-16. Choose the hard drive in the machine. It should be the only option.
-17. "y" and "y"
-18. Choose "reboot" after flashing the image to the machine.
-19. Set the hostname of the machine so that it doesn't conflict with any
- existing machines.
-
-### Capturing a disk image
-
-1. Make sure that the machine is in a clean state: no pre-existing buildslave
- checkouts, extra software, etc.
-2. Find the USB key labeled, "Clonezilla" in the SkiaLab and insert it into the
- machine.
-3. Turn on the machine and load the boot menu. For Shuttle machines, press
- \<del\> or \<esc\>. Mac machines require that you plug in the Mac keyboard and
- press the \<option\> key at boot. Boot from the USB key. It's typically UEFI
- and named something like "FlashBlu" or "Kanguru".
-4. At the Clonezilla menu, choose the "to RAM" option.
-5. Choose your preferred language.
-6. "Don't touch keymap".
-7. "Start Clonezilla".
-8. "device-image".
-9. "local_dev"
-10. Unplug the flash drive and plug in the external hard drive labeled, "Disk
- images." Wait for the "Attached Enclosure device" message to appear, then
- hit \<enter\>.
-11. Select the external drive to use for /home/partimag, something like,
- "1000GB_ntfs_My_Passport".
-12. Select the bot_img directory.
-13. "Beginner"
-14. "savedisk"
-15. Choose a name for the disk image. The convention is:
- `skiabot-<hardware type>-<OS>-<disk image revision #>`
-12. Choose the hard drive in the machine. It should be the only option.
-13. "y"
-14. Choose "reboot" or "shut down" when finished.
diff --git a/site/dev/testing/swarmingbots.md b/site/dev/testing/swarmingbots.md
new file mode 100644
index 0000000000..75cf38b3d4
--- /dev/null
+++ b/site/dev/testing/swarmingbots.md
@@ -0,0 +1,69 @@
+Skia Swarming Bots
+==================
+
+Overview
+--------
+
+Skia's Swarming bots are hosted in three places:
+
+* Google Compute Engine. This is the preferred location for bots which don't need to run on physical
+ hardware, ie. anything that doesn't require a GPU or a specific hardware configuration. Most of
+ our compile bots live here, along with some non-GPU test bots on Linux and Windows. We get
+ surprisingly stable performance numbers from GCE, despite very few guarantees about the physical
+ hardware.
+* Chrome Golo. This is the preferred location for bots which require specific hardware or OS
+ configurations that are not supported by GCE. We have several Mac, Linux, and Windows bots in the
+ Golo.
+* The Skolo (local Skia lab in Chapel Hill). Anything we can't get in GCE or the Golo lives
+ here. This includes a wider variety of GPUs and all Android, ChromeOS, iOS, and other devices.
+
+[go/skbl](https://goto.google.com/skbl) lists all Skia Swarming bots.
+
+Adding new jobs
+---------------
+
+See [Skia Automated Testing](automated_testing) for an overview of how jobs and tasks are executed
+by the Skia Task Scheduler.
+
+If you would like to add jobs to build or test new configurations, please file a [New Bot
+Request](https://bugs.chromium.org/p/skia/issues/entry?template=New+Bot+Request).
+
+If you know that the new jobs will need new hardware or you aren't sure which existing bots should
+run the new jobs, assign to jcgregorio. Once the Infra team has allocated the hardware, we will
+assign back to you to complete the process.
+
+Generally it's possible to copy an existing job and make changes to accomplish what you want. You
+will need to add the new job to
+[infra/bots/jobs.json](https://skia.googlesource.com/skia/+/master/infra/bots/jobs.json). In some
+cases, you will need to make changes to recipes:
+
+* If there are new GN flags or compiler options:
+ [infra/bots/recipe_modules/flavor/gn_flavor.py](https://skia.googlesource.com/skia/+/master/infra/bots/recipe_modules/flavor/gn_flavor.py)
+* If there are modifications to dm flags:
+ [infra/bots/recipes/test.py](https://skia.googlesource.com/skia/+/master/infra/bots/recipes/test.py)
+* If there are modifications to nanobench flags:
+ [infra/bots/recipes/perf.py](https://skia.googlesource.com/skia/+/master/infra/bots/recipes/perf.py)
+
+If you need to do something more complicated, or if you are not sure how to add and configure the
+new jobs, please ask for help from borenet, benjaminwagner, or mtklein.
+
+Debugging
+---------
+
+If you need a physical machine/device to debug an issue, the [current
+Trooper](http://skia-tree-status.appspot.com/trooper) can loan one from the Skolo. For Internet
+access, you can connect to GoogleGuest WiFi.
+
+If you need to make changes on a Skolo device, please check with an Infra team member. Most can be
+flashed/imaged back to a clean state, but others can not.
+
+If a permanent change needs to be made on the machine (such as an OS or driver update), please [file
+a bug](https://bugs.chromium.org/p/skia/issues/entry?template=Infrastructure+Bug) and assign to
+jcgregorio for reassignment.
+
+
+Maintenance Tasks
+-----------------
+
+See the [Skolo maintenance
+doc](https://docs.google.com/document/d/1zTR1YtrIFBo-fRWgbUgvJNVJ-s_4_sNjTrHIoX2vulo/edit).