aboutsummaryrefslogtreecommitdiff
path: root/doc/manual_src/headlessFiveUI.md
blob: 91a0b6218c4ce80146e16e25d64141156bbadf15 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
% Automated testing with FiveUI

# Introducing "Headless"

-------------

The FiveUI distribution comes with a Java application called `Headless` that is
desiged to make automated testing with FiveUI easy.

Headless can take a collection of FiveUI rule sets and a target
(a single web page, an entire website, or a filtered part of one)
and automate running the rule sets on the target(s). Headless
will then output a text or HTML based report summarizing the run.

"Headless runs" are specified using text based run description files that are
written in JSON format. The exact form of these descriptions is given below.
Headless supports two modes for automating FiveUI. In the first mode,
Headless executes one FiveUI rule set per URL line in the run description and
outputs a report indicating which rule sets passed or failed (and how) for each
URL. In the second mode, Headless uses each URL line to specify a seed from
which a web crawl is started.  Parameters can be given to control the extent of
the crawl.

In what follows, `<FiveUI>` refers to the directory where you have installed the
FiveUI distribution.

## Quickstart

-------------

### Install the dependencies

You will need the following dependenies installed on your system in order
to use Headless. All of the dependencies can be found and easily installed
on most major platforms (Linux, Mac OS X, Windows). The dependencies are:

 - [Java runtime environment](http://www.java.com)
 - [Firefox](http://www.mozilla.org/en-US/firefox/organizations/all.html) 17 (the
   E.S.R. release)
 - [Maven](http://maven.apache.org/download.cgi)
 - an included Java library called `webdrivers` (see next step)

Note that on some platforms (e.g. Mac OS X) the Java runtime and Maven are already
pre-installed.

### Install the included webdrivers dependency

The following command will instruct Maven to install
the webdrivers Java library to your local Maven repository.

```
$ cd <FiveUI>/webdriver
$ mvn install
```

### Edit the run script

Edit variables at the start of `<FiveUI>/bin/runHeadless.sh` to
reflect your Firefox installation and FiveUI installation directory.

```
$ cat runHeadless.sh
export FIVEUI_ROOT_PATH=$HOME/galois/FiveUI
export FIREFOX_BIN_PATH=$HOME/myapps/Firefox17/Contents/MacOS/firefox
...
```

### Invoke the script

The `runHeadless.sh` script can be invoked from the command line with options.

```
$ runHeadless.sh -h
usage: headless <input file 1> [<input file 2> ...]
 -h                      print this help message
 -o <outfile>            write output to file
 -r <report directory>   write HTML reports to given directory
 -v                      verbose output
 -vv                     VERY verbose output
```

### Try one of the included example runs

Here we execute the example headless run located in
`<FiveUI>/exampleData/headlessRuns/basicRun.json`.

```
$ cd <FiveUI>/exampleData/headlessRuns
$ runHeadless.sh basicRun.json -v -o reports/basic.out -r reports/basic
com.galois.fiveui.HeadlessRunner  - report directory already exists!
com.galois.fiveui.HeadlessRunner  - invoking headless run...
com.galois.fiveui.BatchRunner  - initializing BatchRunner ...
com.galois.fiveui.CrawlParameters  - setting doNotCrawl = True
com.galois.fiveui.BatchRunner  - setting seed URL for crawl: http://www.whitehouse.gov
com.galois.fiveui.BatchRunner  - skipping webcrawl
com.galois.fiveui.BatchRunner  - building webdrivers ...
com.galois.fiveui.BatchRunner  - built: [FirefoxDriver: firefox on MAC (acbc5ed2-2f3e-fc42-8d4d-08a3e31e30d6)]
com.galois.fiveui.BatchRunner  - registering new webdriver...
com.galois.fiveui.BatchRunner  - root path for webdriver is /Users/galois/FiveUI/
com.galois.fiveui.BatchRunner  - loading http://www.whitehouse.gov for ruleset run ...
com.galois.fiveui.BatchRunner  - running ruleset "Color Guidelines"
com.galois.fiveui.BatchRunner  - runRule: url=http://www.whitehouse.gov/, ruleSet="Color Guidelines"
com.galois.fiveui.BatchRunner  - being polite for 1000 millis...
```

After the run is complete you should see a text log of the run in
`reports/basic.out` and an HTML summary report in
`reports/basic/summary.html`. Note that there are many errors
reported on this particular run because the rule sets used (particularly
the color guidelines) were not designed with `whitehouse.gov` in
mind.

### Write your own run configuration.

A run configuration is a text file in JSON format that determines the behavior
of a Headless run, in particular:

 - the location of rule set files
 - web crawl parameters
 - URLs to test (seeds for the crawl)

The format of a run configuration is as follows:

```javascript
/*
 * Comments
 */
{
  'rulePath'  : '<path>',                         // path = the path where rule set files referenced below live.
  'crawlType' : '<d> <n> <p> <pat>',              // d = crawl depth, n = max number of pages to retrieve.
                                                  // p = politeness delay (ms), pat = URL glob pattern
                                                  // (crawlType can also be 'none').
  'runs': [
  { 'url': '<url>', 'ruleSet': '<rule_file>' },   // Each line here corresponds to a separate webcrawl
  { 'url': '<url>', 'ruleSet': '<rule_file>' },   // and rule execution pass.
  ...                                             // Many URL lines may follow.
  ]
}
```

## Building Headless From Source

----------------

### Install Maven

Headless is a [maven](http://maven.apache.org/) managed Java project. The
project's top-level directory is `<FiveUI>/headless`.  You'll need maven
installed on your system to continue.

 - On debian based linux systems: `sudo apt-get install maven`
 - On Mac OS X: Maven 3.x comes pre-installed (OS X 10.7 or later)

### Compile Headless

Once you have maven installed, you can compile the project. This will trigger
the dependencies to be downloaded and installed in your local maven repository
(for a quick intro to using maven, see [Maven in Five
Minutes](http://maven.apache.org/guides/getting-started/maven-in-five-minutes.html).

```
$ mvn compile
```

### Get Firefox E.S.R.

In order to use Headless you also need a copy of Firefox 17 (the current
extended support release or E.S.R.). More recent versions of Firefox may also work, but
they are not supported for use with FiveUI. Download and install Firefox 17
[here](http://www.mozilla.org/en-US/firefox/organizations/all.html). Note that
Firefox 17 can be installed along side existing alternate versions of firefox on
your system or isolated to your user directory by simply moving it's
installation directory.

Now that you've installed Firefox 17, it's time to tell Headless where the
binary lives. For example, when I install Firefox 17 to `~/myapps/Firefox17` on a
Mac OS X system, the firefox binary lives at

```
~/myapps/Firefox17/Contents/MacOS/firefox
```

Locate your firefox binary and remember it for the next step.

### Configuration

In the top-level `headless` directory, copy the configuration file
`programs.properties.example` to `programs.properties`

```
$ cp programs.properties.example programs.properties
```

Now, modify the first entry in `programs.properties`
to point to the location of your Firefox binary from step 3.

At this point, Headless is ready to run, see the Quickstart section above for
usage examples. The "batteries-included" JAR file has been built and lives at
`<FiveUI>/headless/target/HeadlessRunner-0.0.1-SNAPSHOT.one-jar.jar`. Use `java`
to execute the JAR file manually:

```
$ java -jar HeadlessRunner-0.0.1-SNAPSHOT.one-jar.jar ...
```

### Testing

To run the project's unit tests and verify that things are working as they should on
your installation and system, use the maven test target:

```
$ mvn test
```