XQuery/Uptime monitor
Motivation
[edit | edit source]You would like to monitor the service availability of several web sites or web services. You would like to do this all with XQuery and store the results in XML files. You would also like to see "dashboard" graphical displays of uptime.
There are several commercial services (Pingdom, Host-tracker )which will monitor the performance of your web sites in terms of uptime and response time.
Although the production of a reliable service requires a network of servers, the basic functionality can be performed using XQuery in a few scripts.
Method
[edit | edit source]This approach focuses on the uptime and response time of web pages. The core approach is to use the eXist job scheduler to execute an XQuery script at regular time intervals. This script performs a HTTP GET on a URI and records the statusCode of the site in an XML data file.
The operation is timed to gather response times from elapsed time (valid on a lightly used server) and the test results stored. Reports can then be run from the test results and alerts send when a site is observed to be down.
Even though a prototype, the access to fine-grained data has already revealed some response time issues on one of the sites at the University.
Conceptual Model
[edit | edit source]This ER model was created in QSEE, which can also generate SQL or XSD.
In this notation the bar indicates that Test is a weak entity with existence dependence on Watch.
Mapping ER model to Schemas
[edit | edit source]Watch-Test relationship
[edit | edit source]Since Test is dependent on Watch, the Watch-Test relationship can be implemented as composition, with the multiple Test elements contained in a Log element which itself is a child of the Watch element. Tests are stored in chronological order.
Watch Composition
[edit | edit source]Two possible approaches:
- add the Log as a element amongst the base data for the Watch
Watch uri name Log Test
- construct a Watch element which contains the Watch base data as WatchSpec and the Log
Watch WatchSpec (the Watch entity ) uri name Log
The second approach preserves the original Watch entity as a node, and also fits with the use of XForms, allowing the whole WatchSpec node to be included in a form. However it introduces a difficult-to-name intermediate, and results in paths like
$watch/WatchSpec/uri
when
$watch/uri would be more natural.
Here we choose the first approach on the grounds that it is not desirable to introduce intermediate elements in anticipation of simpler implementation of a particular interface.
Watch entity
[edit | edit source]A Watch entity may be implemented as a file or as an element in a collection. Here we choose to implement Watch as a element in a Monitor container in a document. However this is a difficult decision and the XQuery code should hide this decision as much as possible.
Attribute implementation
[edit | edit source]Watch attributes are mapped to elements. Test attributes are mapped to attributes.
Schema
[edit | edit source]Model Generated
[edit | edit source]QSEE will generate an XML Schema. In this mapping, all relationships are implemented with foreign keys, with key and keyref used to describe the relationship. In this case, the schema would need to be edited to implement the Watch-Test relationship by composition.
By Inference
[edit | edit source]This schema has been generated by Trang (in Oxygen ) from an example document, created as the system runs.
- Compact Relax NG
element Monitor { element Watch { element uri { xsd:anyURI }, element name { text }, element Log { element Test { attribute at { xsd:dateTime }, attribute responseTime { xsd:integer }, attribute statusCode { xsd:integer } }+ } }+ }
- XML Schema
Designed Schema
[edit | edit source]Editing the QSEE generated schema results in a schema which includes the restriction on statusCodes.
Test Data
[edit | edit source]An XQuery script transforms an XML Schema (or a subset thereof) to a random instance of a conforming document.
The constraint that Tests are in ascending order of the attribute at is not defined in this schema. The generator needs to be helped to generate useful test data by additional information about the length of strings and the probability distribution of enumerated values, iterations and optional elements
Equivalent SQL implementation
[edit | edit source]CREATE TABLE Watch(
uri VARCHAR(8) NOT NULL,
name VARCHAR(8) NOT NULL,
CONSTRAINT pk_Watch PRIMARY KEY (uri)
) ;
CREATE TABLE Test(
at TIMESTAMP NOT NULL,
responseTime INTEGER NOT NULL,
statusCode INTEGER NOT NULL,
uri VARCHAR(8) NOT NULL,
CONSTRAINT pk_Test PRIMARY KEY (at,uri)
) ;
ALTER TABLE Test
ADD INDEX (uri),
ADD CONSTRAINT fk1_Test_to_Watch FOREIGN KEY(uri)
REFERENCES Watch(uri)
ON DELETE RESTRICT
ON UPDATE RESTRICT;
In the Relational implementation the primary key uri of Watch is the foreign key of Test. There would be an advantage to adding a system-generated id to use in place of this meaningful URI, both to remove the redundancy created and to reduce the size of the foreign key. However a mechanism is then need to allocate unique ids.
Implementation
[edit | edit source]Dependencies
[edit | edit source]eXistdb modules
[edit | edit source]- xmldb for database update and login
- datetime for date formating
- util - for system-time function
- httpclient - for HTTP GET
- scheduler - to schedule the monitoring task
- validation - for database validation
other
[edit | edit source]- Google Charts
Functions
[edit | edit source]Functions in a single XQuery module.
module namespace monitor = "http://www.cems.uwe.ac.uk/xmlwiki/monitor";
Database Access
[edit | edit source]Access to the Monitor database which may be a local database document, or a remote document.
declare function monitor:get-watch-list($base as xs:string) as element(Watch)* {
doc($base)/Monitor/Watch
};
A specific Watch entity is identified by its URI:
let $wl:= monitor:get-watch-list("/db/Wiki/Monitor3/monitor.xml")
Further references to a Watch are by reference. e.g.
declare function monitor:get-watch-by-uri($base as xs:string, $uri as xs:string) as element(Watch)* {
monitor:get-watch-list($base)[uri=$uri]
};
Executing Tests
[edit | edit source]The test does an HTTP GET on the uri. The GET is bracketed by calls to util:system-time() to compute the elapsed wall-clock time in milliseconds. The test report includes the statusCode.
declare function monitor:run-test($watch as element(Watch)) as element(Test) {
let $uri := $watch/uri
let $start := util:system-time()
let $response := httpclient:get(xs:anyURI($uri),false(),())
let $end := util:system-time()
let $runtimems := (($end - $start) div xs:dayTimeDuration('PT1S')) * 1000
let $statusCode := string($response/@statusCode)
return
<Test at="{current-dateTime()}" responseTime="{$runtimems}" statusCode="{$statusCode}"/>
};
The generated test is appended to the end of the log:
declare function monitor:put-test($watch as element(Watch), $test as element(Test)) {
update insert $test into $watch/Log
};
To execute the test, a script logs in, iterates through the Watch entities and for each, executes the test and stores the result:
import module namespace monitor = "http://www.cems.uwe.ac.uk/xmlwiki/monitor" at "monitor.xqm";
let $login := xmldb:login("/db/","user","password")
let $base := "/db/Wiki/Monitor3/Monitor.xml"
for $watch in monitor:get-watch-list($base)
let $test := monitor:run-test($watch)
let $update :=monitor:put-test($watch,$test)
return $update
Job scheduling
[edit | edit source]A job is scheduled to run this script every 5 minutes.
let $login := xmldb:login("/db","user","password")
return scheduler:schedule-xquery-cron-job("/db/Wiki/Monitor/runTests.xq" , "0 0/5 * * * ?")
Index page
[edit | edit source]The index page is based on a supplied Monitor document, by default the production database.
import module namespace monitor = "http://www.cems.uwe.ac.uk/xmlwiki/monitor" at "monitor.xqm";
declare option exist:serialize "method=xhtml media-type=text/html";
declare variable $heading := "Monitor Index";
declare variable $base := request:get-parameter("base","/db/Wiki/Monitor3/Monitor.xml");
<html>
<head>
<title>{$heading}</title>
</head>
<body>
<h1>{$heading}</h1>
<ul>
{for $watch in monitor:get-watch-list($base)
return
<li>{string($watch/name)}   
<a href="report.xq?base={encode-for-uri($base)}&uri={encode-for-uri($watch/uri)}">Report</a>
</li>
}
</ul>
</body>
</html>
In this implementation, the URI of the monitor document is passed to dependent scripts in the URI. An alternative would to pass this data via a session variable.
Reporting
[edit | edit source]Reporting draws on the log of Tests for a Watch
declare function monitor:get-tests($watch as element(Watch)) as element(Test)* {
$watch/Log/Test
};
Overview Report
[edit | edit source]The basic report shows summary data about the watched URI and an embedded chart of response time over time. Up-time is the ratio of tests with a status code of 200 to the total number of tests.
import module namespace monitor = "http://www.cems.uwe.ac.uk/xmlwiki/monitor" at "monitor.xqm";
declare option exist:serialize "method=xhtml media-type=text/html";
let $base := request:get-parameter("base",())
let $uri:= request:get-parameter("uri",())
let $watch :=monitor:get-watch-by-uri($base,$uri)
let $tests := monitor:get-tests($watch)
let $countAll := count($tests)
let $uptests := $tests[@statusCode="200"]
let $last24hrs := $tests[position() >($countAll - 24 * 12)]
let $heading := concat("Performance results for ", string($watch/name))
return
<html>
<head>
<title>{$heading}</title>
</head>
<body>
<h3>
<a href="index.xq">Index</a>
</h3>
<h1>{$heading}</h1>
<h2><a href="{$watch/uri}">{string($watch/uri)}</a></h2>
{if (empty($tests))
then ()
else
<div>
<table border="1">
<tr>
<th>Monitoring started</th>
<td> {datetime:format-dateTime($tests[1]/@at,"EE dd/MM HH:mm")}</td>
</tr>
<tr>
<th>Latest test</th>
<td> {datetime:format-dateTime($tests[last()]/@at,"EE dd/MM HH:mm")}</td>
</tr>
<tr>
<th>Minimum response time </th>
<td> {min($tests/@responseTime)} ms </td>
</tr>
<tr>
<th>Average response time</th>
<td> { round(sum($tests/@responseTime) div count($tests))} ms</td>
</tr>
<tr>
<th>Maximum response time </th>
<td> {max($tests/@responseTime)} ms</td>
</tr>
<tr>
<th>Uptime</th>
<td>{round(count($uptests) div count($tests) * 100) } %</td>
</tr>
<tr>
<th>Raw Data </th>
<td>
<a href="testData.xq?base={encode-for-uri($base)}&uri={encode-for-uri($uri)}">View</a>
</td>
</tr>
<tr>
<th>Response Distribution </th>
<td>
<a href="responseDistribution.xq?base={encode-for-uri($base)}&uri={encode-for-uri($uri)}">View</a>
</td>
</tr>
</table>
<h2>Last 24 hours </h2>
{monitor:responseTime-chart($last24hrs)}
<h2>1 hour averages </h2>
{monitor:responseTime-chart(monitor:average($tests,12))}
</div>
}
</body>
</html>
Response time graph
[edit | edit source]The graph is generated using the Google Chart API. The default vertical scale from 0 to 100 fits the typical response time. In this simple example, the graph is unadorned or explained.
declare function monitor:responseTime-chart($test as element(Test)* ) as element(img) {
let $points :=
string-join($test/@responseTime,",")
let $chartType := "lc"
let $chartSize := "300x200"
let $uri := concat("http://chart.apis.google.com/chart?",
"cht=",$chartType,"&chs=",$chartSize,"&chd=t:",$points)
return
<img src="{$uri}"/>
};
Response Time Frequency Distribution
[edit | edit source]The frequency distribution of response times summarised the response times. First the distribution itself is computed as a sequence of groups. The interval calculation is crude and uses 11 groups to fit with Google Chart.
declare function monitor:response-distribution($test as element(Test)* ) as element(Distribution) {
let $times := $test/@responseTime
let $min := min($times)
let $max := max($times)
let $range := $max - $min
let $step := round( $range div 10)
return
<Distribution>
{
for $i in (0 to 10)
let $low := $min + $i * $step
let $high :=$low + $step
return
<Group i="{$i}" mid="{round(($low + $high ) div 2)}" count="{ count($times[. >= $low] [. < $high]) }"/>
}
</Distribution>
};
This grouped distribution can then be Charted as a bar chart. Scaling is needed in this case.
declare function monitor:distribution-chart($distribution as element(Distribution)) as element(img) {
let $maxcount := max($distribution/Group/@count)
let $scale :=100 div $maxcount
let $points :=
string-join( $distribution/Group/xs:string($scale * @count),",")
let $chartType := "bvs"
let $chartSize := "300x200"
let $uri := concat("http://chart.apis.google.com/chart?",
"cht=",$chartType,"&chs=",$chartSize,"&chd=t:",$points)
return
<img src="{$uri}"/>
};
Finally a Script to create a page:
import module namespace monitor = "http://www.cems.uwe.ac.uk/xmlwiki/monitor" at "monitor.xqm";
declare option exist:serialize "method=xhtml media-type=text/html";
let $base := request:get-parameter("base",())
let $uri:= request:get-parameter("uri",())
let $watch := monitor:get-watch($base,$uri)
let $tests := monitor:get-tests($watch)
let $heading := concat("Distribution for ", string($watch/name))
let $distribution := monitor:response-distribution($tests)
return
<html>
<head>
<title>{$heading}</title>
</head>
<body>
<h1>{$heading}</h1> {monitor:distribution-chart($distribution)} <br/>
<table border="1">
<tr>
<th>I </th>
<th>Mid</th>
<th>Count</th>
</tr> {for $group in $distribution/Group return <tr>
<td>{string($group/@i)}</td>
<td>{string($group/@mid)}</td>
<td>{string($group/@count)}</td>
</tr> } </table>
</body>
</html>
Validation
[edit | edit source]The eXist module provides functions for validating a document against a schema. The Monitor document links to a schema:
let $doc := "/db/Wiki/Monitor3/Monitor.xml"
return
<report>
<document>{$doc}</document>
{validation:validate-report(doc($doc))}
</report>
Alternatively, a document can be validated against any schema:
let $schema := "http://www.cems.uwe.ac.uk/xmlwiki/Monitor3/trangmonitor.xsd"
let $doc := "/db/Wiki/Monitor3/Monitor.xml"
return
<report>
<document>{$doc}</document>
<schema>{$schema}</schema>
{validation:validate-report(doc($doc),xs:anyURI($schema))}
</report>
This is used to check that the randomly generated instance is valid:
let $schema := request:get-parameter("schema",())
let $file := doc(concat("http://www.cems.uwe.ac.uk/xmlwiki/XMLSchema/schema2instance.xq?file=",$schema))
return
<result>
<schema>{$schema}</schema>
{validation:validate-report($file,xs:anyURI($schema))}
{$file}
</result>
Downtime alerts
[edit | edit source]The purpose of a monitor is to alert those responsible for a site to its failure. Such an alert might be by SMS, email or some other channel. The Watch entity will need to be augmented with configuration parameters.
Check if failed
[edit | edit source]First it is necessary to calculate whether the site is down. monitor:failing () returns true() if all tests in the past $watch/fail-minutes have not returned a statusCode of 200.
declare function monitor:failing($watch as element(Watch)) as xs:boolean {
let $now := current-dateTime()
let $lastTestTime := $now - $watch/failMinutes * xs:dayTimeDuration("PT1M")
let $recentTests := $watch/Log/Test[@at > $lastTestTime]
return
every $t in $recentTests satisfies
not($t/statusCode = "200")
};
Check if alert already sent
[edit | edit source]If this test is executed repetitively by a scheduled job, an Alert message on the appropriate channel can be generated. However, the Alert message would be sent every time the condition is true. It would be better to send an Alert less frequently. One approach would add Alert elements to the log, interspersed with the Tests. This does not affect the code which accesses Tests, but allows us to inhibit Alerts when one has been recently. alert-sent() will be true if an alert has been sent in the last $watch/alert-minutes.
declare function monitor:alert-sent($watch as element(Watch) as xs:boolean ) {
let $now := current-dateTime()
let $lastAlertTime := $now - $watch/alertMinutes * xs:dayTimeDuration("PT1M")
let $recentAlerts := $watch/Log/Alert[@at > $lastAlertTime]
return
exists($recentAlerts)
};
Alter notification task
[edit | edit source]The task to check the monitor log iterates through the Watches and for each checks if it is failing but no Alert has been sent in the period. If so, a message is constructed and an Alert element is added to the Log. The use of the Log to record Alert events means that no other state need to be held, and the period with which this task is executes is unrelated to the Alert period.
import module namespace monitor = "http://www.cems.uwe.ac.uk/xmlwiki/monitor" at "monitor.xqm";
let $login := xmldb:login("/db/","user","password")
let $base := "/db/Wiki/Monitor3/Monitor.xml"
for $watch in monitor:get-watch-list($base)
return
if (monitor:failing($watch) and not(monitor:alert-sent($watch)))
then
let $update := update insert <Alert at="{current-dateTime()}"/> into $watch/Log
let $alert := monitor:send-alert($watch,$message)
return true()
else false()
Discussion
[edit | edit source]Alert events could be added to a separate AlertLog but it is arguably easier to add a new class of Events than create a separate sequence for each. There may also be cases where the sequential relationship between Tests and Events is useful.
[ Re-designed Schema]
To do
[edit | edit source]- add create/edit Watch
- detect missing tests
- Support analysis for date ranges by filtering tests by date prior to analysis
- improve the appearance of the charts