Ken Stewart, October 2011
First, some background:
In March last year I began comparing adjusted temperature data from Australia’s 134 “High Quality” sites with the raw figures. I published my analysis of the results for Non-urban and Urban sites in July and August 2010 (with an update in March this year) . I repeatedly requested information from the Bureau of Meteorology on the reasons for the discrepancy between raw and adjusted data, and the apparent warming bias Australia wide, with no success.
After nothing in reply but an anonymous “go away and write a scientific paper” email, I wrote to Minister Tony Burke on 26 October with a formal complaint and eventually received a reply written on 10 February 2011 by Dr Greg Ayers, head of the Bureau of Meteorology, which did very little to answer my queries. However, he did promise to send me information with the reasons for the large adjustments at the worst nine sites.
On 1 June, after another letter of complaint, (with a copy to Greg Hunt, the Shadow Minister), I was sent the following anonymous email from Web Climate Requests:
Dear Ken Stewart,
thank you for your recent request for information about the Australian annual high-quality temperature data. Attached is a detailed description of station histories, adjustments and relevant information as promised in the Letter from Dr Greg Ayers dated 10 February 2011.
and soon after, an apology from Dr Ayers.
Here is the information that I had requested:
8. Please provide complete details, including station metadata, of the reasons for the large adjustments to the temperature records of the following sites: Omeo, Deniliquin Post Office, Nhill, Wagga Wagga AMO, Kellerberrin, and of the following Urban sites (not used in climate analyses but adjusted): Wangaratta Aero, Echuca Aerodrome, Benalla Shadford St, Dubbo Airport AWS.
I had also included in my original reply to Tony Burke the following: Thank you in anticipation, as I hope this will answer some questions. I trust that it will include some real explanations for the “subjective decisions” used to make adjustments at these sites.
The attachment with the station metadata and adjustments is included in full in the Appendix.
My reason for this request was that the papers cited by Dr Ayers, namely Torok and Nichols (1996) and Della-Marta et al (2004), both plainly state that while discontinuities in the data series indicating a need for adjustments were identified by “objective” methods, the decisions about whether to adjust and by how much were made subjectively after visual analysis. Further, gaps in the records (including at sites with no overlapping comparative data) were infilled with estimates based on subjective techniques.
Della-Marta et al go on to lament that it was impossible to reproduce exactly Torok and Nichols’ adjustments, as slightly different techniques, reference stations, and source data “can apparently produce different results”. It seems they were surprised by this.
It is these “subjective decisions” about the magnitude of the adjustments that perplex me.
The adjustment details given by BOM are almost exactly the same as those given by Torok in his 1995 thesis. The reasons for adjustments (with my interpretation of meaning) are given as :
|(My interpretation of the method)|
|o= objective test||Change points in the difference series (between candidate and neighbouring reference stations) automatically detected.|
|f= median||Comparison with median inter-annual temperature of all non-urban sites within 6 degrees with at least 10 years of data and with similar climatic conditions.|
|r= range||Visual analysis of diurnal temperature range.|
|d= detect||Differences between anomalies between candidate site and 2 neighbours showing spurious trends or discontinuity.|
|l= overlap data||Comparison of data between sites with 2 or more years overlap.|
These indicate how the need for adjustment was detected, when supported by metadata documenting changes to the station. However, remember that every decision to adjust and by how much, even when based on the above “reasons” including objective test and median, was made by visual reference and undisclosed “subjective” techniques.
It is not difficult to find when adjustments applied, and the magnitude. All you have to do is subtract the raw maximum or minimum data from the adjusted data.
So, we know when adjustments were made, how big they were, and how the need for them was identified- all of this anyone could find from an analysis of the data and from Torok’s thesis. The only new information supplied was the station metadata and the adjustments since Torokand Nichols’s paper (1996).
However, the critical information is missing:
- NO explanation for the size of the adjustments;
- NO criteria for thresholds for any objective test, diurnal range, anomaly detect, or median tests;
- NO lists of neighbouring non-urban sites;
- NO reference temperature sets.
I shall now examine each site in turn, examining the adjustments and their reasons, with reference to the supporting metadata now supplied.
BOM gives as reasons Objective test, Median, and Detect (see Appendix). It is very difficult to figure out what sites are used for these methods, as many nearby stations are disqualified by Torok’s and presumably BOM’s own criteria, being over the ranges, on the coast, or much higher up mountains, or far to the north (over more mountainous country). This is especially a problem as the Detect method requires comparison of anomalies with two neighbouring sites. Anomalies have usually been from the 1961-90 mean, and data for these years are very scarce- indeed for any 30 year period in this region. Only Orbost (near the coast) and Beechworth (inland Victoria) have records which overlap in the second half of the century, and only Hotham Heights and Kosciusko Hotel are mountainous sites, with records in the early part of the century. I compared both sets.
The spurious data for 1936-38 is obvious and needs correcting. Most of the time Omeo’s anomalies are quite different from Orbost, but appear to track fairly closely with Beechworth, especially in the late 1950s. The differences show that Omeo is too high relative to Orbost and Beechworth in the 1950s, too low from the mid 60s to mid 70s, and too high in the late 70s. While the 1962 adjustment may well have been needed, it should not have been applied to all previous years. Similarly the Kosciusko record is close to Omeo’s in the late 1920s, while Hotham Heights is completely different. This does not indicate any need for adjustment, and certainly not for all previous years. Further, the 1930 adjustment is for a move- but there is no move documented in the metadata.
Notice that the HQ record appears to be based on Beechworth, then Orbost. The 1978 adjustment of -0.4 appears to be the wrong sign- the adjustment should have been up- and should not have been applied to previous years. There is not enough evidence to evaluate the 1934 adjustment. It appears that the 1917 adjustment makes the anomalies similar to those of Kosciusko so seems appropriate.
The 1996 adjustment is small, and for a move, for which it appears there is no parallel data. Della-Marta et al describe the problems of adjusting station data where a station has moved. Generally, at least 2 years of parallel comparative observations are needed, and adjustments calculated from this would be compared with the objective test method, however the objective method only detects discontinuities caused by site moves of at least 0.5 degrees Celsius. Here the adjustment was +0.1! They continue:
For site moves where no parallel data were available the shift value calculated by the objective technique was used. If this was not available an estimated value was used based on subjective techniques.
Therefore, even though at least 2 years of comparative data are needed, sites such as this with no overlap at all are still used and are adjusted by “subjective” methods.
The best we can say about Omeo is that we don’t know, as there are no nearby overlapping stations with a similar climate. The metadata and “reasons” do not give an explanation for the enormous adjustments at Omeo.
Nhill, in western Victoria, is another example of a station that has moved.
The metadata identifies moves in 1930, 1949, 1966, 1970, 1976, a “small” move in 1992, and 1995. Adjustments are made for the 1930 move (in 1931), 1949, and 1995 moves. There are no overlaps for any of these moves. In fact, there are gaps.
In December 1994 the observer died and the station closed. There were no observations from 17th December until 17th January. In January 1995 it had moved 5km. Normally BOM does not give a monthly mean if there are more than 7 days missing, and does not give an annual mean for any year with a missing month. Therefore, 1994 and 1995 should not have annual means, and there is a gap of two months with no overlaps at all. (Adjustments for site moves are made by examination of monthly overlapping data for two years.) This site should not have been considered for the “High Quality” network.
However, it has been included as HQ, so let’s continue looking at its adjustments. Let’s look at Diurnal Temperature Range (DTR): Some minima adjustment is possibly necessary in 1994, given the large jump in annual diurnal range (not the method used but an indication of possible discontinuity), so we will look at this in more detail shortly, but there appears to be no such need for an equivalent (-0.8) adjustment in 1948. (And notice the adjustment is made to maxima in 1950, and in minima in 1948. Shouldn’t they be together?)
BOM gives one of the reasons for the 1948 adjustment as Detect, so I used a similar (manual) method: I plotted the differences between anomalies of Nhill , Rainbow PO, and Jeparit, the closest sites with good overlap over 1949. Torok says “Any spurious trends or discontinuities at the candidate station should be apparent in the two series involving data from the candidate station, but not in the series of difference between the two neighbours.”
The difference series shows Nhill was too low from 1942-45, too high from 1947-54 (except 1953), so individual year adjustments are needed, but also the difference series fluctuate wildly, especially in the 1930s, so any adjustments should not have been applied to all previous years. Using this method for 1994 by comparing with Stawell, Horsham, and Ararat Prison: It is plain that 1992-94 need adjusting, but the records around 1989-92 are fairly close; when the -0.8 adjustment is applied, the anomalies are closer to Ararat Prison and Stawell, but appear too high by 0.3-0.4. 0.8 is too much.
You could go mad…
BOM’s adjustments at Nhill are based on the Objective Test (Torok’s median method), which apparently detected a discontinuity of more than 0.5 degrees (see above); however they have not supplied any information which will allow this to be checked.
Although BOM’s anonymous spokesperson assures me that all adjustments have now been explained to me, there are several examples of adjustment with no supporting metadata such as Nhill minimum in 1915.
Deniliquin, in south western NSW:
The 1971 minimum adjustment of -0.8: Using a manual detect method, with Falkiner Memorial and Echuca as neighbouring stations (Echuca is urban, I know, but it has a long record) the anomalies show some discrepancy in 1971, but large ones in 1972, and to a lesser extent 1973. However, the years before this, except for 1960-61, do not appear to have any marked discrepancies. Again, without access to the data for the Objective test, I would therefore contend that the 1971 adjustment should be a single year adjustment, and should not apply to all previous years. Although Detect is not used for the 1951 adjustment, the method does not indicate any discontinuity at Deniliquin. Moreover, there doesn’t appear to be anything unusual about the diurnal range around 1951. However, this plot of Deniliquin/ Echuca/ Kerang differences shows Deniliquin compares with Echuca, with Kerang being too low from the 1920s to the 1960s.The Deniliquin adjustments do not appear to be justified.
Wagga Wagga, in inland NSW
The combining of the records for Kooringal and AMO with the good overlap in the 1940s produces a maximum record whose adjustments have little impact on the temperature trend. The minimum overlap appears to show a UHI signal as the minima for Kooringal diverge from the initially identical AMO. I therefore combined the minima records by reducing Kooringal by 0.1C- the average difference for the first 5 years of overlap, whereas BOM adjusts down by 1.0C. This adjustment (made in 1949) thus includes the UHI signal, and applies to all previous years. We do not have any information to examine or replicate the Objective Test used to identify the need for this or the 1969 adjustment, so we must rely on Detect using overlapping nearby stations, Adelong PO, Deniliquin, and Cootamundra, as well as the Resource Centre.
The plot of differences of anomalies shows from 1940-1949 Wagga Wagga is too high; and again in 1967; too low 1973-1975; and far too low in 1917 and 1918. From 1950 to 1970 the AMO anomalies are fairly similar in most years, with some years higher, but which data record is correct? However, BOM adjusts down by 0.7C in 1969 and applies this to all previous years. Without additional information it is impossible to determine the necessity for these adjustments, or their size, but the 1949 adjustment at least appears to be unnecessarily large, and should not be applied to all previous years, especially as BOM metadata shows “documentation unclear”.
Kellerberrin is a small town in the West Australian wheat belt.
The main adjustment to maxima occurs in the 1930s, following a period of poor observation practice. The data from the 1930s looks poor and the adjustment is necessary- but for all previous years? The 1996 adjustment upwards is more concerning. This followed a move about 1.5km north to the airstrip, as the town site was unsuitable due to being bitumen surfaced. So why were the maxima increased, not decreased? Is bitumen cooler than grass under the midday sun? Detect doesn’t show anything, and nothing in 1986.
Looking at minima, the 1996 adjustment appears necessary, but the size is questionable. Here’s the Diurnal Temperature Range, used to justify the maxima and minima adjustments, compared with those of neighbouring Merredin, Cunderdin and Corrigin. All four show little difference in range before 1996, but increasing through different ranges after 1996. Now here’s a plot of daily DTR for 1995, 1996, and 1997. Notice the gaps in September and October 1995. There are large gaps in observations, more than 14 days each in either maxima or minima, so September and October should both show null by BOM’s standards, and therefore so should 1995. The daily range appears to be less before this discontinuity, so I suspect this is when the station moved. If so there is a gap with no comparative monthly data, and Kellerberrin should not be a HQ site. Alternatively, was this when the site was bitumenised? The comparison of anomalies shows Kellerberrin definitely needs adjusting. Here’s a plot with the -0.8 adjustment:
Here’s a plot with an adjustment of -0.4, not -0.8. This shows closer agreement with other site’s anomalies and BOM’s HQ anomalies. The move from the town to the airport was about 1 to 1.5km, and despite the documented problems with the site, there are no overlapping comparative data.
The 1979 move to Telecom land shows in the DTR and anomalies, and the corrections are probably justified- but are not documented at all.
It is obvious that Kellerberrin is different from Merredin and Corrigin after 1995, but in some previous years Merridin is more like Cunderdin whereas Kellerberrin and Corrigin are close. Merredin is the outlier in the 1950s. There is confusion in the 1970s; in 1971 and 1972 Merredin is again different. Anomaly comparison may suggest that from 1950 to 1965 Kellerberrin may need adjustment down by 0.2 to 0.4 in individual years, however after 1965 this would increase the already high Diurnal Range even further. Yet the minimum series was adjusted down for all previous years. This is an example of how the need for adjustment can depend on which anomalies are being compared. Without BOM’s list of nearby sites used for comparison, it is impossible to replicate the need for and size of this adjustment, and we can only assume that they used sites further away than these.
Apart from the 1930s, the undocumented 1979-80 adjustments, and 1996, Kellerberrin’s adjustments do not appear to be justified, and should not be applied to all previous years.
The following four sites are classified as Urban, and therefore are not used in the official climate analyses, however they also have been subject to large adjustments.
Wangaratta Aero in inland Victoria
BOM’s information indicates “Strong urban warming” and the metadata shows numerous site moves; also recent moves are not documented. It is also immediately obvious that Wangaratta should not be a High Quality site because of the two year gap between the two site records, with BOM infilling with estimates. It is therefore necessary to use nearby overlapping data for comparison.
The major adjustments were in 1994 and 1960, with smaller ones in 1951/1953 and 1988, with cooling adjustments in 1918, 1939, and 1974. With no information about data used for Objective test or median, we can only use a manual Detect method to examine differences from neighbouring sites, which compares the differences in anomalies between the target station and two neighbours.
Comparison of minimum anomalies of Wangaratta (including Aero), Benalla, Rutherglen, and Albury Air with Wangaratta HQ shows that the 1994 adjustment is about right, as are the 1986-87 estimates- which still doesn’t make the site HQ. The adjustment should not have been applied generally to years before 1986. In the mid 1940s and early 1950s and mid 1960s Wangaratta is too high and/or Benalla is too low; From 1910s to mid 1930s Wangaratta is low/ Benalla is high. It can be quite clearly seen that the HQ anomaly data is spuriously as much as 1 degree below that of all the others for all years up to 1950, with the possible exception of 1939. Similarly, the maximum anomalies show the closeness of the records, and that the HQ adjustment produces spurious data between 1919 and the late 1950s.
Wangaratta definitely should not be a High Quality site, and the adjustments wrongly increase the warming trend.
Echuca Aerodrome on the Murray River in northern Victoria
Odd, eh? We’ll look at this later.
Once again without necessary information for replicating the Objective Test, we must rely on comparison of anomalies to manually replicate Detect.
The minimum anomalies of Echuca, Deniliquin and Kerang show much disagreement from 1910 until 1924, then close agreement (0.1 to 0.3) between Echuca and Deniliquin (with a couple of exceptions such as 1930 when Deniliquin is 0.6 higher, 1953-56 when it is 0.2 to 0.3 lower, and 1961 when it is 0.7 lower) until 1960, then up to 0.8 higher from 1972-79, then fair agreement from 1980 to 1983, then a difference of 0.2 to 0.4 from 1984 to 1991, then the records diverge. It appears that Kerang is spuriously different from Echuca and Deniliquin from 1910 until the 1970s, then diverges again after 1985 until 2000. The HQ adjusted data appears to favour Kerang or Kyabram in recent decades, and appears to be spurious from 1934 to 1968.
Kerang maximum anomalies are much closer to the others than minima, except 1942-1952 . Echuca is spuriously low from 1921 to 1938, 1950-55, 1968-71, 1986, 1995-97, and 2000.
From 1910 to 1958 the adjustments make the BOM HQ maximum anomalies spuriously low, except for a couple of years. From 1992 HQ appears to be a 0.1 to 0.2 too high
Finally, in all adjustment records there are numerous small “ticks” up and down of about 0.1C, due to rounding perhaps, which are annoying but of no great consequence. However, in Echuca’s minimum in 2005 there’s an enormous undocumented adjustment of +3.4 that sticks out like the proverbial. This disparity is surely evidence that these adjustments have NOT been checked at all, that there is little quality control: or perhaps evidence of just plain incompetence.
Benalla Shadforth Street, in northern Victoria.
There is a gap – no monthly data, although there are daily observations (“Not quality controlled or uncertain, or precise date unknown” is BOM’s description of this data), from September 1986 until observations resume in October 1987 (except for February 1987), although the metadata says the station moved in 1985: this indicates some doubt about the metadata’s accuracy. If there are no comparative observations, a gap in the record, when the station moved, then Benalla should not be HQ.
Without information allowing replication of the Objective test, we can only use anomaly differences to examine the adjustments. Benalla’s maximum record appears too high up to 1936, and too low from the mid 1940s to mid 1950s, and from the mid 1960s to mid 1980s is similar to or lower than neighbours. HQ minimum anomalies appear spuriously low from 1910 to 1959. The 1960 adjustment, applying to all previous years as well, does not appear justified by comparison with other sites. Some of the 1938-1940 adjustments are justified, but not as great. The 1976 adjustments are not documented in the metadata or the adjustment “reasons” and anomaly differences are not conclusive. HQ maximum anomalies appear spuriously low from 1922-1925, 1929-1938, and 1941 to 1959.
Benalla should not be HQ, as it appears to have a 2 year gap with no comparative data. The Benalla adjustments do not appear justified, without the supporting information regarding Objective Test criteria.
Dubbo Airport, in inland NSW.
Dubbo is another combined site, but with many moves (see Appendix). Once again, without information about reference sites for Objective Test and Median, we must rely on anomaly comparisons.
The maximum anomalies are messy until about 1937. The 1998 adjustment appears correct from comparing the raw data for Darling St and AWS, but the anomalies show that the 1987 adjustment of +0.5, while appearing to match anomalies back to 1954, makes the HQ record after 1987 spuriously low. The 1954 correction of -0.5 is not warranted (1951-52 maybe) and the HQ record before 1954 is spuriously low. The 1924 adjustment is not justified.
The record is messy until 1950; after that the anomalies are fairly close. The 1987 5km site move is clearly evident in the minimum anomalies, and when the 0.8 adjustment is made to previous years, the anomalies closely track back to 1970.
Comparison of differences of the Dubbo 2 splice (adjusted by -0.8) with Wellington, Agrow, Molong, and Mudgee shows how messy the records are. By comparison with the decades before the 1950s, the difference between Molong and Dubbo 2 is relatively minor (-0.4 to +0.4), suggesting that the Dubbo record from 1957-1964 is fairly sound.
There appears to be no need for the adjustments of +0.3 in 1977, -0.7 in 1970, or -0.4 in 1952, applying to all previous years. The late 1960s are messy, but without further evidence we cannot confirm the need for the 1970 adjustment to all previous years. The effect of these is clearly visible as the HQ anomalies appear to be spuriously below the others apart from a few years (e.g. 1916-1918 which have been adjusted correctly). 1943-1946 may need adjusting, but not 1940-41. 1929 is not warranted, and 1924 appears to be the wrong sign. And good luck to anyone trying to make sense of the 1910-1915 data.
The minimum adjustments lead to a warming trend of 1.3, compared with the trend of 0.9 when including only the 1987 correction.
The Dubbo record is a mess.
These nine sites were identified as the ones with the largest adjustments of the 134 High Quality stations. The timing of the adjustments, their size, and the method of identifying the need for the pre-1996 ones have for some years been publicly, if not readily, available. Although the Bureau has been repeatedly requested to provide an explanation for the large adjustments, the only new information provided was the station metadata and the reasons for identifying the need for adjustments post 1996. It is impossible to replicate or justify adjustments without:
- detailed explanation for the size of the adjustments;
- the criteria for thresholds for objective tests, diurnal range, anomaly detect, or median tests;
- lists of neighbouring non-urban sites;
- reference temperature sets.
However, using the limited information available, an analysis of the adjustments at these nine sites was conducted. To summarise some of the problems found:-
Omeo: we don’t know: not enough supporting information, no nearby overlapping sites.
Nhill: gap in the record, should not be HQ.
Deniliquin: no apparent significant anomaly differences from nearby sites; no evidence to justify the large adjustments.
Wagga Wagga: large adjustment takes no account of overlapping data showing UHI effect; little supporting information; blanket adjustments to all previous years.
Kellerberrin: UHI wrongly accounted for (wrong sign); no overlapping data; gaps in the record not documented; some adjustments needed but not for all previous years: not enough information. Should not be HQ.
Wangaratta: 2 year gap during move with no overlapping data- should not be HQ; adjustments wrongly increase warming.
Echuca Aerodrome: gross undocumented adjustment shows lack of quality control.
Benalla Shadforth Street: gap in monthly data suggests metadata problematic, possible lack of comparative data- should not be HQ; some undocumented adjustments; not enough information.
Dubbo Airport: a mess. Adjustment of -0.8 justified, others doubtful.
We can conclude that:
- A small number of the adjustments do indeed appear to be justified.
- Many of the adjustments were wrongly applied to all previous years.
- Some are the wrong sign.
- The adjustments, being made subjectively, are mostly impossible to replicate, and often questionable and too large.
- Some adjustments have no supporting metadata.
- Some metadata appear inaccurate.
- Some adjustments are not documented at all.
- It is difficult to locate neighbouring overlapping sites with sufficiently long records for anomaly comparison.
- Some adjustments appear to make Urban Heat Island effect worse.
- Nearly all of the sites have a poor station history. Some have multiple moves, and others reveal poor location, dubious observations, and gaps in the record infilled with estimates.
- Wangaratta, Benalla, Kellerberrin and Nhill should not be High Quality as there are site moves with gaps in the record. Kellerberrin and Nhill (2 of the 5 examined) are however used for climate analyses.
- There is evidence of lack of checking or quality control.
- Several sites feature adjustments that wrongly increase warming.
Nine of the 134 High Quality sites feature adjustments that are problematic and not explained; without complete detailed information and tedious analysis it is not possible to assess the validity of adjustments at the remaining 125- their adjustments remain unexplained.
The Bureau of Meteorology has failed to provide adequate explanation for the large adjustments to the HQ data for even this small subset of the High Quality sites.
The whole saga of BOM’s delays and lack of response to quite fundamental questions, reveals an arrogant, don’t care attitude.
The Australian High Quality Temperature Network has a poor and patchy record. Instead of claiming it to be High Quality (meaning the best they can manage with the poor resources), BOM should admit that the record is a mess. Rather than trying to defend it, BOM should immediately agree to an independent audit.
Another thing: Dr Ayers also promised that he would send me copies of (1) a journal paper reviewing potential bias in warming trends to be published “later in the year” and (2) and an updated summary of operational adjustments “after this is published in the scientific peer review literature”. Well, it is much later in the year, and I’m still waiting. Perhaps I should remind him again?
Della-Marta, P, Collins, D., and Braganza, K: Updating Australia’s high quality annual temperature dataset. Australian Meteorological Magazine, 53 (2004) pp. 75-93.
Torok, S.J.: The development of a high quality historical temperature database for Australia. PhD Thesis: School of Earth Sciences, University of Melbourne, 1996.
Torok, S.J. and Nichols, N: A historical annual temperature dataset for Australia. Australian Meteorological Magazine, 45 (1996) pp. 251-260.
Climate data: http://www.bom.gov.au/climate/data/