Sunday, August 14, 2011

Lightsum and Darksum are Calculated, not Measured

In last year's post The Tiljander Data Series: Data and Graphs, I explained that the four Tiljander data series were actually three: Darksum is calculated as (Thickness minus Lightsum).

I've since discovered that there are actually two Tiljander data series rather than four.

Thickness and XRD are measured values.

Lightsum and Darksum are values that Tiljander et al. calculated by multiplying Thickness and XRD.

Here are the formulas. Varve thicknesses are measured in microns (thousandths of a millimeter, um).

Lightsum = Thickness * XRD * 0.003937

Darksum = Thickness * ( 1 - ( XRD * 0.003937 ))

Solving these two equations for Thickness yields

Thickness = Lightsum + Darksum

The calculated values of Lightsum are within 0.01% of the values archived at NCDC. For Darksum, the calculated values are consistently 0.5% to 0.8% too low. Presumably, this is a rounding error.

[UPDATE Aug 15, 2011 -- Commenter HaroldW figured out the exact formulas by which Lightsum and Darksum are calculated. It strongly suggests that Tiljander et al. made a minor arithmetic error in their formulae, such that

Thickness = Lightsum + (( 255/254 ) * Darksum )

"Exact" means that the calculated values of LS and DS agree with the archived values to within 0.001%. I've updated the Excel file at BitBucket to reflect HaroldW's insight.]

"Discovered" as used above is tongue-in-cheek. Obviously, the authors of Tiljander03 have known from the outset that this was their procedure. However, this finding is new to me. Presumably, it is also news to the authors of Mann08, Mann09, Kaufman09, and to other people who take an interest in paleoclimate reconstructions.

"Does it matter?" From a statistical point of view, yes, it does.

Before going into why it's important, here are comparisons of Downloaded versus Calculated Lightsum and Darksum. Clearly, the Calculated series are essentially identical to the series as they were archived. (Graphed data available for download as Excel file Tiljander03-Calculated_LS+DS.xls at BitBucket.) The two graphs are followed by a description of the methods used by Mia Tiljander and her colleagues.

Tiljander et al.'s methods led to the measurement of Thickness and XRD. In their fieldwork, they recovered drill cores of thousands of years of varved sediments from the bottom of Lake Korttajarvi. Back in the lab, they stabilized and preserved these cores by a set of procedures that are commonly used for biological specimens. First they infused the waterlogged mud with acetone. Once all the water was removed, they then impregnated the mud with liquid epoxy, which hardened as it cured. Once solidified, the cores could be cut with a bandsaw to yield a specimen with the desired 2-millimeter thickness. (Reference: Tiljander, Ojalaa, Saarinena, & Snowball, 2002; abstract.)

The first analytical step was to determine the thickness of each annual varve, likely with a caliper.

Second, the epoxied core segments were placed atop X-Ray film, and a pre-determined burst of X-Rays illuminated the specimen. As with dental X-Rays, some materials absorb more X-Rays than do others, exposing the film less, or more. After development, the X-Ray film was scanned and digitized. Minerals (e.g. silica) absorb more X-Rays, leaving the underlying film relatively less exposed. Organic matter absorb less of the X-Ray energy, causing the underlying film to be relatively more exposed. Thus, high XRD is indicative of a high proportion of mineral matter; low XRD indicates a high proportion of organic matter.

The final step in calculating how much mineral matter and how much organic matter was deposited at the bottom of Lake Korttajarvi each year is to combine the Thickness and XRD information. The thicker the varve, the greater the total of Mineral matter plus Organic matter. The less-exposed the X-Ray film underlying a varve, the higher that varve's X-Ray Density, and the higher its proportion of Mineral matter. (And, the lower its proportion of Organic matter.)

The equations for Lightsum and Darksum near the top of this post represent these relationships quantitatively.

These procedures are well-described in publications by co-authors of Tiljander03. For example, on page 20 of his 2001 PhD dissertation (1 MB PDF), Antti E.K. Ojala wrote:
our general procedure (Papers III; IV; V) has been to digitise X-ray radiographs with 1000 dpi optical resolution, providing an average of approximately 24 grey-scale data points per one 0.6 mm thick varve... The acquisition of comparable high-quality grey-scale images of the varved section is usually the most critical and time-consuming phase in digital image analysis. Owing to the considerable density difference between a minerogenic spring lamina and organic matter deposited during the summer, autumn and winter, X-ray radiography is an important and useful tool in documenting thinly (< 1 mm) laminated clastic-organic varves. Dense minerogenic layers have a greater ability to absorb X-rays than organic layers, therefore showing a lighter shadow in the X-ray film (Fig. 3). By using a 19-step standard glass sample with known density (Bresson & Moran, 1998), the comparability of the X-ray radiographs (grey-scale) of 2 mm thick slabs of embedded sediment was facilitated (Paper III).

So: why does it matter that two of the Lake Korttajarvi data series are calculated from the measured values of the other two?

The answer lies in the idea of Degrees of Freedom.

From Wikipedia's entry, here is one definition of the concept:
A common way to think of degrees of freedom is as the number of independent pieces of information available to estimate another piece of information. More concretely, the number of degrees of freedom is the number of independent observations in a sample of data that are available to estimate a parameter of the population from which that sample is drawn.
Think of it this way: suppose I wished to use a set of proxies to estimate a time series of something. That 'something' could be anything: temperature, precipitation, or kangaroo population, for instance. Since my proxies are noisy, I'll have more confidence in an estimate that is derived from a larger number of proxies -- all things being equal. But suppose I decided to increase the proxy count by copying-and-pasting columns in an Excel spreadsheet. One proxy can become two! Two can become four!

Obviously, this sort of copy-paste activity can't improve my results, because I haven't increased the number of independent observations in the data I am using to estimate the parameter of interest (temperature/precipitation/kangaroos). In other words: bigger spreadsheet, but unchanged degrees of freedom.

Returning to Tiljander-in-Mann08:

If Lightsum and Darksum are used as "proxies," then Thickness and XRD cannot be used without specifically reducing the d.f. in all calculations -- they aren't independent.

Conversely, if Thickness and XRD are used as "proxies," then Lightsum and Darksum cannot be used without specifically reducing the d.f. in all calculations -- same reasoning.

There are two possible results if these cautions are not observed. First, the Tiljander data series will be overweighted -- there seems to be twice as much independent data from Lake Korttajarvi as is actually the case. Second, confidence intervals will be drawn too narrowly, as degrees of freedom always enter into such calculations. How much overweighting? Since none of the Tiljander data series can be directly calibrated to the instrumental temperature record -- Mann08's sole approach -- that question can't be answered. (As discussed in other posts, the proper weighting of the Tiljander data series is "zero".) How much underestimation of confidence intervals? There doesn't seem to be a clear answer to this question, either, for the same reason.

In my opinion, there's no evidence and no likelihood of any intent to cut statistical corners by the authors of Mann08 or Mann09. The simple and obvious explanation is inadequate due diligence. This appears to be a common shortcoming of the "proxyhopper" approach favored by these and other researchers engaged in paleotemperature reconstruction.


  1. AMac -
    Thanks for the description of the series. A couple of questions:
    1. The file at the NCDC archive indicates archival in 2009. Was 2009 the first time this dataset was archived? or just the time it was archived at that particular location?

    2. What's going on at year 1326 -- in the NCDC file the values are off the scale. The Mann file replaces this with the average of the previous & succeeding year, and this is copied into your "as downloaded" sheet.

    3. The scaling constant .003937 presumably is an approximation to 1/254, leading one to suppose that the XRD gray scale runs from 0 to 254. Then LS = thickness * (XRD/254). However, the formula for DS is apparently DS = thickness * ((255-XRD)/254). Perhaps this is a simple programming error, as one would imagine that the intent was to divide the varve thickness into LS & DS components, that is, LS+DS = thickness. As it is, LS+DS = 255/254*thickness. One could guarantee LS+DS=thickness by replacing the "255" in the DS formula with 254; or alternatively by replacing the "254"s in the formulas with 255.

    4. The comments at the top of the NCDC file mention "a distinct positive anomaly in mineral matter accumulation between 907 and 875 BC." However, the data only go back as far as year 0.

    Perhaps you could ask Dr. Tiljander about the 255/254 discrepancy and the reason for the limited extent of the archival.

  2. HaroldW -

    Thanks for the careful read.

    1. I don't know when Ojala first archived their data, or when an archive was first easily accessed online. The NOAA/NCDC files do not make this clear. Here is the an NOAA archive... but as you note, it only goes back to the year 0 (sic).

    2. 1326 -- I don't know. That's an interesting observation.

    3. I've added an Update to the post to reflect your insight, and revised the Excel file at BitBucket. What you write is correct.

    8-bit greyscale images -- the standard -- have 2^8 gradations of gray, and 2^8 = 256. If "black" or "white" are (accidentally) excluded, one is left with 255 levels. If both are excluded, one is left with 254. I suspect something like that explains the discrepancy you discovered.

    4. Tiljander03's data goes many centuries prior to AD 1. I am certain that I've pulled the earlier data from an online archive; it is somewhere in the files at BitBucket. I'll try and find the link to its online source.

    I've done a little exploration of different ways of analyzing the Tiljander data series. In short, (1) I don't think averaging or smoothing are the best procedures; (2) taking median values over a 20-year interval gives an interesting pattern that does not appear to be a proxy of temperature, pre-1720; (3) I suspect that both Lightsum and Darksum are positively-correlated proxies for precipitation. Email me if you'd like to follow up.

    As far as I know, the authors of Tiljander03 have not addressed the issues that are raised by Mann08's use of their work. They are, I believe, aware of these issues (and this blog).

  3. AMac --
    Thanks for the responses. I was going to ask about the likelihood that the Tiljander series are valid temperature proxies. I can't see any plausible direct connection between darksum, which is mineral, and temperature. Tiljander says of the darksum (in the archived file) "The accumulation of the mineral lamina is most likely a short-term event and the thickness of the mineral layer is directly related to the duration and strength of the spring flood." So perhaps there's a relation to precipitation, although seasonally limited.

    As to the lightsum (organic material), one might suspect that to the extent that higher temperatures engender greater growth, there might be a connection to the lightsum values. But wouldn't that connection be made in later years, when the plants die and the decayed products are washed back into the lake? Also, it seems more plausible that the quantity of organic accumulation would be more affected by the flow rates into the lake, which would mean that lightsum is connected more to precipitation. [As you say.]

    You've read Dr. Tiljander's thesis -- what are her comments on the utility of these series as proxies?

    In the end, I suppose this is a moot point, as regardless of what relationships one might suppose that the light/darksums hold to climatic parameters, the post-1720 data don't maintain that relationship. At least, they can't be used in a method which requires calibration against actual (modern) temperature data.

  4. HaroldW -

    Your observations are good, though I think you've swapped Lightsum (high X-Ray density; mineral) and Darksum (low X-Ray density; organic) in the 8:13pm comment.

    As far as I know, Tiljander and her colleagues didn't try to construct a quantitative model of possible effects of temperature on LS or DS. They seemed to have limited themselves to gazing at multicentury curves of smoothed data. Not that there's anything wrong with that.

    The point they missed has also been missed by Mann and coauthors, Kaufman and coauthors, Gavin Schmidt, and, in the recent thread at Bart's, Jim Bouldin. Namely, if a data series is indeed a proxy for temperature, it should have characteristics that indicate that it's a proxy for temperature. One would think that an obvious statement like this would be readily grasped. One would think that it could be sensibly discussed as a matter of routine, as scientist/TCO, MikeN, myself, and others did at the below-linked post. But this is climate science.

    In that regard, there are data series for Southern Finland that do appear to be proxies for temperature. See the figure of temperature reconstruction from T.P. Luoto's dissertation on fossil Chironomids from Lake Hamptrask (here; search for "I just looked at Fig. 14"). It's widely accepted that the Little Ice Age extended to Scandanavia -- and a downward excursion in Luoto's Chironomid-derived series can be discerned for those dates. If there is an analogous temperature-derived signal in LS or DS, it is a subtle one.

  5. HaroldW --

    You are correct about the entries for 1326. For that year, in the data as archived at NCDC by Ojala, values for Lightsum and Darksum are anomalous, over tenfold higher than any neighboring year. The same pattern thus holds for Thickness. Somehow, in the files archived at the Mann website, the numbers for all four data series have been overwritten by the average of the values for 1325 and 1327.

    This makes the graphs look prettier than they should.

    As far as I know, this substitution is not noted in Mann08, Mann09, or either of these papers' S.I.s.

    Thanks for the catch.

  6. What is the impact of the double counting? Can we quantify it? I wonder if the series are already linearly dependent does the algorithm end up nixing that aspect of double counting or does it really look like there are double the number of "strong responders".

    I guess we ought to quantify the DOF issue as well. I suspect it is small. (Although maybe not when you get back to the MWP with few series, or in the "non dendro" variations.


    There is another upside proxy in the same paper.

  8. a year later and I am still banned at CA. Do you really think that I am a detriment and the cheering choir a complement?

    Watch out for McI. He has an inner dishonesty. guy you should grok with is Zorita. I would trust him to tell the truth even when it went against his interests (RARE!)

  9. posted at Steve's blog:

    Your comment is awaiting moderation.
    So when are we going to see the redone post-MMH calculations, Steve? You locked a post where you made a mistake (stopping people from discussing it…so much for free speech and criticism…looks more like you use your control of the forum to your advantage shamefully) and promised to rework your math. It has been over a year now.

  10. I have replaced the original (15 Aug 2011) version of "Tiljander03-Calculated_LS+DS.xls" at with one dated 14 Nov 2011. The revision highlights the two different sets of data for the year 1326 (Ojala submission cf. Mann08 archive). Thanks to HaroldW for spotting the apparent discrepancy.

  11. Hey...didn't I have some role in discovering the lack of independence? I know that my intutions were that the two metrics were percent X-ray transmission (can be converted mathwise to absorbance) as well as sample thickness. I think this ended up being all they really had for starts...just transformations after.

  12. > Hey...didn't I have some role in discovering the lack of independence?

    Hmmm. IIRC you did state that that seemed to be the case. I'll do a read-through at some point and highlight any relevant passages.

    For me, the aha moment (small one) was seeing that the formula worked all the way through, except the infilled years.

  13. Can you find an entry on Tiljander at Skeptical Science? I have only found a list of links.