A Canadian database is emerging as the planet’s biggest repository of radiocarbon dates. Its overseer dreams of a single global database that will revolutionize archaeology.

In a remote set of islands on BC’s northern coast, home to the Tsimshian people, archaeologist Andrew Martindale drove yet another percussion corer into the ground. Martindale and his team were taking cores, some of them nine metres long, from dozens of sites around Prince Rupert, looking for the crushed mussel shell beds that supported villages here more than 1,000 years ago. The cores would be examined in their lab, and the top and bottom of the shell layers sent away for radiocarbon dating, to pinpoint exactly when people lived here—and to confirm an oral history about how locals abandoned their territory for a couple of generations after an invasion.

Martindale’s team drives a corer into the ground to investigate thousand-year-old settlements in northern BC.

Martindale’s team drives a corer into the ground to investigate thousand-year-old settlements in northern BC. Photo credit: Andrew Martindale

Then, finally, comes the most mundane, but arguably most important part of the process: filing all 300 of those dates away in a database.

The Canadian Archaeological Radiocarbon Database (CARD) is emerging as the world’s biggest repository of carbon dates for all things archaeological, from bones to teeth to artefacts. Martindale, who became director of this initiative in 2014, hopes it will form the backbone of a single, global repository, helping archaeology to enter the modern world of Big Data.

Locations of the archaeological sites in the CARD and the sampling intensity base map.

Locations of the archaeological sites in the CARD and the sampling intensity base map. Photo credit: M.A. Chaput et al

Sorting and cataloguing data, as simple as it sounds, can have transformative effects on science. The National Center for Biotechnology Information’s GenBank, for example, has revolutionized biology by organizing and opening up genetic sequencing information for the masses. Researchers like Martindale likewise dream of saving their colleagues time and money by putting all carbon dates measured by anyone, anywhere, into a single database, which would make previous work easier to find and correct. This should also make the dates themselves easier to mine, thereby creating a whole new technique for tracking how human populations have moved, ebbed and flowed over millennia.

There are other, smaller radiocarbon databases out there: Italy maintains a database of Mediterranean dates; and the Euro Evol project has radiocarbon dates tracking agriculture in Europe.

“Everyone else has just been concerned with their neck of the woods,” notes Robert Kelly, a professor in the University of Wyoming’s Department of Anthropology. He supports the CARD effort, which now includes some 70,000 radiocarbon records from 70 countries. Kelly alone has half a dozen students now compiling tens of thousands of dates from all across the United States, with the aim of adding them to CARD soon.

One of University of Wyoming anthropologist Robert Kelly’s excavation sites, which will yield radiocarbon dating information to be filed in CARD.

One of University of Wyoming anthropologist Robert Kelly’s excavation sites, which will yield radiocarbon dating information to be filed in CARD.  Photo credit: R. L. Kelly

Perhaps the greatest benefit of a centralized radiocarbon database, he argues, is that it opens the door to data mining. “The dates themselves become data,” says Martindale.

The general idea is to assume that the number of dated objects in any given area is proportional to the number of people who were living there at the time. So if you find a cluster of radiocarbon dates, there was likely a large group living there then; if there’s a sudden dip or gap in the radiocarbon record, then whoever was living there likely moved away or died out.

Such work is controversial. Some argue, for example, that there is too much bias in the data. If one researcher was particularly interested in, for example, pottery from the American southwest in AD1100, there might be a blip in radiocarbon dates resulting from the emphasis of their work, rather than from any demographic change. But the database is getting large enough to iron out such blips, argues Kelly.

Controversial or not, researchers have started to use this strategy to map out population movements. In 2015, when CARD was only half its current size, Martindale and colleagues used the dataset to make the first continent-wide map of human occupation of the Americas over the past 13,000 years. This year they used it to track demographic changes during the black plague, and to confirm those oral histories of the Tsimshian near Prince Rupert. The more radiocarbon dates in the database, the better this method works.

: Martindale and his colleagues travel by boat around Tsimshian territory near Prince Rupert.

Martindale and his colleagues travel by boat around Tsimshian territory near Prince Rupert. Photo credit: Andrew Martindale

There are still hurdles for CARD to overcome. Some information, like precise location, must be kept secret in some countries to avoid looting of archaeological sites—this is what is delaying Kelly from adding his US dates to CARD, while he works on agreements with various states on how best to “fuzz” out the location data.

Another problem is trying to standardize all the information that goes into a carbon date. For any given data point researchers might want to know a host of information to judge the date’s quality, says Tom Higham, Deputy Director of the Oxford Radiocarbon Accelerator Unit. “For bones, for example, it is very important to know the pretreatment chemistry applied, the amount of collagen extracted, and the stable isotope values for carbon and nitrogen,” he says.

Martindale acknowledges the inability of shoving all information into a single database. Some contextual information, like the type of site, has no standardized description. The clone system, he hopes, will help to get around some of these problems, by allowing each client to customize their version for their own purposes.

In the meantime, Martindale keeps going out into the field to dig up more artefacts, building up his own small repository of carbon dates. Each one will get filed away into CARD—as he finds the time to enter the data. “Even I have data published over the last 6 months that aren’t yet in CARD,” he laughs. “Now that I have embarrassed myself, I will do that today.”

Building a definitive database – one lab at a time

The idea behind radiocarbon dating is simple. Living things soak up carbon from the environment as they eat or grow. That carbon naturally contains a mix of different isotopes, including radioactive carbon-14. Once the organism stops growing, it ceases to take up new carbon, and its radioactive carbon starts to decay. By tracking the amount of radioactive carbon left in a sample, one can calculate how long ago that creature died, up to about 50,000 years ago.

Age determination is complicated by many factors. The environment’s natural 14C levels have changed over time, thanks, for example, to wobbles in the planet’s magnetic field and the strength of the Sun. And mankind has altered it through the testing of nuclear weapons (which boosts 14C) and the burning of fossil fuels (which dilutes 14C). Additionally, some processes, like a plant soaking up carbon from the air, tend to take up more of one isotope than another. This fractionation needs to be accounted for. The carbon isotopic content of the air is often different from that of the ocean, or the soil, so where an organism was living (or what it ate) also makes a difference to the date calculation.

As researchers learn more about these processes, they publish new calibration calculations allowing scientists to match the right date to their data. The last such publication, in 2013, tweaked some of Martindale’s previous results by about 5 years or so. “It’s just a part of life,” says Martindale. “You have to recalculate every 5 years.”

CARD aims to capture enough contextual information — or at least link out to papers that do — so that dates can be re-calibrated or re-assessed in the light of future research. But perhaps the clearest benefit of a central database is that it simply makes data easier to find. Martindale, who has been working in coastal BC for 20 years, recently stumbled on a relevant dataset from another researcher that he didn’t know existed.

“And I thought I had it all,” Martindale laughs.

In addition, putting it all in one spot saves researchers the expense of replicating the work of others. According to Carley Crann, who works at the University of Ottawa’s high-precision isotope lab, the André E. Lalonde Accelerator Mass Spectrometry Laboratory, carbon dating in Canada cost hundreds of dollars per sample.

The Lalonde lab is the only one in Canada that does accelerator mass spec radiocarbon dating. That means they can analyze small samples, like bone the size of a thumb, and do so in an hour rather than weeks or months. They crunch through about 2,000 dates a year for researchers who send them their samples. Nevertheless, Crann notes, the lab does not have a good database to store it all. “We currently don’t have an efficient way to store all the data we collect, so we need to build a database,” she says.

The accelerator mass spectrometer at the University of Ottawa’s Lalonde Lab is one of a kind in Canada.

The accelerator mass spectrometer at the University of Ottawa’s Lalonde Lab is one of a kind in Canada. Photo credit: Lalonde Lab

Crann is now working with Martindale to make a clone of CARD, copying its software so the Lalonde lab does not have to start from scratch — an effort that might otherwise cost upward of six figures. For its part, CARD is happy to give Lalonde the code for free, since it will encourage that facility’s customers to donate their data to the CARD central database. The easier they make the interface—with just a simple button that says “yes donate my data”—the better the chances that people will contribute.

If it works out, the Lalonde lab will be the guinea pig for this strategy, which CARD hopes to roll out for others around the globe. This win-win strategy sidesteps an awkward question of funding that has vexed CARD since it was started in the 1980s on a shoestring budget. Martindale brought a $15,000 grant with him when he took over as director, but he has learned that prospective donors do not find supporting an archaeological cataloguing service to be as appealing as supporting work in the field.

Martindale’s plan gets around this problem by de-centralizing the funding. Each group or lab gets to customize their own version of CARD to suit their individual needs or research questions, but will then be responsible for raising their own cash to do so. In return, CARD gets access to ever-more data.