IntChron is an indexing service for chronological data such as radiocarbon dates (Bronk Ramsey et al. 2019). It specifies a standard exchange format and provides a consistent API for querying databases that use its schema. The rintchron package provides a simple interface for querying these databases with intchron(), explained in this vignette.

The package also includes low level functions for interacting with the IntChron API directly, described in vignette("intchron-api").

Querying IntChron

Use intchron() to query databases indexed by IntChron. At a minimum, you will need to specify which databases or ‘hosts’ you want to query.1 Use intchron_hosts() to see a list of currently available databases:

intchron_hosts()
#> # A tibble: 5 x 2
#>   host     database                            
#>   <chr>    <chr>                               
#> 1 egyptdb  Egyptian Radiocarbon Database       
#> 2 intimate INTIMATE Database                   
#> 3 nrcf     NERC Radiocarbon Facility (Oxford)  
#> 4 oxa      Oxford Radiocarbon Accelerator Unit 
#> 5 sadb     Southern Africa Radiocarbon Database

The first argument to intchron() should be a the ‘host’ code of the database you want to query, or a vector of hosts to query more than one (e.g. intchron(hosts = c("oxa", "nrcf"))). For example, to return the entire South Africa Radiocarbon Database (‘sadb’):

intchron("sadb")
#> # A tibble: 2,575 x 24
#>    record_site record_country record_name record_longitude record_latitude
#>    <chr>       <chr>          <chr>                  <dbl>           <dbl>
#>  1 Basinghall  Botswana       Basinghall              27.1           -23.5
#>  2 Basinghall  Botswana       Basinghall              27.1           -23.5
#>  3 Basinghall  Botswana       Basinghall              27.1           -23.5
#>  4 Basinghall  Botswana       Basinghall              27.1           -23.5
#>  5 Basinghall  Botswana       Basinghall              27.1           -23.5
#>  6 Basinghall  Botswana       Basinghall              27.1           -23.5
#>  7 Bisoli      Botswana       Bisoli                  27.6           -21.0
#>  8 Bisoli      Botswana       Bisoli                  27.6           -21.0
#>  9 Bobonong R… Botswana       Bobonong R…             28.8           -22.2
#> 10 Bosutswe    Botswana       Bosutswe                26.6           -22.0
#> # … with 2,565 more rows, and 19 more variables: series_type <chr>,
#> #   labcode <chr>, country <chr>, region <chr>, longitude <dbl>,
#> #   latitude <dbl>, site <chr>, sample <chr>, material <chr>, species <chr>,
#> #   d13C <dbl>, r_date <int>, r_date_sigma <int>, qual <chr>,
#> #   environment <chr>, context <chr>, period <chr>, subperiod <chr>, refs <chr>

You can further refine your query by specifying the locations you are interested in with the countries and sites parameters. Like hosts, these can also accept a vector of locations. For example, to download records from Jordan in the ORAU (oxa) and NERC-RF (nrcf) databases:

jordan <- intchron(c("oxa", "nrcf"), countries = "Jordan")
jordan
#> # A tibble: 156 x 19
#>    record_site record_country record_name record_longitude record_latitude
#>    <chr>       <chr>          <chr>                  <dbl>           <dbl>
#>  1 Araq ed-Du… Jordan         Araq ed-Du…             32.3            35.7
#>  2 Ayn Qasiyah Jordan         Ayn Qasiyah             36.8            31.8
#>  3 Ayn Qasiyah Jordan         Ayn Qasiyah             36.8            31.8
#>  4 Ayn Qasiyah Jordan         Ayn Qasiyah             36.8            31.8
#>  5 Ayn Qasiyah Jordan         Ayn Qasiyah             36.8            31.8
#>  6 Ayn Qasiyah Jordan         Ayn Qasiyah             36.8            31.8
#>  7 Azraq 31    Jordan         Azraq 31                36.8            31.8
#>  8 Azraq 31    Jordan         Azraq 31                36.8            31.8
#>  9 Azraq 31    Jordan         Azraq 31                36.8            31.8
#> 10 Burqu' 02   Jordan         Burqu' 02               37.8            32.7
#> # … with 146 more rows, and 14 more variables: series_type <chr>,
#> #   labcode <chr>, longitude <dbl>, latitude <dbl>, sample <chr>,
#> #   material <chr>, species <chr>, d13C <dbl>, r_date <int>,
#> #   r_date_sigma <int>, qual <chr>, F14C <dbl>, F14C_sigma <dbl>, refs <chr>

Use intchron_countries() to get a list of available countries on IntChron:

intchron_countries()
#> # A tibble: 109 x 1
#>    country   
#>    <chr>     
#>  1 Albania   
#>  2 Algeria   
#>  3 Andorra   
#>  4 Angola    
#>  5 Antarctica
#>  6 Argentina 
#>  7 Armenia   
#>  8 Australia 
#>  9 Austria   
#> 10 Bahrain   
#> # … with 99 more rows

Or in specific databases:

intchron_countries(c("intimate", "egyptdb"))
#> # A tibble: 13 x 2
#>    host     country                
#>    <chr>    <chr>                  
#>  1 intimate ""                     
#>  2 intimate "France"               
#>  3 intimate "Greenland"            
#>  4 intimate "Ireland"              
#>  5 intimate "Italy"                
#>  6 intimate "Norway"               
#>  7 intimate "Romania"              
#>  8 intimate "Slovenia"             
#>  9 intimate "Switzerland"          
#> 10 intimate "UK"                   
#> 11 egyptdb  "Egypt"                
#> 12 egyptdb  "Palestinian Territory"
#> 13 egyptdb  "Sudan"

Working with IntChron records

With the default setting tabulate = TRUE, intchron() returns a table of records, ready for you to use in your analysis:

library("dplyr", warn.conflicts = FALSE)

# Summarise radiocarbon dates available from sites in Jordan
jordan %>% 
  distinct(labcode, .keep_all = TRUE) %>% 
  group_by(record_site) %>% 
  summarise(n_dates = n(), .groups = "drop_last")
#> # A tibble: 20 x 2
#>    record_site            n_dates
#>  * <chr>                    <int>
#>  1 Araq ed-Dubb                 1
#>  2 Ayn Qasiyah                  5
#>  3 Azraq 31                     3
#>  4 Burqu' 02                    1
#>  5 Burqu' 03                    1
#>  6 Burqu' 27                    3
#>  7 Burqu' 35                    3
#>  8 Dahikiya, Badia Region       2
#>  9 Dhuweila                     4
#> 10 Kharaneh IV                  7
#> 11 Shuna Project               22
#> 12 Tell Abu Al-Kharaz          18
#> 13 Tell Abu en-Niaj             3
#> 14 Tell el-Hayyat               4
#> 15 Tell el-Hibr                 1
#> 16 Tell Hesban                  1
#> 17 Wadi Jilat                  11
#> 18 Wadi Jilat 13                1
#> 19 Wadi Jilat 22                2
#> 20 Wadi Jilat 25                1

Note the use of distinct(labcode) above. The data from IntChron usually requires some cleaning; for example, the ORAU and NERC-RF databases contain many duplicate radiocarbon dates. The c14bazAAR package (Schmid, Seidensticker, and Hinz 2019) includes many useful functions for tidying radiocarbon data.

You may find the stratigraphr and rcarbon (Crema and Bevan 2020) packages useful for further analysis of radiocarbon dates in R.

In some situations you might want to access the full records returned by IntChron. Setting tabulate = FALSE will return the raw JSON responses as a named list. See vignette("intchron-api") for some tips on how to work with these objects.

References

Bronk Ramsey, Christopher, Maarten Blaauw, Rebecca Kearney, and Staff, Richard A Staff. 2019. “The Importance of Open Access to Chronological Information: The IntChron Initiative.” Radiocarbon 61 (5): 1–11. https://doi.org/10.1017/RDC.2019.21.

Crema, Enrico R, and Andrew Bevan. 2020. “Inference from Large Sets of Radiocarbon Dates: Software and Methods.” Radiocarbon, 1–17. https://doi.org/10.1017/RDC.2020.95.

Schmid, Clemens, Dirk Seidensticker, and Martin Hinz. 2019. “c14bazAAR: An R Package for Downloading and Preparing C14 Dates from Different Source Databases.” Journal of Open Source Software 4 (43): 1914. https://doi.org/10.21105/joss.01914.


  1. There are several reasons for this. First and foremost, it reduces the number of requests a given query has to make to the IntChron API. Querying all hosts isn’t usually necessary because IntChron indexes different types of database (e.g. radiocarbon dates from ORAU, palaeoclimate records from INTIMATE) which are rarely combined in a single analysis. Also, since IntChron is an indexing service, it is designed to include more databases over time, meaning analysis code that is not explicit about which hosts it needs is likely to break or at least become much less efficient in the future. But if you do need to, you can query all hosts by setting host = "all".↩︎