The National Archives Labs

Datasets – UPDATED

We have updated our variety of datasets, which are based on records and information held by The National Archives, and encourage web developers to experiment with new applications, online tools and ways of visualising data.

We will continue to add more datasets to this post, so do revisit Labs for new data to use. We will also update the data currently available, so please check the date in which the data has been added for the latest version. All the information about the data is below. Do let us know what you’ve done with the data by posting a comment on Labs - it would be great to hear from you.  To access all the datasets, go to: data.nationalarchives.gov.uk/records/.

The data we have made available through Labs is licensed under the Open Government Licence.

Middlesex Appeal Tribunal Case Papers, 1916-1919 (MH 47)

Type: Fields extracted from The National Archives’ catalogue

File format: Excel Spreadsheet

Size: 967k

Date added: 22 January 2014

Catalogue details of the case papers from series MH 47 of individuals appealing against conscription to the Middlesex Appeal Tribunal, giving names, addresses and occupations of appellants, as well as grounds for appeal. It would support georeferencing and statistical analysis of appellants’ occupations and grounds of appeal.

Right-click to save this dataset to your computer.

Women in the National Register of Archives

Type: Authority dataThe National Register of Archives

File format: Excel Spreadsheet

Size: 4.1mb

Date added: 20 March 2012

The National Register of Archives (NRA) is a central point for the collection and dissemination of information about the nature and location of manuscripts relating to British history. It currently consists of over 44,000 unpublished lists and catalogues that describe archival holdings in the UK and overseas.

This dataset contains details of all the collections in the NRA relating to women. It can be combined with data from ARCHON to allow georeferencing.

The individuals listed in this dataset form part of the NNAF (National Name Authority File).

Right-click to save this dataset to your computer.

Government films accessioned by The National Archives

Type: List of filmsFilms accessioned by The National Archives

File format: Excel Spreadsheet

Size: 928k

Date added: 20 March 2012

List of over 4800 government films accessioned by The National Archives and held as public records in the BFI’s National Film Archive including wartime and documentary classics such as Night Mail and Listen to Britain and Colonial films such as Amenu’s Child and Morning on Mount Kenya.

The National Archives holds paper files on the making of many of these films in the INF class.

Right-click to save this dataset to your computer.

New arrivals: files accessioned by The National Archives 2000-11

Type: File listNew arrivals: films accessioned by The National Archives

File format: TSV

Size: 1mb

Date added: 20 March 2012

Gives details of files received by The National Archives over the past decade including department of origin, year of arrival, catalogue reference and physical width on our shelves.

This dataset would support visualisation.

Right-click to save this dataset to your computer.

ARCHON Directory of UK Archives

ARCHON Directory of UK Archives

Type: Address data

File format: CSV

Size: 505k

Date added: 26 July 2011

Postal and web addresses for around 2400 UK record repositories listed in the National Register of Archives (NRA). This dataset would support georeferencing.

Right-click to save this dataset to your computer.

ARCHON Directory of International Archives

ARCHON Directory of UK Archives

Type: Address data

File format: CSV

Size: 135k

Date added: 26 July 2011

A list of around 700 archives from around the world that hold collections noted under the indexes of the National Register of Archives (NRA).

The dataset gives postal and web addresses for these repositories and would support georeferencing.

Right-click to save this dataset to your computer.

NDAD: National Digital Archive of Datasets

Snapshot of NDAD: National Digital Archive of Datasets

Type:Collection of government databases

File format: Zipped CSV files

Size: Various

Date added: 26 July 2011

The National Digital Archive of Datasets is accessible through Documents Online. It contains hundreds of datasets from over 30 government departments from crime and judicial statistics to listed buildings and bat population data.

Learn more and download datasets from The National Archives’ website.

Hospital Records Database

Doctors around a hospital bed

Type: Database

File format: Zipped tab delimited files

Size: 578k

Date added: 26 July 2011

The database provides information on the existence, location and administration of the records of UK hospitals. There are currently over 2,800 entries, which have been compiled by the Wellcome Library for the History and Understanding of Medicine who continue to work to develop and improve the database.

This data is released under a Creative Commons non-commercial licence by kind permission of the Wellcome Library.

See the live database on The National Archives’ website.

Ancient Correspondence (1175-1538) (SC 1)

Ancient correspondence (1175-1538) (SC 1)

Type: Fields extracted from The National Archives’ catalogue

File format: CSV

Size: 233k

Date added: 26 July 2011

Catalogue details of a selection of letters from the 12th to the 16th centuries giving details of writers, recipients and the subject of their correspondence.

Right-click to save this dataset to your computer.

Papal Bulls (SC 7)

Papal extract from The National Archives' Catalogue (SC7)

Type: Fields extracted from The National Archives’ catalogue

File format: CSV

Size: 408k

Date added: 26 July 2011

This files series contains official correspondence from the Pope (some with attached metal ‘bullae’) between the 1130s and the break with Rome 400 years later.

This catalogue data might permit relatively sophisticated analysis of the Pope’s wide interests in British politics and ecclesiastical life.

See a visualisation based on this dataset.

Right-click to save this dataset to your computer.

Equity Pleadings (C6)

Equity pleadings (C6)

Type: Database

File format: Zipped tab delimited files

Size: 14.4mb

Date added: 26 July 2011

Covers details of Equity cases from the court of Chancery: manorial, domestic and trading disputes, disputes over land purchase, apprenticeship agreements and much more.

This dataset gives names, places and the subject of the litigation.

Right-click to save this dataset to your computer.

Poor Law Union and Workhouse Records (MH 12)

Poor Law Union and Workhouse Records (MH 12)

Type: Extended document descriptions

File format: Zipped XML files

Size: 10.3mb

Date added: 26 July 2011

Extremely detailed, name rich record descriptions of a wide range of correspondence between central government and selected regional Victorian Poor Law Unions across England and Wales. You will find letters, memos, reports and accounts alongside details of individual paupers many with stories of extreme hardship.

This dataset might support relatively sophisticated modelling of disease and mortality statistics amongst many other applications.

Right-click to save this dataset to your computer.

Domesday Places

Detail from Domesday Book

Type: WFS dataset

File format: XML

Size: 4mb

Date added: 11 January 2011

This dataset lists places mentioned in William the Conqueror’s Domesday Book. Other data shown includes current place names, latitude/longitude, and an ID number which is keyed against our placename gazetteer (The National Archives Places), supplied separately below.

NOTE: Place ID in this dataset matches Place ID in The National Archives Places.

Right-click to save this dataset to your computer.

Medieval Petitions 1189-1577 (SC 8)

Detail from a document in Catalogue series SC8

Type: Fields extracted from The National Archives’ Catalogue

File format: Excel spreadsheet

Size: 32mb

Last updated: 26 July 2011

Contained in this spreadsheet are details of petitions to the Crown and other state officials from across the medieval period.

The Catalogue entries list the petitioners and the nature of their request as well as related place names, addressees and other data. (This information is split across two columns and will need to be recombined).

Right-click to save this dataset to your computer.

Series Statistics

A computer screen showing a document

Type: Numerical

File format: Excel spreadsheet

Size: 305k

Date added: 11 January 2011

This spreadsheet gives the numbers of pieces and documents for every series in The National Archives’ Catalogue – from A 1 to ZWEB 7. The related pivot table (943kb) gives the numbers of pieces in each class.

The data has the potential to allow visualisations showing the relative scale of our holdings across government.

Right-click to save this dataset to your computer.

Serious Crimes 1962-76 (DPP 2)

Detail from Catalogue reference DPP 2/306: Director of Public Prosecutions: case papers, new series

Type: Fields extracted from The National Archives’ Catalogue

File format: Excel spreadsheet

Size: 1.8mb

Date added: 11 January 2011

This file series contains records from the Director of Public Prosecutions. This dataset is extracted from the most fully catalogued portion of the series and gives details of defendants, offences committed and some geographical information.

NOTE: The naming of a defendant within this dataset does not imply guilt.

Right-click to save this dataset to your computer.

@ukwarcabinet

Detail from an image of Winston Churchill

Type: Extracted Twitter feed

File format: CSV

Size: 1.2mb

Last updated: 26 July 2011

These tweets are from our @ukwarcabinet twitter feed, which uses British War Cabinet papers to follow the progress of the Second World War in near real time.

The spreadsheet contains tweets covering events from 1 January 1940 to 17 November 1940. Each tweet links to an original Cabinet paper through our DocumentsOnline service.

Right-click to save this dataset to your computer.

Victorian Photographs (COPY 1)

Detail from a Victorian photograph, 1887 (Catalogue reference: COPY 1/381/253)

Type: Fields extracted from The National Archives’ Catalogue

File format: Excel spreadsheet

Size: 42mb

Date added: 11 January 2011

COPY 1 is a copyright register which contains examples of Victorian art and design sent to Stationer’s Hall in London by their creators, as proof of their ownership of the work.

This dataset is extracted from the best catalogued portion of COPY 1. Each entry gives a description of a photograph, the photographer’s name and their address.

This dataset would support georeferencing.

Right-click to save this dataset to your computer.

Discovery Taxonomy

Type: Categorisation/taxonomyDiscovery Taxonomy

File format: XML

Size: 270kb

Date added: 22 March 2012

Every item in Discovery is assigned one of around 120 categories according to a series of rules described in this document.

These categories aid searching and filtering of search results.

Right-click to save this dataset to your computer.

Selected Government Social Media channels

Type: ListSelected government social media channels

File format: Excel spreadsheet with multiple worksheets

Size: 28kb

Date added: 22 March 2012

A broadly representative selection of central government social media presences covering Flickr, Facebook, Twitter and YouTube.

This data represents a small subset of the channels which The National Archives is required to archive on behalf of central government. We estimate this will reach 200 for this year.

Right-click to save this dataset to your computer.

UK Government Web Traffic Statistics (2009-12)

Type: Web statsUK Government Web Traffic Statistics

File format: Excel spreadsheet

Size: 56kb

Date added: 22 March 2012

Traffic statistics for the UK Government Web Archive first became available to us in January 2009.  Since that time we’ve seen a massive increase in the number of ‘hits’ received and some interesting trends have emerged.

Please note that due to the complexities of recording ‘hits’ to such a large collection the figures may not be entirely accurate however they are broadly reliable and the trends are accurately reflected.

The statistics are divided into two categories ‘Normal’ (which counts ‘hits’ resulting from users browsing to the web archive through search engines or our pages on the TNA website) and ‘Recent’ (which counts the number of ‘hits’ resulting from users being redirected into the web archive through an automated redirection component (see http://nationalarchives.gov.uk/information-management/policies/web-continuity.htm).

Right-click to save this dataset to your computer.

Comments (10)

  • Aylarja

    A number of the graphics above are missing because it appears that the URL is pointing to a development server, which is likely not accessible by the general public.

    The National Archives reply:

    Thanks very much for reporting this, everything should now be fully accessible.

  • Richard Boulton

    What licenses are the datasets mentioned above available for use under? I can think of many uses for the data, but without a license which permits reuse I’m a bit stuck.

    The National Archives reply:

    Thanks for your interest Richard. The data we have made available through Labs is licensed under the Open Government Licence.

  • The 2011 Census: History and Research for Liverpool (or, Why fill in the census? A historian’s perspective)

    [...] If you’re more technically minded, you might even make use of the National Archives’ Domesday on a Map, or its Domesday Places dataset. [...]

  • Marie

    Hi

    I’ve tried to access the Domesday tool but here’s the message I’ve got :

    gml:featureMembers>1North Benfleet

    sorry about that. keep up the good work!

  • Chris Willis

    Why are there a series (presumeably several), with a range of references, but constant data eg 267083,Felsted School,Auto-transferred from GeoKnowledgeBase,0.435533241379148,5
    1.858308678426,NULL
    -267095?

    The National Archives reply:

    Hi Chris, we need a bit more information in order to answer your question – which dataset are you referring to please?
    Thanks very much,

  • Chris Willis

    Apropos my previous comment. The data quoted is from TNAplaces.csv. It is far from the most serious instance. From memory the example quoted a range of 12 identifiers with identical data.

  • Barnaby

    Would it be possible to have more granular information down to the series level?

Leave a comment




Comment validation by @