Manifest file

Batch data integrations


Manifest files accompany every batch dataset pushed to the UDP for ingestion. They describe the contents of the full data dump, enabling the UDP to validate the completeness and integrity of the dataset beore ingestion.

Manifest Version 1

Filename conventions

The filenames of manifest files must follow a convention: <system>_daily_<date>.done. In this formula, the system value will correspond to the loading schema. For example, values may include sis for SIS data, oc for Originality checker date. The date will be in ISO 8601 format and refer to the date only; e.g., 2019-01-01.

An example filename is: oc_daily_2019-01-15.done.

File contents format

Manifests are YAML files that must provide the follow key-value pairs.

  # The version of the manifest file used
  # in this dataset.
  manifest_version: "v1"

  # The name of the system whose data is ingested,
  # where values may include 'peoplesoft', 'banner',
  # 'canvas-data', etc. depending on the application.
  source: "peoplesoft"

  # The UCDM version number this dataset's format
  # corresponds to.
  data_schema: "1.0"

  ## The ISO 8601 datetime stamp describing when
  ## the dataset was produced.
  ##
  ## The timezone must be UTC.
  datetime: "2017-11-19T21:11:31-0000"

  ## A UUID for the dataset
  dump_id: "b4f8eec7-7adc-47a1-83a4-238f1032da00"

  ## An array of the files included in the dataset.
  ## For each file, an MD5 checksum is provided.
  files:
  - name: "academic_term_2019-02-04.csv"
    checksum: "9c9e18230ff048f2837889f41e1faba2"
  - name: "course_offering_2019-02-04.csv"
    checksum: "64388cb12350d966fbdc37b9e6c02014"
  - name: "course_section_2019-02-04.csv"
    ...


Manifest Version 2

Filename conventions

The filenames of manifest files must follow a convention: <system>_daily.done. In this formula, the system value will correspond to the loading schema. For example, values may include sis for SIS data, oc for Originality checker date. You will notice that the date was dropped from the file name as this will be inferred from the datetime field in the file.

An example filename is: oc_daily.done.

File contents format

Manifests are YAML files that must provide the follow key-value pairs.

  # The version of the manifest file used
  # in this dataset.
  manifest_version: "v2"

  # The name of the system whose data is ingested,
  # where values may include 'peoplesoft', 'banner',
  # 'canvas-data', etc. depending on the application.
  source: "peoplesoft"

  # The UCDM version number this dataset's format
  # corresponds to.
  data_schema: "1.0"

  ## The ISO 8601 datetime stamp describing when
  ## the dataset was produced.
  ##
  ## The timezone must be UTC.
  datetime: "2017-11-19T21:11:31-0000"

  ## A UUID for the dataset
  dump_id: "b4f8eec7-7adc-47a1-83a4-238f1032da00"

  ## An array of the files included in the dataset.
  ## For each file, an MD5 checksum is provided.
  files:
  - academic_term: "9c9e18230ff048f2837889f41e1faba2"
  - course_offering: "64388cb12350d966fbdc37b9e6c02014"
  - course_section: "656dfg45s1dfg564r64r51gr231sd89e"
    ...