Skip to main content

WARC format (WebARChive) is a file format used to store ‘web crawls’ as sequences of content blocks harvested from the web. 3rd-party applications are available that can crawl websites to capture this content. The steps below explain how to ingest a WARC-specific file format.
1.1    Login to Starter using your email address and password.
The Home screen  displays.
1.2    Navigate to the folder/sub-folder where you want to upload the content into and select the folder.
1.3    Select the WARC file(s) you want to upload and drag-and-drop onto the main window. You can select multiple files or a zip file.
Note: If the WARC files are in a folder use the Add>Upload a folder  option.

1.4    The Upload begins, you can follow progress in the bottom right-hand corner. An informational message  displays advising that you are ingesting  a WARC file and you must enter the URL of the corresponding website  into the URL field on the Metadata Properties page, when the ingest has completed (see below).
Note: this may take some time depending on the number of files and size.

1.5    Click OK, continue. To avoid being prompted again, click the Don’t show me this message again checkbox .
1.6    When the upload completes Preservation automatically begins. 

Note: by default the folder and asset will have the access view setting of Private, and will therefore not display on the Access Portal. If you want the folder and/or asset(s) to display. change the access view setting to Public by right-clicking on the folder or asset(s) and choosing Public. 
1.7    When the asset is visible click on it, a reminder that you must add the URL displays
Note: if you don’t enter the URL the WARC file will not render and display the website.

1.8    Click OK, continue and click on the pen icon in the highlighted URL field
1.9    Type in the full URL of the website that corresponds with the WARC file and then click the tick icon to save

1.10    To display the preserved website click on the asset, and navigate around the site



Be the first to reply!
