Skip to main content

When using OPEX Incremental Ingest please be aware of the following -

 

Folders and their subsequent OPEX files with non-ASCII characters or question marks in their names can be ingested using OPEX Incremental Ingest, however for files and their subsequent OPEX files with non-ASCII characters or question marks in their names, these do not get ingested using OPEX Incremental Ingest. We would advise customers to avoid using these characters for filenames. The OPEX Incremental Ingest continues to ingest after these erroneous files have been encountered, and allows other folders and files in the ingest to continue, so does not break the ingest

 

Non-ASCII characters are any characters that fall outside of the standard 7-bit ASCII character set and beyond the basic English alphabet. This includes accented letters, and characters from languages like Chinese, Japanese, Korean, Hebrew, Farsi and Arabic.

Examples of non-ASCII characters can be found below:

  • Accented letters like é, à, ö, ñ
  • Characters from non-Latin scripts like 漢 (Chinese), こんにちは (Japanese), or به متنی (Farsi)

 

Ingesting folders and files, and their subsequent OPEX files, using OPEX Incremental Ingest with the following four listed characters fails the entire ingest, even if there are other folders and files with non-erroneous characters in their name. We would advise customers to not use any of these characters in their folder or file names:

  • : Colon
  • < Less than
  • > Greater than
  • | Vertical bar, Pipe

 

For files that contain any of these illegal characters in their text or contents, these will be ingested successfully using OPEX Incremental Ingest, as long as the points mentioned in both 1 and 2 are adhered to. They can also be searched upon within the Preservica application once the post-ingest task of full text indexing has been completed.

Be the first to reply!

Reply