Skip to main content

Hi community,

 

I'd like to find large web archives (WARC) in our 5.11 EE repository. e.g. all larger than 1 GB.

Or sort all WARCs on file size.

The reason for this is that we experiment with what to do with the metadata extracted from embedded objects from WARCs.

Especially the larger WARCs currenty slow down our repository because all extracted metadata seems to be stored in a single (clob) database field.

It this search/sort possible and does anyone know how to do it?

 

Thanks,

Remco van Veenendaal

Digital Preservation Officer @ National Archives of the Netherlands

I don’t think this is possible through Preservica, but as EoP you can probably do it with a Solr or database query.


Reply