The ediscovery landscape is full of powerful tools for slicing and dicing your web archived data. At first glance, it can seem that a native format web archive, containing the richness and depth of a dynamic and complex website, is a poor fit for tools designed to work with stand-alone data files like emails and documents.
Not so with Hanzo’s Web Archive Connector. It’s designed to give you the best of both worlds. Web Archive Connector outputs a stream of ‘page’ documents that can then be read into a third party solution. Each page document contains a raft of metadata, which includes the extraction of additional metadata from the page content (e.g. pulling social media content into posts with authors and posting dates). Additionally, the textual representation of the content of the page, and an alternative rendition of the page, such as a PDF, TIFF or PNG, with links back to the native format archive, ensures complete preservation of data, meaning none of the important context is lost.
Web Archive Connector uses an range of output formats, including XML, to connect to third party information management systems, such as Symantec Enterprise Vault. I’m also pleased to announce that this product now supports the EDRM XML data exchange format, as a way of getting the Web Archive Connector page stream into e-discovery applications supporting this format. This enables a wide range of software to work directly with Hanzo Web Archive content.