Web content is constantly evolving and there is a growing need for it to be properly preserved and managed. Here you will find a helpful resource to answer frequently asked questions your company has regarding web content preservation.
Essentially, Hanzo’s software visits a website and collects what a person would see, and we store the content exactly as it was delivered from the target site. We navigate through the site, so we get all of the web pages and related content you need, including text and metadata from each page. We also make a PDF of each page and store that alongside the native web content.
Once the collection is complete, we make a working replica (links work, videos play, etc.) of the site available so you can see how the site performed when it was live. We also create exports using the PDFs of each page and the native content.
Any website, message forum or blog, plus:
ISO 28500 is the standard for web content collection and preservation. It was designed by an international body of experts in digital preservation, the IIPC, which includes people from the national archives and libraries, including The National Archives of the UK and the Library of Congress. It specifies a methodology for collection and it specifies a storage format called a WARC file.
No matter what your reason for capturing web content, there are two things you don’t want:
ISO 28500 WARCs make sure you avoid both of those issues. Virtually all other web capture methods are susceptible to those problems, and that’s what Hanzo wants to avoid for our clients.
A WARC file is an industry-standard format for storing collected web content and associated data. A WARC file is a container that provides structure to the data for processing, indexing and access. More importantly, a WARC file will preserve original web content exactly as it was delivered from the target site. It contains all of the metadata that allows a forensic examiner to verify the integrity of captured web content.
There is a huge amount of web technology that makes it easy for people to use websites, but also makes it very difficult for many capture tools (except Hanzo, of course) to capture. Essentially, it is content that requires interaction with a web page; think drop-down list selections, mouse-overs, pop-ups, multimedia, etc.
Proper preservation methods are critical in avoiding spoliation of web evidence. It is vital that preserved web content be sealed off from any live web content so that the risk of alterations or changes to the original content is eliminated. Be sure to check with your provider to make sure they’re following proper preservation methods for web content.
Both. Hanzo provides options to use the software as a service (SaaS) and under a license (on-premise).
Hanzo uses Amazon Web Services, which gives us unmatched reliability and scalability.
The short answer is yes, Hanzo can capture virtually anything you can see in a browser. Want to give us a test? Show us the hardest, most complex content on your site. We’ll show you how Hanzo’s technology can give you the most complete, accurate and defensible captures available.
Hanzo provides a number of viewing options, including native format, where you can view the site just as it appeared when it was live online, plus a variety of other export formats, including offline working replicas of the captured sites.
You have a number of options, including producing exported PDFs, which are always instantly available with Hanzo, to e-discovery industry standard load files and a variety of native format production options.
Not sure how big the site is that you want to capture? No problem. Hanzo uses a number of tools to provide our clients accurate page counts, and our experience across thousands of web capture projects helps clients make sure they’re getting the correct capture scope in place.
As many as you want. Hanzo doesn’t charge for users.
Hanzo stores content as long as you need it. Our clients set the retention schedule to meet regulatory requirements or litigation needs. For many clients in the financial services industry, the retention period is seven years.
In general, no. Hanzo looks like a user on your website, so we impact the performance of the site like any other user would. Hanzo’s professional services team works with our clients to make sure we have the smallest footprint.
Yes. Many of our clients opt to have Hanzo run in overnight hours when usage of the website is lowest.
No. Hanzo’s professional services team uses a variety of techniques to make sure Hanzo isn’t impacting our clients’ site analytics at all. When you’ve done as many website captures as we have, this kind of attention to detail comes naturally.
Yes. It takes some serious sophistication to perform accurate, defensible captures behind a login, and the good news is that Hanzo does logged in captures all the time.
Yes. And you can control the content that’s available to each user. You don’t need a support call to Hanzo to add users or manage permissions.
Yes. You can manage users through Hanzo’s admin features in the app. You don’t need a support call to Hanzo to add users or manage permissions. Plus, ask Hanzo about LDAP and SAML integration.
Yes. Hanzo provides LDAP and SAML integration to support easier user management for many customers.
Yes. Hanzo supports retention and records management, including exceptions for legal holds, plus reporting and notifications, such as upcoming records due for disposition.
Yes. Interactive elements, like mouse-overs, image carousels, drop-down lists and pop-ups, will play back like the original, as will all the links on the site, including video and other multimedia content.
No, not if you’re using Hanzo as a service. You can view content using a browser (Chrome, Firefox or Safari – we don’t recommend Internet Explorer). You can also download our viewer app, which many customers find easier.
If you’re using the on-premise instance of Hanzo, for you, Hanzo is software running on your organization’s network (but only if you’re using Hanzo on-premise).
It’s easy. You can add new sites through the Hanzo app, or our support team can add the sites for you. We do the heavy lifting for you.
Hanzo provides a variety of analytics and reports to help customers create a detailed picture of their web portfolios. You can monitor changes to sites, including text and image changes, plus quickly pinpoint critical items within your web content, like external links or forms. Additionally, you’ll receive a full suite of reports for compliance support.
Yes, although you don’t need much training at all to use Hanzo. We do the heavy lifting behind the scenes so you get a clean, easy-to-use app.
For most users, Hanzo requires little to no training. Of course, admin and engineering users will get much more training.
Yes. Hanzo stores all captured web content in WORM storage.
Yes. These letters are standard parts of the Hanzo agreement.
Yes. Hanzo has provided dozens of affidavits and declarations, and been called on to testify as an expert on numerous occasions.
Yes. Hanzo can trigger geolocation content so you can see how the site appeared to someone in a specific location.
Hanzo uses a variety of methods to trigger all kinds of personalization characteristics of websites, including things like browser history triggers, preferences and A/B site direction.
Hanzo generally captures sites including what we call +1 hop, meaning we will follow all links outside the target site to one hop away, and then stop. You control the number of hops, and, therefore, how much content you want to include in the capture.
You control the number of hops and how many levels deep you want to go. Hanzo generally recommends +1 hop. For example, if a Facebook profile has links in posts or comments to sites outside of Facebook, Hanzo will follow each link and capture the resulting web page, but no links from that web page.
Yes. Hanzo provides a Relativity .DAT file as a standard part of every capture. You can load content into virtually any eDiscovery review platform.
No, it is much more than a screenshot. With Hanzo, any captured web content looks and works like it did when it was live on the web. Screenshots give you no interactivity, and they miss critical content on modern web pages.
Yes. Captures can be customized in a variety of ways. The methods fall into these general categories:
What’s The Cost Of Your Enterprise’s Data Loss?