Loading
 
Get Support
 
 
Solutions

Research and Development

Hanzo’s continuous innovation in web archiving techniques and technology ensures that its customers always have access to the most effective, innovative methods of meeting regulatory and legal requirements.

With continuous and often rapid change in information technology, especially the web, coupled with the increasing migration of corporate information to web technologies, it is difficult to maintain continuous access to corporate memory, even during the active period of a document’s lifecycle. Moreover, licensing and maintenance costs involved in sustaining a corporation’s digital memory keep increasing. To this extent Hanzo is engaged in finding solutions for the complex archiving issues arising from these developments.

Hanzo’s continuous innovation in web archiving techniques and technology ensures that its customers always have access to the most effective, innovative methods of meeting regulatory and legal requirements.

At Hanzo, we work constantly at improving website capture, archive access and archival search. We have world-class engineers involved in research and development of state-of-the-art technologies and products across a number of both internal projects and external collaborative projects, including:

Living Web Archives

Living Web Archives is a three-year European project funded by the European Commission through the Seventh Research Framework Program.

Developing the next generation web archive technologies.

Living Web Archives

Web content plays an increasingly important role in the knowledge-based society, and the preservation and long-term accessibility of Web history has significant value, highly relevant to preservation of commercial websites and social media and Hanzo’s commercial web archiving services.

The nature of websites, intranet and social media, such as highly dynamic content, volatility of content and links, wide range of technical and software infrastructure, huge variety of file formats, all conspire to make quality Web archiving a challenge.

As Hanzo is strictly focussed on commercial web archiving, creating native-format web archives for legal and compliance applications and highly valuable corporate history, this challenge must be met. This is why Hanzo has contributed to LiWA, focussing on improving web archive crawling technology, streaming media, and social media archiving.

More Information: Living Web Archives (LiWA)

World Wide Web of Humanities

The World Wide Web of Humanities project (Oxford Internet Institute and Hanzo Archives Ltd in the UK and Internet Archive in USA) was funded by a transatlantic collaboration between JISC and the US’s National Endowment for the Humanities (NEH) - the First JISC/NEH Transatlantic Digitization Collaboration Grants. The project was one of five digitisation projects to be awarded funding of around £600,000 ($1,150,000).

The World Wide Web of Humanities created a suite of open source tools for data collection and curation, to support new methodologies for Internet research built around large collections of web data, using automated tools to extract, index, and analyze the data. The collection was designed to help researchers and policy makers gain an understanding both of the state of the art of e-Humanities and of historical trends and developments in the field.

Hanzo developed experimental search and analytics software and API’s, which were used to provide access and visualisation of a large web archive.

Partners:

Announcements:

Longitudinal Analytics of Web Archive Data

Longitudinal Analytics of Web Archive data (LAWA) is a three-year European project funded by the European Commission through the Seventh Research Framework Program under the theme [ICT-2009.1.6] Future Internet experimental facility and experimentally-driven research.

Longitudinal Analytics of Web Archive Data

To support innovative Future Internet applications, we need a deep understanding of Internet content characteristics (size, distribution, form, structure, evolution, dynamic). The LAWA project on Longitudinal Analytics of Web Archive data will build an Internet-based experimental testbed for large-scale data analytics. Its focus is on developing a sustainable infrastructure, scalable methods, and easily usable software tools for aggregating, querying, and analyzing heterogeneous data at Internet scale. Particular emphasis will be given to longitudinal data analysis along the time dimension for Web data that has been crawled over extended time periods.

A Virtual Web Observatory will be created, to support data-intensive experimentation with Web content analytics. A demonstrator is planned which will allow citizens at large to interactively browse, search, and explore born-digital content along the time dimension.

Hanzo is developing tools and API’s into its web archive to enable analytics and visualisation of the archives at the very large scale. In addition to the development role, Hanzo will also collect websites and social media as a contribution to the virtual web observatory.

More Information: Longitudinal Analytics of Web Archive Data (LAWA)

 


Last modified Jan/21/2011