Today we’re announcing the launch of the CloudPreservation Public API. It’s designed to make it even easier for you to get web-accessible data into CloudPreservation. What’s particularly great about this API is how easy it makes taking fine-grained control over your web preservations, either through a handy browser-based bookmarklet tool, or by having your own development team programmatically add webpages to your feeds. Let’s take a look at two ways to use this new feature.
Keeping a website feed up to date
Say you run a website that provides lots of content and has lots of updates — naturally, you have a CloudPreservation instance pointing at it to track all the changes. To keep tabs on everything that’s happening on your site, you’ve probably got the crawl frequency of that CP instance cranked up as fast as it goes too.
Unfortunately, that’s pretty inefficient. Not all of your pages have necessarily changed over the course of a week, so hunting through them all for the changes just takes time that doesn’t necessarily need to be spent. And, if you make an important change on Monday, it won’t be preserved until the next crawl — which might not be until the following Sunday.
The CloudPreservation Public API makes it really easy to get these “between crawl” changes into your feed as they happen. Simply install the bookmarklet for your website feed by dragging it to your bookmarks bar (or right clicking and choosing “Add Favorite” if you use Internet Explorer).
Then on every page you’d like to add or update, just click the bookmarklet and we’ll go fetch the newest version of it.
It really is that simple.
If you have a small development team available to you, you could even go one step further and integrate your content management system with our API. This would let you preserve copies of new or updated pages in CloudPreservation as they’re published or edited — nearly in real-time.
Storing only the pages you want to store
This new ability to fetch and preserve single pages actually lends itself to having a new type of feed as well, which we’re calling a Public API/Bookmarklet Feed. This feed only takes in the pages you tell it to specifically through the API or the bookmarklet.
Let’s say your company just launched an amazing new feature that’s being covered by all the major news outlets. The world is buzzing about your product and you want to preserve what they’re saying. Simply set up a Public API/Bookmarklet Feed — in this case we’ll call it “Launch Buzz” — and install its bookmarklet. Then, browse to any of the articles that you want to preserve and click the bookmarklet. CloudPreservation will see your request, and preserve a copy of that page in the “Launch Buzz” feed.
Public API/Bookmarklet feeds are fantastic for preserving this sort of research, as well as any other time you want to keep track of a collection of very specific web pages without crawling and storing their entire website. Collecting and preserving single webpages has never been easier.
More information
Bookmarklets are available today for all webpage and Public API/Bookmarklet feeds, look for them on your feeds listing page as well as instructions to get you started.
For the more programmatically inclined, the public API — and its associated documentation — is also available for use starting today. The documentation contains examples on how to send us webpages in various programming languages as well as instructions on how to move beyond those examples to build your own custom solutions.
We think the API is going to be a great tool in your preservation arsenal. As always, we love hearing your feedback. Feel free to get in touch with us if you have any comments or questions.








