Approaches to Soliciting Opinions for Institutional Responses to Formal Consultations

One of the things we didn’t put into the original JISCPress bid – though in hindsight we might have – was a use case for commentable documents in the context of government consultations soliciting formal responses from higher education institutions (for example, Universities UK: Review of External Examining Arrangements in the UK).

From a chat with Alison Nash in the OU’s recently reorganised Strategy Unit (I think?), it seems that candidate consultations are fielded by a member of that unit who then emails likely suspects (identified quite how, I’m not sure?) with either a link to, or copy of, the consultation document; (these are typically Word or PDF documents). As with many of the consultations we have looked at in the context of Write To Reply, the consultations typically have a set of questions associated with them that are distributed throughout the consultation document as a whole. Comments and responses to questions are then returned by email (I didn’t ask whether this is typically in the body of an email message, in a Word document, or as comments or highlighted changes on a copy of the orginal consultation), collated (again, I’m not sure how? One way would be to use a spreasheet, with rows for respondent and columns for each question (or vice versa)), and used to frame the institutional response. (I’m not sure if a draft of the institutional response was then circulated to the orginal commenters for final comment…?) The question that was then asked was: would a WriteToReply style approach be appropriate for managing returns of comments and answers to consultation questions in a rather more organised way than is currently the case?

(If anyone from the OU, or other HEIs who engage in producing formal instituional responses to consultations would like to provide further detail about the workflow for soliciting internal comments, producing drft and final versions of instituional responses, and then tracking the impact of comments made in the response, please post a comment to this post…)

Here are some thoughts/matters arising relating to how the WriteToReply/JISCPress/digress.it approach might apply:

– comments may need to be private; this could be achieved by hosting WordPress within the firewall, limiting who can view comments to members of the institution, or not making comments public (e.g. by moderating them, meaning that only the blog owner could see them). Limiting who can make comments can be achieved by requiring users to log in to the blog, and only providing certain users with log in accounts.

– it may not be appropriate to allowing commenting on all paragraphs, instead requiring users to only comment on actual questions. This might be achieved by disabling comments on all pages except a single summary page that contains one paragraph per question, maybe with links back to the actual posts that contain the question in context.

– if comments are solicited throughout the document, a dashboard tool such as Netvibes can be used to aggregate comments from different sections of the document; tools like Yahoo Pipes can also be used to aggregate comments from separate areas of the document and display them in a single view. Views over comments by individual commenters are also available and may be collated together on dashboard pages (for example, with separate pages aggregating comments from different sorts of commenter – e.g. allowing views over responses by Faculty, for example).

– once a formal response has been produced, it may be appropriate it post it on the consultation site to allow commenters to see how their comments were o weren’t integrated in to the official response (maybe leaving it open to them to submit a personal response to the consultation if they feel their views were not appropriately reflected, if at all. (The more I think about the process of these document based consultations, the more I feel a feedback loop is required that allows folk to see what sort of impact, if any, their comments may have had. I also briefly touched on this in On the Different Roles Documents and Comments May Take in a Commentable Document.) The consultation document site then becomes an important part of institutional memory, archiving as it does the original consultation, individual comments from members of the institution, and the institution’s formal response. It might also be the case that a draft of the institutional response is placed on the same site and comments on it solicited. (The site would then be hosting documents in two modes – the original consultation mode document, and then a draft mode document (again, this distinction appears in the Different Roles blog post.)

In many cases, it might be that the paragraph level commenting approach is not appropriate – unless comments are limited to just the consultation questions themselves, each as a separate commentable item. Where it is appropriate to isolate consultation questions from the surrounding text, a simple form may provide the best way of capturing comments.

In the OU, where I believe we are about to start rolling out Google Apps for Education to at least some of our students over the next month or two, it might be appropriate to look at using a Google form as platfrom for capturing comments. As well as satisfying the immediate goal (capture comments in a centralised way), this approach would also provide a legitimate and low risk use case for exloring how we might make use of the Google Apps environment as part of internal business processes.

The simplest case, then, would be for the internal staff member responsible for gathering comments to create a Google form. I don’t know if internal staff members have yet been issued with login details for how to access Google Apps on the open.ac.uk domain, but in the interim they can either create a personal Google account (or I could let them have an account on one of my Google apps domains!). Creating a form can be done either from the main docs menu, or within a Google spreadsheet (the posted form results are collated within a spreadsheet).

Google docs - create new form

For most consultations based around a set of specific questions, the format of the form would look something like this:

Creating a google form to field consultation question replies

That is, a copy and pasted copy of each consultation question (with minor tweaks so the question makes sense in a standalone questionnaire) as a separate form item, with a Paragraph text element for the response.

If additional commentary is required, the section head (which includes a description component) can be used to display it:

Google form - section head

It might also be worth capturing “any other comments” in a final paragraph text comment at the end of the questionnaire.

Although the form, once published, would be open to anyone on the apps domain, (if they knew the URL), a further “security” measure would be to prompt the user for a consultation “pass phrase” emailed to them as part of the request for comments (“please enter this keyphrase when you complete the form so we can put your responses into the class of ‘high priority’ responses”;-) This might even be a required element.

Alternatively, a keyphrase element could be used to sort the responses in the results spreadsheet, or as suggested above in the context of digress.it, used to sort responses for example by Faculty, (Alternatively, an optional unique key code be be generated for each invited response to identify their responses. Or we could request an OU identifier, name, email address etc to track who made what comment (though these approaches are gameable and don’t necessarily imply that the person with a given identifier is the person who submitted the form…)). If users are logged in to the Google Apps environment, it may be that their identity is recorded anyway…? Hmm….

Google form responses

For just collecting responses, pretty much anyone could just set up the form and then email the link to the form to the potential commenters. With the availability of Google Apps script, and a little bit of developer time, it would also be possible to provide alerts to the internal consultation organiser whenever a form submission is made, provide automated collation of responses by question and pop these into a Google wordprocessor doc (I think…?!) and even manage a circulation list – so for example, a list of respondents could be created in a spreadsheet, used to mail out invitations for them to complete the form, and then track their response. In the advent of them not responding within a certain period, an automated reminder could be sent out. (I’m guessing it would take about a day to build and test such a workflow, which once created would be reusable.)

Another advantage of using the Google Apps approach would be that the response spreadsheet (or an automagically maintained Google wordprocessor doc version of it) could be shared to other members of the team providing the formal institutional response as an online shared document appearing in each individual’s Google docs “inbox”.

PS it seems that within a Google Apps for Edu environment, it may now also be possible for users to edit their form responses if they want to revise their answers…

PPS it’s also worth noting here a couple of practical considerations about how to write a consultation document bearing in mind that someone might put together a form to collate the responses. Firstly, the question should make sense as a standalone item (i.e. out of context) or very clearly identify what it is referring to rather than just “the above”. Secondly, if the questions are collated together in a single appendix, it makes it easier to just check off that each question has been included in the form. (It’s also handy as a one page item for someone who is putting together the response.) Links to the original context also help; (in a sense, this sort of Appendix is like “List of Tables” or “List of Figures” that acts as contents page for locating questions within the document). Reading over the questions in an Appendix will also make it obvious whether or not the question was written in such a way that it implicitly refers to content surrounding it in the original embedded context (“see the above” again…) Note that I’m not saying questions shouldn’t be embedded, just that when they are taken out of context, they still make sense and read well. In the example I give above about external examiners, the questions had to be tweaked so that they made sense as standalone items.

On the Different Roles Documents and Comments May Take in a Commentable Document

Chatting over possible use cases for digress.it in a meeting at UKOLN yesterday, it struck me that there are at least three different roles we might expect a commentable document to play in a open discussion context:

  • draft document – in which comments are solicited on different parts of a document, with a view to producing a revised version of the document that takes into account the various comments made on the commentable version of it. For example, the publication of draft standards (e.g. British Standards – commentable drafts) or draft policy documents (e.g. Leicester University social media policy). Users may be able to see the consequences of comments by comparing final versions of the document with the orginal commentable version, and the comments associated with it.
  • consultation document – in which issues are discussed and a series of consultation questions asked, often embedded within the various sections of the body of the document. For example, HEFCE REF Consultation. If a summary of responses is provided around the consultation, along with a review of what actions were taken that relate back to the consulation questions, commenters will be able to judge whether or not their comments appeared to influence the direction of post consultation outcomes.
  • guidance document – in which comments may be round around guidance either requesting or providing clarification of particular points, or collecting examples of how others have practically implemented guidance. For example, COI Guidance on open source software. This sort of document can act as a hub for aggregating practical advice implementing guidance. In contrast to the previous two document/comments, the comments thermselves can become a means of sharing practical advice around the guidelines, and may effectively deliver practical guidance themselves. The outcomes of requests for clarification may also be trackable, if for example they result in revisions to the original guidance, or indeed if they result in a futher comment that clarifies the matter; (in this case, we might see clarifying comments as providing a similar role as do comments on a draft document?).

Combining elements of all three types listed above, we might also consider an amplification, or discussion, document, such as documents published in support of a meeting (“meetings without borders”, or “semi-permeable meetings”; for example Using WriteToReply to Publish Committee Papers. Is an Active Role for WTR in Meetings Also Possible?). [Added: I guess that educational materials might also be regarded as discussion documents?] Rather than being the focus of a conversation, these documents are part of an ongoing process, or conversation, where comments raised may either be seen to be a continuation of a discussion held in a meeting, or as part of a conversation that may be continued in a folow on meeting. Feedback to commenters about how comments are received may be realised through mentions to matters raised in comments appearing in the minutes of later meetings (which may even reference back to the orginal comment).

Looking at these various document types, it seems to me that it is possible for commenters to look for evidence in later/follow-on documents about the extent to which their comments may or may not have directly influenced the content of those follow-on documents, as well as providing opportunities for direct links back to comments that influenced later documents from those later documents themselves.

If a consultation platform can start to highlight the impact comments may have on practice or policy development through appropriate feedback, such as “follow-on feedback” (i.e. the demonstration of how a comment on one document influenced the content of another), then it feels right to me that it is more likely that people will start to see it as a tool that supports “real” involvement in a process?

PS Seems like I’m too late to add this distinction in as feedback to the COI draft guidance on commentable docs.

Paragraph Level Search Results on WordPress Using Digress.it and Yahoo Pipes

One of the many RSS related feature requests I put in when we were working on the JISCPress project was the ability to get a page level RSS feed out where each paragraph was represented as a separate item the page feed.

WordPress already delivers a single item RSS feed for each page containing just the substantive content of the page (i.e. the content without the header, footer and sidebar fluff), which means you can do things like this, but what I wanted is for the paragraphs on each page to be atomised as separate feed elements.

Eddie implemented support for this, but I didn’t do anything with it at the time, so here’s an example of just why I thought it might be handy – paragraph level search.

At the moment, searching a document on WriteToReply returns page level results – that is, you get a list of search results detailing the pages on which the search term(s) appear. As you might expect with WordPress, we can get access to these results as a feed by shoving feed in the URI, like this:
http://ouseful.wordpress.com/feed?s=test

Paragraph level feeds, as implemented in the Digress.it WordPress theme we were developing, are keyed by URLs of the form:
http://writetoreply.org/legaldeposit/feed/paragraphlevel/annex-c-online-content-to-be-published/#56

That is:

http://writetoreply.org/DOCNAME/feed/paragraphlevel/PAGENAME/#PARA_NUMBER

So can you guess what I’m gonna do yet…?

First of all, grab the search feed for a particular query on a particular document into a Yahoo Pipe:

Rewrite the URI of each page liked to in the results feed as the full fat, itemised paragraph feed for the page, and emit those items (that is, replace each original search results item with the set of paragraph items from that page).

The next step is to filter those paragrpah feed items for just the paragraphs that contain the original search terms:

We need to rewrite the link because (at the time of writing) the page paragraphs feed doesn’t link to each paragraph, it links to the parent page (a bug report has been made;-)

You can find the pipe here: Double dip JISCPress search

Note that at the time of writing, there’s also a problem with the paragraph number reported in the link (again a report has been made), a workaround patch for which is included in this pipe.

What this means is that we now have a workaround for indexing into individual paragraphs using a search term. If we tag content at the paragraph level, (e.g. by running a page-level paragraph feed, or double dip search results feed through OpenCalais), we can generate related search links into the document, or other documents on the platform, at a paragraph level, increasing the relevance, or resolution (in terms of increased focus), of the returned results.

Just by the by, the approach shown above is based on a search, expand and filter pattern, (cf. a search within results pattern) in which a search query is used to obtain an initial set of results which are then expanded to give higher resolution detail over the content, and then filtered using the original search query to deliver the final results. If a patent for this doesn’t already exist for this, then if I worked for Google, Yahoo, etc etc you could imagine it being patented. B*****ds.

PS here’s a trick I picked up from Joss’ blog somewhere for reversing the order of feed items published by WordPress:
http://writetoreply.org/legaldeposit/feed/?orderby=ID&order=ASC
I assume these parameters also work?

Using JISCPress/Digress.it for Reading List Publication

One of the things I’ve been doodling with but not managing to progress much thinking wise (not enough dog walking time lately!) is how we might be able to use the digress.it WordPress theme to support various course related functions in ways that exploit the disaggregating features of the theme.

Chatting with Huw Jones last week about his upcoming Arcadia seminar on “The Problem of Reading Lists” (this coming Tuesday, Nov 24th – all welcome;-) I started thinking again about the potential for using digress.it as a means of publishing, and collecting comments on, reading lists.

So for example, over on the doodlings WriteToReply site I’ve posted an example of how a reading list posted under the theme is automatically disaggregated into separate, uniquely identified references:

The reading list was generated simply by copying and pasting a PDF based reading list into a WordPress blog post. Looking at the format of the list, one could imagine adding further comments or notes relating to each reference using a blog comment. Given that the basis of each paragraph is a citation to a particular work, it might be possible to parse out enough information to generate a link to a search on the University OPAC for the corresponding work (and if so, pull back an indication of the availability of the book as, for example, my Library Traveler script used to do for books viewed on Amazon).

Under the current in-testing digress.it theme, each paragraph on the page can be made available as a separate item in an RSS feed; that is, as well as the standard ‘single item’ RSS page feed that WordPress generates automatically, we can get an N-item feed from the page for the N-paragraphs contained on a page.

Which in terms means that to generate an itemised RSS feed version of a reading list, all I need to do is paste the reading list – with each reference in a separate paragraph – into a single blog post. (the same is true for disaggregating/feed itemising previous exam papers, for example, or I guess video links in order to generate a DeliTV programme bundle…?!)

(For more details of the various ways in which digress.it can automatically disaggregate/atomise a document, see Open Data: What Have We Got?.)

PS just a reminder again – Huw’s Reading List project talk, which is about far more than just reading lists, is on Tuesday in the Old Combination Room, Wolfson College, Cambridge, at 6pm.

Measuring Website Usage With Google Analytics, Part I

Knowing where to get started with reporting website statistics can often provide new webmasters with something of a challenge. In this post, I’ll quickly review the guidance provided by the Central Office of Information on Measuring Website Usage which:

describes a common approach to measuring website traffic [for central government]. This enables departments to answer Parliamentary Questions and Freedom of Information Requests about website usage consistently and reliably

I’ll also start to explore how to generate reports that satisfy those guidelines using Google Analytics.

The proposed metrics “are defined according to industry standards set by the Joint Industry Committee for Web Standards (JICWEBS)” and specify the following minimal level of reporting (Measuring Website Usage – Reporting requirements):

  1. The following web metrics, as defined by the Joint Industry Committee for Web Standards (JICWEBS), must be measured for each and every publicly accessible website operated by an organisation:
    • Unique User/Browsers
    • Page Impressions
    • Visits
    • Visit Duration
  2. Central government departments must measure Unique User/Browsers, Page Impressions, Visits and Visit Duration starting from 1 April 2009 for every website open on 1 April 2010.
  3. Executive agencies and non-departmental public bodies (NDPBs) must measure Unique User/Browsers, Page Impressions, Visits and Visit Duration starting from 1 April 2010 for every website open on 1 April 2011.
  4. The following information must be provided to COI at the end of each quarter:
    • Number of monthly Unique User/Browsers
    • Number of monthly Page Impressions
    • Number of monthly Visits
    • Number of Visits of at least two Page Impressions
    • Total time in seconds for all Visits of at least two Page Impressions
  5. Each report should contain figures for each of the previous three months. This information should be provided in the format shown in the reporting template in Appendix A.COI Website usage reporting template http://coi.gov.uk/guidance.php?page=237
  6. All figures should exclude internal web development activity, performance monitoring, automated broken link detection and other types of non-human activity (e.g. robots and spiders). Further details on what to exclude are found in the Page Impressions section.

So what does Google Analytics offer “out of the box”?

Headline report - Google Analytics

The Visitors Overview repeats these figures and additionally provides an indication of the number of ‘unique’ visitors:

Visitors Overview

At face value then, it would appear that the Google Analytics are providing at least some of the required stats (though we need to clarify that the numbers as recorded by Google Analytics conform to what the COI has in mind for those reports as described in their guidance on the Minimum standard for web metrics!) But what does that guidance relating to “at least two web pages” mean?

To understand the emphasis on “at least two pages”, it’s worth reflecting on the notion of bounces and the bounce rate. Bounce rate refers to the proportion of visitors to a site who only visit one page on a website before leaving that site, and as such tend to leave no meaningful analytics behind.

According to the ClickTale blog (What Google Analytics Can’t Tell You – Part 1), Google Analytics “has no way of knowing how long a bounced visitor, who only visits one page, spent on your website”. That is, it appears that the time spent looking at a page appears not to be based on the difference between the time when a page has fully loaded (and generated a trackable onload event) and its unload event; instead, it is calculated as the time between two loading one page and clicking through to and loading a second page on the sam site.

Which is why the emphasis on collecting stats from at last two pages: given the current crop of analytics tools that struggle to do anything meaningful with single page visits, specifying a two page visit means that not only visits to the site that are likely to be meaningful are reported, but also that the reports are more likely to contain meaningful data too. (There is an obvious problem here: if visitors visit two pages, and quickly click to the second from the first before exiting the site from the second page, the time spent on the second page won’t be captured? See for example Time on Site & Time on Page – Google Analytics metric mystery)

One of the nice things about Google Analytics is that it lets you create custom views, or “segments” of the data in which you can specify things such as the minimum number of pages visited when generating a particular report. In order to do this, you specify an “Advanced Segment”. Here’s what an Advanced Segment for a “minimum of two pages visited report” might look like:

GA Advancd segment - visited at last two pages

Applying this segment to the same data charted above gives these results:

Segmented goog stats

GA segmented view

So for example, in this version of the report we see that the average number of page views and the average time on site has gone up.

Something I don’t think Google Analytics report is the total time on site. Bearing in mind the lack of data regarding the time spent on exit pages, the best we can do is multiply the number of visits by the average time on site to get an estimate of the total time on site.

With just this single advanced segment, a simple calculation, and the out of the can reports from Google Analytics, I think we can deliver on the suggested stats based on a literal reading of the headings, though in a follow up post I’ll check to see if the more detailed spec on the metrics matches the way that Google ANalytics defines its metrics.

PS Unfortunately, the segmented report appears to have lost the number of absolute unique visitors (although I think the recommended report wanted the number of uniques, including bounces, to the site?) Anyway, let’s play: the number of visits gives the upper bound on the number of unique visitors, but can we also estimate the lower bound? One heuristic might be to look at the number of visits and uniques in the original report (176 uniques, 245 visits), see how many visits were lost in discounting the bounces (245-104 = 141), assume these were all unique and subtract these from the original number of uniques (176-141=35). I think this gives the lower bound on uniques as recorded by Google Analytics for non-bouncing visitors?

Google Analytics, Feedburner and Google Reader

Over the last couple of weeks, it seems as if the Goog has been doing a bit of reconciliation on the old analytics front, in particular the ability to track traffic driven back to your website from links contained within a feed published from that site using Feedburner…

The first thing I’d noticed as being different was the appearance Google Analytics tracking codes on Feedburner powered posts that I was reading in Google Reader – opening such a post in a new window seems to display it with a set full blown set of GA tracking attributes. So for example, opening a post from the Feedburnered OUsful.Info feed results in a URI like this:

http://ouseful.wordpress.com/2009/11/18/under-the-radar/?
utm_source=feedburner&utm_medium=feed
&utm_campaign=Feed%3A+ouseful+%28OUseful+Info%29&utm_content=Google+Reader

…and I’m pretty sure I didn’t put those tracking codes in there explicitly…

In “Campaign” Tracking With Google Analytics, I started sketching out how it might be possible to use Google Analytics campaign tracking codes to to track the spread of referrer links to documents or document fragments hosted on WriteToReply or JISCPress, so let’s see how the Feedburner annoations are structured:

  • utm_source=feedburner (that is, the originator of the feed);
  • utm_medium=feed (that is, the means by which the content was transported/syndicated);
  • utm_campaign=Feed: ouseful (OUseful Info) (that is, the name of the Feedburner feed (I think: the feed URL is http://feedburner.com/ouseful), followed by the feed title (OUseful Info);
  • utm_content=Google Reader (that is, the place where I viewed the link).

Compare this with the suggestion I made for annotating WriteToReply links:

  • utm_source=twitter.com (that is, the place a link was ‘launched’);
  • utm_medium=question (that is, the type of slug content used to qualify the link);
  • utm_campaign=jiscri (that is, the consultation document linked to, e.g. for the link <em.http://writetoreply.org/jiscri/2009/03/11/rapid-innovation-projects/);
  • utm_content=slug3 (that is, a unique ID to identify the text used to qualify the syndicated link).

So how can you get Googalytics tracking codes on your Feedburner feeds? Details are still sketchy, (e.g. see the original announcement on the Goole Analytics blog here: An Integration With Feedburner, and the Google AdSense for Feeds blog here: “Afternoon, Frank.” “Hey howdy, George.”) but this Google FAQ post on How do I set up my FeedBurner feed to report feed clicks in Google Analytics?:

If you use Google Analytics to track web site visitors, you can see feed clicks originating from your FeedBurner feed by activating an option on the Analyze tab.

When someone clicks one of your feed items and ends up back on your web site, Google Analytics will track that activity and include it in the “Traffic Sources” section.

The post also tells you where you can set up the tracking details – from the Configure Stats menu option. And selecting that, I can now see why my feed links are annotated as they are:

(I’m not sure how the $distributionEndpoint is treated for none Google properties?)

The Google AdSense for Feeds post suggests that:

By default, these analytics will show up in the “All Traffic Sources” and “Campaigns” views in Google Analytics. You can filter the results just to only the traffic that comes from Google FeedBurner by filtering on “feedburner” on the All Traffic Sources page or “Feed:” on the campaigns view. You can also use these sources in the Advanced Segments views.

which suggests that for sites like JISCPress/WriteToReply that use Google Analytics on the main site and Feedburner for the public/promoted feeds, the Feedburner integration will automatically annotate feed links with tracking codes that can be tracked from the site’s Google Analytics dashboard.

“Campaign” Tracking With Google Analytics

Of the very many things that it’s possible to provide webstats reports about, such as tracking visitors arriving from organisational wbsites, one of the most useful is being able to track how much traffic has been driven back to your website from a particular link – such as a link included in a particular tweet, or in a particular email announcement, and so on.

If a link to a JISCPress document appears on a third party webpage, and somebody clicks on that link and then lands on the corresponding JISCPress page, Google Analytics will capture where that incoming visitor cam from via the Referring Sites report. At the top level this is organised by domain:

Google Analytics - Referring sites

We can then tunnel down to the page level:

More referrers

This is all well and good, but sometime we also might want to know where the person who posted the referring link on their web page got hold of it. Did they capture it from a tweet, for example, or via an email list? When we releas a URI into the wild via some sort of marketing campaign, what sort of life does that URI have, and where will it end up sending traffic back from?

In the Googe Analytics FAQ answer How do I tag my links?, a method is described for adding additional tags to a referral URL (that is, a URL that you publish and/or distribute more widely that refers back to your website) that Google Analytics can use to segment traffic referred from that URL. Five tags are available (as described in Understanding campaign variables: The five dimensions of campaign tracking):

Source: Every referral to a web site has an origin, or source. Examples of sources are the Google search engine, the AOL search engine, the name of a newsletter, or the name of a referring web site.
Medium: The medium helps to qualify the source; together, the source and medium provide specific information about the origin of a referral. For example, in the case of a Google search engine source, the medium might be “cost-per-click”, indicating a sponsored link for which the advertiser paid, or “organic”, indicating a link in the unpaid search engine results. In the case of a newsletter source, examples of medium include “email” and “print”.
Term: The term or keyword is the word or phrase that a user types into a search engine.
Content: The content dimension describes the version of an advertisement on which a visitor clicked. It is used in content-targeted advertising and Content (A/B) Testing to determine which version of an advertisement is most effective at attracting profitable leads.
Campaign: The campaign dimension differentiates product promotions such as “Spring Ski Sale” or slogan campaigns such as “Get Fit For Summer”.

(For an alternative description, see Google Analytics Campaign Tracking Pt. 1: Link Tagging.)

The recommendation is that campaign source, campaign medium, and campaign name should always be used.

Elsewhere, (Library Analytics (Part 7), from which elements of this post have been taken), I considered how these codes might be used to track course referrals to Library resources from a VLE (something I need to revisit, now I’ve had a little more time to consider the possible role(s) of these tracking codes). But it also seems to me to be reasonable to raise a few questions about how we might use these tracking codes in the context of a document on JISCPress or WriteToReply in order to track referrals back to the site from social media campaigns highlighting a particular document or section of a document.

So, what are sensible mappings/interpretations for the campaign variables? Remember, these tracking variables are parameters that we might add to a link that we have posted somewherethat is intended to drive traffic back to the site. The tracking variables are there to allow us to see how different links are performing. Thinking about how we might use these five tracking dimensions, whether or not we use them in the “intended” Google Analytics way, may also provide us with some ideas about how to use links to drive traffic back to our site.

To try and ground the exercise, consider this example: a new document is published on JISCPress and we want to compare how well links posted on Facebook compare with links posted on Twitter for driving traffic back. For tracking to be most effective, we hope that if a link is rebroadcast or shared, the tracking variables are carried along with it. This means that if a link is posted to Twitter, that gets shared onto Facebook and onto a blog, we can look at the traffic that comes back, and from where (via the Referral tracking described at the start of this post), for each of the separately released URIs. A second example might relate to a campaign intended to drive traffic back to a particular section or paragraph of a document. This campaign might involve publishing a link back to the same paragraph in a series of separate posts or status updates, each with a different slug or call to action message. That is, each link+message may be published in the same place (and hence have the same referrer information), but at different times and with different link text, or contextual information. A third example might be where there is more than on link back to the same document on a web page, and we want to track how effective each link is compared to the others?

Here are the supported variables again:

  • source: the obvious thing to use this variable for is the domain or URI of the page where the link is published to. So if we tweet a link, twitter.com might be sensible. If we blog it, actually might be best?
  • medium: this is intended to refer to the sort of link that has generated the traffic, such as a banner ad. In our case, we might clarify the intent with which the link was posted, such as announcement, or question;
  • term: this is an optional parameter, and I’m not sure how it should be used or whether it conflicts with other Google services. If we post something with a hashtag on twitter, or a st of tags on delicious, might we use those tags are terms?
  • content The second optional variable, this is often usd to discern A/B test ads. If we tweet the same link with different call to action/prompting questions, maybe this differential content should be uniquely identified with the content field?
  • campaign: typically used for tracking a promotion or campaign, this field might be used to identify a different document when, for example, a link to the top level JISCPress is referred to in a announcement about a particular document?

So for example, we might have something like:
http://writetoreply.org/?utm_campaign=ukgovurisets &utm_medium=announcement&utm_source=actually
appearing as the link for WriteToReply in an announcment about the hosting of the UK Government URI Sets document.

Or maybe a call to action on twitter relating to a particular part of a document:
What benefits would you like to see from #JISCRI calls? http://writetoreply.org/jiscri/2009/03/11/rapid-innovation-projects/#3?utm_campaign=jiscri &utm_medium=question&term=JISCRI&utm_source=twitter.com&utm_content=slug3

To support the generation of tracking URIs, a URL Generator Tool (like the official Tool: URL Builder) that will accept a tweet, for example, along with a JISCPress/WriteToReply URL and then automatically create tracking variable values might be worth considering?

Thoughts on JISCPress

As we come to the final month of the JISCPress project, we had some great news over on WriteToReply last week where we were able to announce that Eduserv would be covering our hosting costs for the immediate future (Eduserv funds hosting for WriteToReply, eFoundations: Write To Reply).

So what exactly does the platform we’ve been working on have to offer? Here’s one of the ways I think of it…

A document publishing platform that automatically atomises documents to the paragraph level, allows aggregated commenting at the paragraph and ‘user’ level, and supports the republication and re-presentation of documents in a variety of standard formats at the document level.

The first part of the process is the (manual assisted) ingress stage, in which documents are imported into the WordPress environment such that each substantive document section ideally maps onto a single WordPress “blog post”:

An RSS for the document as a whole, with one item per section, is generated automatically by the WordPress platform. A single item RSS feed is also generated for each page (so the content of each page can be easily transported around the web).

The second part of the process is the atomisation of each post, carried out automatically by the Digress.It theme, in which each paragraph in the document is given its own unique URI, derived from the URI of the web page (“blog post”) the paragraph appears on:

Potentially, an RSS feed can also be produced for each page in which each paragraph is a separate feed item, thus allowing a page/section to be transported around the web via a single feed, but in atomised form.

The paragraph level chunks produced by the atomistation process can be transcluded as independent elements in independent web documents in other documents by a variety of means (as an embeddable object, via XML, txt, JSON, etc):

The default nature of the WordPress platform allows comments to be made at the level of each web page, with an RSS feed of comments for each page being published ‘for free’. JISCPress extends this functionality by allowing comments to be associated with discrete paragraphs. Views over the comments are also available at the user level, (that is, grouped according to the user who made the comments, wheresoever they are made in the document). An additional RSS fed of comments by user is also available, which means that a document on the platform can actually be used as a scaffold for a critical response to the document by a particular user.

A further level of innovation is based on the automated generation of ‘semantic tags’ at the page level. Once generated, tag based collections of posts can be syndicated in the normal way via WordPress generated tag based RSS feeds:

JISCPress also benefits from the Trackback mechanism implemented by WordPress. When a page or paragraph URI is linked to from a third party web page, a trackback to the originating page may be captured, which we interpret as the automated capture of links remote annotations or comments about the document.

When considered in these terms, the JISCPress/WriteToReply platform is seen to provide a powerful means of publishing documents in which individual sections may carry their own unique URI, and individual paragraphs within a section also contain their own unique URI (which in many situations may be rooted on the section URI).

The platform can also be regarded as republishing – or re-presenting – each section (i.e. page) and each paragraph as an independent entity. That is, whenever a document is published via the platform, each separate paragraph may also be thought of as being independently published “for free”, in the sense that:

– each paragraph is independently addressable,
– each paragraph is independently commentable, and
– each paragraph is independently republishable/syndicatable.

So, given that, can you think of any ways in which the JISCPress/WriteToReply platform can support your document publishing and comment gathering strategy?

Paragraph Embedding from JISCPress

One of the things I was keen to explore within the context of the JISCPress project was the potential for using WordPress as a platform for publishing paragraph level fragments that could be embedded in third party web pages.

As Joss announced on the JISCPress blog, We’ve got paragraph data output switches! that expose paragraph level content through a unique URI in a variety of formats (xml, txt, html, rss and json), as well as object embed codes for each paragraph, though I’m not sure if this is going to be maintained…? e..g at the moment, I think we’re trialling literal text blockquote embeds:

Blockquote embed

(If the object embed does disappear, similar functionality could be achieved using the JSON feed and a Javascript function, though I guess we need JSON-P (i.e. support for something like &callback=foo to make that really easy.)

See also: A Quick Update for a review of the latest feature releases within the digress.it theme we’re using.

To demonstrate one possible use case for object embedding, see the post Engaging With the Issues Raised By The Google Book Settlement which includes three embedded paragraphs from the JISC’s current consultation around the Google books settlement.

Embedding content from write to reply

Here’s the actual HTML:

Embedding content from WriteToReply

Note that currently there is an issue with sizing the embed container (can any CSS gurus out there give us a fix?

Object sizing issue with WTR embeds

Ideally we need to identify the container height and then size it automatically so there are no scrollbars? I’m guessing .scrollHeight might have a role to play in autodetecting this?)

One thing you might notice is that the URIs for the embedded consultation questions follow a similar pattern – only the paragraph number identifier changes:
http://writetoreply.org/googlebooks?p=8&digressit-embed=4

What this means is that we should be able to pull in a random paragraph by constructing a URI with a randomly generated paragraph number. So for example:

If you reload the page, you have an 80% chance of seeing a different question…

Here’s the Javascript snippet:

var n=2+Math.floor(Math.random()*5);
var o=document.createElement('object');
o.setAttribute('style','width: 100%; height:70px;');
o.setAttribute('id','61c197964762012d4819093ebeee4fcf');
var p='http://writetoreply.org/googlebooks?p=8&digressit-embed='+n;
p=p.replace(/#038;/,''); //get round WordPress escaping everything...
o.setAttribute('data',p);
document.getElementById('wtr_embed').appendChild(o);

//There’s a div with an appropriate id attribute (‘wtr_embed’) also added to the page…
//Note that the div needs to be placed before any inline Javascript in the page;-)

I’m not sure yet if we can track the use of embeds (certainly server logs should be able to track calls, but these probably can’t be captured using Google Analytics?), but it’s still early days…

Innovation as a Side Effect… JISCPress and the JISC Strategy Review

The eagle eyed amongst you may have noticed that we recently republished the JISC Strategy Review 2010-2012 on WriteToReply in part as a way of field testing the new digress.it theme that has been under development as part of our JISCPress project.

Some time ago, I remember reading a book by Gary Hammel (with Bill Breen) on “The Future of Management” that included a model referred to as the innovation stack.

The model was pyramidal, and comprised four layers – at the bottom, operational innovation; sitting on top of that was product or service innovation, followed by strategic innovation, and at the top, management innovation.

Now I’ve never done an MBA, so my reading of this book may be out of line with a ‘traditional’ reading of it, but here’s what came to my mind when we originally floated the idea of republishing the JISC Strategy Review on WriteToReply, offered as a straw man…

  • operational innovation: Dev8D and the development approaches encouraged in JISCRI projects represent operational innovations; publishing documents on JISCPress is an operational innovation aimed at helping JISC programme managers clarify project calls and JISC project teams shape their bids and disseminate their results;
  • product/service innovation: in many cases, the JISC calls for projects seek to encourage product or service innovations, as well as operational innovations; as a hosted service, the JISCPress platform can be seen as a service innovation, running either as a centrally hosted service, or as a document platform in its own right hosted by an institution itself.
  • strategy innovation: to a certain extent, programmes like the #ukoer programme represent operational steps that may support a strategic innovation in the way HEIs disseminate the fruits of their scholastic endeavours. The idea of Open Repositories and Open Science also operates at the level of strategic innovation. I think I’d be pushing a little more than I already am to find a strategy innovation role for JISCPress!
  • management innovation: JISC Reviews are often disseminated to PVCs and research managers on an “I2I” (institution-to-institution) basis. JISCPress breaks that… badly. JISCPress allows anyone to comment and provide their own response directly to JISC, rather than necessarily representing the traditional response from the top of the strategy/research/IT management hierarchy within the institutions.

So, with that warm up exercise over, it’s time for me to get stuck in to reading the JISC Strategy Review properly… Hmm, now I wonder, does Hamel’s innovation pyramid map in any way onto JISC’s strategy for innovation across the HE and FE sector….?!

Scholarly publishing with WordPress

Working on the JISCPress project, I’ve been thinking quite a lot about scholarly publishing on the web, and in particular with WordPress. This morning, I read a post over on the ArchivePress blog about some WordPress plugins which are useful additions for creating a scholarly blog and it got me thinking a bit more about what features WordPress would need to support scholarly publishing.

JISCPress does away with the idea that WordPress is a blogging tool, and instead uses WordPress Multi-User as a document publishing platform, where one site or ‘blog’ is a document. The way WPMU is structured means that despite serving multiple (potentially millions) of document sites, the platform remains relatively ‘lightweight’ as each document site generates just a handful of additional database tables, while sharing the same administrative core as a single WordPress install. So, 100 WordPress blogs on WPMU is nothing like the equivalent of running 100 separate WordPress blogs, both from the point of resource requirements and administration. In fact, quite soon, there will be no such thing as WPMU as the two products are going to be merged and because they share 90%+ of the same code already, it’s not too difficult to achieve.1

Anyway, my point here is to discuss whether WordPress can be extended to accommodate most conventions found in scholarly publishing and where it is lacking, to identify the development work required to meet the needs of most academic who wish to write on and publish to the web.2

Scholarly publishing extends to a wide variety of published outputs. As a Content Management System (CMS) and technology development platform, I believe that WordPress has the potential to support any type of scholarly publishing that the web supports. It is extremely extensible, as can be seen from the 6000+ plugins that are available. However, what I’m interested in is what can be done now, by an academic wishing to publish their work through the use of WordPress acting as a CMS. What can be achieved with a few quid3 to self-host WordPress so that a few plugins can be installed and a well structured, typical, scholarly paper can be published.

My Dissertation

For some time, I’ve been meaning to publish my MA dissertation. Back in 2002, I undertook some unique research which has not, to my knowledge, been repeated and I think there is some value in having it easily accessible on the web. I have an OpenOffice file and a PDF and, in the course of a morning, have published it under my own domain. The reason I did not publish it on the university WPMU platform is because I have been experimenting with different plugins and did not want to install plugins that were untested or we may not support long-term.  In this case, I’ve used a single WordPress installation, but ideally an individual researcher, group of researchers or research institution, would run a WPMU installation which allowed multiple documents to be authored individually or collaboratively4 and published directly to the web as XHTML.

BuddyPress, by the way, can make the experience even more natural, not only because it is based around a community of like-minded people writing together  on the same web publishing platform, but also because, with a few tweaks here and there, we can move away from the language of blogs and towards the language of documents.


BuddyPress admin bar

Profile menu

Enough of BuddyPress on WPMU for now and back to my dissertation. I set up the site in ten minutes, without using FTP or a command line because I use a host that provides a one-click install of WordPress and WordPress allows you to search for and install plugins from its Dashboard, rather than having to use FTP. Once the site was installed, I then  made some basic changes to the settings, turning on XML-RPC and AtomPub, so that, if I decided to, I could publish to the site using my Word Processor.5 I didn’t use this in the end, but trust me, it works very well using recent versions of MS Word, Open Office (free) and other blogging clients such as MS Live Writer (free).

So, what are the common characteristics of an academic paper? What does WordPress have to support to provide functionality that meets most scholars’ publishing requirements? I scratched my head (and asked on Twitter) and came up with the following:

  • footnotes/endnotes
  • citations
  • use of LaTeX (sciences)
  • tables
  • images
  • bibliography
  • sub-headings
  • annexes
  • appendices
  • dedication
  • abstract
  • table of contents
  • index to figures
  • introduction
  • exposition
  • conclusion

Many of these are supported in WordPress by default and don’t require any additional plugins (tables, images, sub-headings, annexes, appendices, dedication, abstract, introduction, exposition, conclusion, are all either basic literary conventions or just part of a simply structured document).

For additional support, I installed digress.it, which we have funded through the JISCPress project. This is a WordPress plugin which allows readers to comment on the paragraphs of a document, rather than at the document section level. We’re adding a lot more functionality to meet the objectives of the JISCPress project, but I chose digress.it, principally for the reason that it is designed to turn a WordPress blog into a document site. I could have used any other WordPress theme, but digress.it automatically creates a Table of Contents and allows you to re-order WordPress posts when they are read so that you don’t have to author your document in reverse or adjust the publication dates so the document sections appear in the correct order.

My dissertaion published using digress.it

My dissertation published using digress.it

I added the abstract for my dissertation to the ‘about’ page, so it shows up on the front of the site. I also uploaded a PDF version so that people can download it directly. You’ll see that I also added some links to a related book and DVD, which will certainly appeal to people who are interested in my dissertation. The links pull an image and some basic metadata from Amazon, using the Amazon Machine Tags plugin. This could be used to link to the book in which your article is published and earn you money in click referrals. An alternative, would be the Open Book Book Data plugin, which retrieves a book cover and metadata from Open Library, where your book may already be catalogued. If it’s not on Open Library, catalogue it!

After setting this up, I installed a few more plugins:

Dublin Core for WordPress: Automatically adds ten Dublin Core metadata elements to the document mark up.

wp-footnotes: This allows you to easily add footnotes to your document by enclosing your footnote in double parentheses.6

OAI-ORE Resource Map: Automatically marks up the document sections with a OAI-ORE 1.0 resource map.

Google Analyticator: Adds Google Analytics support so you can collect statistics on the readership of your document.

WP Calais Archive Tagger: Analyses your entire document and automatically keywords each section, using the Open Calais API.

Search API: WordPress comes with search built in, but there is a new search API which will eventually make its way into the WordPress core. I’ve installed the plugin to provide full-text search across the document. It can also add Google Search to your document site.

wp-super-cache: This is simple to install and will significantly speed up your document site, making it a pleasure to navigate through and read :-)

Plugins I didn’t use

wp-latex: Although I didn’t need it for my dissertation, it’s worth noting that WordPress supports the use of \LaTeX.

Academic Citation: You need to add a line of code to your theme for this to display. It supports the concept of an article being a single blog post, rather than a ‘document site’ and displays a variety of citation formats for readers to use.

Do you know of any other plugins for a scholarly blog?

The Beauty of Feeds

The other useful thing about managing a document using WordPress and in particular, using digress.it, is that you automatically get RSS/Atom feeds for the document. I’ve already discussed these in detail. It means that I was able to read my document in my feed reader, with footnotes and images displayed correctly.

Document in Google Reader

See how nicely the formatting is preserved. \LaTeX is also rendered correctly in feed readers.

Document formatted nicely in Google Reader

Reading my dissertation in Google Reader

You’ll see that the document sections are listed in order; that is, first section on top. As I noted above, blogs list posts in reverse (most recent first), so I sorted the feed items in Yahoo Pipes and sorted it in ascending order. Yahoo Pipes exports as RSS and it’s that feed that I subscribed to in Google Reader. Wouldn’t it be nice, if I could import my document feed into an Institutional Repository? Wait a minute, I can! :-)

Importing an RSS feed into EPrints

Click to see the item in the repository

Click to see the item in the repository

When importing the default feed, the HTML output is accurate but in reverse order, while the RSS output from Yahoo Pipes didn’t import into EPrints very cleanly at all. I’ll work on this. UPDATE: Forget Yahoo Pipes. WordPress feeds can be sorted with a switch added to the URL: http://example.com/feed/?orderby=post_date&order=ASC

So there it is. An academic paper, published to the web using a modern CMS which supports most authoring and publishing requirements. I would favour an institutional WPMU platform for academics to author directly to, publish their pre-print to the web for open access and detailed comment, and import their RSS feed into the repository. As a proof of concept, I’m quite pleased with this. We are currently developing a widget that can be embedded in a web page or WordPress sidebar and allow a member of staff to upload a document or zipped folder of documents to the Institutional Repository. I wonder if we can also support the import of a feed from the widget, too?

So, what would your requirements be? Tell me and I’ll do my best to test WordPress against them.

  1. Has anyone done a diff on the two code bases to measure exactly what percentage of the code is shared between WP and WPMU?
  2. Actually, I think I’ll save the discussion of its shortfalls for my next post. This one is already long enough.
  3. I pay $5/year for my domain name and as many sub-domains as I need. I pay $10/month for my hosting with unlimited storage and bandwidth.
  4. Like any decent CMS, WordPress supports role-based authoring and editing and maintains a revision history of edits, auto-saved once per minute. Revisions can be compared alongside of each other.
  5. On a scholarly WPMU installation, plugins could be pre-installed and activated, a default theme selected and settings tweaked so very little work is required by the academic author prior to writing her document.
  6. I am using the plugin on this blog!

Related posts

Testing new site features with the Amazon Kindle License Agreement

We’re really pleased to help promote the launch of digress.it, the evolution of CommentPress which WriteToReply uses to allow you to comment on document paragraphs.

We’ve been in touch with Eddie Tejeda, the original developer of CommentPress, since March, and have been working with him to find funding for a complete rewrite and re-release of the original CommentPress project. You can read more about digress.it on the community website, but here’s a run down of the new features, a bit of a roadmap for forthcoming features and a shout out to anyone that wants to get involved in the project.

New Features

The original features of CommentPress can be summarised as follows:

  • A Table of Contents
  • Paragraph-level URIs
  • Paragraph-level commenting
  • A scrolling comment box
  • Page filters that allow you to read comments by document section or by commenter

CommentPress was a WordPress theme. digress.it is a WordPress plugin and a complete rewrite of the original CommentPress code. It adds the following features:

Floating comment box. You can now resize and position the comment box anywhere on the page.

Threaded comments. Real discussion.

Highly configurable and accepts different stylesheets

RSS feeds for comment authors. Feeds for individual comment authors is a first for WordPress.

Paragraph embedding. You can embed a paragraph on your own site. Paragraphs content is available as HTML, JSON or TXT

Real-time onsite notifications. If someone else comments on a section you are reading, the comment box will ‘pulse’ to alert you.

There still be bugs. We’re still working on browser compatibility issues with the comment box, for example. This is a first release  using the version 2 codebase and we’d really appreciate your feedback from testing it by considering Amazon’s Kindle License Agreement. :-)

RoadMap

To achieve the objectives of the JISCPress project, we’ll be continuing to fund the refinement of digress.it until November.  The features we’re currently considering can be seen on our UserVoice page (please add more as you think of them). Here are some highlights, specific to digress.it:

  • Compatibility with IntenseDebate. This would provide a number of multimedia and reputational features.
  • Compatibility with PollDaddy. The ability to include polls in a document would be useful for consultations.
  • Paragraph and ‘comment here’ links in the RSS feed. Convenient if you read the document in your news reader or embed a feed elsewhere.
  • WCAG Accessibility. Required for use by the Public Sector.
  • Compatibility with XML-RPC clients for remote document authoring. Convenient for document authors to publish from MS Word, etc.

Content Transclusion: One Step Closer

Following a brief exchange with @lesteph last night, I thought it might be worth making a quick post about the idea of content or document transclusion.

Simply put, transclusion refers to the inclusion, or embedding, of one document or resource in another. To a certain extent, whenever you embed an image or Youtube video in a page is a form of transclusion. (Actually, I’m not sure that’s strictly true? But it gets the point across…)

Whilst doing a little digging around for references to fill out this post, I came across a nicely worked example of transclusion from Wikipedia – Transclusion in Wikipedia

content transclusion in wikipedia

The idea? You can embed the content of any Wikipedia page in any other Wikipedia page. And presumably the same is true within any Mediawiki installation.

That is, in a MediaWiki wiki:

you can embed the content of any one page in any other page.

(I’m not sure if one MediaWiki installation can transclude content from any other MediaWiki installation? I assume it can???)

It’s also possible to include, (that is, transclude) MediaWiki content in a WordPress environment using the Wiki Inc plugin. A compelling demonstration of this is provided by Jim Groom, who has shown how to republish documentation authored in a Wiki via a WordPress page, an approach we adopted in our WriteToReply Digital Britain tinkerings.

One of the things we’ve started exploring the JISCPress project is the ability to publish each separate paragraph in a document (each with its own URI), in a variety of formats – txt, JSON, HTML, XML. That is, we have (or soon will have) an engine in place that supports the “publishing” side of paragraph level transclusion of content from reports published via the JISCPress/WTR platform. Now all we need is the transclusion (re-presentation of transcluded content) part to be able to transclude content from one document in another. (See Taking the Conversation Elsewhere – Embedded Quotes; see also Image Based Quotes from WriteToReply Using Kwout for a related mashup).

(Hmm, although Joss won’t like this, I do think we need a [WTR-include=REF] shortcode handler installed by default in WTR/JISCPress that will pull in paragraph level content in to one document from a document elsewhere on the local platform?)

Now this is really what hypertext is about – URIs (that is, links), that can act as a portal that can pull content in to one location from another. It may be of course that the idea of textual transclusion is just too confusing for people. But it’s something we’re going to explore with WriteToReply.

And on of the things we’re looking at for both WriteToReply and JISCPress is the use of semantic tagging to automatically annotate parts of the document (at the paragraph level, if possible?) so that content on a particular topic (i.e. tagged in a particular way) in one document can be automatically transcluded in – or alongside – a related paragraph in a separate document. (Hmm – maybe we need a ‘related paragraphs’ panel, cf. the comments panel, that can display transcluded, related paragraphs, from elsewhere in the document or from other documents?)

PS If you have an hour, here’s the venerable Ted Nelson giving a Google Tech Talk on the topic of transclusion:

Enjoy…

PPS here’s an old library that provides a more general case framework for content transclusion: Purple Include. I’m not sure if it still works though?

PPPS Here’s the scarey W3C take on linking and transclusion 😉 This is also interesting: auto/embed is not node transclusion

PPPPS for another take on including content by reference, see Email By Reference, Not By Value, or “how I came up with the idea for Google Wave first”;-)

PPPPPS Seems like eprints may also support transclusion… E-prints – VLit Transclusion Support.

Image Based Quotes from WriteToReply Using Kwout

One of the things we discussed with respect to embedding WriteToReply/JISCPress quotes in third party applications was whether or not we should support an “imagified” embedding – that is, convert a paragraph to a JPG or PNG image format that can then be easily embedded in the third party site.

The advantage? Even if the third party site disallows script, object or embed tags, it will probably allow img tags…

So for example, extending the range of output formats suggested in Taking the Conversation Elsewhere – Embedded Quotes, we might consider something like an &output=png switch that allows us to construct an image embedding code along the lines of:

<img src=”http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER&output=png” longdesc=”http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER”>

Once again, there’s a trackback issue, although it’s easy enough to wrap the image tag in an appropriate anchor tag:

<a href=”http://docserver.example.com?p=POSTNUMBER&para=PARANUMBER”><img src=”http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER&output=png” longdesc=”http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER”></a>

However, this facility was seen as non-essential, so I looked on the web for a solution – and found it in the form of the kwout API which can be used to generate an image based representation of text found in a specified div tag (by ID) on a given web page, which can then in turn be embedded in an arbitrary web page. Although the image may be hard to read, this can work to our advantage: it might drive traffic back to the site that originated the quote :-)

The following javascript snippet uses the Kwwout API to generate an image based representation of a single paragraph from a WriteToReply republished document:

javascript:window.location=’http://kwout.com/grab?address=’+encodeURIComponent(“http://writetoreply.org/pluralnews/2009/07/03/section-1-securing-plural-sources-of-news-in-the-nations-locally-and-in-the-regions/”)+’&block=contentblock_10′

In the API call, “contentblock_10″ is the id of the block element to be quoted. Here’s what the kwouted image looks like:

kwouting a paragrpah from writetoreply http://kwout.com/quote/nbj4nife

And here’s the original paragraph on WriteToReply:

http://writetoreply.org/pluralnews/2009/07/03/section-1-securing-plural-sources-of-news-in-the-nations-locally-and-in-the-regions/#10 Writetoreply orginal quote

Note that the link that the kwout script generates is back to the page in the above case, so to link back to the actual paragraph we’d need to specify this in the link:

javascript:window.location=’http://kwout.com/grab?address=’+encodeURIComponent(“http://writetoreply.org/pluralnews/2009/07/03/section-1-securing-plural-sources-of-news-in-the-nations-locally-and-in-the-regions/#10“)+’&block=contentblock_10′

As a step on the road to full integration (a use of the Kwout API which may or may not be in line with the stated terms and conditions? I don’t know, I haven’t read them…!) is this bookmarklet that should let you highlight a paragraph number on a WriteToReply document, and then take you straight to the Kwout embed page for that paragraph:

javascript:(function(){var l=location.href; window.location=’http://kwout.com/grab?address=’+encodeURIComponent(l)+’&block=contentblock_’+window.getSelection();})()

Actually, that looks a little cluttered, and the usability is a little off. So a better solution maybe to suggest that the user clicks on the paragraph link to get the “paragraph in focus page” page, and then click on the following bookmarklet:

javascript:(function(){var l=location.href;l=l.split(‘#’);window.location=’http://kwout.com/grab?address=’+encodeURIComponent(l[0])+’&block=contentblock_’+l[1];})()

(What this does is pull the paragraph identifier out of the URI and then construct the Kwout API call out of it as a result.)

Or if you want the link to go to the “paragraph in focus” page, rather than the top of the page:

javascript:(function(){var l=location.href;window.location=’http://kwout.com/grab?address=’+encodeURIComponent(l)+’&block=contentblock_’+l.split(‘#’)[1];})()

(Note that neither of these bookmarklets is ideal – a production stable bookmarklet should be able to cope (or fail gracefully) with the lack of hash separated paragraph identifier in the URI.)

Hmm, maybe we need a “labs” area on WriteToReply where we can collect these micro-utilities?

Taking the Conversation Elsewhere – Embedded Quotes

As part of the JISCPress effort, one of the things we’ve been considering is the granularity of appropriate “consultation elements” or “discussion elements”, those pieces of content that people might actually want to reference, question or chat around as compared to a whole 200 page document, for example.

The page and paragraph levels fall out of the CommentPress theme (and its descendants) quite naturally – WordPress gives us the page level (along with a single item RSS feed at the page level), and the theme gives us URIs at the paragraph level.

(Hmmm… I wonder – would it also be useful to provide a multi-item RSS feed, at the page level, with a separate item for each paragraph on that page? Or do we do that already?!)

In many cases, the paragraph level seems to be the most natural chunk for discussion, particularly in an ongoing conversation about a particular document. So a major question for us is how to put those paragraphs to work?

One of the features that Eddie’s been working on as part of the JISCPress project is the ability to embed paragraphs from a document in third party web page. This feature will allow us to increase the surface area of the document by allowing third parties to re-present that content elsewhere, whilst also (hopefully) providing a means to link that external conversation directly back to the original document.

So what benefits does embedding have to offer to:

a) the person grabbing and using the embed code;
b) the publisher/whoever’s running the consultation from which the embed code was grabbed

In a discussion on the JISCPress group, Joss suggested the following:

For the user:

1. More portable transformation of document content into raw data.
2. Personalisation, presentation and ‘ownership’ of documents within their own publishing environment (which is one of the benefits of slideshare/scribd).
3. Direct joined up quoting rather than copying. More aligned with the ideals of the web and linking data. This could also be a benefit to publishers concerned about unattributed copying.

For the publisher:

1. Greater possibilities of content dissemination
2. Greater potential of attracting engagement via trackbacks
3. Further possibility of using JISCPress as an underlying ‘document store’ where authoring, dissemination and engagement occurs mostly remotely via XML-RPC, syndication, embeds and trackbacks.
4. Possibility of site analytics being hooked into embeds so the reach is measurable???? (Analytics can track document types, I’m not sure whether they are used to track embeds…)

So where are we at? Embedding is currently in testing and has the following mechanic. Hovering your mouse cursor over one of the paragraph numbers in a document raises a floating panel that contains a link to the current paragraph, and an embed code. (The panel remains open whilst the cursor is over it, so you can easily grab a copy of the code.)

Embedding in digress.it

Using the embed code in a third party page embeds the corresponding paragraph in that page.

For testing purposes, the pattern we are using for the embed URL is of the form:

http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER

The POSTNUMBER identifies the actual page (i.e. http://docserver.example.com?p=POSTNUMBER is a valid page URI) and the PARANUMBER identifies the paragraph to be embedded. Note that this is subject to change.

Unfortunately, the simple embed strategy does not trivially generate a linkback (such as a trackback or pingback) to the original document. For these reverse links to be generated automatically, an actual anchor tag linking back to the original page must be present in the page creating the linkback. One commonly used strategy for achieving this is to provide an embed code of the form:

<div>
<object /&gt
<a>Quoted from etc…</a>
</div>

That is, a link is explicitly included in the embed code, although it is easy enough for the person embedding the quote to strip that anchor tag out.

(Although it complicates matters, as the embedded object is being pulled from the document server, I guess that means we could, in principle, generate a linkback by observing the referrer page URIs for requests made on the server for particular embeddable objects and checking those against the current list of trackbacks? Or maybe the embedded object could generate an XML-RPC back to the trackback server itself whenever the page it is embedded in is loaded? [Note to self: can we easily get analytics on third party embeds?] I think Eddie is working on this, so I won’t embarrass myself further wittering on about things I don’t know anything about!;-)

Note that a similar problem arises when using a Javascript (<script> tag) based embed code: there is no explicit anchor link present. Script tags also have the additional problem that they are often sanitised (i.e. stripped out) of web pages in many institutional web publishing systems. (In some circumstances, a workaround for the institutional case may be possible. For example, if a variant of WTR/JISCPress was running as a white label solution in an institution, a shortcode plugin could be provided that allowed authors to embed paragraphs from documents in that environment within other documents in that environment. See the WordPress shortcode API for more details.)

As well as the straightforward embed code, we’ve also been considering other ways in which paragraph level content can be published so that third parties have convenient access to it in a format that is appropriate for their needs.

And this is what we came up with – an output switch that can be appended to the end of a paragraph URI that allows the paragraph level content to be published in a variety of formats:

  • &output=html
  • &output=rss
  • &output=txt
  • &output=js
  • &output=json

As and when these come on stream, we’ll publish use-case examples for each of them.

If you have any comments on our “paragraph republishing” strategy, please post a comment below.

JISCPress: A document discussion platform

We’re very pleased to announce that JISC have agreed to fund JISCPress, a six-month, £32,500 project led by the University of Lincoln, in partnership with the Open University and based on WriteToReply. JISCPress will provide a scalable community platform for publishing and discussing project calls and final reports, in order to support the grant bidding and project dissemination processes.

As you may know, WriteToReply is run in our spare time – lots of late nights and busy lunchtimes. Since launching the re-publication of the Digital Britain – Interim Report, we’ve been looking for ways to bring benefits from our work on WriteToReply, into the Higher Education community where we work. JISC fund much of the UK development and innovation in the use of ICT in teaching and research and in March, announced their Rapid Innovations funding call.

We quickly re-published the call on WriteToReply to demonstrate the benefits of publishing funding calls in this way and then went on to submit a bid which proposed a community platform for the JISC funding call process, based on our experience of setting up and running WriteToReply. As with WriteToReply, this will be an open, public project and all documentation and code will be available under open licenses.

JISCPress is a platform aimed at people working in UK Higher Education, but the platform itself could be easily adapted for other uses, just as WriteToReply is primarily focused on government consultation documents. The final platform will be available as an Amazon Machine Image so anyone will be able to host their own multi-document discussion platform with all the benefits you see on WriteToReply plus the additional features we’ll be developing throughout this project. We’re already advocating the use the platform in our own universities for the open (and closed) discussion of institutional strategies, for the critique of texts by students and for peer-review of research papers. What might you use it for?

Over on the JISCPress project blog, you’ll find links to a mailing listwiki and code repository. Feel free to join us if this WriteToReply spin-off appeals to you. If you know anyone that might be interested, please do let them know.

You’re probably already aware that WriteToReply uses WordPress Multi-User and CommentPressEddie Tejeda, the developer of CommentPress will be working with us on the project and this will result in significant further development of CommentPress 2. So, if you’re interested in WPMU and CommentPress (as many people are), please consider following, contributing to and testing JISCPress.

We should also note that while the project is a spin-off of our work on WriteToReply, neither Tony or Joss are personally receiving any funds from JISC.  The contributions from JISC to cover our time on this project are paid directly to our employers and does not result in any financial benefit to us or WriteToReply (which is in the process of being formalised as a non-profit business).  In other words, while WriteToReply is a personal project, JISCPress is part of our normal work as employees of our universities (both Tony and I are expected to routinely bid and win project funds – you get used to it after a while!). Money has been allocated to fund dedicated developer time to the project, which will pay Eddie and Alex, a student at the University of Lincoln, for their work as freelancers.

Anyway, on with the project! Here’s the outline from our original bid document:

This project will deliver a demonstrator prototype publishing platform for the JISC funding call and dissemination process. It will seek to show how WordPress Multi-User (WPMU) can be used as an effective document authoring, publishing, discussion and syndication platform for JISC’s funding calls and final project reports, and demonstrate how the cumulative effect of publishing this way will lead to an improved platform for the discovery and dissemination of grant-related information and project outputs. In so doing, we hope to provide a means by which JISC project investigators can more effectively discover, and hence build on, related JISC projects. In general, the project will seek to promote openness and collaboration from the point of bid announcements onwards.

The proposed platform is inspired and informed by WriteToReply, a service developed by the principle project staff (Joss Winn and Tony Hirst) in Spring 2009 which re-publishes consultation documents for public comment and allows anyone to re-publish a document for comment by their target community. In our view, this model of publishing meets many of the intended benefits and deliverables of the Rapid Innovation call and Information Environment Programme. The project will exploit well understood and popular open source technologies to implement an alternative infrastructure that enables new processes of funding-related content creation, improves communication around funding calls and enables web-centric methods of dissemination and content re-use. The platform will be extensible and could therefore be the object of further future development by the HE developer community through the creation of plugins that provide desired functionality in the future.

Subject to user requirements, our planned project deliverables are:

  • A WordPress Multi-User based platform for authoring and publishing JISC funding calls in a form that allows paragraph-level comment and discussion either locally or remotely.
  • A meta-site that aggregates all document data into a single site for search, navigation by categories and tags and can syndicate searches, tags and categories.
  • Develop CommentPress to meet WCAG 2.0 accessibility guidelines, meeting public sector requirements.
  • Evaluation and integration of “related content” utilities to dynamically link related project calls and reports based on content and/or semantic analysis.
  • Evaluation and possible integration of remote, realtime messaging services such as Twitter and XMPP integration.
  • Evaluation and possible integration of enterprise authentication services such as LDAP and Shibboleth.
  • Evaluation and possible integration of OpenCalais, a semantic tagging service.
  • Documentation on how to exploit the benefits of AWS and clone the project instance for other uses.
  • A documented suggested workflow for document authors
  • Documented examples of how to fully exploit the platform for data extraction and syndication.
  • Documented ‘user stories’ for the JISC funding call process.

If this sounds interesting, please do take a look at the full project proposal and join us on the mailing list.

JISCPress: A document discussion platform for the Higher Education Community

We’re very pleased to announce that JISC have agreed to fund JISCPress, a six-month, £32,500 project led by the University of Lincoln, in partnership with the Open University and based on WriteToReply. JISCPress will provide a scalable community platform for publishing and discussing project calls and final reports, in order to support the grant bidding and project dissemination processes.

As you may know, WriteToReply is run in our spare time – lots of late nights and busy lunchtimes. Since launching the re-publication of the Digital Britain – Interim Report, we’ve been looking for ways to bring benefits from our work on WriteToReply, into the Higher Education community where we work. JISC fund much of the UK development and innovation in the use of ICT in teaching and research and in March, announced their Rapid Innovations funding call.

We quickly re-published the call on WriteToReply to demonstrate the benefits of publishing funding calls in this way and then went on to submit a bid which proposed a community platform for the JISC funding call process, based on our experience of setting up and running WriteToReply. As with WriteToReply, this will be an open, public project and all documentation and code will be available under open licenses.

JISCPress is a platform aimed at people working in UK Higher Education, but the platform itself could be easily adapted for other uses, just as WriteToReply is primarily focused on government consultation documents. The final platform will be available as an Amazon Machine Image so anyone will be able to host their own multi-document discussion platform with all the benefits you see on WriteToReply plus the additional features we’ll be developing throughout this project. We’re already advocating the use the platform in our own universities for the open (and closed) discussion of institutional strategies, for the critique of texts by students and for peer-review of research papers. What might you use it for?

Over on the JISCPress project blog, you’ll find links to a mailing listwiki and code repository. Feel free to join us if this WriteToReply spin-off appeals to you. If you know anyone that might be interested, please do let them know.

You’re probably already aware that WriteToReply uses WordPress Multi-User and CommentPressEddie Tejeda, the developer of CommentPress will be working with us on the project and this will result in significant further development of CommentPress 2. So, if you’re interested in WPMU and CommentPress (as many people are), please consider following, contributing to and testing JISCPress.

We should also note that while the project is a spin-off of our work on WriteToReply, neither Tony or Joss are personally receiving any funds from JISC.  The contributions from JISC to cover our time on this project are paid directly to our employers and does not result in any financial benefit to us or WriteToReply (which is in the process of being formalised as a non-profit business).  In other words, while WriteToReply is a personal project, JISCPress is part of our normal work as employees of our universities (both Tony and I are expected to routinely bid and win project funds – you get used to it after a while!). Money has been allocated to fund dedicated developer time to the project, which will pay Eddie and Alex, a student at the University of Lincoln, for their work as freelancers.

Anyway, on with the project! Here’s the outline from our original bid document:

This project will deliver a demonstrator prototype publishing platform for the JISC funding call and dissemination process. It will seek to show how WordPress Multi-User (WPMU) can be used as an effective document authoring, publishing, discussion and syndication platform for JISC’s funding calls and final project reports, and demonstrate how the cumulative effect of publishing this way will lead to an improved platform for the discovery and dissemination of grant-related information and project outputs. In so doing, we hope to provide a means by which JISC project investigators can more effectively discover, and hence build on, related JISC projects. In general, the project will seek to promote openness and collaboration from the point of bid announcements onwards.

The proposed platform is inspired and informed by WriteToReply, a service developed by the principle project staff (Joss Winn and Tony Hirst) in Spring 2009 which re-publishes consultation documents for public comment and allows anyone to re-publish a document for comment by their target community. In our view, this model of publishing meets many of the intended benefits and deliverables of the Rapid Innovation call and Information Environment Programme. The project will exploit well understood and popular open source technologies to implement an alternative infrastructure that enables new processes of funding-related content creation, improves communication around funding calls and enables web-centric methods of dissemination and content re-use. The platform will be extensible and could therefore be the object of further future development by the HE developer community through the creation of plugins that provide desired functionality in the future.

Subject to user requirements, our planned project deliverables are:

  • A WordPress Multi-User based platform for authoring and publishing JISC funding calls in a form that allows paragraph-level comment and discussion either locally or remotely.
  • A meta-site that aggregates all document data into a single site for search, navigation by categories and tags and can syndicate searches, tags and categories.
  • Develop CommentPress to meet WCAG 2.0 accessibility guidelines, meeting public sector requirements.
  • Evaluation and integration of “related content” utilities to dynamically link related project calls and reports based on content and/or semantic analysis.
  • Evaluation and possible integration of remote, realtime messaging services such as Twitter and XMPP integration.
  • Evaluation and possible integration of enterprise authentication services such as LDAP and Shibboleth.
  • Evaluation and possible integration of OpenCalais, a semantic tagging service.
  • Documentation on how to exploit the benefits of AWS and clone the project instance for other uses.
  • A documented suggested workflow for document authors
  • Documented examples of how to fully exploit the platform for data extraction and syndication.
  • Documented ‘user stories’ for the JISC funding call process.

If this sounds interesting, please do take a look at the full project proposal and join us on the mailing list.