Why it is the way it is - Hypertext Design Issues

This post is an analysis of an early document on Hypertext Design Issues.

The key ideas being discussed in this document are on Hypertext - whether links should be monodirectional or bidirectional, should links be typed etc.

These discussions were conducted in the early days of the web. It is interesting to know how things have evolved since the time this design was made.

Let's first get some facts right:
Hypertext links today:

  • Are Two-ended

  • Are Monodirectional

  • Have one link

  • Are Untyped

  • Contain no ancillary information

  • Don't have preview information

What are the implications of this design?
  • Hyperlinks are not multiended. A single link cannot link to multiple destinations. There are however cases when one to many, many to one and many to many 'links' might make sense. These types of connections among information nodes is what RDF/OWL help achieve.

  • an advantage is that often, when a link is made between two nodes, it is made in one direction in the mind of its author, but another reader may be more interested in the reverse link.
    Bloggers want to track those pages that have linked to their posts. Google indexes allow us to track links to a particular page. Linkback mechanisms have evolved in the Blogger world to serve precisely this purpose. In general however, we never know who has linked to our page

  • It may be useful to have bidirectional links from the point of view of managing data. For example: if a document is destroyed or moved, one is aware of what dangling links will be created, and can possibly fix them.
    This problem has not yet been solved. Since links are monodirectional, dangling links cannot be detected. Dangling links - when the information linked to changes, there is no way to clean up the links

  • About anchors having one or more links: This is still debatable. There are some utilities that allow you to make every word a hyperlink and allow executing a host of 'commands' on the word. Ex: Perform a Google search for the word, lookup the word in dictionary.com, map the word (if it is a city) or lookup in Wikipedia. However I am not a big fan of these utilities since I feel it clutters the screen and the context detection is not yet great.

  • Typed links: I feel this is the single most important thing missing from Hyperlinks in WWW. While making types mandatory would have complicated the issue, a standard way to provide 'types' to links should have been provided. Anyway, it's the way it is. So how are people solving this issue? Microformats, RDFa are 2 things I know of. The data is mostly silently read by the browser and tools and users are usually unaware of this data in the pages. In other words, the User Interface for typed links is still not great.

  • Meta information associated with links. Interesting! I am aware of Wikipedia articles containing the date when the page was last visited but this is pretty much manually updated as far as I know.

  • Preview information: Snap solves this very issue.

  • The conclusion?
    Well, it's tough to say how optimal the design of hypertext on the WWW was. Introducing multi-directional links and typed links would definitely help the technical people out there, but would introduce complexity which would perhaps have made it so tough for the web to flourish that it wouldn't be what it is today.

    Google search making use of Social Graph information???

    Today, while I was searching for my name in Google, I saw that towards the end, it showed my Delicious bookmarks in the search results. There is definitely no occurrence of my name in the pages that were listed in the search results and the only connection I could see between 'me' and the search result is that I have bookmarked these links in Delicious.

    Is Google making use of the Social Graph information that I provided while I was trying out Google Friendconnect? Doesn't this raise privacy concerns?

    Why it is the way it is - an analysis of the proposal by TimBL of the WWW

    Ever wonder why hyperlinks in the World Wide Web (WWW) are unidirectional? Why are links not typed? Why are links many to one and not many to many? Why do browsers have the restrictions that they have today? Why is the web the way it is?

    A lot of the answers to these questions are hidden somewhere deep in the web itself. Having come across several technical issues with the web, I began to wonder what the initial creators of the web perceived the web to be? What was running in the minds of the users when they came across the idea of the web?

    I started tracing back into history to the very beginning of the WWW. That's how I came across the 'original proposal of the WWW'.

    So here are some of my notes on the paper:
    (Content in italic are from the paper.)

    Use cases for the WWW

    The initial use-cases for the WWW were related to project management - communicating project ideas, storing technical details for retrieval later, finding out who wrote a piece of code, fetching all related documents for the current task. Most of the proposal revolves around the system to allow for multiuser hypertext access which is non-centralized and non-hierarchical.

    Relationship to relational databases

    Linked information systems have entities and relationships. There are, however, many differences between such a system and an "Entity Relationship" database system. For one thing, the information stored in a linked system is largely comment for human readers. For another, nodes do not have strict types which define exactly what relationships they may have. Nodes of similar type do not all have to be stored in the same place.

    What does this mean?
    We do have entities and relationships, but there are no fixed rules. Entities don't need to have types and any two entities can be related to each other. There is also no restriction on where the entities are stored.


    The key ideas around Hypertext were put down by Vanevar Bush in 1945 in the form of Memex. There were several attempts by people to implement Hypertext and also Hypermedia (linking images, video etc). Ted Nilson coined the word Hypertext in 1965 and subsequently also coined the term Hypermedia. The first implementation of Hypertext in some form seems to be from Doug Engelbart in 1968. The buzz around Hypertext picked up during the late 1980's - there was a dedicated Usenet newsgroup, a bunch of conferences starting with Hypertext'87, several ACM papers, workshops etc. All this happened even before the WWW was born. There were several commercial products too, like Hypercard from Apple.

    TimBL had also tried his hands at building a hypertext system, which he called Enquire. TimBL claims to have built it as early as 1980, although the first mention of Enquire seems to be in this proposal made in 1989.

    When I started researching on Hypercard features, I realized one thing. These products are easily 20 years old. Technology has changed a lot in this time. It is really hard to imagine how many of these products looked like. Either the source is not available in its entirety or it is tough to compile. This reminds me of what Grady Booch said - about having an archive of source code similar to the archive of books, videos, music and web pages.

    Anyway, the most important difference I see between Enquire and Hypercard is that Enquire was more of a 'programmers playtool', while Hypercard was targeted towards end-users.

    So while Hypercard had 'fancy graphics', Enquire had typed links and was available for multi user access.

    WWW requirements

    About the requirements that TimBL put down for the WWW:
    * Remote access across networks, Heterogeneity, Non-Centralisation - These are what are now taken for granted. The WWW is ubiquitous, it never breaks as a system, it can be accessed from just about any device that is Internet aware.
    * Access to existing data - This was one of the reasons why the WWW became popular. It was easy to get existing data onto the web with minimal effort.
    * Private links -
    One must be able to add one's own private links to and from public information. One must also be able to annotate links, as well as nodes, privately.
    Frankly, I am not sure what TimBL means by private links 'from' public information.
    * Bells and Whistles - Graphical access to the web was considered optional.
    * Data analysis - This is one thing that has not taken off.
    It is possible to search, for example, for anomalies such as undocumented software or divisions which contain no people. It is possible to generate lists of people or devices for other purposes, such as mailing lists of people to be informed of changes.
    It is also possible to look at the topology of an organisation or a project, and draw conclusions about how it should be managed, and how it could evolve. This is particularly useful when the database becomes very large, and groups of projects, for example, so interwoven as to make it difficult to see the wood for the trees.

    The Semantic Web is showing this promise.
    * Live links - These are what are now called 'Dynamic pages' and most popular pages on the web are 'live' in that sense.

    The implementation

    Much of the academic research is into the human interface side of browsing through a complex information space. Problems addressed are those of making navigation easy, and avoiding a feeling of being "lost in hyperspace". Whilst the results of the research are interesting, many users at CERN will be accessing the system using primitive terminals, and so advanced window styles are not so important for us now.

    As I read this, it gives me a feeling that TimBL was not thinking of making the WWW a 'public' web that would be used by just about everyone. Even a non-techie could build a page of content and hook it onto the web. Usability seemed to be of least importance.

    The only way in which sufficient flexibility can be incorporated is to separate the information storage software from the information display software, with a well defined interface between them.

    This division also is important in order to allow the heterogeneity which is required at CERN (and would be a boon for the world in general).

    A client/server split at this level also makes multi-access more easy, in that a single server process can service many clients, avoiding the problems of simultaneous access to one database by many different users.

    'information display software' - Now that's what the browser is! Also this is what created the need for HTTP, HTTP server and HTML.


    Do we still visualize the web as just content linked via Hypertext? How can we accommodate social networking and the whole realm of developments around Web 2.0 and social network applications?

    The web has surely come a long way!

    (Note: Draft content - subject to change)

    YQL - Yahoo's query language for the web

    This post is a part of the AfterThoughts series of posts.

    Post: A query language for searching websites
    Originally posted on: 2005-01-27

    I blogged about the idea of a query language for websites back in 2005. Today, when I was doing my feed sweep, I came across YQL, a query language with SQL-like syntax from Yahoo that allows you to query for structured data from various Yahoo services.

    There is one thing that I found interesting. The ability for you to query 'any' HTML page for data at a specific XPath. There are some details in the official Yahoo Developer blog.

    The intent of YQL is not the same as what I had blogged about. While YQL allows you to get data from a specific page, what I had intended was something more generic - an ability for you to query a set of pages or the whole of the web for specific data, which is a tougher problem to solve.

    In order to fetch specific data from a HTML page using YQL, all you have to do is:
    1. Go to the page that you want to extract data from.
    2. Open up Firebug and point to the data that you want to extract (using Inspect).
    3. Right click the node in Firebug and click on 'Copy XPath'.
    4. Now create a query in YQL like this:
    select * from html where url="" and xpath=""

    Although the idea seems promising I wasn't able to get it to work for most XPaths.

    I guess the reason is the difference between the way the browser interprets the HTML and the way a server would interpret it. For example, if there is no 'tbody' tag in your table, the Firefox browser inserts a 'tbody' tag and that would be present in your XPath, while a server that interprets the HTML after Tidying it wouldn't see one. One way we can solve this is to have the same engine interpret the XPath on the server side as well or be as lenient as possible when matching the XPaths. I had similar discussions with the research team in IRL when I was working on my idea of MySearch, which had similar issues, and there were some interesting solutions that we discussed.

    I would say it is only a matter of time when someone will crack the issue of fetching structured data from semi-structured data present in the web and make it available to other services. Tools like Dapper, Yahoo Pipes, YubNub and YQL are just the beginning.

    I have made several attempts at this right from using one of these tools, to building my own using Rhino, Jaxer etc, but until now the most content solution is a combination of curl, grep, awk and sed.

    weRead - what's new?

    Ever since I blogged about iRead back in April, a lot has changed. We have introduced tons of new features, and there is really not one place where we have captured all of them.

    So this is my attempt to describe the features to our readers.

    • iRead is now called weRead and we have partnered with Lulu
      This post from our official blog has more details.

    • We now have a destination site
      You don't have to login to Facebook or some social network to access weRead. You can directly access your bookshelf from our destination site. If you have already used weRead in Facebook or one of the social networks, you can link your account and access the same account from the destination site.

    • Connections - find people like you
      This Facebook feature allows you to find people who have similar book tastes like you. You can look for people of a specific gender, people in your network and people in specific age groups.

    • We now have friend activities in the homepage
      We now show activities from your friends on weRead in the homepage. This helps you keep track of which books your friends have been reading, and if they have participated in any discussions.

      Activity of friends on weRead

    • Book discussion boards
      This is the place to discuss with your friends and network about your favorite books, what you liked, what you didn't like, why someone should or shouldn't read a book.

    • Author discussion boards
      If you want to discuss about a specific author, talk about what works of an author are good, or what you would expect his next book to be like, this is the place to do it. Check out the latest discussions here.

      AC Discussion Board

    • Author profile claim
      Are you an author? Then you should be on weRead. weRead makes it ultra simple for you to setup a profile and interact with your readers. Writing a new book? Want to know who might like it? Want to get suggestions from your readers? Want to promote your book on various social networks? Start here

      weRead for authors

    • New catalogs
      We now have catalogs from Amazon, Google and OCLC integrated into weRead. This means you have a whole range of books to choose from. More catalogs are coming soon.

    • weRead is now available in multiple languages
      weRead is now available in 6 different languages - English(US), English(UK), German, French, Spanish (on Hi5 only) and Portuguese (on Orkut only). We have more languages being added soon. Want weRead in a local language? Help us translate weRead here.

    • We now have limited previews of books from Harper Collins and Google Books and full preview of some books from Gutenburg
      This will give you some sort of a 'bookstore experience' by allowing you to preview books.

    • See how a book fares in your network
      Curious to know how a book has been rated by people in your network? We now give you near realtime statistics about a book - how people have rated the book in your network, how many people own the book, how many have marked it favorite etc.
      Find who has read a book in your network

    • Readers now have a profile page which displays their bookshelf
      Each weRead user gets his/her own personal page that they can then share with their friends, bookmark, etc. In order to set up your own profile page, link your account from Facebook to our destination site and click on the "Profile" link in the top blue bar. Check out my profile page here.

    • Readers can showcase their bookshelf in their blogs and other sites
      Want to advertise your bookshelf in your blog? It's simple! Go to your profile page and then click on 'Take weRead with you', get the code and put it in your blog. You also have some customization that you can do before you get the code. Check out a demo here.

    • The Facebook Wall application allows you to post information about books, write reviews etc directly from the Facebook Wall.
      You can now chuck a book at your friends directly from the Facebook wall. Go to your Facebook profile page: http://www.facebook.com/profile.php. Under the Wall tab, you should see the Books iRead option. Clicking this opens a dialog that allows you to pick a book from your shelf or search for a book and chuck this at your friend.

      Facebook weRead Wall application

    • Similar authors
      Under every book detail page, we show similar authors that will help you discover authors who write books similar to the one that you are viewing.

    • Mis-spelt searches
      weRead now has builtin suggestions in case you make a misspell some work while typing your query.

    • See more like this
      We have launched some kind of a 'Stumble upon' feature. When you are viewing a book in weRead, you will see a button 'See more like this', clicking which, takes you to a random but related book.

    • External integration with OCLC
      We now power the OCLC related books and reviews.

    • We have also moved to bigger and more powerful servers, which means a better user experience for all our readers.

    As you see, we have been busy! We have tons of new and exciting features lined up and we promise to provide feature updates as frequently as possible. A lot of these features revolve around making weRead a truly social application.

    By the way, you can get some quick updates on weRead in our Twitter page.

    Happy reading!

    PS: Features and feature names are subject to change.

    Who do we believe?

    As information is becoming cheaper everyday and as we are getting access to more and more information, I see one problem. There are certain 'well known theories' which are being proved to be untrue. Also of how 'facts' are generated when in fact it had never really occurred. These are things that we studied during our schooling as 'facts'.

    On one side, this is a good thing. It makes you question everything you read or hear and not just accept things blindly. But on the other side, it makes you feel, well, then, what do we believe?

    Wikipedia is a classic example of information accuracy and the arguments around it. Do you trust Wikipedia? Take an example of a controversial article - say Scientology, or about Crop Circles, or say the Nazca lines. Would you believe what Wikipedia has to say? Well, isn't there a slight possibility that the theory is wrong, especially when there are mathematicians, archaeologists, physicists or historians who subscribe to either sides of the controversies.

    What if a vast majority of the people actually believe something that is actually not true? Wasn't the earth believed to be in the center of the solar system and that the sun revolved around the earth?

    Here are some things that I came across in recent days:
    1. The theory of evolution and the theory of Intelligent design.
    2. The Sphinx mystery - is Sphinx older than it was initially thought to be and does it have connections to mars?
    3. The Aryan invasion theory - did it really happen?
    4. Global warming a myth?
    5. Aliens and UFO's - has anyone really spotted them?
    6. Man landing on the moon

    Well, the list is endless. If you look for information on any of these, you will see tons of information that can convince you either ways.

    Not all of us are mathematicians, not all of us are theoretical physicists. Nor do we have the time to verify every single 'fact' we come across.

    So the question is how do we believe what we read and who do we trust and believe?!

    It's official - Lulu partners with weRead

    So finally the news has been made official.

    Lulu today announced partnership with weRead (iRead).

    Lulu is a platform that enables wanna be authors, musicians and other creators to bring their work directly to their audience. Publishing is free, and the lack of middlemen means that the freedom lies in the hands of the creator. Lulu was founded by Bob Young, co-founder of Red Hat and an extremely successful entrepreneur. Lulu is the world's fastest-growing provider of print-on-demand books.

    With this partnership, there are several exciting things that we are looking at.

    With weRead, Lulu users now get a simple way to make their creation available on all popular social networking sites and promote their work. As for weRead, users get a much larger catalog of books, some of them which are not available anywhere else.

    Well, this is definitely just the tip of the iceberg and we see several other exciting things ahead.

    News about the partnership from the Lulu site:
    "Lulu (www.lulu.com), the world's largest marketplace for individual, educational, and corporate authors and publishers to bring their books directly to market, announced today an alliance with weRead (www.weread.com), the leading social networking application for books where readers can easily discover and recommend books to their friends on social networks and therefore, the world."

    Over the next few weeks, you should see several new features on weRead. There is one theme that we are concentrating on - make weRead more social, which is why we thought it makes better sense to name it weRead rather than iRead.

    The future now looks promising!

    The Afterthoughts - If Google came up with an RSS Reader

    So here is another post in The Afterthoughts series.

    Post: If Google came up with an RSS Reader
    Originally posted on: 2005-01-30

    This post was made long before Google came up with Google Reader. I was experimenting with RSS readers and started wondering what it would be like if Google came up with an RSS reader.

    Now that we have one from Google, it is time to look back and see how my expectations matched with the actual product.

    > * It would first buy the domain "greader" or something similar.
    This didn't happen. However, Google Reader is popularly called GReader. I guess I made this comment because of Gmail.
    On a side note, Google does own greader.net.

    > * It would have an index of more than 8 million different feeds.
    This is not how an RSS reader has evolved. Google Reader does have recommendations based on the feeds you already have. It would be good to see an integration of Google Blogsearch or even Google News with Google Reader. The only integration I see is the subscription of search results from both of these in Google Reader (a 'new' feature).

    > * It would offer 1 GB space for storing posts.
    The storage in most online readers is unlimited.

    > * It would have an excellent search feature for searching posts.
    This was a surprise! The feature came in so late. Totally unexpected.

    > * The interface would be simple, but at the same time powerful.
    You bet this has been true. The keyboard shortcuts are just superb. The speed with which you can navigate and read feeds is extremely good. (You will need my script to make it even faster. :))

    > * We would be able to mail any post just at the click of a button.
    I guess this feature has been around since quite some time now.

    > * It would allow us to filter posts and also label them for future reference.
    With tagging and folders, this has been better than expected.

    > * It would also allow us to make blog entries (of course the service would be integrated with Blogger.)
    Again, this is a surprise. Google has not provided any integration with Blogger. However, recently Google added a feature to share an item with notes. With the microblogging revolution, and Google having acquired Jaiku, I guess that integration will happen first.

    > * It would integrate greader with other offerings like mail, groups etc.
    The integration is not that great as of now. It would be cool to see posts related to a mail, or a message in a group etc.

    > It would be Beta forever. :)
    Surprise! This isn't true!

    Final thoughts:
    So after more than 3 years since I made the original post, (which is a lot of time in technological evolution) I should say, Google did match most of the expectations that I had back then, some features were developed much better than what I had expected. However the integration with other services is one thing where it could have done better.

    Getting Rosegarden to work in Ubuntu (Gnome)

    Updates to this post at my new blog: http://buzypi.in/2008/07/15/getting-rosegarden-to-work-in-ubuntu-gnome/

    I am one of those many people out there who had trouble in getting Rosegarden to "sing" in Ubuntu under Gnome Window Manager. Finally after trying a lot of permutations and combinations, I got Rosegarden to work. I made this post to share what I did so that others don't have to go through the same trouble I did!

    So let's proceed.

    Required software

    Rosegarden requires some other applications to be installed in your system. So before you fire up Rosegarden ensure that you have the following:
    1. qjackctl
    2. qsynth
    3. rosegarden
    If you don't have any of these you could execute this:
    $ sudo apt-get install qjackctl qsynth rosegarden fluid-soundfont-gm

    Ok, now we have everything we need. Let's proceed to the configuration steps:

    Start the Jack server

    (Somehow not using sudo gave me problems)
    $ sudo qjackctl &

    Jack Audio Connection Kit setup

    Click on Setup
    Here are the settings I used:
    Setup - JACK

    Start the Jack server

    JACK Audio Connection Kit

    Start the synthesizer

    $ sudo qsynth &

    QSynth setup

    MIDI Setup

    Qsynth: Setup - Midi

    Audio Setup

    Qsynth: Setup - Audio

    Soundfonts Setup

    Qsynth: Setup - Soundfonts

    Start rosegarden

    Ok, it's time to fire up Rosegarden.
    $ sudo rosegarden &

    Configuring Rosegarden

    Go to Settings - Configure Rosegarden.

    Configure Rosegarden - General

    Configure Rosegarden - Midi

    Ensure that the connections are right in Jack Audio Connection Kit (Connect):
    Connections - JACK - ALSA

    Connections - JACK - Audio

    Play one of the sample files and you should hear music!


    In case your Jack server is not running, you might want to execute this command and then start the Jack server:
    $ sudo /sbin/alsa force-reload