July 5, 2014 · Weekend Testing

WTA-52 Experience Report

Here's a quick(ish!) summary of my notes from Weekend Testing Americas #52: "Going Deep with 'Deep Testing'", facilitated by Justin Rohrman. (I have a habit of promising quick reports, and then taking two hours to write them up, for which I make no apologies!)

We were introduced to the application-under-test for this session, TitanPad, a real-time collaborative document editor. We saw an introduction to deep testing, and Michael Bolton provided the definition as it is stated in the Rapid Testing Intensive course:

Testing is “deep” to the degree that it reliably and COMPREHENSIVELY fulfills its mission AND to the degree that substantial skill, effort, preparation, time, or tooling is required to do so.

Each participant/group was asked to pick an area of the application on which they would perform deep testing. I decided to focus on the Chat area of the application, as this seemed to be an important part of document collaboration.

Sending and receiving chat messages

With TitanPad seeming to deal solely in plain-text messaging, I looked into how I could vary the content/format of these posts to cause unexpected results. (A few such problems can be found in the Issues section below.)

In order to "get deep", I harnessed some tools which I had at my fingertips, to allow me to easily perform some tasks which might otherwise have some setup overhead:

  • Range of characters supported: I took advantage of Perlclip's built-in ability to generate a wide range of character codes, and sent these in a chat message. In a different browser, I then copied this message out of the chat window and back into Notepad, diffing the string against the value which I originally pasted-in, to confirm that the characters remained intact after a send/receive.
  • Size of message supported: Again, I used Perlclip (this time using its "counterstring" function) to generate unnaturally large chat messages. TitanPad happily handled a 200,000-character message without any delay or performance degradation, which was quite impressive. When I increased the message size by an order of magnitude (2 million characters), I received a "413 Request Entity Too Large" response from the server, so there appeared to be a server-enforced limit at play.
  • Message timestamps: I fired-up a virtual machine on my desktop, and set it to operate in a different timezone. I confirmed that (on my desktop and VM) message timestamps were converted correctly to the user's region.
  • Post format: I used Firebug to analyse the requests/responses which occur when a chat message is sent. I noticed that POSTs were occurring in the format shown below. I spent a bit of time seeing whether I could easily manipulate this, e.g. whether I could spoof user IDs. I wasn't successful in the time available, but with more time or foresight, I would've installed Fiddler to make this analysis much easier.
m 	{"type":"COLLABROOM","data":{"type":"CLIENT_MESSAGE","payload":{"type":"chat","userId":"g.sp0d8sm4k0zav46d","lineText":"dummy","senderName":"NeilCHANGED","authId":"g.sp0d8sm4k0zav46d"}}}
  • Testability: Through the Firebug console, I determined that there were a range of pad-prefixed library functions, which I could use to interact with the application (see below). The padchat library seemed that it could prove particularly useful for the task at hand, but its functionality was fairly limited. The only pertinent commands related to reloading the panel and scrolling it to the top/bottom; had it included the ability to post chat messages, the testability of the application (and particularly my ability to automate chat-related checks in the future) would be vastly increased.

Firebug console

Performance and connectivity

Given that the application is focused around multi-user editing, and with the chat panel therefore expected to handle users entering and leaving at regular intervals, I decided that I wanted to perform a stress-test of this functionality. With only a short amount of time for the session, I didn't have time to create a script for this purpose, but I had a different brainwave.

As TitanPad deals with public documents (no login required), I utilised a tool which normally serves a completely different purpose: Browsershots. It's a handy tool for quickly previewing your document's layout/design in a wide range of browsers, firing-up 148 different connections in a short space of time, and then (after automatically capturing a screengrab of the loaded page) disconnecting again. By giving Browsershots the URL to my TitanPad document, I was able to effectively add and remove 150 users to my document with no effort on my part. (Ideally I would've liked for each of those users to submit one or more chat messages; I'd need to invest some time to create a harness for this, but it would seem to be worthwhile.)

Shortly after I started this Browsershots test, I noticed that my Firefox document session had lost its connection. It appeared that other participants were experiencing the same problem...

[17:36:17] amy.kroh: anybody seeing load issues?
[17:36:26] Neil Studd: Yes, me too
[17:36:31] Neil Studd: "Lost connection with the EtherPad synchronization server." in top-right
[17:36:37] Neil Studd: Um... I think I may have caused it ;)
[17:36:39] richardsbradshaw: we saw this too
[17:36:39] amy.kroh: reconnect
[17:36:40] justin_rohrman: I am having that same problem
[17:36:44] Neil Studd: I will be repeating in a minute when it settles down.
[17:36:49] justin_rohrman: reconnect fixed it :)
[17:36:54] jayshreejrathod: yes I am too seeing same connection error

I repeated the same test near the end of the session, and again my fellow participants were reporting some connectivity issues. This TitanPad support topic suggests that this is a known ongoing issue, so even if I was the guilty party, it might be that this is again a known issue.

In fact, at the very end of the session, while investigating something completely different, I stumbled across the TitanPad "Limitations" Help page, and slapped my head when I saw this:

All Pads have a limit of 64 simultaneous users. Please note that usability degrades quite harshly after more than a dozen users since coordination of many people editing the same text isn't an easy task.

Doh!

I wished that I'd looked for this at the start of the task. It's the only Help page on the site, so it wasn't exactly hard to find. It would've recontextualised much of my testing, as I would've shifted focus away from these known performance problems.

Issues

During my one-hour investigation, these are the issues that I discovered related to the Chat functionality:

  • The Chat panel doesn't notify (visually or audibly) when a new message is posted. So if you're reading through the chat history, the user doesn't see that there has been an update.
  • The Chat panel is fixed-width, and the inability to resize the panel could cause issues, especially when dealing with long strings, or strings which don't wrap (such as URLs).
  • It's not possible to insert linebreaks into chat messages; and if you try to paste text which contains linebreaks, they will be stripped-out. This suggests that users prefer to use the chat for short bites of information, rather than needing any structured formatting;
  • Similarly, chat messages cannot be given custom font style/weight/colour; everything is sent/received as unformatted text.
  • When the Chat history is retrieved, the URL contains "start" and "end" parameters which specify an integer range of history items to retrieve. If either are changed to a non-integer value, an error page is displayed.
  • When I attempted to post a 2,000,000-character chat message, I received a "413 Request Entity Too Large" response in Firebug, and the message was not sent. My feeling is that users are unlikely to enter such large messages, but if an error like this occurs, it would be beneficial to surface it to the user (rather than hoping they are looking at the correct debug panel).

Here are some other things which I noted in passing, which I didn't investigate further, as I didn't want to distract from my Chat-testing mission:

  • The Help page states a Deletion Policy for documents, but I found plenty of instances of documents not being deleted in a timely fashion. For instance, this Google search seems to suggest that all exported documents gain a permanent URL which Google is able to crawl, which might contribute to keeping exported documents alive longer than expected.
  • Similarly, whilst trying to narrow-down other Google-cached pages, I came across this set of results which seems to contain a solitary document from 2010 which, for some reason, hasn't been deleted. I'm immediately curious as to what's "special" about this one document.
  • If I look in the Firebug net panel when I load the page, there is always a "xhrXdFrame" load failure. This could point to a wider issue.
  • When experimenting with URL manipulation for message-sending, I managed to generate a "So such socket" message, which I think was supposed to say "No such socket" (not that the latter is any clearer to any end-user who sees it).
  • Various places (especially error pages) where the site header still displays the old EtherPad product name.

Wrap-up

We discussed whether up-front planning was necessary to perform deep testing. I think that an element of planning is certainly useful; had I performed the necessary due diligence, I would've reviewed the help page (discovering the documented 64-user limitation) and browsed the support forum (discovering that document deletion issues are important to users, and that connectivity issues have been reported by others) which could have limited my testing in certain areas, freeing up some time for increased depth elsewhere.

I posited that "You can't go deep until you know how deep deep is". While you certainly can't cater for all of this up-front, I think that there is value in some up-front analysis (e.g. discovering the parameters that you will be able to vary during your testing, and lowering the barriers to testability). You will certainly discover more layers (and more opportunities to "go deep") as you proceed further down. Even if you don't describe these as preparatory activities, it's very possible that these will be the first activities that you perform when you begin deep testing.

In conclusion, it was a very interesting session which provoked some different ways of thinking, and some introspection in an area which I often take for granted. I found some potentially valuable issues, made some classic mistakes, and gained an awareness into the activity of deep testing which will stand me in good stead for the future.

Next, my focus shifts to the forthcoming Weekend Testing Europe session #47 which I will be co-facilitating. More about that in the weeks to come...

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket
Comments powered by Disqus