Some of you might be visiting DevConf [1] in February 2013 in Brno. I heard there would be some Anaconda talks.
I was contacted by Ben Levenson who's trying to make sure we have enough of interesting QA-related talks and labs. One of his ideas is to set up a usability lab for Anaconda NewUI and he asked me to contact you.
The idea is that we would have a set of goals we would ask people to achieve, and we would watch them how they handle the task. There might be interesting outcomes related to different program dialogs/screens and how different people cope with them. Of course if you have other ideas how the lab should work, in can be done another way.
The audience on that conference is mainly technical, lots of Red Hatters, but there is a high chance of involvement also from university students and other not-as-technical people. Probably most of the attendees have installed some OS in the past, but I think we might get some people who installed only Windows or Ubuntu, nothing else (some of them might be even Red Hatters, hah, we're not all Linux developers). It might be interesting to see them tackle with Anaconda.
The main question is whether you would like to implement this idea. Our local Fedora QA team is willing to help you in any ways with preparation and execution of this lab.
It would be great to have some answers by the end of the next week.
Thanks.
There was a short off-list discussion and it seems this could be done, provided there is enough interest among anaconda developers. We have a few QA guys ready to help you.
Some further details: I have talked to Radek Vokál who organizes DevConf.cz. The schedule is expected to be pretty full and the talks submission deadline is already over, but you can still submit one if you do it fast.
The most important information I received is that we will be able to get only 1.5 hours of the lab time. Is it enough time? It might be enough if we test several people in parallel. 4 people in parallel, 30 minutes each, that's 12 people tested. It is achievable? I never performed such lab testing.
We also have to prepare some abstract so that our session can go into the selection process and can be announced properly. The abstract should contain: * session name * names of the people leading it * short description * some links if appropriate
The abstract should be submitted ideally this week (to Radek Vokál).
We also need to handle required hardware. It seems obvious we will have to have some computers prepared, this can't be done on laptops that people bring with them. I can talk to proper people to make sure the required hardware is ready, but I will need to know how many computers to request (and whether we have some specific needs, like laptops+external displays, etc).
The most important information I received is that we will be able to get only 1.5 hours of the lab time. Is it enough time? It might be enough if we test several people in parallel. 4 people in parallel, 30 minutes each, that's 12 people tested. It is achievable? I never performed such lab testing.
We'll be able to get an awful lot more testing done if we have people test the DVD instead of the netinst. True this will cut down on the testing we give the source spoke, but it's probably worth it to crank through more people doing more other stuff. We can always come back and test netinst when we are less constrained for time.
We also have to prepare some abstract so that our session can go into the selection process and can be announced properly. The abstract should contain:
Is there a template?
- session name
anaconda usability test
- names of the people leading it
I think I can be there to help, but ideally I shouldn't lead it for two reasons:
(1) I know too much about how things are supposed to work, and probably don't want to watch.
(2) It seems to me that a native Czech speaker would be a better leader. That way we eliminate any possible translation problems.
- short description
For Fedora 18, the installation program ("anaconda") has undergone a complete user interface redesign. It has moved from the older wizard style to a hub-and-spoke model where the user can do tasks in whatever order they choose. We are looking for data on the usability of this new interface by observing people running through the installer and taking notes on what problems they run into.
Something like that?
- some links if appropriate
We could link to the feature page, or mizmo's blog, or any of the other pages linked to from there.
The abstract should be submitted ideally this week (to Radek Vokál).
Ouch, limited time.
We also need to handle required hardware. It seems obvious we will have to have some computers prepared, this can't be done on laptops that people bring with them. I can talk to proper people to make sure the required hardware is ready, but I will need to know how many computers to request (and whether we have some specific needs, like laptops+external displays, etc).
I think we only have one camera setup here, which would make it difficult to film multiple people installing at the same time. However that's probably a limitation we will have to live with. Thus, we should probably have a small number of computers so we can get one observer for each person doing the install. The observer can take detailed notes on what the installer is doing and where they get hung up.
As for specific needs, I think answering that first means we need to decide what we want out of this. Are we trying to get as broad a base of testing of anaconda as possible, or are we trying to get a larger sample size of people doing similar things? If the former, we will want specific setups (pre-installed Windows, pre-installed other Linux, etc.) on VMs. If the latter, the setup doesn't matter all that much and we could just use blank disks on VMs.
Probably whatever we do, we want VMs though.
- Chris
On Thu, 2012-12-13 at 16:33 -0500, Chris Lumens wrote:
The most important information I received is that we will be able to get only 1.5 hours of the lab time. Is it enough time? It might be enough if we test several people in parallel. 4 people in parallel, 30 minutes each, that's 12 people tested. It is achievable? I never performed such lab testing.
We'll be able to get an awful lot more testing done if we have people test the DVD instead of the netinst. True this will cut down on the testing we give the source spoke, but it's probably worth it to crank through more people doing more other stuff. We can always come back and test netinst when we are less constrained for time.
We also have to prepare some abstract so that our session can go into the selection process and can be announced properly. The abstract should contain:
Is there a template?
- session name
anaconda usability test
- names of the people leading it
I think I can be there to help, but ideally I shouldn't lead it for two reasons:
(1) I know too much about how things are supposed to work, and probably don't want to watch.
(2) It seems to me that a native Czech speaker would be a better leader. That way we eliminate any possible translation problems.
I believe all Czech members of the team can be mentioned as leading such lab as we will all probably be there.
(1) I know too much about how things are supposed to work, and probably don't want to watch.
Do you mean you don't want to watch people performing the tasks?
I don't understand. The sole purpose of this experiment is for anaconda developers to get more insight about possible interface deficiencies, as seen by users. Watching is a much stronger experience than reading a transcript (bugzilla is full of those). Why else would we do it?
(I certainly don't want to sound pushing you into this. It was just an idea, and QA will definitely help you with implementing it, but only if you want to do it, of course.)
Kamil Paral (kparal@redhat.com) said:
(1) I know too much about how things are supposed to work, and probably don't want to watch.
Do you mean you don't want to watch people performing the tasks?
I don't understand. The sole purpose of this experiment is for anaconda developers to get more insight about possible interface deficiencies, as seen by users. Watching is a much stronger experience than reading a transcript (bugzilla is full of those). Why else would we do it?
This was in the context of leading the session. While I don't want to speak for Chris, having the complete SME in the room leading the session with the testers may have the issue of consciously/subconsciously giving clues as to the 'proper' way to do things, distorting the feedback, as least as I understand how this testing is supposed to work. Mo?
Bill
On Fri, 2012-12-14 at 10:59 -0500, Bill Nottingham wrote:
This was in the context of leading the session. While I don't want to speak for Chris, having the complete SME in the room leading the session with the testers may have the issue of consciously/subconsciously giving clues as to the 'proper' way to do things, distorting the feedback, as least as I understand how this testing is supposed to work. Mo?
This is exactly right, Bill. In the same way you don't want developers QAing their own code, you don't want the folks who developed the code running the usability tests either. When you're knee deep in it for so long, it's really easy to miss things and have biases that will negatively impact the effectiveness of the testing.
This brings up another point. We do have a bit of an issue in that I lead the design/mockup work for the UI and I'm going to be working on the planning for the tests we're going to be running in Westford (and potentially remotely on-site at our intern's campus) concurrently. For actually administering the tests here, I can talk to Ryan Lerch and see if he can do it so I'm not the one talking to the test participants and instead am the person watching from the other side of the screen chewing on my fist. ;-)
~m
On Thu, 2012-12-13 at 16:33 -0500, Chris Lumens wrote:
I think we only have one camera setup here,
The portable lab I have has 3 cameras, but can only record one stream at once (if that makes sense) so the extra cameras would do us no good.
Having audio of the test though would be very useful.... I can produce transcripts based on that if it's in English.
which would make it difficult to film multiple people installing at the same time. However that's probably a limitation we will have to live with. Thus, we should probably have a small number of computers so we can get one observer for each person doing the install. The observer can take detailed notes on what the installer is doing and where they get hung up.
I can put together a worksheet that could be printed out and handed out to observers to make it easier for them to write up their observations.
As for specific needs, I think answering that first means we need to decide what we want out of this. Are we trying to get as broad a base of testing of anaconda as possible, or are we trying to get a larger sample size of people doing similar things? If the former, we will want specific setups (pre-installed Windows, pre-installed other Linux, etc.) on VMs. If the latter, the setup doesn't matter all that much and we could just use blank disks on VMs.
My intern and I will be putting together a usability test plan (hopefully with everybody here's help, we'll start the discussions as we start that work here and in #anaconda) and that should hopefully be ready in time for all of this. I have two categories of tasks I'd like included in the test plan:
1) Testing general ability of users to install Fedora across scenarios typical for Fedora target users (on a clean bare metal machine, on a VM, pre-installed windows, pre-installed other Linux, and Mac)
2) Testing of a set of tasks specific to the custom partitioning UI
We could focus just on tasks in one of those two categories for DevConf if we want. I think it'll be a more technical audience, so it might be worth focusing on the custom partitioning UI tasks? We could have tasks like, 'Here's a diagram showing a specific disk/partition layout. Re-create this using Anaconda.' And we could have a set, some layouts use LVM, some use BTRFS, etc.
I think we should emphasize coverage of features over having different users repeat the same task, if possible. I think we will get a lot more useful data covering more areas for improvement that way. (When you have multiple people doing the same tasks, there's typically a sharp diminishing of returns after about 5 or 6 users; even if only 1 user goes through a particular set of tasks in the test plan, that 1 user will uncover a good chunk of the most egregious problems.)
~m
On Thu, 2012-12-13 at 06:51 -0500, Kamil Paral wrote:
The most important information I received is that we will be able to get only 1.5 hours of the lab time. Is it enough time? It might be enough if we test several people in parallel. 4 people in parallel, 30 minutes each, that's 12 people tested. It is achievable? I never performed such lab testing.
If you have 1.5 hours, and we target the tests to be 30 min per participants, you're really only going to be able to do 2 runs. You'll probably have to do some cleanup work in-between participant sessions on the same machine (e.g., delete the old VM, save out the recordings, etc.) But that's okay. If you could do 4 people in parallel, that's 8 people tested, and that's absolutely useful.
Do you have access to four laptops that could be devoted to the testing purpose during that time?
Having video recordings of the tests is honestly not all that critical - if we could get audio recordings of the tests as they were run and sheets of observer's notes per participant, that's a great body of data - we could have the audio transcribed and compare that against the notes and get good data from that.
If you're running 4 tests currently, are there 4 QA folks who could commit to administering one each of the tests and write down the observations?
~m
anaconda-devel@lists.stg.fedoraproject.org