Sunday, December 16, 2012

Open browser test automation at WMF


Almost a year ago I started working as QA Lead for the Wikimedia Foundation.  Among other things, I had a mandate to create an automated browser testing practice at WMF.

From the start I wanted this project to be a world class, completely open, reference implementation of such a project, using the best and most modern tools and practices I could find.  I wanted this to be a project that anyone could read, anyone could run, and to which anyone could contribute.  I wanted this to be an industry standard implementation of a well-designed, well-implemented, working browser test automation project.

Around 2006 my career had veered off into a test automation approach that, while valid and useful in certain circumstances, would be inappropriate for the WMF project.  And in the years since 2006, the tools and practices that were immature at the time had grown into mature, stable, powerful projects of their own. 

I set out to educate myself about the details of the cutting edge of browser test automation in every language.  I visited Austin TX twice and San Francisco several times in the past year to discuss approaches to the project in person with experts on the subject. 

WMF hired Željko Filipin in October 2012 specifically for the browser test automation project, and just this week we opened the curtain and turned on the lights.  The WMF browser test automation project is in Ruby, using

* page_object gem
* watir-webdriver
* selenium-webdriver
* Cucumber
* RSpec
* rake
* Jenkins integration

The initial announcement is here.
A first pass at technical documentation is here, including instructions to run the tests locally if you want to see them in action on your own machine.
Some community concerns are here.
The main code base is managed in gerrit but there is a more accessible read-only mirror on github.
Our Jenkins instance is running on the Cloudbees service  using Sauce Labs  hosts, and the current test results are visible here.
If you would like to follow the project or contribute to it yourself, we are starting a community group you can join.  Feel free to add yourself to the page here  to follow the project or contribute to it.

Many people helped me along the way to making this project what it is.  Hopefully I haven't left out anyone, feel free to remind me if I did!

Jeff Morgan for creating the page-object gem and for answering tons of my questions
Jari Bakken, especially for this presentation about webdriver in Ruby.
Brahma Ghosh
Charley Baker
Alister Scott and his Watirmelon blog
Marlena Compton  and Matt Brandt  for discussing their experience at Mozilla WebQA with me.
Bret Pettichord for hosting  the Test Automation Bazaar
Jim Holmes of Telerik for hosting the Telerik Test Summit

And especially Željko for actually building the actual project! (And for kicking my ass  along the way, I still have a lot to learn.)

Friday, June 01, 2012

Wikimedia Foundation hiring QA staff

The Wikimedia Foundation currently has two open positions for QA staff.  One is for a QA Engineer and the other for a Volunteer QA Coordinator

I want to point out what a unique opportunity this is.

Before I was hired at WMF four months ago as "QA Lead", there had never been anyone working on the Wikipedias and related projects whose only focus was testing and quality.  There was no UI test automation, there was no program of community testing.   Actually, there still isn't.  That's where these new staff come in.  No exaggeration:  this is an opportunity to create from scratch the quality and testing practices for one of the great achievements in human history.  But WMF only has about 100 staff, and only about half of those are technical.  At the moment, there are about 50 regular contributors to WMF software (and millions and millions of users!), so this QA staff will be outnumbered and outgunned.  So we need help, both from automation and from a community of people interested in testing. 

WMF has two priorities this year for QA and testing:  one is to implement some browser-level test automation.  The other is to involve both the testing community and the Wikipedia community in more testing activities. 

As QA Lead, I figured it was my job to lay some groundwork for these activities.  So I did a lot of research on the current state of browser automation (I'd not been following recent developments there for some time) and built a demonstration and example of what I think the best available browser test "stack" should be.   (Note: there is some bad code in there right now, for a purpose)  I think Ruby is the way to go for this, and this is my justification for that.  I am really looking forward to working with our new QA Engineer to expand these tests, bring in community contributions, hook it up to Jenkins, run it against our brand new test environment, etc. etc.  My little spike is only an example, and I look forward to having my mind changed about how it will ultimately be of use. 

The other thing I've done is to begin having "test events" where WMF invites outside groups to help test some aspect of the Wikipedia software.  On May 5 2012 we teamed up with Weekend Testing to validate the new frequent deployment scheme when we began updating all the Wikipedia software every two weeks.  On June 9 we'll be teaming up with Openhatch.org for a "shakedown cruise" of the new Wikipedia Article Feedback system that will be rolled out to all of Wikipedia in just a few weeks.   I'm looking forward to working with the new Volunteer QA Coordinator to expand community testing both in partnership with other organizations and within the Wikipedia community itself. 

As for me, to the extent that I am a "Lead", I tend to lead from behind.  I'll be working with new and evolving WMF projects to bring testing and quality work where it can be the most valuable.  I'll be working with the test automation communities to improve development practices through testing.  I'll be looking for cool stuff out in the world for Wikipedia to make use of.  It's a brilliant future. 

Tuesday, May 29, 2012

Join the Wikipedia/OpenHatch.org test event 9 June

Wikipedia allows users to leave feedback on each article.  Experienced Wikipedians analyze this feedback in myriad different ways to improve the Wikipedia user experience and to improve the encyclopedia itself. 

The Wikimedia Foundation has been creating a new Article Feedback system, and on Saturday 9 June from 10AM-noon Pacific time, WMF invites testers and anyone else interested to participate in a "shakedown cruise" to test a near-final version of the new Article Feedback system before the system is rolled out to all of Wikipedia.

Following on May's successful test event with Weekend Testing Americas, WMF is teaming up with the Open Source fans at OpenHatch.org for this event.  I am hoping that having experienced exploratory testers plus people interested in improving Wikipedia articles will be a killer combination of expertise and interest to shake out any final issues in this critical aspect of the Wikipedia experience.

Note:  anyone who shows up on 9 June will have access to some of the Article Feedback features, but if you create an account on Wikipedia no later than 4 June and also make at least ten edits to Wikipedia, your account will be "autoconfirmed" and you will have access to many more Article Feedback features than a casual user would.  (Feel free to edit for instance the Wikipedia article on your home town, or an article on software testing, or dice games, or if you're shy, just update your User page and your Talk page a few times, those edits count toward the ten also.)

This is the announcement and there is an optional sign up sheet. The Article Feedback Test Plan has all the details


Monday, May 14, 2012

Testing (automation or whatever) 101: ask a good question.

I tried to do A, and I really don't understand the response I got, X.  Does this make sense?

I know it should be possible to do A, but I tried it and X happened.  What sort of conditions would cause that?

I tried to do A, and X, Y, and Z happened.  X makes sense, but I don't understand Y, what's going on here?

It doesn't really matter whether you're asking about automation or any other kind of testing.  The tricky part is that before asking the question, you had better be pretty familiar with A, and you had better be able to report X, Y, and Z in a reasonable way. 

I have a corollary, and I have a (counter) example.

I have seen any number of people in the software world complain about testers who submit bad bug reports.  I'm sure it's true, I've seen the evidence, and it boggles my mind.  A good bug report will explain A and explain X, and a great bug report will phrase the issue in terms of a question. 

Not long ago I got an email from someone asking about a little script I wrote some time ago.  He asked me to give it to him.  I have not replied. I was astonished.  For one thing, a cursory google search would turn up the 30 lines of code in question.  But even worse than that:  why don't you WRITE IT YOURSELF? 

It's quite possible my script no longer works.  It's quite possible that there are better ways to accomplish what the script does than what I wrote.  But I absolutely refuse to copy'n'paste 30 lines of code in an email response.

Eric Raymond (if you don't know that name, google it) wrote an essay a long time ago How To Ask Questions The Smart Way.  I'm guessing that many readers of my blog are not familiar with it.  This is a travesty. 

NB: the last time I pointed a software tester to Raymond's essay, I was accused of misanthropy and worse.  Testing might be dead.

Sunday, May 06, 2012

Testing Summit at Telerik


I attended the Test Summit peer conference this weekend at the invitation of Jim Holmes of Telerik.

It was outstanding, as such peer conferences tend to be, and I and others will be posting a lot of information as a result of what went on there. 

But I want to talk about the conference itself. 

Software testing has a long history of peer conferences, starting (to the best of my knowledge) with the Los Altos Workshop on Software Testing (LAWST).  Bret Pettichord borrowed the LAWST format for his Austin Workshop on Test Automation (AWTA) in the mid-2000s, and I borrowed from AWST for my Writing About Testing (WAT) conferences in 2010 and 2011.   I think other examples exist.  The format has gotten looser over the years.  LOTS looser, as we find that motivated participants are pretty good at self-organization.

As far as I am aware, the Telerik Test Summit (TTS?) is the first such software testing peer conference created and sponsored by a commercial company.   I think this is important, and I think TTS has important implications.

As a general rule, I will not endorse or support commercial companies, even the ones I work for.  I don't even link from blog posts to my own writing for articles that are behind a registration-wall or pay-wall.  Now that I work for the Wikimedia Foundation, I have even more motivation to be as impartial as I possibly can. 

And yet I accepted the invitation from Telerik and Jim easily.  I'd like to explain that. 

I knew everyone in the room at TTS, and many of the attendees are good friends.  I had never met Jim Holmes. 

But I knew his work.  Jim created the CodeMash conference, which is notable not only because of its radical growth in popularity over the last several years, but also because of its reputation for friendliness and hospitality.   I'm not involved in CodeMash and I've never attended it, but I know a lot of people who have, and Jim has a stellar reputation among my friends and acquaintances. 

I didn't go to TTS because Telerik invited me;  I went because Jim Holmes has an unimpeachable reputation for integrity and hospitality. 

We were not pitched. We were not marketed to. There were no Telerik tools being discussed in the room.  We were given a tour of the Telerik office in Austin;  some attendees chose to pair up with Telerik staff to see what they were working on, some did not.   I consider this just hospitality. 

TTS should be a lesson to other companies.  Many of the best people working in software will give away their best ideas and even their best work freely.   But the moment such people suspect they are being used or co-opted for some purpose other than trying to move the cause forward, we will turn on you with a vengeance you can not imagine. 

So my advice to any companies who might want to sponsor their own peer conferences in the future:  first, get yourself a Jim Holmes; second, listen to what he has to say; then, do what he tells you.  That's probably harder than you imagine.





Tuesday, May 01, 2012

Weekend Testing for Wikipedia May 5


On Saturday May 5 at 10:00 AM Pacific Time the Weekend Testing Americas will be investigating the new release of Mediawiki on Wikimedia Foundation sites before WMF rolls out the new version to all of Wikipedia.

I am really excited about this project, I do hope you will consider joining in.

Details of how to join are on the official Weekend Testing site
1. Add “weekendtestersamericas” to your Skype contacts if you haven’t already.

2. Fifteen minutes prior to the start of the session, please message “weekendtestingamericas” and ask to be added to the chat session. Once we see you, we will add you to the session.

The test plan is here


See you on Saturday!

Monday, March 26, 2012

conf report: Test Automation Bazaar

I went to the Test Automation Bazaar because one of the (many) things I want to do at Wikimedia Foundation is to start a browser test automation project, open to the greater software testing community, in support of Wikipedia and related projects.  I have been out of the loop on this front (forgive the mixed metaphor) for some time, and I particularly wanted to learn about two things:

* what an attractive, modern, well-designed browser test automation framework looks like
* page objects and their use

I got what I came for.  Particular thanks to  Brahma Ghosh and Jeff Morgan/Cheezy for the great discussion.  I am putting on the white belt for this project, but at the same time, I have years of valuable UI test design experience to bring to bear.

But beyond that, TAB was full of friends, many of whom I had never met. 

Kudos to Bret Pettichord for making it happen, I wish we'd had more time to chat, I've known Bret longer than anyone else there, through some weird times back in the day. Tiffany Fodor, whom I met once about three years ago, Rick Hower likewise.  Željko Filipin, Alister Scott, Marek Jastrzebski, whom I've known for years and never met in person.  And the amazing Jari Bakken, who I met for the first time in San Francisco a few weeks ago.  Charley Baker was sorely missed by many.

The kindness and generosity of people in the Ruby community is just remarkable.

And there was ukulele playing.  It was my fault.

Sunday, February 26, 2012

deja vu: code, culture, and QA



Some years ago I had the privilege of making some suggestions for Brian Marick's book Everyday Scripting based on the first article I ever wrote for Better Software magazine.  That article appeared in 2004, and I just recently ran into a similar situation at work. 

Wikipedia is localized for well over 100 languages.  I had only been working at Wikimedia Foundation a couple of weeks when I heard that discrepancies between the localized message files from version to version could cause problems when upgrading.  I didn't know what kind of problems, but since we're upgrading all the Wikipedia wikis to version 1.19, that sounded like sort of a big deal, so I followed up.

It turns out that changes to the localization files are essentially undocumented, no tools exist to monitor such changes, and we simply did not know anything about discrepancies in those files.  So I decided it would be useful to look into that.

You can find the Wikipedia localization files for version 1.19 here  and for version 1.18 here if you want to follow along.  

Since there are well over 100 files in each directory and each file has 1000s of lines, checking for discrepancies manually is impossible.  From one of the senior people on the Wikimedia dev staff I got a few examples of certain places in these files where discrepancies would cause big problems.  (See technical note at the end.)  Although I've cleaned the code up quite a bit (one-off scripts don't have to be DRY, right?) here's what I did to cite discrepancies for one of the examples:

In a directory called 'mediawiki' I have one directory 'lang118' and another 'lang119'.  In those directories are all of the Messages*.php files for each version.  What I want to do is read each file in each version, identify the contents of the $namespaceNames array, and compare those contents for every file in each directory. 

path119 = 'mediawiki/lang119/'
path118 = 'mediawiki/lang118/'

r119namespaceNames_array = []
r118namespaceNames_array = []

def get_values ( path, array_name  ) 
  Dir.foreach(path) do |name|
  unless File.directory?("#{path}#{name}")
    text = File.read("#{path}#{name}")
    text.scan(/namespaceNames.+?\)/m)
    array_name << name + $~.to_s
         end #unless
  end #do
end

get_values(path119, r119namespaceNames_array)
get_values(path118, r118namespaceNames_array)

mismatch = r119namespaceNames_array - r118namespaceNames_array
disc= mismatch.length.to_s
puts "number of files with discrepancies in $namespaceNames array is #{disc}"

mismatch.each do |string|
  file = string.split(".php")
  puts file[0]
end
 
 
This script runs from the directory above 'mediawiki'.  It defines the paths to where the localization files live, and defines two arrays to hold the values to be compared.  For each directory it calls the 'get_values' method, and puts the name of the file and the contents of the $namespaceNames array of that file into the appropriate array.  Subtracting one array from the other yields a set of all mismatches, and with that the script knows how many files have mismatches, and what the names of those files are.  

Reading this script should be fairly straightforward for anyone who knows a little bit of Ruby.  Note a few things, though: 

* 'unless' is equivalent to 'if not', and the script needs to not check directories, only files
* File.read is the same as Perl's "slurp", it puts the entire contents of the file into the variable 'text'
* the 'scan' method takes a regular expression for an argument.  Here the regular expression is saying "give me all the text that begins with the string 'namespaceNames' and ends with the string ')'.  I had forgotten that '.+' is 'greedy', and will match past the terminating string, so doing '.+?' prevents that, thanks Charley Baker for the reminder.  The 'm' at the end of the regex tells it to match multiple lines, which is necessary because each value of the $namespaceNames array is on a single line and I want to match all of them in one fell swoop.  

The output from this script looks like


number of files with discrepancies in $namespaceNames array is 16
MessagesEn_ca
MessagesEn_rtl
MessagesFrp
MessagesIg
MessagesMk
MessagesMzn
MessagesNb
MessagesNds_nl
MessagesNo
MessagesOr
MessagesOs
MessagesQug
MessagesSa
MessagesSr_ec
MessagesWar
MessagesYue
 
At this point it made sense to just look at the problem files with my eyeballs and see what was in their $namespaceNames arrays.  With a little help from diff(), that's what I did.  I reported the discrepancies I found on a public mail list for Wikimedia tech issues.

A couple of interesting things happened because of that.  Again, keep in mind that I am a total n00b with these systems.  While I have a little more information now, I had no idea of what the consequences of such discrepancies would be.

I got an answer on the mail list from a senior Wikimedia dev person who analyzed the discrepancies I reported and said in effect "everything's fine, we are good to upgrade based on these examples".  And while there are several other areas in these localization files that could cause issues, my example demonstrates that the technical risk for upgrading to 1.19 seems low.

But then some days later in a a conversation on IRC, a different senior Wikimedia dev person said in effect "whoa, whoa, whoa, if we release these changes without at least some review from the language communities affected, we are going to be in for big trouble".

As I write this I do not know if the localization files for Wikipedia will be upgraded next week or not; that decision is not in my hands. However, I am immensely pleased that as a total n00b I was able to provide true concrete examples of the data in question to inform that decision.

I decided to write about this for a number of reasons:

To my mind, nothing in this story has anything to do with "testing".  For some time now I have been saying that "QA is not evil", and to me, this was an exercise in pure Software Quality Assurance.  Since my official title at Wikimedia is "QA Lead", this makes me happier than you would imagine.

One of the great neglected areas of software projects is the state of the actual data in applications, be it held in files or databases or whatever.  One of the most important skills QA/testing people can bring to bear on a software project is the ability to isolate critical chunks of data from enormous data stores.  That was true when I wrote "Is Your Haystack Missing a Needle" in 2004, it was true when Brian published "Everyday Scripting" in 2007 and it remains true today.  If as a QA/testing person you don't know how to read a bunch of files and do regular expressions (and for that matter do SQL queries too), you owe it to yourself and to your projects to learn. (Frankly, I hadn't done this kind of thing in a long, long time, and it felt great to get back on that horse.)

Finally, I wrote this because all of the data and all of the conversations we had were completely open and public.  I could give you a link to the email thread where I published the detailed discrepancies and got the reply, I could publish a link to the IRC log where people discussed the cultural risks of upgrading the localization files.  The only reason I don't is because they're not germane to the story.  I so enjoy working in an open culture.

Technical notes:

My original script checked for discrepancies among four arrays:  $namespaceNames, $namespaceAliases, $magicWords, and $specialPageAliases.  The $magicWords array was trickier, and I had to do this:

text = File.read("#{@@path118}#{name}")
text.scan(/magicWords.+?\);/m)
if $~.to_s.length > 0
array = $~.to_s
array_no_space = array.gsub(/\s+/,"")
@@nsn118magicWords_array << name + array_no_space

For one thing, $magicWords is an array-of-arrays, so I check for a terminating string of ');' instead of just ')'.  For another thing, some of the files didn't contain the $magicWords array.  For another thing, I found some random differences in whitespace between versions for many many files, so I eliminated all the whitespace in the strings in question by doing 'array.gsub(/\s+/,"")'.  The comparison only became valid once those things happened.

Saturday, February 04, 2012

Who I Am and Where I Am, early 2012

I've been pretty quiet in recent times, but that's going to change somewhat in 2012, so I thought I'd write this(*) to catch up.

As of last week, I am the QA Lead for the Wikimedia Foundation. My job there will be to create, codify, and execute the software testing and quality assurance regimes for the software that powers Wikipedia and its associated properties.

I've worked some other interesting places, among them Thoughtworks and Socialtext. I like open source and wikis. I have been a dedicated telecommuter/remote worker since 2006. Depending on when you read this, I'm in either Santa Fe NM or Durango CO, or somewhere else.

I have written about software a lot. Most of my writing in recent times has been for SearchSoftwareQuality.com  (warning: registration wall), but I've also written a lot for StickyMinds.com  and a couple of articles for PragPub.   I wrote a chapter for Beautiful Testing.

I created the Writing About Testing peer conference and associated mail list and wiki. I like to think WAT has had some influence on the testing/QA field over the last couple of years. I've given presentations at a couple of Agile conferences, once at PNSQC, a couple of AWTAs, and some smaller peer conferences from time to time. I attended two GTACs.

I've been using browser automation tools (Selenium  and Watir) since they existed. I know a lot about their history and something about how to use them well. As a programmer I am slow and simple. What programming I do is usually in Ruby.

I play fretless electric bass guitar with a couple of jazz bands, but outside of the WAT conference I'm more well known for playing irreverent tunes on a cheap green ukulele at software conferences.  I'm a pretty good musician.

I'm @chris_mcmahon on Twitter, christopher.mcmahon at gmail and gplus. I ignore LinkedIn and I don't have a Facebook page, don't try to reach me there.

* props to Warren Ellis, from whom I stole the idea of this post.