22 April 2013

Automated Verification of VOIP Audio


I've created a Presentation that goes over these points as well:
http://prezi.com/-29ebxieb4ek/copy-of-rtp-re-assembly/

*UPDATE*
I found this awesome work, using google's translate api, to transcode the audio to text:
http://cheateinstein.com/category-shell/using-google-voice-api-to-transcribe-audio/

I've now used this at the final end of the process, to verify the text heard is what is expected!!

I've been working on this VOIP/SIP automation framework for a few months now. I started with a Cucumber framework, and then added on with some VOIP/SIP specific tools like SIPP and SIPCLI.

I got to where the test harness' I built with these tools, would use Jenkins to push button (or on a schedule or build commit) drive traffic to a phone number... verify it reached it by acknowledgements sent back.  But what if the phone number was going to the wrong destination, and sent back acknowledgements?

At that point I used TollFreeForwarding.com's technology to set a email alert as an endpoint on a phone number. For example, you call: 888-888-8888 and you get an IVR. you press 1, and are sent to a voicemail - you pass in audio and hang up. Then TollFreeForwarding.com emails the configured email on the account, the recording.

It was better, but it required a voicemail to email application at every end point. It also doesn't verify that audio actually occurred on the call. What if no audio played back? Or there was significant jitter to not understand it?

To further this testing, I started thinking of recording the call and using some sort of analysis of the recording to verify it's what was expected.  

This is my first draft at answering that need.  It can be improved.  But it's a step in the right direction.

What I'm doing

  1. This automation dials a number, with a known IVR or greeting.  
  2. It does a packet capture during the recording
  3. It filters out the RTP channels from the packet capture and then creates a wav out of the pcap file.
  4. Once there is a wav file, it runs diagnostics on it... generating some visual graphs like the image on this blog... but more importantly (and more useful) it generates audio information that I use as a footprint for the audio playback.
  5. This audio is also sent to google who transcribes it and sends me back the text which is compared to the expected string.

Tools used

  1. sipp to drive an automated command line sip call
  2. tshark (command line version of wireshark)
  3. jenkins (for the GUI to drive and schedule these tests)
  4. sox (linux based audio conversion and analysis tool)
  5. some shell scripting

How it Works

The test has a parent job, that kicks off two sub jobs.  These sub jobs run simultaneously.  One does a phone call to a phone number with a recording Greeting/IVR.  The other job runs a shell script that maintains the test itself.  The second job uses tshark to record the packets and filter the rtp, then uses sox to convert the raw audio to a wav and do some analysis on the wav.

The Shell Script

First I set tshark to record for a specific duration, that I think will encompass the call:
tshark -a duration:20 -w /jenkins/userContent/sip_1call.pcap

I assign a variable to a tsark task to scan the RTP packets and find the hex value for the RTP packets (I learned these three parts from a online tutorial, but lost the bookmark):
ssrc=$(tshark -n -r /jenkins/userContent/sip_1call.pcap -R rtp -T fields -e rtp.ssrc -Eseparator=, | sort -u | awk 'FNR ==1 {print}')

The above would return a hex value like:
0x344292302

Which is followed by:
sudo tshark -n -r /jenkins/userContent/sip_1call.pcap -R rtp -R "rtp.ssrc == $ssrc" -T fields -e rtp.payload | tee payloads

The above looks for that Hex value captured previously, and holds that as a variable, payload.

Finally, we have a for statement in the shell script to convert the payload value from above, to a raw audio file:
for payload in `cat payloads`; do IFS=:; for byte in $payload; do printf "\\x$byte" >> /jenkins/userContent/sip_1call.raw; done; done

At this point I had a raw audio file. I found a linux tool called sox that was  a good fit for this conversion... so I installed it and added these lines into my script...
Sox is then invoked to convert the raw audio to a wav:
sox -t raw -r 8000 -v 4 -c 1 -U /jenkins/userContent/sip_1call.raw /jenkins/userContent/sip_1call.wav


Then I run a couple more Sox commands:
This one creates stats, which Jenkins captures in the log file of the test run:
sox /var/lib/jenkins/userContent/sip_audio_1call.wav -n stat

The stats generated will look like this:
Samples read:             15680
Length (seconds):      1.960000
Scaled by:         2147483647.0
Maximum amplitude:     0.425659
Minimum amplitude:    -0.285034
Midline amplitude:     0.070313
Mean    norm:          0.043354
Mean    amplitude:    -0.000055
RMS     amplitude:     0.070984
Maximum delta:         0.243896
Minimum delta:         0.000000
Mean    delta:         0.019919
RMS     delta:         0.034190
Rough   frequency:          613
Volume adjustment:        2.349
 
The two highlighted values seem to be consistent with the same audio.  At this point, that's what the test assertion is based on.  I have a better plan in the works for a future upgrade to this test.  But for now, I'm using the rough frequency and max amplitude to determine the pass / fail criteria.

Is it perfect? No. It's potential for false negatives. The rough frequency *could* change, but so far it hasn't for the same audio I expect.

Spectograms
If your into spectrogram's (and who isn't?), then sox will also output one if you like, I end the shell script with this:

sox /jenkins/userContent/sip_1call.wav -n spectrogram -y 2 -l -o /jenkins/userContent/sip_1call.png

If anyone has any other tools that can pull out more data, please let me know.

The Upshot?

One shell script, called by Jenkins, running 3 tools gets this job done.

Verify Audio via Speech To Text

A few people approached me and mentioned rough frequency may not remain constant as the test call goes through different hops.  So I began to investigate this some more... I found this guy:
http://cheateinstein.com/category-shell/using-google-voice-api-to-transcribe-audio/

he had created a way to use a shell script to send audio files to google for transcription.

I modified his script a little to work for my needs, and added a text assertion.  If the text fails comparison then I exit the script with a error code, which forces jenkins to regard this as a total failure.

Here's the part I added to the bottom of my previous script:
echo "1 - Translate with SOX - Convert WAV to FLAC with 16000"
sox /jenkins/userContent/sip_audio_1call.wav input.flac rate 16k
echo "2 - Submit to Google Voice API"
wget -q -U "Mozilla/5.0" --post-file input.flac --header="Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" > output.ret
echo "3 - Extract recognized text"

cat output.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' > output.txt
echo "4 - Display text"
a=`cat output.txt`
echo $a
b="tollfreeforwarding.com"
if [ "$a" = "tollfreeforwarding.com" ];
then
        echo "Verified audio is tollfreeforwarding.com"
else
        echo "FAIL audio is not tollfreeforwarding.com"
        exit 666


fi;


In my scenario, I've seeded the phone greeting on the number that is called to be an announcement audio that says, "Toll Free Forwarding Dot Com"  which google turns correctly to "tollfreeforwarding.com" and I validate against that.

19 April 2013

Converting RDP in pcap to Audio Wav files

After following a lot of different tutorials (some of which worked some of which didn't), I came up with a shell script using a couple tools to scrape a packet capture file, pull out the rdp packets, and then convert them back into audio.

For me this will be useful in automated testing.  I currently drive automated SIP calls via SIPCLI and ruby for a variety of tests at work.  But how do I know I get the right end point?  In the past, I'd have the phone number I dial, record voice mail and send me an email, and the sipcli client would send over text to speech audio.

But I dont always have the luxury of being able to configure the phone number to voice mail. 

I've been wanting to do a packet capture during the test and convert it back to audio afterwards, then do a wav comparison on the expected audio vs. the captured audio.

Tools used:

tshark
sox

These are both linux tools. 
tshark is a command line version of wireshark.  It's installed on centos boxes using yum install wireshark-gnome.
sox via yum install sox

Sox is a audio analysis tool that is run from the command line. 

Test Script:

After looking at some examples online of different tools, I pieced this together from other people's examples, with a few modifications. It seems to work for me:

Contents of pcap_to_wav.sh:


ssrc=$(sudo tshark -n -r capture.pcap -R rtp -T fields -e rtp.ssrc -Eseparator=, | sort -u)

echo $ssrc

sudo tshark -n -r capture.pcap -R rtp -R "rtp.ssrc == $ssrc" -T fields -e rtp.payload | tee payloads

for payload in `cat payloads`; do IFS=:; for byte in $payload; do printf "\\x$byte" >> sound.raw; done; done

echo 'sox has converted pcap to wav file'
sudo sox -t raw -r 8000 -c 1 -U sound.raw capture3d.wav

That's it!

basically if you have sudo access, you can run this and it will take the pcap and find the rdp packets, then make that a raw audio file... sox is then used to convert the raw file to a wav file.

At this point, you can further use sox to compare one wav to another wav.

08 April 2013

Selenium Grid Up and Running with Cucumber Automation

Setting up Selenium Grid with Cucumber


This is more of an advanced topic I think. I did a lot of research to come up with the solution I use here at TollFreeForwarding.com

I have tests that would take hours to run sequentially.  To bring this back to normalcy, I use Selenium Grid to farm the jobs to multiple VM's simultaneously.  This way the total time for all tests to complete is the longest test I have (5min.)

To get to this point, the Cucumber tests have to be modular.  They have to be broken up. You can't have one giant cucumber test.  Selenium Grid can't take one giant test and farm out each Scenario.  Instead you have to have multiple features.

For example:
If you  have a UI you automate and you have coverage for areas like -
  • Account Creation
  • Profile CRUD actions
  • Forum Post CRUD actions
  • Calendar Schedules
  • Call Customer Support (WebPhone)
  • Admin: Create Menus for Customers
  • Admin: Make new announcements for Customers
  • Admin: CRUD actions on gallery uploads/edits/deletes
Then I'd make each of these it's own feature. 

Imagine each of the eight features above took 5 min to complete.  That's a total of 40min if they are run sequentially.  Meaning you'd start the tests and come back in 40min.  What if you could get that down to 5 min?

You can!

Run them all at the same time, across multiple VMs.  That way they all run in parallel and finish in 5min.

This is where Selenium Grid comes in.

Install the Grid

On the Main VM that will run this job, you will install the Selenium Grid Hub.  This is rather simple to do.  You basically have to have java installed, and you run a command like:
[path to your selenium server standalone jar]/selenium-server-standalone-2.31.0.jar -role hub

I found our VM's might sometimes restart (IT restarts and what not) so I added a batch file on this Windows VM and use the Windows Scheduler to assign a new task of "on restart run this batch file" the batch file has this content:
@echo off
"C:\Program Files (x86)\Java\jdk1.7.0_17\bin\java.exe" -jar "C:\Selenium-grid\selenium-server-standalone-2.31.0.jar" -role hub

Which tells the server "run Java to execute the Selenium server stand alone with the parameter role of "hub."

Node VM's must be configured

On each VM that will be used to get jobs from the Grid hub (referred to as 'nodes'), you will need to set them up.  These boxes/VM's again need to have JAVA installed and need to have the same selenium-server-standalone jar.  But it will be run with the role 'node.'

Here's an example:
java -jar selenium-server-standalone-2.31.0.jar" -role node -hub http://qa1.ifn.com:4444/grid/register -browser browserName=chrome,maxInstances=5

Similar to the Hub, I also created a batch file and used the windows scheduler to make sure that on a restart the batch file executes... it has this code:

C:\Program Files (x86)\Java\jre7\bin\java.exe" -jar "C:\Selenium-grid\selenium-server-standalone-2.31.0.jar" -role node -hub http://mydomain.com:4444/grid/register -browser browserName=chrome,maxInstances=5

Note that http://mydomain.com:4444 is the domain that the hub is running on.  This command is registering the browser Chrome and saying it has 5 max instances (this means it will run up to 5 chrome tests simultaneously on this box.)  you can of course change that.


Setting Up Cucumber

Since I use Jenkins to run all the Cucumber jobs, I want to be able to specify parameters from a command line that will:
  • Run the Cucumber Features
  • Specify the Browser for the test
  • Specify if the grid will be used or not
To do that, I use the env.rb file in Cucumber's /Features/support  directory.

Here's an example of what I did to create these parameters to be used from the command line....
First I added some code:

def browser_name
  (ENV['BROWSER'] ||= 'firefox').downcase.to_sym
end

def environment
  (ENV['ENVI'] ||= 'int').downcase.to_sym
end

This is setting the parameter "browser" and "envi" (for environment) to be called on the command line.  It is also giving a default value for each.  By default the browser is Firefox and the environment is int (for integration.)

Below that code, I wrote this before statement:
Before do  |scenario|
  p "Starting #{scenario}"
  if environment == :int
    @browser = Watir::Browser.new(:remote, :url=>"http://mydomain.com:4444/wd/hub", :desired_capabilities=> browser_name)
    @browser.goto "http://integration.env.com:8080"
  elsif environment == :local
    @browser = Watir::Browser.new browser_name
    @browser.goto "http://integration.env.com:8080"
  end
end

This block above says that before we run the Cucumber scenarios, we must define a few things.  First if the environment is passed in as 'int' (our default) we will define @browser to be equal to Watir::Browser.new(:remote, :url=>"http://mydomain.com:4444/wd/hub", :desired_capabilities=>browser_name)

So if environment is int, we're defining the @browser to be going to selenium grid to send all the traffic.  If a browser is passed in, we're also sending that along. 

Test this out by sending a command from the project you have like:
cucumber envi=int browser=firefox

You can monitor the jobs get picked up by different vm's from the Grid console.

Jenkins Configuration

Once you have Selenium Grid set up, go to Jenkins and make a new basic job. 

This job will just do a Windows Shell command.

You'll want to use the same command line parameter that was working to test your test above. For example, if you have a feature called "Account Creation" you would do something like this:
cucumber features/account_creationl.feature BROWSER=firefox ENVI=int

If you did that for each job, you'd have all your jobs going to the grid.  You could make a parent job that runs all features simultaneously.

To do that you create a project that has only one function, to kick off downstream projects... each feature would be it's own project/job.  So the parent project would launch downstream jobs of: account creation, profile crud actions, forum crud actions, etc.  So they all run in parallel and are sent to the grid.

Gotcha's

There's a gotcha.  IE webdriver doesn't like to be used remotely.  For this one outlier, I use IE in a serial fashion.

So in my Jenkins I have my jobs on tabs.  The first time is Functional Tests (these run in Firefox), the second tab is Chrome Tests, the third is IE tests.

On the first tab I have the jobs configured to run with FF only.  On the second to run with Chrome only. The third tab runs in IE only AND does NOT SEND the jobs to the grid. 

This way, I get the Firefox functional tests finished in 5min, Chrome finished in 5 min and IE takes the normal time of 40min.  It's still better then 40min a pop.