Tuesday, December 4, 2012

G-WAN, Nginx, Apache2 benchmark (serving static files)

It's just a personal, informal, not rigorous experiment.
The test result may misleading or even incorrect. 

You can reproduce the test cases in your own environment easily.

Some benchmark articles:


All the reports show G-WAN is mush faster than Nginx, but that's the comparison more than 1 year ago, the version it used( Nginx 0.7 & G-WAN 2.1) had been out of date.

I made my tests with current popular version: Nginx 1.1.19 & G-WAN 3.3.28

Hardware and Software:

CPU: 4 core, Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
Memory: 5GB
OS: Ubuntu server 12.04.1 x64 
OS Kernel: 3.2.0-34-generic #53-Ubuntu SMP
**I made a mistake in first several tests, because I put server and client in different machines, although they are connected in a 1Gb LAN environment,  but in this kind of heavy load test, the 1Gb network become the bottleneck. 
In those tests, Nginx and G-WAN, even Apache2 got almost the same score. That confused me for a while, finally I figured out this problem. 
And retested the cases by running server and client on same machine.

Test cases:

3 static files in different size: 17KB, 3.5MB, 653MB
case 1: use ab.c (http://gwan.ch/source/ab.c) provided by G-WAN to download the 17KB file  in different concurrency
case 2: download 100Byte file in different concurrency
case 3: 100 clients concurrent download a 623MB size file.


1, Tune up system
* </etc/security/limits.conf> append
* soft nofile 200000
* hard nofile 200000
root soft  nofile 200000
root hard  nofile 200000

*</etc/pam.d/common-session> append
session required pam_limits.so

* in </etc/sysctl.conf> append or modify
fs.file-max = 5000000
net.core.netdev_max_backlog = 400000
net.core.optmem_max = 10000000
net.core.rmem_default = 10000000
net.core.rmem_max = 10000000
net.core.somaxconn = 100000
net.core.wmem_default = 10000000
net.core.wmem_max = 10000000
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.tcp_congestion_control = bic
net.ipv4.tcp_ecn = 0
net.ipv4.tcp_max_syn_backlog = 12000
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_mem = 30000000 30000000 30000000
net.ipv4.tcp_rmem = 30000000 30000000 30000000
net.ipv4.tcp_sack = 1
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_wmem = 30000000 30000000 30000000 

2, Static Files on server
copy testing files to G-WAN & Nginx folders.
for G-WAN, put the files under gwan installation path, like  gwan_linux64-bit/

for Nginx, put files anywhere, then alias the path to URL in nginx configuration.

3, Configurations
for G-WAN, there's no necessary additional configurations at all. Let it works on default port 8080.
for Nginx, increase the acceptable connections, and let the app serve on port 8989
* </etc/nginx/nginx.conf>

user www-data;
worker_processes 8;
pid /var/run/nginx.pid;

events {
        worker_connections 10000;
        multi_accept off; <-----MUST be OFF in this test, othervise, you will see lots of non-2xx failures.
       use epoll;
* application configuration </etc/nginx/sites-enable/ANYNAME>
server {
  listen 8989;
  server_name localhost;
        location / {
                alias /var/tf/;
        access_log off;
        error_log off;

for Apache2, make it work on port 8888, and add some extra settings in </etc/apache2/apache2.conf>

StartServers          2
MaxClients          150
MinSpareThreads      25
MaxSpareThreads      75
ThreadLimit          64
ThreadsPerChild      25
MaxRequestsPerChild   0

4, Reboot & verify
reboot the server to make sure all changed configuration become effective.
after reboot, run cmd "ulimit -a" to check "max open file number"

start G-WAN by simply run ./gwan  in its installation folder.
nginx should already started up as a service.

Try to download the file by browsers: <---G-WAN work <---Nginx work <---Apache work


first, we have to modify some macros in <ab.c> to fit my needs.
1, select bench-marking tool

//#define HP_HTTPERF 

2, indicate target server
#define IP   ""
#define PORT "8080" //"8989" for Nginx, "8888" for Apache2

3, number of concurrent clients, and increasing steps,
#define FROM     0 // range to cover (1 - 1,000 concurrent clients)
#define TO      1000 // range to cover (1 - 1,000 concurrent clients)
#define STEP      200 // number of concurrency steps we actually skip

4, indicate target URL
#define URL "/17k.rar" //Or other file names

5, Build
gcc ab.c -O2 -o testapp -lpthread

Test case 1:

download 17KB file

G-WAN Result:
G-WAN ab.c ApacheBench wrapper, http://gwan.ch/source/ab.c.txt
Machine: 1 x 4-Core CPU(s) Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
RAM: 2.97/4.77 (Free/Total, in GB)
Linux x86_64 v#53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 3.2.0-34-generic
Ubuntu 12.04.1 LTS \n \l

ab -n 1000000 -c [0-1000 step:200 rounds:10] -S -d  ""

  Client           Requests per second               CPU
-----------  -------------------------------  ----------------  -------
Concurrency     min        ave        max      user     kernel   MB RAM
-----------  ---------  ---------  ---------  -------  -------  -------
         1,      9420,     10636,     13093,
       200,     12987,     13035,     13095,
       400,     12222,     12254,     12299,
       600,     11701,     11882,     12022,
       800,     11717,     11808,     11934,
      1000,     11407,     11499,     11627,
min:69454   avg:71114   max:74070 Time:62 second(s) [00:01:02]

Nginx Result:

G-WAN ab.c ApacheBench wrapper, http://gwan.ch/source/ab.c.txt
Machine: 1 x 4-Core CPU(s) Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
RAM: 4.55/4.77 (Free/Total, in GB)
Linux x86_64 v#53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 3.2.0-34-generic
Ubuntu 12.04.1 LTS \n \l

ab -n 1000000 -c [0-1000 step:200 rounds:10] -S -d  ""

  Client           Requests per second               CPU
-----------  -------------------------------  ----------------  -------
Concurrency     min        ave        max      user     kernel   MB RAM
-----------  ---------  ---------  ---------  -------  -------  -------
         1,      8722,      9065,      9569, 
       200,     16120,     16299,     16506, 
       400,     14652,     14987,     15110, 
       600,     14092,     14220,     14314, 
       800,     13540,     13759,     13986, 
      1000,     13356,     13492,     13722, 
min:80482   avg:81822   max:83207 Time:61 second(s) [00:01:01]

Apache Result:

G-WAN ab.c ApacheBench wrapper, http://gwan.ch/source/ab.c.txt
Machine: 1 x 4-Core CPU(s) Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
RAM: 4.54/4.77 (Free/Total, in GB)
Linux x86_64 v#53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 3.2.0-34-generic
Ubuntu 12.04.1 LTS \n \l

ab -n 1000000 -c [0-1000 step:200 rounds:10] -S -d  ""

  Client           Requests per second               CPU
-----------  -------------------------------  ----------------  -------
Concurrency     min        ave        max      user     kernel   MB RAM
-----------  ---------  ---------  ---------  -------  -------  -------
         1,      5512,      5639,      5761,
       200,     13004,     14481,     15457,
       400,     11569,     13759,     14052,
       600,     13420,     13485,     13532,
       800,     11723,     12788,     13341,
      1000,      5570,     10521,     12729,
min:60798   avg:70673   max:74872 Time:61 second(s) [00:01:01]

Really out of expectation, RPS of Apache2 is close to G-WAN, while Nginx serves 10000 more requests per second !!!
The result is different to what I read from Internet. 
A friend of mine, asked me to test the performance when JBoss serving static files.
here's the result of JBoss serving this 17KB file
G-WAN ab.c ApacheBench wrapper, http://gwan.ch/source/ab.c.txt
Machine: 1 x 4-Core CPU(s) Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
RAM: 0.41/4.77 (Free/Total, in GB)
Linux x86_64 v#53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 3.2.0-34-generic
Ubuntu 12.04.1 LTS \n \l

ab -n 1000000 -c [0-1000 step:200 rounds:10] -S -d  ""

  Client           Requests per second               CPU
-----------  -------------------------------  ----------------  -------
Concurrency     min        ave        max      user     kernel   MB RAM
-----------  ---------  ---------  ---------  -------  -------  -------
         1,      6459,      6550,      6666, 
       200,      8089,      9262,      9704, 
       400,      8427,      9287,      9581, 
       600,      8433,      9208,      9373, 
       800,      7625,      9020,      9544, 
      1000,      7407,      9206,      9559, 
min:46440   avg:52533   max:54427 Time:62 second(s) [00:01:02]
Obviously, application server is not good at serving static files. 

Test case 2:

100 Byte size file downloading test.

G-WAN ab.c ApacheBench wrapper, http://gwan.ch/source/ab.c.txt
Machine: 1 x 4-Core CPU(s) Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
RAM: 4.37/4.77 (Free/Total, in GB)
Linux x86_64 v#53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 3.2.0-34-generic
Ubuntu 12.04.1 LTS \n \l

ab -n 1000000 -c [0-1000 step:200 rounds:10] -S -d  ""

  Client           Requests per second               CPU
-----------  -------------------------------  ----------------  -------
Concurrency     min        ave        max      user     kernel   MB RAM
-----------  ---------  ---------  ---------  -------  -------  -------
         1,     10649,     13385,     15629, 
       200,     16514,     17171,     17398, 
       400,     15559,     15838,     16155, 
       600,     15231,     15444,     15582, 
       800,     14978,     15133,     15360, 
      1000,     14700,     14895,     15101, 
min:87631   avg:91866   max:95225 Time:61 second(s) [00:01:01]

G-WAN ab.c ApacheBench wrapper, http://gwan.ch/source/ab.c.txt
Machine: 1 x 4-Core CPU(s) Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
RAM: 4.34/4.77 (Free/Total, in GB)
Linux x86_64 v#53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 3.2.0-34-generic
Ubuntu 12.04.1 LTS \n \l

ab -n 1000000 -c [0-1000 step:200 rounds:10] -S -d  ""

  Client           Requests per second               CPU
-----------  -------------------------------  ----------------  -------
Concurrency     min        ave        max      user     kernel   MB RAM
-----------  ---------  ---------  ---------  -------  -------  -------
         1,      9343,      9776,     10148, 
       200,     17679,     17964,     18336, 
       400,     16568,     16991,     17304, 
       600,     16132,     16385,     16582, 
       800,     15955,     16168,     16350, 
      1000,     15580,     15787,     16083, 
min:91257   avg:93071   max:94803 Time:62 second(s) [00:01:02]

G-WAN ab.c ApacheBench wrapper, http://gwan.ch/source/ab.c.txt
Machine: 1 x 4-Core CPU(s) Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
RAM: 4.35/4.77 (Free/Total, in GB)
Linux x86_64 v#53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 3.2.0-34-generic
Ubuntu 12.04.1 LTS \n \l

ab -n 1000000 -c [0-1000 step:200 rounds:10] -S -d  ""

  Client           Requests per second               CPU
-----------  -------------------------------  ----------------  -------
Concurrency     min        ave        max      user     kernel   MB RAM
-----------  ---------  ---------  ---------  -------  -------  -------
         1,      3524,      3570,      3604, 
       200,      8280,      8606,      9162, 
       400,      8246,      8466,      8665, 
       600,      5381,      7998,      8482, 
       800,      1503,      8043,     10969, 
      1000,         0,      6814,     11594, 
min:26934   avg:43497   max:52476 Time:62 second(s) [00:01:02]

well, Nginx is still the fastest one. But G-WAN is close to it. Apache seems very slow in this test, just half of request Nginx served.

Test case 3:

623MB size file downloading test. By command:
ab  -c 100 -n 100

Server Software:        G-WAN
Requests per second:    2.01 [#/sec] (mean)
Time per request:       49747.564 [ms] (mean)
Time per request:       497.476 [ms] (mean, across all concurrent requests)
Transfer rate:          1283066.55 [Kbytes/sec] received

Server Software:        Apache/2.2.22
Requests per second:    2.59 [#/sec] (mean)
Time per request:       38638.219 [ms] (mean)
Time per request:       386.382 [ms] (mean, across all concurrent requests)
Transfer rate:          1651976.61 [Kbytes/sec] received

Server Software:        nginx/1.1.19
Requests per second:    1.69 [#/sec] (mean)
Time per request:       59245.971 [ms] (mean)
Time per request:       592.460 [ms] (mean, across all concurrent requests)
Transfer rate:          1077363.21 [Kbytes/sec] received

It shocked me again... RPS rank:
Apache > G-WAN > Nginx

Tuesday, October 23, 2012

Looking for Adobe flash player?

Android market (Google play) do not provide Flash player any longer, but some application and some web page requires that still.

Here's the download link:

Tuesday, September 18, 2012

Google translation API emulation(Free API :P)

1, I'm missing the free API

I feel sad when Google start charging for their translation API. But after some research, I think, Google is still open, friendly to those technical otaku, like me :P

2, Emulate the translation request

Here's the capture, showing the AJAX request  when I translating the English word "china" to Chinese in Google translate page.

There're several parameters in this GET operation, including 
"sl": source language, 
"tl": target language,
"text": text to be translated to target language
You will get a 403 error when you just send the query parameters in the request, you should disguised your code as a browser to "cheat" Google.

queryArgs = {'hl':'zh-CN',
        "client": "t",
        'text':text, "sl": sl, "tl":tl,
        "multires":1, "ssel":0, "tsel":0, "sc":1}

req = urllib2.Request("http://translate.google.com/translate_a/t", urllib.urlencode(queryArgs))
req.headers["Refer"] = "http://translate.google.com/"
req.headers["Host"] = "translate.google.com"
req.headers["Connection"] = "Close"
req.headers['User-Agent'] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1"

response = urllib2.urlopen(req)
tres = response.read()

3, understand the response

Response is a JavaScript list, with 10 elements inside.

Let's check out the detail:
index 0: [["中国","china","Zhōngguó",""]]
Summary in target language. position 2 in the picuture.

index 1: [["noun",["中国","瓷器","华","中华","瓷"],[["中国",["China"]],["瓷器",["porcelain","china","chinaware"]],["华",["China","flower","flora"]],["中华",["China"]],["瓷",["porcelain","china","chinaware"]]]],["adjective",["中国的","瓷的"],[["中国的",["China","Chinese"]],["瓷的",["china"]]]]]
Detail in target language. There're 1+ items inside. In this sample, there're 2 items inside, one for the adjective explain, another for noun explain.

e.g. the noun explain:
3 parts inside:

  • parts of speech: noun, adj ...  position 3 in the picture
  • explains: position 4 in the picture
  • Synonyms of the explains. position 5 in the picture

index 2: en
target language

index 3: None

index 4: [["中国",[5],0,0,1000,0,1,0]]

index 5: [["china",4,,,""],["china",5,[["中国",1000,0,0],["瓷器",0,0,0]],[[0,5]],"china"]]

index 6: None

index 7: None

index 8: [['en']]
source language

index 9: 3

Although there're some unknown slices, but information we're interested located in index 0 & index 1.

4, write code to parse the response

4.1 by JavaScript parser

This is my first reaction. No doubt a JavaScript array can be parsed by JavaScript parser. As a super convenient programming language, there're definitely 3rd libraries to do it. 
First, I found WebKitGTK+, which is a GTK implement of WebKit. Unfortunately, I give it up after read its document, it's too heavy to use. So does QTWebkit.
Then I found PyV8 (http://code.google.com/p/pyv8/), which is a python binding to Google's v8 JavaScript Engine. It's easy to use, really.

res = tres.decode("utf8")
import PyV8 as v8
ctxt = v8.JSContext()
x= ctxt.eval(res)

summaryJSObj = x[0]
detailJSObj = x[1]

summary = summaryJSObj[0][0]
detail = ""

if detailJSObj:
    for dx in detailJSObj:
        detail += "%s."%dx[0] # nonu, adj...
        detail += "%s\n"%str(dx[1])
print summary, "\n", detail

4.2 transfer to python list

It's kind of tricking. Comparing the JavaScript array to python list format, you'll find the only difference is: JavaScript gamma allows "[,,,]" while Python list only support a None element after a unNone element, like  "[1,]". 
So when we force the JavaScript array in string to a python list, we should prevent the continually comma.

Let's replace it!Our goal, or we say our test cases:
JavaScript array -> Python list
[1,] -> [1,] or  [1,None]
[1,1] -> [1,1]
[1,,,] -> [1,None,None,] or [1,None,None,None]
[1,,,2,3] -> [1,None,None,2,3]
[,1] -> [None,1]
[,,1] -> [None,None,1]

A regular-expression can be a key to all test cased below.
re.sub("(?<=,),|(?<=\[),", "None," , JS_ARRAY_STRING)
It takes care of 2 grammar cases
a comma after a comma. .e.g. ",," => ",None,"
a comma after a '[' . e.g. "[," -> "[None,"

res = re.sub("(?<=,),|(?<=\[),","None,",tres)
l = eval(res)

summaryJSObj = l[0]
detailJSObj = l[1]
summary = summaryJSObj[0][0]
detail = ""

if detailJSObj:
    for dx in detailJSObj:        
        detail += "%s."%dx[0] # nonu, adj...
        for exp in dx[1]:
            detail += "%s,"%str(exp)
        detail += "\n"
print summary, "\n", detail


Thursday, August 30, 2012

Config SSL for WebLogic

I'm not a JAVA developer at all. I just helped my colleague finished this configuration.

Suppose you have 3 necessary files already:
<privkey.pem>: private key
<COMPANY.com.crt>: certification
<gd_bundle.crt>: rootCA

### Generate Identify.jks ###
openssl pkcs8 -topk8 -nocrypt -in privkey.pem -inform PEM -out key.der -outform DER
openssl x509 -in COMPANY.com.crt  -inform PEM -out cert.der -outform DER

javac ImportKey.java
java ImportKey key.der cert.der
keytool -import -file gd_bundle.crt -alias -trustcacerts -keystore keystore.ImportKey -storepass importkey

you can change the generated file <keystore.ImportKey> as xxx.jks if you like

keytool -v -list -keystore keystore.ImportKey -storepass importkey 

### Generate Trusts.jks ###
keytool -import -v -trustcacerts -alias importkey -file gd_bundle.crt -keystore bundle.jks  -storepass importkey

### Config weblogic ###
Enter WebLogic admin conosole
indicate these 2 JKS files in "Keystore" tab.
finger out password in "SSL" tab.

### Verify ###
Domain name should be *.COMPANY.com, otherwise, your browser will give a warning.

Wednesday, August 22, 2012

ArrayWritable as Reduce input

1, As the input for a Reducer

ArrayWritable is a class, but you have  to create a subclass indicate its proper type if you wanna use it in Map/Reduce tasks.
With the  code slice below, you can use it as a Mapper's output value(Reducer's input value)

public static class TextArrayWritable extends ArrayWritable {
    public TextArrayWritable() {

2, As the input Key for a Reducer

Well, to be a Key for Reducer, a class should be comparable, because the output from a Mapper should be sorted before it become the input of following Reducer.
We have 2 solution to make the ArrayWritable  subclass to be comparable.

1, use setOutputKeyComparatorClass in JobConf. (old style API)
2, add interface WritableComparable to existing TextArrayWritable.
 not good at Java code , my apologies :(

 1 public  class TextArrayWritable extends ArrayWritable implements WritableComparable<TextArrayWritable>{
 2     public TextArrayWritable() {
 3         super(Text.class);
 4     }
 6     public TextArrayWritable(Text[] values) {
 7         super(Text.class, values);
 8     }
10     @Override
11     public int compareTo(TextArrayWritable o) {
12         try{
13             Writable[] self = this.get();
14             Writable[] other = o.get();
16             if (self == null) self = new Text[]{};           
17             if (other == null) other = new Text[]{};         
19             if (self.length == other.length){
20                 for (int i = 0; i < self.length; i++){                   
21                     int r = ((Text)self[i]).compareTo(((Text)other[i]));
22                     if (r != 0return r;                    
23                 }                    
24             }
25             else{
26                 return (self.length < other.length) ? -1 : 1;
27             }
28         }
29         catch(Exception e){
30             e.printStackTrace();
31         }
32         return 0;
33     }
34 }

Tuesday, June 26, 2012

HBase + Thrift performance test 2

Test purpose and design
Nginx work as a balancer, 8 tornado instances will serve at the back end. Each tornado instance owns a thrift connection to HBase. Since tornado is a single thread web server, so there's no "thread safe issue" we mentioned in previous blog here.

Code & Configuration file
Server side code: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/web%20service%20test/tornado_1.py
Test driven code: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/web%20service%20test/emu_massdata.py
Nginx configuration: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/web%20service%20test/Nginx%20setting/hbasetest
Supervisord configuration: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/web%20service%20test/Supervisord%20setting/supervisord.conf

CPU: Intel(R) Xeon(R) CPU            5150  @ 2.66GHz (4 core)
Memory: 4GB
Network: LAN

deploy tornado application

configuration file for supervisord
we'll start 8 tornado instances, they will listen on port 8870~8877.
command=python /root/tornado_1.py 887%(process_num)01d

verify working processes
root@fdcolo8:/etc/nginx/sites-enabled# supervisorctl
hbasewstest:hbasewstest_0        RUNNING    pid 2020, uptime 18:27:55
hbasewstest:hbasewstest_1        RUNNING    pid 2019, uptime 18:27:55
hbasewstest:hbasewstest_2        RUNNING    pid 2034, uptime 18:27:53
hbasewstest:hbasewstest_3        RUNNING    pid 2029, uptime 18:27:54
hbasewstest:hbasewstest_4        RUNNING    pid 2044, uptime 18:27:51
hbasewstest:hbasewstest_5        RUNNING    pid 2039, uptime 18:27:52
hbasewstest:hbasewstest_6        RUNNING    pid 2054, uptime 18:27:49
hbasewstest:hbasewstest_7        RUNNING    pid 2049, uptime 18:27:50

Nginx configuration
create new server profile under /etc/nginx/sites-enabled
 1 upstream backends{
 2     server;
 3     server;
 4     server;
 5     server;
 6     server;
 7     server;
 8     server;
 9     server;
10 }
14 server {
15     listen 8880;
16     server_name localhost; 
17     location / {
18         proxy_pass_header Server;
19         proxy_set_header Host $http_host;
20         proxy_set_header X-Real-IP $remote_addr;
21         proxy_set_header X-Scheme $scheme;
22         proxy_pass http://backends;
23         proxy_next_upstream error;
24     }  
25     access_log /var/log/nginx/hbasewstest.access_log;
26     error_log /var/log/nginx/hbasewstest.error_log;
27 }

Verify Nginx worked
after new Ngninx profile created, make sure nginx is now listen on port 8880
service nginx reload
lsof -i:8880

The test-driven application start 10 threads at beginning, and send 300KB-length data packages continually by HTTP POST.
Web application will split each 300K-length JSON data into hundreds of 1K-length data, and transform to HBase records. Web application use batch write mode, each coming JSON data will trigger one time write only...
Check more detail in source code, URL@2nd subparagraph.

Test Result
Data size                    | web app detail                        | time
60K records(60MB)       | 1 instance (port 8870)              |  12 seconds
60K records(60MB)       | nginx (8 instances, port 8880)  |  6.22 seconds
6 million records(6GB) | nginx (8 instances, port 8880)  |  768.79 senconds(12.8mins)

Web server status
CPU time



Friday, June 22, 2012

HBase + Thrift performance test 1

Thrift transport is not thread safe!
At the beginning, I used only 1 global thrift connection in my test app, and 10 concurrent threads send data to HBase by this thrift connection.

Then, lots of confusing exceptions came towards me!
- Server side printed exceptions:
"java.lang.OutOfMemoryError: Java heap space"

- Client side printed exceptions:
"... broken pipe..."

Performance test 
Data size:
  • write 60K records to database, 
HBase Row
  • each record is about 1024 bytes => 60MB data totally
  • each record owns 3 columns "f1:1", "f1:2", "f1:3" in 1 column family "f1". values in each column were formatted as "value_%s_endv" % "x"*(1024/3)
  • row key is formatted as  "RK_%s_%s" % (random.random(), time.time())
  • one record :

Write mode
  • 10 threads concurrent => each thread in charge of writing 6K records(6MB)
  • write to database every 300 records (mutateRows)

4 boxes in cluster:
  1. NameNode, Secondary NameNode, HBase Master, Zookeeper Server
  2. DataNode, Region server, Thrift
  3. DataNode, Region server
  4. DataNode, Region server

They're all Ubuntu 12.04 x64 servers, Intel Core2 Quad@2.66GHz. #1 #3 #4 are equipped with 8G memory, #2 is 16G because thrift was running on it.

HBase Configuration:
Most preferences keep default after Cloudera CDH4 Manager installed. The only two modifications:
HBase Master's Java Heap Size in bytes: 1073741824 -> 2147483648
HBase Client Write Buffer 2097152 -> 8388608

create test database "testdb1" with column family "f1"
> create 'testdb1','f1'
0 row(s) in 1.6290 seconds

[Test 1]
each thread own its private connection to Thrift, so there are 10 connections totally in this test.
test code: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/connectioninthread.py

result: 6.9139 seconds -> 60 000 records (60MB)

[Test 2]
use one global connection, each thread should acquire the global reentrant lock before write.
test code: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/sharedconnection.py

result: 16.345 seconds -> 60 000 records(60MB)

Uh... Of course, 10 connections(test 1) are much faster than single connection(test 2). thrift is not the bottle neck in this test. More thrift connections bring you better performance.

Next week, I'll add a tornado Web application in front of the thrift interface to collect mass data.
I'll try to reach the best performance as I can.

I also did load test with 6 million records. It cost me 896 seconds(14 mins), so avg. time to store a record is 0.1ms. impressive performance!!

Here's the server status:(the regionserver where thrift located) 
CPU Time



Thursday, June 14, 2012

Deploy tornado application

Goal: deploy tornado application with Nginx.

Nginx+Tornado is absolutely a perfect combination for high performance web service. I used to configure Nginx as reverse proxy and load balancer in front of multiple tornado instances.

And I use the supervisord which is a Linux process management tool to make sure tornado applications are up and running.
supervisord: http://supervisord.org/ 

1, install supervisor by python setuptool:
$easy_install supervisor
since Ubuntu 12.04, supervisord had been indexed in apt-get repos. Install by command
apt-get install supervisor
put your configuration with .conf file ext in /etc/supervisord/conf.d/

2, generate a config file by template
echo_supervisord_conf > /etc/supervisord.conf

3, append configuration at end of the file
(here goes a simple example)
[program:mytornado] <-----prgoram name: mytornado
command=python /home/feifan/Desktop/tor.py 888%(process_num)01d <----you'd better provide a port parameter in application entrance.
process_name=%(program_name)s_%(process_num)01d <---progress name format, "mytornado_8880" and "mytornado_8881"
numprocs=2 <---- how many instances

4, start processes
We've registered the tornado processes in supervisord configuration, start the service:
To add supervisord at startup, check out item #8
If you installed by apt-get, make sure supervisord in running by command
service supervisor status
and reload new added supervisor configuration file by command
$supervisorctl update
check process up and running by command
$supervisorctl status

5, check tornado application running on specified ports
$lsof -i:8880
$lsof -i:8881
Kill one of the instance, you will get a new instance start up :)

6 config Nginx
upstream backends{
server {
 listen 8878;
 server_name localhost;
 location /favicon.ico {
  alias /var/www/favicon.ico;
 location / {
  proxy_pass_header Server;                       
  proxy_set_header Host $http_host;                       
  proxy_set_header X-Real-IP $remote_addr;                       
  proxy_set_header X-Scheme $scheme;                       
  proxy_pass http://backends;                       
  proxy_next_upstream error;
 access_log /var/log/nginx/tor.access_log;
 error_log /var/log/nginx/tor.error_log;

7 done
$service nginx reload
Now, Nginx is listening on port 8878, and requests send to this port will proxy to tornado applications on port 8880 and 8881

8 extra
If you installed supervisor by easy_install rather than apt-get, you probably need to make supervisor works as a service(upstart system)
a) (Ubuntu)create start script at /etc/init.d/supervisord
#! /bin/bash -e

OPTS="-c /etc/supervisord.conf"

test -x $SUPERVISORD || exit 0

. /lib/lsb/init-functions

export PATH="${PATH:+$PATH:}/usr/local/bin:/usr/sbin:/sbin"

case "$1" in
    log_begin_msg "Starting Supervisor daemon manager..."
    start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $SUPERVISORD -- $OPTS || log_end_msg 1
    log_end_msg 0
    log_begin_msg "Stopping Supervisor daemon manager..."
    start-stop-daemon --stop --quiet --oknodo --pidfile $PIDFILE || log_end_msg 1
    log_end_msg 0

    log_begin_msg "Restarting Supervisor daemon manager..."
    start-stop-daemon --stop --quiet --oknodo --retry 30 --pidfile $PIDFILE
    start-stop-daemon --start --quiet --pidfile /var/run/sshd.pid --exec $SUPERVISORD -- $OPTS || log_end_msg 1
    log_end_msg 0

    log_success_msg "Usage: /etc/init.d/supervisor
    exit 1

exit 0

b) make script executable
chmod +x /etc/init.d/supervisord

c) add to the startups
update-rc.d supervisord defaults

Tuesday, June 5, 2012



both CDH4 & Cloudera Manager start supporting Ubuntu 10.04 & 12.04 now!

It's really a great news to whom consider Ubuntu as their primary deploy platform :P

Monday, June 4, 2012

Cross domain AJAX

Same-orgin policy
The same-origin policy prevents a script loaded from one domain from getting or manipulating properties of a document from another domain.
This policy dates all the way back to Netscape Navigator 2.0.
You will meet failures while getting resource from  different host or different port or even a different port.

For example, an error will occurs when a page in domain "localhost" trying to get resource from domain "". Following is the exception description in Chrome Dev tool: 
XMLHttpRequest cannot load Origin http://localhost:8094 is not allowed by Access-Control-Allow-Origin.

Classic cross-domain communication: suppose we have Server/Domain A, Server/Domain B. Client, usually browser load web page from server B, and it's going to get JSON data from serer A.

Solution 1: insert script element dynamically (JSONP)
This solution seems like a heterodoxy("旁門左道"), but it's the most popular and well compatible solution currently, nicely supported by jQuery API.
It works because the same-origin policy doesn't prevent dynamic script insertions and treats the scripts as if they were loaded from the domain that provided the web page.

Server A was written in python and works on GAE. The request serves both same-domain GET request and cross-domain request, and has one parameter "y".
def makeJSONP(fn):
    def wrapped(itself):
        callback = itself.request.get("callback")
        response = fn(itself)
        if callback:
            itself.response.headers['Content-Type'] = 'application/x-javascript'
            itself.response.write( "%s(%s)" % (callback, fn(itself)) )
    return wrapped

class GetMStat(webapp2.RequestHandler):     #support JSONP
    def get(self):
        tyear = int(self.request.get("y"))
        yrecords = dbmodel.MonthlyStat.all().filter("ryear =", tyear).fetch(None)
        x= []
        for record in yrecords:    
        return json.dumps(x) 
JSONP requests server side serving URL follow the syntax:
And server side doesn't return the json data directly, it return slice of javascript code like "func(jsondata)", where func comes from the parameter "callback".

GetMStat is the HTTP handle code, it pass the raw json data to python decorator makeJSONP for more processes.
The decorator makeJSONP determine whether there was a "callback" in GET parameters, if there is, it return the js code style string as we mentioned just now, and set the HTTP header in response to type javascript.(otherwise, browser will give you a MIME mismatch warning)
If there was no "callback" in GET parameters, the request will be considered as request from same-domain, return the json data directly.

When client (javascript) get the response, the function "func" will be invoked with jsondata as parameter.
client B:
$.getJSON('http://localhost:8118/getmstatistics?callback=?',{y: 2012},
                    } //end function
The jQuery API getJSON will handle the JSONP callback function without additional work.

In Chrome Dev tool, you can find the client(browser) access an URL like this:
The callback function name is actually the 3rd anonymous parameter in getJSON function.

Cross-domain communications with JSONP, Part 1: Combine JSONP and jQuery to quickly build powerful mashups
jQuery API: getJSON 

Solution 2: HTTP access control (CORS) 
This solution is much easier than the JSONP solution, but works for modern browsers only. You can try it in your Chrome, not IE serial.

Check out the tech Draft from W3: http://www.w3.org/TR/cors/
And try to implement in your code, just one new line code :)
Server A
Add HTTP header "Access-Control-Allow-Origin" in JSON response.
response["Access-Control-Allow-Origin"] = "*"
That'all. It means this resource can be access by any domain in a cross-domain manner.