Tuesday, June 26, 2012

HBase + Thrift performance test 2

Test purpose and design
Nginx work as a balancer, 8 tornado instances will serve at the back end. Each tornado instance owns a thrift connection to HBase. Since tornado is a single thread web server, so there's no "thread safe issue" we mentioned in previous blog here.

Code & Configuration file
Server side code: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/web%20service%20test/tornado_1.py
Test driven code: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/web%20service%20test/emu_massdata.py
Nginx configuration: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/web%20service%20test/Nginx%20setting/hbasetest
Supervisord configuration: https://github.com/feifangit/hbase-thrift-performance-test/blob/master/web%20service%20test/Supervisord%20setting/supervisord.conf

CPU: Intel(R) Xeon(R) CPU            5150  @ 2.66GHz (4 core)
Memory: 4GB
Network: LAN

deploy tornado application

configuration file for supervisord
we'll start 8 tornado instances, they will listen on port 8870~8877.
command=python /root/tornado_1.py 887%(process_num)01d

verify working processes
root@fdcolo8:/etc/nginx/sites-enabled# supervisorctl
hbasewstest:hbasewstest_0        RUNNING    pid 2020, uptime 18:27:55
hbasewstest:hbasewstest_1        RUNNING    pid 2019, uptime 18:27:55
hbasewstest:hbasewstest_2        RUNNING    pid 2034, uptime 18:27:53
hbasewstest:hbasewstest_3        RUNNING    pid 2029, uptime 18:27:54
hbasewstest:hbasewstest_4        RUNNING    pid 2044, uptime 18:27:51
hbasewstest:hbasewstest_5        RUNNING    pid 2039, uptime 18:27:52
hbasewstest:hbasewstest_6        RUNNING    pid 2054, uptime 18:27:49
hbasewstest:hbasewstest_7        RUNNING    pid 2049, uptime 18:27:50

Nginx configuration
create new server profile under /etc/nginx/sites-enabled
 1 upstream backends{
 2     server;
 3     server;
 4     server;
 5     server;
 6     server;
 7     server;
 8     server;
 9     server;
10 }
14 server {
15     listen 8880;
16     server_name localhost; 
17     location / {
18         proxy_pass_header Server;
19         proxy_set_header Host $http_host;
20         proxy_set_header X-Real-IP $remote_addr;
21         proxy_set_header X-Scheme $scheme;
22         proxy_pass http://backends;
23         proxy_next_upstream error;
24     }  
25     access_log /var/log/nginx/hbasewstest.access_log;
26     error_log /var/log/nginx/hbasewstest.error_log;
27 }

Verify Nginx worked
after new Ngninx profile created, make sure nginx is now listen on port 8880
service nginx reload
lsof -i:8880

The test-driven application start 10 threads at beginning, and send 300KB-length data packages continually by HTTP POST.
Web application will split each 300K-length JSON data into hundreds of 1K-length data, and transform to HBase records. Web application use batch write mode, each coming JSON data will trigger one time write only...
Check more detail in source code, URL@2nd subparagraph.

Test Result
Data size                    | web app detail                        | time
60K records(60MB)       | 1 instance (port 8870)              |  12 seconds
60K records(60MB)       | nginx (8 instances, port 8880)  |  6.22 seconds
6 million records(6GB) | nginx (8 instances, port 8880)  |  768.79 senconds(12.8mins)

Web server status
CPU time




  1. I like the post format as you create user engagement in the complete article. It seems round up of all published posts. Thanks for gauging the informative posts.
    cara menggugurkan kandungan

  2. Die Technologie entwickelt sich schneller denn je und schneller als Sie denken. Diese aufkommende Technologie wird unsere Lebensweise verändern