Thursday, March 29, 2012

Write Data to HBase over thrift (Python)

[[Find Thrift interface file]]  

the Thrift interface file should be located under

[[Install Thrift]]
I tested on both Ubuntu 11.04 x32 & CentOS 5 x64.
  tar -xzvf thrift-0.8.0.tar.gz
+Compile & Install
  cd thrift-0.8.0
  sudo make install

  //try thrift command, you should get the usage information
+Install Thrift library for your language
  Trhift provided a lot of libraries for different languages.
  I'll use python to make an example. So, install the python library for thrift first.
  cd thrift-0.8.0/lib/py
  sudo python isntall

  //verify the installaton by run "import thrift" in the python interactive shell.

[[Generate Hbase library "header file"]]
  thrift --gen py hbase.thrift  You'll get a folder named "gen-py", those are the python header files

[[write a script]]
  Let's write a script to 1,Create a table; 2,Show table names; 3,Inseart some data; 4,Read them.
   1: import sys
   2: sys.path.append('/root/Desktop/working/gen-py')
   4: from thrift.transport.TSocket import TSocket
   5: from thrift.transport.TTransport import TBufferedTransport
   6: from thrift.protocol import TBinaryProtocol
   7: from hbase import Hbase
  10: transport = TBufferedTransport(TSocket('', 9090))
  12: protocol = TBinaryProtocol.TBinaryProtocol(transport)
  13: client = Hbase.Client(protocol)
  15: columns = []
  16: col = Hbase.ColumnDescriptor(); = "data:"
  17: client.createTable("test", columns)
  18: print client.getTableNames()
  20: mutations = [Hbase.Mutation(column="data:1",value='value1')]
  21: client.mutateRow("test", "row1", mutations )
  23: print client.getRow('test', 'row1')

[[test the script]]
  Make sure the thrift-server is running. (in this sample script, thrift server is running on the same machine)
  If you can not make your thrift-server run in a Cluodera-manager-managed cluster, look at the tail of
  Run the script: "python", Get stdout result:
   1: ['test']
   2: [TRowResult(columns={'data:1': TCell(timestamp=1333062795476L, value='value1')}, row='row1')]

 There's an article compared the performance between thrift python client and HBase native JAVA API by Jython.

[[Verify the stability when region server down]]
  As we known, Hbase is based on HDFS file system, and HDFS keeps replicas in data nodes by its coherency model
  You can read more in "Hadoop: The Definitive Guide" Chapter 3 > DataFlow
  And the setting to indicate how many replicas is "dfs.replication" in <hdfs-site.xml>. Deafault value is 3. It means, every data block in HDFS own 2 copies.
  Make a case to verify whether it work as we expect.
  1, On region server "REGIONSRV3", create a table named "test", and write some data in it.
  2, Check the table status from HBase master page. "http://HBASEMASTER:60010/table.jsp?name=test"
  It shows the "Table regions" is located on "REGIONSRV3", and the table is enabled.
  3, Then turn this region server "REGIONSRV3" down.
  4, Our expect we still able to query the table content from the cluster, coz there're 2 copies of the data in other alive nodes.
   run "scan 'test'" in hbase shell, we can see the result. That's what we expected :)

   1: hbase(main):025:0> scan 'test'
   3: row1 column=data:1, timestamp=1332891318009, value=value1 
   4: row2 column=data:2, timestamp=1333049644415, value=value2 
   5: row3 column=data:3, timestamp=1333053002019, value=value3 
   6: 3 row(s) in 0.0890 seconds

  5, Check the table properties by URL "http://HBASEMASTER:60010/table.jsp?name=test"
   The "Table regions" of the table had moved to "REGIONSRV4"


  1. There are lots of information about latest technology and how to get trained in them, like Big Data Course in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies(Big Data Training Chennai). By the way you are running a great blog. Thanks for sharing this.

    Big Data Training in Chennai | Big Data Training

  2. I agree with your thoughts!!! As the demand of java programming application keeps on increasing, there is massive demand for java professionals in software development industries. Thus, taking training will assist students to be skilled java developers in leading MNCs. J2EE Training in Chennai | JAVA Training Institutes in Chennai

  3. Thank you so much for sharing this worth able content with us. The concept taken here will be useful for my future programs and i will surely implement them in my study. Keep blogging article like this.

    python training in bangalore|

  4. Thanks for providing a valuable information with us. get an offer for MSBI Online Training

  5. Nice post about MSBI, looking for best msbi online training institute ?

  6. I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.
    Hadoop Training Institute In chennai

  7. Have you been thinking about the power sources and the tiles whom use blocks I wanted to thank you for this great read!! I definitely enjoyed every little bit of it and I have you bookmarked to check out the new stuff you post

    java training in annanagar | java training in chennai

    java training in marathahalli | java training in btm layout

    java training in rajaji nagar | java training in jayanagar

  8. Thanks for such a great article here. I was searching for something like this for quite a long time and at last I’ve found it on your blog. It was definitely interesting for me to read  about their market situation nowadays.
    python training in chennai
    python training in Bangalore
    Python training institute in chennai

  9. Very nice post here and thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.
    Selenium Training in Chennai | Selenium Training in Bangalore | Selenium Training in Pune | Selenium online Training

  10. Very nice post here and thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.
    Selenium Training in Chennai | Selenium Training in Bangalore | Selenium Training in Pune | Selenium online Training

  11. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.

    Devops Training in pune
    DevOps online Training

  12. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.

    rpa interview questions and answers
    automation anywhere interview questions and answers
    blueprism interview questions and answers
    uipath interview questions and answers
    rpa training in chennai

  13. Thank you for an additional great post. Exactly where else could anybody get that kind of facts in this kind of a ideal way of writing? I have a presentation next week, and I’m around the appear for this kind of data.
    angularjs Training in bangalore

    angularjs Training in btm

    angularjs Training in electronic-city

    angularjs online Training

    angularjs Training in marathahalli

    angularjs interview questions and answers

  14. Really very nice blog information for this one and more technical skills are improve,i like that kind of post.
    apple service center | apple iphone service center | apple ipad service center | apple mac service center

  15. Thanks for sharing an informative blog keep rocking bring more details.I like the helpful info you provide in your articles. I’ll bookmark your weblog and check again here regularly. I am quite sure I will learn much new stuff right here! Good luck for the next!
    mobile application development training online
    web designing course with placement in chennai
    web designing training institute in chennai
    web design and development training
    mobile app development course
    mobile application development course

  16. The information is worth thinking over. I am really thankful to you for posting this blog.
    Selenium Training in Chennai | Best Selenium Training in Chennai

  17. Your very own commitment to getting the message throughout came to be rather powerful and have consistently enabled employees just like me to arrive at their desired goals.

    Angularjs Training in Chennai
    Java Training in Chennai
    Bigdata Hadoop Training in Chennai
    SAS Training in Chennai