atom feed9 messages in org.apache.hadoop.hbase-userRe: Unique row ID constraint
FromSent OnAttachments
Tatsuya KawanoApr 28, 2010 7:40 am 
StackApr 28, 2010 9:42 am 
Ryan RawsonApr 28, 2010 9:41 pm 
Tatsuya KawanoApr 29, 2010 1:33 am 
Todd LipconApr 29, 2010 9:36 am 
Guilherme GermoglioApr 29, 2010 9:58 am 
Michael SegelApr 29, 2010 1:08 pm 
Tatsuya KawanoApr 30, 2010 9:31 am 
Tatsuya KawanoMay 8, 2010 4:21 pm 
Subject:Re: Unique row ID constraint
From:Tatsuya Kawano (tats@snowcocoa.info)
Date:Apr 29, 2010 1:33:36 am
List:org.apache.hadoop.hbase-user

Hi Stack and Ryan,

Thanks for your advices. I knew using row lock wasn't ideal, but I couldn't find an appropriate atomic operation to do Compare And Swap.

So, thanks Stack for helping me to find it. I found incrementColumnValue() atomic operation just works for me since it automatically initializes the column value with 0 when the column doesn't exist. I cat try to increment the column value by 1, and if it returns 1, I can be sure that I'm the first one who has created the column and row.

So, my updated code is much simpler and now lock-free.

=============================================== def insert(table: HTable, put: Put): Unit = { val count = table.incrementColumnValue(put.getRow, family, uniqueQual, 1)

if (count == 1) { table.put(put)

} else { throw new DuplicateRowException("Tried to insert a duplicate row: " + Bytes.toString(put.getRow)) } } ===============================================

Thanks, Tatsuya

2010/4/29 Ryan Rawson <ryan@gmail.com>:

I would strongly discourage people from building on top of lockRow/unlockRow.  The problem is if a row is not available, lockRow will hold a responder thread and you can end up with a deadlock because the lock holder won't be able to unlock.  Sure the expiry system kicks in, but 60 seconds is kind of infinity in database terms :-)

I would probably go with either ICV or CAS to build the tools you want.  With CAS you can accomplish a lot of things locking accomplishes, but more efficiently.

On Wed, Apr 28, 2010 at 9:42 AM, Stack <sta@duboce.net> wrote:

Would the incrementValue [1] work for this? St.Ack

1.
http://hadoop.apache.org/hbase/docs/r0.20.3/api/org/apache/hadoop/hbase/client/HTable.html#incrementColumnValue%28byte[],%20byte[],%20byte[],%20long%29

On Wed, Apr 28, 2010 at 7:40 AM, Tatsuya Kawano <tats@snowcocoa.info> wrote:

Hi,

I'd like to implement unique row ID constraint (like the primary key constraint in RDBMS) in my application framework.

Here is a code fragment from my current implementation (HBase 0.20.4rc) written in Scala. It works as expected, but is there any better (shorter) way to do this like checkAndPut()?  I'd like to pass a single Put object to my function (method) rather than passing rowId, family, qualifier and value separately. I can't do this now because I have to give the rowLock object when I instantiate the Put.

=============================================== def insert(table: HTable, rowId: Array[Byte], family: Array[Byte],                               qualifier: Array[Byte], value: Array[Byte]): Unit = {

   val get = new Get(rowId)

   val lock = table.lockRow(rowId) // will expire in one minute    try {      if (table.exists(get)) {        throw new DuplicateRowException("Tried to insert a duplicate row: "                + Bytes.toString(rowId))

     } else {        val put = new Put(rowId, lock)        put.add(family, qualifier, value)

       table.put(put)      }

   } finally {      table.unlockRow(lock)    }

} ===============================================

Thanks,

twitter: http://twitter.com/tatsuya6502