Blog‎ > ‎

Entity Graph in HBase

posted Dec 22, 2011, 6:16 PM by James Kraemer
Data Intelligence Technologies now connects with HBase for off-line entity graph analytics.

The Entity Analytical Platform comes packaged with a specialized view to be run with the Cloudera Sqoop program. Platform users can do a full data load of all the entity links stored in the Entity Repository straight to HBase in one easy step.

We plan to continue to work with the Hadoop ecosystem of tools.  One of Data Intelligence's core values is to facilitate entity discovery by sharing data across platforms.  In the near future we plan to add access through the Entity Analytical Platform to some of the latest Hadoop graph processing platforms based on Google's Pregel algorithm.

Sqoop example from a Cloudera CDH3u2 VM:
[cloudera@localhost ~]$ sqoop import --connect "jdbc:mysql://${databaseip}/dataintelligence?useUnicode=true&characterEncoding=utf-8&sessionVariables=FOREIGN_KEY_CHECKS=0" --username ${username} -P --table HBASE_ENTITY_LINK --hbase-table ENTITY_LINK --direct --split-by ENTITY_LINK_ID --column-family default --hbase-create-table

Verify: Scan the Entity Graph

[cloudera@localhost ~]$ hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.90.4-cdh3u2, r, Thu Oct 13 20:32:26 PDT 2011

hbase(main):001:0> scan 'ENTITY_LINK'