Published on October 25, 2007
Evaluation of Objectivity Databases on the Sony HSM Software and Tape Robotics: Evaluation of Objectivity Databases on the Sony HSM Software and Tape Robotics Michael Athanas Cornell Univ.* and Nobu Katayama, Atsushi Manabe KEK * on leave to Cereon Genomics Motivation: Motivation In upcoming HEP experiment, managing a huge volume of complex data is necessary. For handling such a data, Objectivity/DB is becoming a popular choice in HEP. Better understanding of how the Objectivity/DB works with a rather conventional Hierarchical Storage Management system (HSM). HSM pros and cons: HSM pros and cons Popular software on many venders and affordable. Unit of transfer between disk and tape is file access time overhead space inefficiency User cannot assign any tapes to store the file. Prototyping Test: Prototyping Test Schema: Based on Cornell Nile Proj*/objyDB. * presented in CHEP97 Data Base software: Objectivity/DB 4.0.10 Hardware: A part of the KEKB computer system. Test Bed System (1): Test Bed System (1) Server: SUN UE 6000 (US1 166 MHz x4CPU) Tape Library: Sony PetaSite* : 30TB Sony DTF (8MB/s) x 8 Disk: MSS RAID 80GB HSM soft: Sony PetaServe* (OSM ver.2.1 base) SCSI2/W SCSI2/W Tape Lib. WS RAID * http://www.sony.com/professional Test Bed System(2): Test Bed System(2) HSM disk Layer: 10GBytes HSM tape Layer: 100GBytes Tape Media Capacity: 10GB/tape HSM Water Mark:(adjustable param.) High (force data move to tape) =8GB Low (move until this size) =4GB Shadow =1GB Test Bed System(3): Test Bed System(3) Disk Max. read/write rate: 8/8 MB/s Tape Max. read/write rate: 12MB/s media loading time: 30~60 sec (with positioning & mounting) Network DB Clients & Server are on the same machine Disk and Tape are attached locally Class association: Class association DataSetCollection DataSet(Hadron) ……….... DataSet(Taus) RunCollection(1) ……….. RunCollection(100) Run(1) ……….. Run(9) Event(1) ………….. Event(1000) EventRecord x 100 x10 x1000 EventRecord ……….. Slide10: Each DB is a UNIX file and Unit of HSM migration Data Size: Data Size Event Inf. ~50B Event Record 4k~16kB Run Inf. ~50B RunCollection DB size = ~ 120MB Total DataSet size = ~12GB Tape Media Size = 10GB (small type media ) Population (object creation) : Population (object creation) PageSize = 64kB DataSet = 12GB population. Single process / Multi process concurrent population Slide13: End Run Obj write Event header write Commit Commit Population flow diagram Define DB file Begin Run Obj write Event Record write & close RunCollection Loop Run Loop >Over HighWM 50B ~12kB 50B disk 10GB Tape 100GB Population test result: Population test result Average write Rate: 1.2MB/sec 1 DataSet (12GB) creation Tape migration was negligible. NO HSM overhead was seen. Population in parallel (multi process) Scalability: saturation was seen. (memory shortage ?) Population speed: Population speed Data base access (1): Data base access (1) Access pattern can make big difference in its performance. Data reloading time (Tape->Disk) affects much. Data base access (2): Data base access (2) Deep Scan Load all data into memory Light Scan Just Event header data is scan. Index Scan 1/200 data access by event number Index search (ooEqualLookup). At first all data are moved to Tape (migrate) then start access. Deep Scan: Deep Scan 6GB full scan (500 Runs) On HSM 3.4MB/sec 337 obj/sec 1482s On normal disk (reference) 7.3MB/sec 725 obj/sec 690s Tape Reloading Time (reference) 50x120MB file 1100s Light Scan: Light Scan 250MB scan in 6GB (500 Runs) On HSM 28kB/sec 568 obj/sec 880s = 5.7MB/sec seq. access equiv. On normal disk (reference) 830kB/sec 16k obj/sec 30s = 166MB/sec seq. access equiv. Tape Reloading Time (reference) 50x120MB file 5.5MB/s 1100s Index Scan: Index Scan 30MB (0.5%)scan in 6GB (500 Runs) On HSM 27kB/sec 2.7 obj/sec 950s = 5.3MB/sec seq. access equiv. On normal disk (reference) 116kB/sec 12 obj/sec 210s = 23MB/sec seq. access equiv. Tape Reloading Time (reference) 100x120MB file 6MB/s 1002s Seq. access equiv. speed: Seq. access equiv. speed Required speed in using conventional sequential tape scan to fetch the data in the time. Speed = Total Scaned object size /elapsed time Seq. acc. equiv. speed = Total data size /elapsed time Slide23: RUN(I) Event Conventional Tape access Search EventAtr= XXX objyDB/HSM Index table Event Record RUN(I+1) Index scan access speed: Index scan access speed on HSM on pure disk Conflicting tape access : Conflicting tape access Index Search with multi processes 1% = ~0.05GB/6GB (500runs) search 1 process/HSM 38kB/sec 0.05GB/1300s 2 process/HSM with access conflict total 16kB/sec 0.1GB/6300s 2 proccess/HSM w/o access conflict total 51kB/sec 0.1GB/2000s 2 process/Disk (reference) total 220kB/sec 0.1GB/450s Typical conflict situationSimultaneous access to single tape: Process 1 Process 2 Typical conflict situation Simultaneous access to single tape Back & Forth Summary (1): Summary (1) ObjectivityDB + PetaServ (HSM) generally works well. It could be a cost effective solution. HSM overhead in DB write (populate) was negligible in our program. More efficiency was achieved by concurrent population with multiple processes. Summary(2): Summary(2) The performance of accessing data directly from the HSM under three access pattern was measured. The access tests showed that objyDB/HSM could get comparable speed with conventional sequential tape access in addition to the OODB merits. Summary(3): Summary(3) Conflictive access to single tape among processes much degrades DB access speed. To avoid such a situation, access scheduling is one of a solution. Strategy: Access DB file in the order of tape file position. Avoid concurrent accesses to DB files in single tape.