IRATI @ RINA Workshop 2014, Dublin

38 %
63 %
Information about IRATI @ RINA Workshop 2014, Dublin
Technology

Published on February 4, 2014

Author: irati-project

Source: slideshare.net

Description

Presentation of the current results of the IRATI project at the RINA Workshop celebrated in Dublin, January 2014

Project overview, use cases, specifications, software development and experimental activities RINA Workshop, Dublin, January 28th –29th 2014 Investigating RINA as an Alternative to TCP/IP

Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Conclusions 2

Project at a glance • What? Main goals – To advance the state of the art of RINAtowards an architecture reference model and specificationsthat are closerto enable implementations deployable in production scenarios. – The designand implementation of a RINA prototype on top of Ethernet will enable the experimentationand evaluation of RINA in comparison to TCP/IP. Who?5partners From 2014 5 activities:  WP1: Project management  WP2: Architecture, Use cases and Requirements  WP3: Software Design and Implementation  WP4: Deployment into OFELIA testbed, Experimentation and Validation  WP5: Dissemination, Standardisation and Exploitation Budget Total Cost 1.126.660 € EC Contribution 870.000 € Duration 2 years Start Date 1st January 2013 External Advisory Board Juniper Networks, ATOS, Cisco Systems, Telecom Italia 3

Objectives (I) • Enhancement of the RINA specifications – The specification of a shim DIF over Ethernet – The completion of the specifications that enable DIFs that provide a level of service similar to the current Internet (low security, best-effort) – The project use cases • RINA Open Source Prototype for the Linux Operating System – Targeting both the user and kernel spaces, allowing RINA to be used on top of different technologies (Ethernet, TCP, UDP, etc) – It will provide a solid baseline for further RINA work after the project. IRATI will setup an initial open source community around the prototype. 4

Objectives (II) • Experimentation with RINA and comparison with TCP/IP – IRATI will follow iterative cycles of research, design, implementation and experimentation, with the experimental results retrofitting the research of the next phase – Experiments will collect and analyse data to compare RINA and TCP/IP in various aspects like: application API, programmability, cost of supporting multi-homing, simplicity, etc. • Interoperability with other RINA prototypes – The achievement of interoperability between independent implementations is a good sign that a specification is well done and complete. – Current RINA prototypes target different programming platforms (middleware vs. OS kernel) and work over different underlying technologies (UDP/IP vs. Ethernet) compared to the IRATI prototype. 5

Objectives (III) • Provide feedback to OFELIA – Apart from the feedback to the OFELIA facility in terms of bug reports and suggestions of improvements, IRATI will actively contribute to improving the toolset used to run the facility. – Moreover, the experimentation with a non-IP based solution is an interesting use case for the OFELIA facility, since IRATI will be the first to conduct these type of experiments in the OFELIA testbed. 6

Project Outcomes • Enhanced RINA architecture reference model and specifications, contributed to the Pouzin Society for experimentation. IRATI will focus on advancing the RINA state of the art in the following areas: – – – – – • DIFs over Ethernet DIFs over TCP/UDP DIFs for hypervisors Routing Data transfer Linux OS kernel implementation of the RINA prototype over Ethernet – By the end of the project an open source community will be setup in order to allow the research/industrial networking community to use the prototype and/or contribute to its development • Experimental results of the RINA prototype, compared to TCP/IP • DIF over TCP/UDP extensions, interoperable with existing RINA prototypes 7

Overview of the project structure 8

Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Conclusions 9

BASIC SCENARIOS PHASES 1 AND 2 10

Basic use cases Shim DIF over Ethernet • Goal: to ensure that the shim DIF over Ethernet provides the required functionality. The purpose of a Shim DIF is to provide a RINA interface to the capability of a legacy technology, rather than give the legacy technology the full capability of a RINA DIF. 11

Basic use cases Turing machine DIF • Goal: to provide a testing scenario to check a normal DIF complies with a minimal set of functionality (the “Turing machine” DIF). 12

ADVANCED SCENARIOS PHASES 2 AND 3 13

Advanced use cases Introduction • RINA applied to a hybrid cloud/network provider – Mixed offering of connectivity (Ethernet VPN, MPLS IP VPN, Ethernet Private Line, Internet Access) + computing (Virtual Data Center) Datacenter Design Access Network Wide Area Network 14

Advanced use cases Modeling PE CE CE Customer 1 Site A Customer 1 Site B PE CE MPLS backbone CE Customer 1 Site C PE Customer 2 Site A CE PE Customer 2 Site B PE Internet GW CE CE TOR TOR TOR TOR HV VM VM VM HV VM VM VM HV VM VM VM HV VM HV VM VM VM HV VM VM VM Public Internet HV VM VM VM HV VM VM VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM Data Center 1 Customer 2 Site C End user Data Center 2 15

Advanced use cases Enterprise VPN over operator’s network Wide Area Network • Logical separation of customers through: MPLS encapsulation, BGP-based MPLS VPNS and Virtual Routing and Forwarding (VRF) Access network • Use of Ethernet switching within metro-area networks • Logical separation of traffic belonging to multiple customers implemented through IEEE 802.1Q 16

Advanced use cases Enterprise VPN over operator’s network: Applying RINA • Backbone DIF: provides the equivalent of the MPLS network. This DIF must be able to provide flows with “virtual circuit” characteristics, equivalent to MPLS LSPs. • Provider top-level DIF: This DIF provides IPC services to the different customers, by connecting together the CE routers. The DIF may provide different levels of service, depending on the customer’s requirements. There may be one or more of these DIFs (one per customer, one for all the provider customers, etc). • Intra customer-site DIFs: The DIF whose scope is a single customer site. Its characteristics will depend on the size and needs of the customer (e.g. could be a campus network, an enterprise network, etc) • Customer A DIF: Can provide connectivity to all the application processes within customer A’s organization. More specialized DIFs targeting concrete application types (e.g. voice, file transfer) could be created on top. 17

Advanced use cases Hypervisor integration: With TCP/IP Virtual Machine 3 Virtual Machine 2 eth0 eth0 vif3.0 shared memory 192.168.1.3 192.168.1.2 SW bridge 0 bridge if eth0 VLAN 2 eth6 192.168.1.3 eth1 Top of Rack Switch Out of the DC eth0 shared memory eth3 eth0.2 bridge if SW bridge 1 eth2 Hypervisor Machine vif3.0 vif2.0 shared memory eth1.5 Virtual Machine 1 Out of the DC eth5 Hypervisor Machine eth1 eth1 eth0 eth0 VLAN 5 192.168.1.1 eth0 bridge if SW bridge 0 shared memory vif1.0 192.168.1.2 vif2.0 shared memory eth0 Virtual Machine 3 eth1.5 bridge if eth0.2 Virtual Machine 2 Virtual Machine 1 192.168.1.1 SW bridge 1 vif3.0 shared memory eth0 18

Advanced use cases Hypervisor integration: With RINA Hypervisor Hypervisor Green customer DIF VM Shim DIF over 802.1q TOR Shim DIF for HV VM VM Out of the DC (to customer VPN or Internet Gateway) 19

Advanced use cases VDC + Enterprise VPNs over the Internet: With TCP/IP Green Customer premises Border router Customer machines Switch Blue Customer premises Border router NAT, Gateway NAT, Gateway Customer machines Switch Datacentre Border router Public Internet eth2 eth3 Public Internet NAT, Gateway eth0 eth1 Datacenter premises 20

Advanced use cases VDC + Enterprise VPNs over the Internet: With RINA Hypervisor Hypervisor Green customer DIF VM Shared memory Shim DIF over 802.1Q VLAN 2 Shim DIF for HV Shared memory VLAN 2 TOR VM VLAN 2 VM Shared memory DC Border router Server Shim DIF over TCP/UDP Datacenter premises Public Internet VLAN 10 Green Customer premises Customer Border router Server Shim DIF over 802.1Q VLAN 10 Layer 2 switch VLAN 10 21

Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Y2 plans 22

SHIM DIF OVER 802.1Q 23

Shim DIF over Ethernet General requirements • The task of a shim DIF is to put a small as possible veneer over a legacy protocol to allow a RINA DIF to use it unchanged. • The shim DIF should provide no more service or capability than the legacy protocol provides. 24

Examining the Ethernet Header • Ethernet II: specification released by DEC, Intel, Xerox (hence also called DIX Ethernet) Preamble MAC dest MAC src 802.1q header (optional) Ethertype Payload FCS Interframe gap 7 bytes 6 bytes 6 bytes 4 bytes 2 bytes 42-1500 bytes 4 bytes 12 bytes 25

Ethertype • Identifies the syntax of the encapsulated protocol • Layers below need to know the syntax of the layer above • Layer violation! 26

Consequences of using an Ethertype • Also means only one flow can be distinguished between an address pair • The MAC address doubles as the connection endpoint-id 27

Shim DIF over Ethernet Environment Investigating RINA as an Alternative to TCP/IP 28

Address Resolution Protocol • Resolves a network address to a hardware address – Most ARP implementations do not conform to the standard – Shim IPC process assumes RFC826 compliant implementation 30

Usage of ARP • Maps the application process name to a shim IPC Process address (MAC address) – Application process name is transformed into a network protocol address Process name: My_IPC_Process Process instance: 1 My_IPC_Process/1/Management/2 Entity name: Management Entity instance: 2 – Application registration adds an entry in the local ARP cache • Flow allocation request results in an ARP request/reply – Instantiates a MAC protocol machine equivalent of DTP (cf. Flow Allocator) IRATI - Investigating RINA as an Alternative to TCP/IP

PDU FORWARDING TABLE GENERATOR 32

PDU Forwarding Table Generator Requirements and general choices It’s all policy! • Every DIF can do it its own way • We start with a link-state routing approach 33

PDU Forwarding Table Generator High-level view and relationship to other IPC Process components IPC Process PDU Forwarding Table Generator Enrollment Task Events N-1 flow allocated N-1 flow deallocated N-1 flow down N-1 flow up Update knowledge on N1 flow state Propagate knowledge on N1 flow state Events Enrollment completed successfully Neighbor B invoked write operation on object X PDU Forwarding Table Recompute forwarding table Lookup PDU Forwarding table to select output N-1 flow for each PDU Invoke write operation on object X to neighbor A Relaying and Multiplexing Task RIB Daemon 5 6 7 1 2 3 4 N-1 Flows to nearest neighbors (Layer management) CDAP Incoming CDAP messages from neighbor IPC Processes CDAP Resource Allocator Outgoing CDAP messages to neighbor IPC Processes N-1 Flows to nearest neighbors (Data Transfer) 34

Plans for Year 2 • Shim DIF for Hypervisors – Enable communications between VMs in the same physical machine without using the networking subsystem • Updated shim DIF over TCP/UDP – Current version requires manual discovery of mappings of app names to IP addresses and TCP/UDP ports, investigate the use of DNS • Updated PDU Forwarding Table Generator – Based on lessons learned from implementation and experimentation • Feedback to EFCP – Based on implementation and experimentation experience • Faux sockets API 35

Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Y2 plans 36

INTRODUCTION 37

Project’ targets and timeline (SW) • IRATI SW goals: • • fx • Release 3 SW prototypes in 2 years • Each prototype provides incremental functionalities • 1st prototype: basic functionalities (unreliable flows) • Comparable to a UDP/IP • 2nd prototype: “complete” stack (reliable flows + routing) • Comparable to a TCP/IP • 3rd prototype: enhancements (hardened proto + RINA over IP + …) • More product-like than prototype-like • Glancing at extendibility, portability, performances & usability The SW components lay at both kernel & user spaces Investigating RINA as an Alternative to TCP/IP 38

Problems … • Problems are mostly SW-engineering related – Time constrained 1. 2. 3. Ref-specs → HL arch HL arch → detailed design Detail design → implementation, debug, integration … • Since the IRATI stack spans user and kernel spaces… • User-space problems (as usual): – – – – – Memory (e.g. corruptions, leaks) Bad logic (e.g. faults) Concurrency (e.g. dead-locks, starvation) … Anything that special (but … time consuming for sure) Investigating RINA as an Alternative to TCP/IP 39

… and problems • Kernel space problems are the user-space ones PLUS: – A harsher environment, e.g. • The develop, install & test cycle is (a lot) slower – Huge code-base (takes lot to compile) – Faults in the kernel code may bring the whole host down – Reboot s are usually required to test a new “version” (at early stages) • C is “the” language → less expressive than others in userland • No “external libraries” … – The kernel is “cooperative”, e.g. • Stack & heap handling must be “careful”, e.g. – Memory corruptions could propagate everywhere – Different mechanics, e.g. • Mutex, semaphores, spinlocks, rcus … coupled with un-interruptable sleeps – Syscalls may sleep … but spinlocks can’t be held while “sleeping” • No recursive locking • Memory allocation is in different flavours: NOWAIT, NOIO, NOFS … – ... … … Investigating RINA as an Alternative to TCP/IP 40

Outline • Introduction • High level software architecture • Detailed software architecture – Kernel space – User space • Wrap-up Investigating RINA as an Alternative to TCP/IP 41

Splitting the spaces: user vs kernel Fast/slow paths → user vs kernel • We split the “design” in different “lanes” and placed SW components there, depending on their timing requirements – Fast-path → stringent timings → kernel-space – Slow-path → loose timings → user-space • ... looking for our optimum – fiddling with time/easiness/cost/problems/schedule/final-solution etc. User Kernel Kernel User Investigating RINA as an Alternative to TCP/IP 43

API & kernel • OS Processes request services to the kernel with syscalls – – • Modern *NIX systems extend the user/kernel communication mechanisms – • User OR kernel originated Multicast/broadcast We adopted syscalls and Netlink – Syscalls (fast-path): • – Application Application Application Application Application M Netlink, uevent, devfs, procfs, sysfs etc. We wanted a “bus-like” mechanism: 1:1/N:1, user/kernel & user/user – – • User originated (user → kernel) Unicast Bootstrapping & SDUs R/W (fast-path) Netlink(mostly slow-path): • We introduced a RINA “family” and its related messages IPC Process IPC Process Daemon IPC Process Daemon Daemon IPC Manager Daemon N 1 User Kernel Kernel 1 (*) Bootstrapping needs: Syscalls create kernel components which will be using Netlink functionalities later on Investigating RINA as an Alternative to TCP/IP 44

Introducing librina • Syscalls are “wrapped” by libc (kernel abstraction) – i.e. syscall(SYS_write, …) → write(…) – glibc in a OS/Linux • Changes to the syscalls → changes to glibc – Breaking glibc could break the whole host • Sandboxed environments are necessary – Dependencies invalidation → Time consuming compilations – That sort of changes are really hard to get approved upstream – etc. • We introduced librina as the initial way to overcome these problems … – … use IRATI in a host without breaking the whole system Investigating RINA as an Alternative to TCP/IP 45

librina • It is more a framework/middleware than a library – It has explicit memory allocation (no garbage collection) – It’s event-based – It’s threaded • Completely abstract the interactions with the kernel – syscalls and Netlink • Adds functionalities upon them • Provides them to userland (apps & daemons) – Static/dynamic linking (i.e. for C/C++ programs) – Scripting language extensions (i.e. Java) Investigating RINA as an Alternative to TCP/IP 46

librina interface • librina contains a set of “components”: – Internal components – External components • And a portable framework to build components on top, e.g.: – Patterns: e.g. singletons, observers, factories, reactors – Concurrency: e.g. threads, mutexes, semaphores, condition variables – High level “objects” in its core • FlowSpecification, QoSCube, RIBObject etc. • Only the “external “components are “exported” as classes Investigating RINA as an Alternative to TCP/IP 47

librina core (HL) SW architecture • Configure PDU Forwarding Table • Create / delete EFCP instances • Allocation of kernel resources to support a flow • Creation • Deletion • Configuration Application eventPoll() eventWait() • Allocate / deallocate flows • Read / write SDUs to flows • Register/unregister to 1+ DIF(s) eventPost() common cdap faux-sockets sdu-protection ipc-process ipc-manager application API framework Core components Event Queue NetlinkManager librina NetlinkSession NetlinkSession NetlinkSessions RINA Manager nl_send() / nl_recv() Syscall wrappers syscall(SYS_*) libnl / libnl_genl User kernel RINA Netlink Investigating RINA as an Alternative to TCP/IP RINA syscalls 50

How to RAD, effectively ? • OO was the “natural” way to represent the RINA entities • We embraced C++ as the “core” language for librina: – Careful usage produces binaries comparable to C – The STL reduces the dependencies • in the plain C vs plain C++ case – Producing C bindings is possible – … … • There was the ALBA prototype already working … • … and ALBA has RINABand … • BUT that prototype is Java based … Investigating RINA as an Alternative to TCP/IP 51

Interfacing librina to other languages • We “adopted” SWIG: the Software Wrapper and Interface Generator • SWIG “automatically” generates all the code needed to connect C/C++ programs to scripting languages – Such as Python, Java and many, many others … example.h int fact(int n); example.c #include "example.h" example.i /* File: example.i */ %module example SWIG %{ #include "example.h" %} High level wrapper int fact(int n); int fact(int n) { … } Low level wrapper example_wrap.c GCC Native interface libexample.so Investigating RINA as an Alternative to TCP/IP example.py Python 52

librina wrapping • Wrapping “cost”: – The wrappers (.i files) are small: ~480 LOCs – They produce ~13.5 KLOCS bindings → ~1/28 ratio … • The wrappers are the only thing needed to obtain the bindings for a scripting language – SWIG support vary on the target language, i.e. • Java: so-so (not all data-types mapped natively) • Python: good • … – Our wrappers contain only the missing data-type mappings for Java • Java interface = C++ interface • Bindings for other languages (i.e. Python) are expected to be straightforward Investigating RINA as an Alternative to TCP/IP 53

High level software architecture RINABand HL RINABand HL ipcpd ipcmd RINABand LL rinad (Java) Language X imports Third parties SW Packages (Applications) Java “imports” SWIG HL wrappers (Language X) SWIG HL wrappers (Java) JNI Language X “NI” SWIG LL wrappers (C++, for language X) SWIG LL wrappers (C++, for Java) librina API (C) Static/dynamic linking API (C++) Core (C++) libnl / libnl-gen syscalls Netlink Kernel Investigating RINA as an Alternative to TCP/IP 54

DETAILED SOFTWARE ARCHITECTURE KERNEL SPACE 55

The Linux object model • Linux has its “generic” object abstraction: kobject, kref and kset Garbage collection &SysFS integration structkref { atomic_trefcount; } Naming &sysfs structkobject { const char * name; structkset { structlist_headentry; structlist_headlist; structkobject * parent; spinlock_tklist_lock; structkset * kset; structkobjectkobj; structkobj_type * ktype; const structksetset_uevent_ops * uevent_ops; structsysfs_dirent * sd; }; structkrefkref; unsigned int state_initialized:1; unsigned int state_in_sysfs:1; Objects (dynamic) [re-]parenting unsigned int state_add_uevent_sent:1; unsigned int state_remove_uevent_sent:1; (loosely typed) unsigned int uevent_suppress:1; }; Objects grouping SysFS integration • Generic enough to be applied “everywhere” References counting (explicit) – E.g. FS, HW Subsystems, Device drivers Investigating RINA as an Alternative to TCP/IP 56

kobjects, ksets and krefs in IRATI • They are the way to go for embracing OOD/OOP kernel-wide • If the design has a “limite scope” the code gets bloated for: – Ancillary functions & data structures – (unnecessary) Resources usage • We don’t need/want all these functionalities (everywhere): – Reduced (finite) number of classes • We don’t have the needs of a “generic kernel” – Reduced concurrency (can be missing, depending on the object) – Object parenting is “fixed”(obj x is always bound to obj y) • E.g. DTP/DTCP are bound to EFCP … – Not all our objects have to be published into sysfs – We have different lookups requirements • No needs to “look-up by name” every object – Inter-objects bindings shouldn’t loose the object’ type – … Investigating RINA as an Alternative to TCP/IP 57

Our OOP/OOD approach • • • • We adopted a (slightly) different OOD/OOP approach (almost) Each “entity” in the stack is an “object” All our “objects” provide a basic common interface & behavior They have no implicit embedded locking semantics structobject_t{ … }; API opaque structobj_ops_t { result_x_t (* method_1)(object_t * o, …); … result_y_t (* method_n)(object_t * o, …); }; Static Dynamic vtable (if needed) intobj_init(object_t * o, …); void obj_fini(object_t * o); Interruptablectxt object_t * obj_create(…); object_t * obj_create_ni(…); intobj_destroy(object_t * o); Non-interruptablectxt intobj_<method_1>(object_t * o, …); ... intobj_<method_n>(object_t * o, …); vtable proxy (if needed) Investigating RINA as an Alternative to TCP/IP 58

OOD/OOP & the framework • This approach: – Reduces the stack (overall) bloating • no krefs, spinlocks, sysfs etc. where unnecessary • Only objects requiring sysfs, debugfs and/or uevents embed a kobject – (or it is comparable) • E.g. the same bloating related to _init, _fini, _create and _destroy – Speeds-up the developments – Helps debugging • (re-)Parenting is constrained to specific objects • No loose-typing → type-checking is maintained (no casts) – Decouples (mildly) from the underlying kernel • With these assumptions we built our framework – Basic components: robj, rmem, rqueue, rfifo, rref, rtimer, rwq, rmap, rbmp – OOP facilities/Patterns: Factories, singletons, facades, observers, flyweights, publisher/subscribers, smartpointers, etc. – Ownership-passing + smart-pointing memory model Investigating RINA as an Alternative to TCP/IP 59

The HL software architecture (Y1) rinad RINABand HL ipcpd Third parties SW Packages ipcmd SWIG HL wrappers (Java) SWIG HL wrappers (Language X) SWIG LL wrappers (C++, for Java) rinad SWIG LL wrappers (C++, for language X) User space librina librina Framework API (C) API (C++) Core (C++) libnl / libnl-gen syscalls Netlink Personality mux/demux KIPCM core RNL IPCP Factories Framework Kernel space KFA kernel KIPCM shim-eth-vlan Normal IPC P. PFT RMT EFCP shim-dummy RINA-ARP Investigating RINA as an Alternative to TCP/IP 62

The API exposed to user-space: KIPCM + RNL • Kernel interface = syscalls + Netlink messages • KIPCM: – Manages the syscalls • Syscalls: a small-numbered, well defined set of calls (#8) : – IPCs: ipc_create and ipc_destroy – Flows: allocate_portand deallocate_port – SDUs: sdu_read, sdu_write, mgmt_sdu_read and mgmt_sdu_write • RNL: – Manages the Netlink part • Abstracts message’s reception, sending, parsing & crafting • Netlink: #36 message types (with dynamic attributes): – assign_to_dif_req, assign_to_dif_resp, dif_reg_notif, dif_unreg_notif… • Partitioning: – Syscalls→ KIPCM → “Fast-path” (read and write SDUs) – Netlink→ RNL → “Slow-path” (mostly conf and mgmt) Investigating RINA as an Alternative to TCP/IP 63

KIPCM & KFA • The KIPCM: – Counterpart of the IPC Manager in user-space – Manages the lifecycle the IPC Processes and KFA – Abstract IPC Process instances • Same API for all the IPC Processes regardless the type • maps: ipc-process-id → ipc-process-instance • KIPCM KFA Manages ports and flows – Ports • Flow handler and ID • Port ID Manager – Flows • maps: port-id → ipc-process-instance Normal IPCP EFCP Both “bind” the kernel stack: – – • syscalls Netlink The KFA – • User space Top: user-interface Bottom: ipc processes (maps) – When KIPCM calls KFA to inject/get SDUs: • N-IPCP → EFCP → RMT → PDU-FWD → Shim/IPC Process Shim IPCP RMT They are the Initial point where “recursion” is transformed into “iteration” Investigating RINA as an Alternative to TCP/IP PDU-FWD-T OUT IN 64

The RINA Netlink Layer (RNL) • Integrates Netlink in the SW framework – Hides all the configuration, generation and destruction of Netlink sockets and messages from the user – Defines a Generic Netlink family (NETLINK_RINA) and its messages Investigating RINA as an Alternative to TCP/IP 66

The IPC Process Factories • They are used by IPC Processes to publish/unpublish their availability – Publish: • x = kipcm_ipcp_factory_register(…, char * name, …) – Unpublish: • kipcm_ipcp_factory_unregister(x) • The factory name is the way KIPCM can look for a specific IPC Process type – It’s published into sysfs too • There are two “major” types of IPC Processes : – Normal – Shims Investigating RINA as an Alternative to TCP/IP 67

The IPC Process Factories Interface • Factory operations are the same for both types • Upon registration – A factory publishes its hooks .init .fini .create .destroy .configure → → → → → x_init x_fini x_create x_destroy x_configure • Upon user-request (ipc_create) – The KIPCM creates a particular IPC Process instance 1. 2. 3. 4. Looks for the correct factory (by name) Calls the .create “method” The factory returns a “compliant” IPC Process object Binds that object into its data model • Upon un-registration – The factory triggers the “destruction” of all the IPC Processes it “owns” Investigating RINA as an Alternative to TCP/IP 68

IPC Process Instances • The .create provided to the factories returns an IPC Process “object” • There are two “major” types of IPC Processes: – Normal – Shims • Regardless of its type – The interface is the same – Each IPC Process implements its “core” code: • Shim IPC Process: – Each Shim IPC Processes provide its implementation • Normal IPC Process: – The stack provides an implementation for all of them Investigating RINA as an Alternative to TCP/IP 69

IPC Process Instances Interface • The IPC Process “object” • instance_data • instance_ops • The IPC Process Interface is the same for all types, but each type decides which ops will support – Some are specific for normal or shim, a few are common to both instance_ops • • • • • • • .application_register = x_application_register .application_unregister = x_application_unregister .assign_to_dif = x_assign_to_dif .sdu_write = x_sdu_write .flow_allocate_request = shim_allocate_request .flow_allocate_response = shim_allocate_response .flow_deallocate = shim_deallocate • • • • • • • .connection_create = normal_ connection_create . connection_update = normal _ connection_update . connection_destroy = normal _ connection_destroy .connection_create_arrived = normal _connection_arrived .pft_add = normal_pft_add . pft_remove = normal_pft_remove . pft_dump = normal_pft_dump – They support similar functionalities (except the PFT’s) – How they translate into ops depends on the type Investigating RINA as an Alternative to TCP/IP 70

Write operation sys_sdu_write(sdu, app2) APP User space Kernel space port_idapp2 kipcm_sdu_write(sdu, app2) IPCP 2 EFCPC 2 EFCP 2i efcp_container_write(sdu, 2i) dtp_write(sdu) DTP efcp_write(sdu) KIPCM normal_write(sdu, app2) kfa_flow_sdu_write(sdu, app2) rmt_send(pdu) RMT 2 kfa_flow_sdu_write(sdu*, 21) KFA port_id 21 IPCP 1 EFCPC 1 EFCP 1j dtp_write(sdu*) DTP efcp_container_write(sdu*, 1j) efcp_write(sdu*) normal_write(sdu*, 21) rmt_send(pdu*) RMT 1 kfa_flow_sdu_write(sdu**, 10) Pid10 IPCP 0 SHIM shim_write(sdu**, 21)

Read operation sys_sdu_read(app2) APP port_idapp2 User space Kernel space IPCP 2 EFCPC 2 kipcm_sdu_read(app2) kfa_sdu_post(sdu, app2) EFCP 2i DTP KIPCM dtp_receive(pdu) efcp_receive(pdu) efcp_container_receive(pdu, 2i) RMT 2 kfa_flow_sdu_read(app2) rmt_receive(sdu*, 21) KFA port_id 21 IPCP 1 EFCPC 1 EFCP 1j kfa_sdu_post(sdu*, 21) DTP dtp_receive(pdu*) efcp_receive(pdu*) efcp_container_receive(pdu*, 1j) RMT 1 rmt_receive(sdu**, 10) port_id 10 IPCP 0 SHIM kfa_sdu_post(sdu**, 10)

Shim IPC Processes • The shims are the “lowest” components in the kernelspace • They have two interfaces: – NB: The same for each shim, represented by hooks published into KIPCM factories – SB: Depends on the technology • There are currently 2 shims: – shim-dummy: • Confined into a single host (“loopback”) • Used for debugging & testing the stack – shim-eth-vlan: • As defined in the spec, runs over 802.1Q Investigating RINA as an Alternative to TCP/IP 73

Shim-dummy IPC Process Daemon IPC Manager Daemon User-space Kernel KIPCM / KFA shim_dummy_create shim_dummy_destroy RINA IPC API Dummy shim IPC Process IRATI - Investigating RINA as an Alternative to TCP/IP

Shim-eth-vlan IPC Process Daemon IPC Manager Daemon User-space Kernel KIPCM / KFA shim_eth_create shim_eth_destroy rinarp_add RINA IPC API Shim IPC Process over 802.1Q rinarp_remove RINARP rinarp_resolve shim_eth_rcv dev_queue_xmit Devices layer IRATI - Investigating RINA as an Alternative to TCP/IP

RINARP shim-eth-vlan ARP826 Maps Core Tables RINARP API TX RX ARM Devices Layer IRATI - Investigating RINA as an Alternative to TCP/IP

DETAILED SOFTWARE ARCHITECTURE USER SPACE 78

Introduction to the user space framework IPC Manager Daemon Main logic IDD RIB & RIB Daemon Manageme nt agent Normal IPC Process IPC(Layer Management) Process Daemon Enrollment (Layer Management) librina Application A Application A Application A Application logic Netlink sockets System calls Netlink sockets Sysfs Netlink sockets PDU Forwarding Table Generation Flow allocation librina System calls RIB & RIB Daemon Resource allocation librina System calls Netlink sockets Sysfs User space Kernel • • • IPC Manager Daemon: Broker between apps & IPC Processes, central point of Management in the system IPC Process Daemon: Implements the layer management components of an IPC Process Librina: Abstracts out the communication details between daemons and the kernel 79

Librina software architecture Perform action Get event API (C++) Message Message classes Proxy classes classes Message Message classes Model classes classes Event Producer Message Message classes Event classes classes Events queue Concurrency classes Core (C++) libpthread Message Message Message reader Thread classes Message classes classes Netlink Manager Syscall wrappers Logging framework Netlink Message Parsers / Formatters libnl/libnl-gen User space Kernel 80

The IPC Process and IPC Manager Daemons • IPC Manager Daemon – – – – Manages the IPC Processes lifecycle Broker between applications and IPC Processes Local management agent DIF Allocator client (to search for applications not available through local DIFs) • IPC Process Daemon – Layer Management components of the IPC Process • RIB Daemon, RIB, • CDAP parsers/generators • CACEP • Enrollment • Flow Allocation • Resource Allocation • PDU Forwarding Table Generation • Security Management 81

IPC Manager Daemon Message Message IPC Manager Daemon (Java) classes Console classes classes IPC Manager core classes IPC Process Manager Flow Manager Application Registration Manager Call operation on IPC Manager core classes Command Line Interface Server Thread Operation result Call IPC Process Factory, IPC Process or Application Manager local TCP Connection CLI Session Message Call operation on IPC Manager core classes Main event loop Message Configura classes classes tion classes Bootstrapper Configuration file EventProducer.eventWait() EventProducer.eventWait() SWIG Wrappers (high-level, Java) Java Native Interface (JNI) SWIG Wrappers (Low-level, C++) librina (C++) IPC Process IPC Process Factory Message Message classes Model classes classes Message Message classes Event classes classes Event Producer Application Manager System calls Netlink Messages 83

IPC Process Daemon IPC Process Daemon (Java) Supporting classes Delimite r CDAP parser Encoder Layer Management function classes Enrollment Task Flow Allocator Resource Allocator Registration Manager Forwarding Table Generator RIB Daemon Resource Information Base (RIB) RIBDaemon. sendCDAPMessage() RIBDaemon.cdapMessageReceived() Call IPCManager or KernelIPCProcess CDAP Message reader Thread Main event loop EventProducer.eventWait() KernelIPCProcess.writeMgmtSDU() KernelIPCProcess.readMgmtSDU() SWIG Wrappers (high-level, Java) Java Native Interface (JNI) SWIG Wrappers (Low-level, C++) librina (C++) KernelIP C Process IPC Manager System calls Message Message classes Model classes classes Message Message classes Event classes classes Netlink Messages Event Producer 85

Example workflow : IPC Process creation • The IPC Manager reads a configuration file with instructions on the IPC Processes it has to create at startup – • Or the system administrator can request creation through the local console The configuration file also instructs the IPC Manager to register the IPC Process in one or more N-1 DIFs, and to make it member of a DIF 3. Initialize librina 4. When completed notify IPC Manager (NL) local TCP Connection 10. Update state and forward to Kernel (NL) 5. IPC Process initialized (NL) CLI Session OR 8. Notify IPC Process registered (NL) IPC Manager Daemon 9. Assign to DIF request (NL) IPC Process Daemon 13. Assign to DIF response (NL) Configuration file 1. Create IPC Process (syscall) 6. Register 2. app Fork(syscall request(NL) ) 7. Register app response (NL) 11. Assign to DIF request (NL) 12. Assign to DIF response (NL) User space Kernel 86

Example workflow : Flow allocation • An application requests a flow to another application, without specifying what DIF to use 2. Check app permissions 3. Decide what DIF to use 4. Forward request to adequate IPC Process Daemon 5. Allocate Flow Request (NL) 1. Allocate Flow Request (NL) IPC Manager Daemon 12. Forward response to app Application A 13. Allocate Flow Request Result (NL) 14. Read data from the flow (syscall) or write data to the flow (syscall) User space 11. Allocate Flow Request Result (NL) IPC Process Daemon 6. Request port-id (syscall) 7. Create connection request (NL) 8. On create connection response (NL), write CDAP message to N-1 port (syscall) 9. On getting an incoming CDAP message response (syscall), update connection (NL) 10. On getting update connection response (NL) reply to IPC Manager (NL) Kernel 87

WRAP UP 88

Y1: Where we are / What do we have… • 9 months, ~3700 commits and ~214 KLOCs later … – – – – ~27 KLOCs in the kernel; ~87 KLOCs in the librina (hand-written); ~35 KLOCS in the librina (automatically generated); ~65 KLOCs in rinad • .. the project released its 1st prototype (internal release): – User and kernel space components providing unreliable flow functionalities – We have the building|configuration|development frameworks – A testing framework • A testing application (RINABand, compilation-time) • A regression framework (ad-hoc, run-time) • We’re actively working on the 2nd prototype Investigating RINA as an Alternative to TCP/IP 89

Y2: Plans … • Prototype 2: – Reliable flows support – Shim DIF for HV • Same schema as shim-dummy/shim-eth-vlan as in prototype 1 – Complete routing – Public release as FOSS (July 2014) • Prototype 3: – Shim DIF over TCP/UDP • same schema as prototype 2 – Faux sockets API via 1. FI: Functions interposition (dynamic linking) 2. SCI: System calls interposition (static linking) Investigating RINA as an Alternative to TCP/IP 90

Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Conclusions 92

IRATI EXPERIMENTATION GOALS Investigating RINA as an Alternative to TCP/IP 93

Experimentation goals TCP/IP UDP/IP RINA prototype Use Cases Specifications Investigating RINA as an Alternative to TCP/IP 94

IRATI experimentation in a nutshell Phase I Phase III Phase II PSOC OFELIA OFELIA iLab.t iLab.t iLab.t EXPERI MENTA OFELIA EXPERI MENTA Investigating RINA as an Alternative to TCP/IP OFELIA EXPERI MENTA 95

PROTOTYPE STATUS AND TOOLS Investigating RINA as an Alternative to TCP/IP 96

Available Tools • Rinaband RINABand 1 RINABandClient 1 Data Contr – Test application for RINA AE ol AE – Java (user space) – Requires multiple flows between to Api’s 1 control flow N data flows Contr ol AE Data AE DIF • Echoserver/client – test parameters number and size of SDUs to be sent – Ping-like operation – The test completes when either all the SDUs have been sent and received, or when more than a certain interval of time elapses without receiving an SDU. – client and server report statistics • the number of transmitted and received SDUs • time the test lasted. – Single flow between two Api’s Investigating RINA as an Alternative to TCP/IP 97

First Phase Prototype capabilities • Capabilities – Decision to focus on the Shim- ETH-VLAN – Supports only a single flow between two APi’s Preamble MAC dest MAC src 802.1q header (optional) Ethertype Payload FCS Interframe gap 7 bytes 6 bytes 6 bytes 4 bytes 2 bytes 42-1500 bytes 4 bytes 12 bytes • Impact on experiments – Could not use RinaBand – Rely on Echoserver/client application Investigating RINA as an Alternative to TCP/IP 98

FIRST PHASE EXPERIMENTS Investigating RINA as an Alternative to TCP/IP 99

First phase use case Investigating RINA as an Alternative to TCP/IP 100

Single flow echo/bw test •Validate Stack / Prototype 1 •Validate Ethernet transparency •Measure goodput Investigating RINA as an Alternative to TCP/IP 101

Multiple flow echo/bw validation •Validate multiple IPC processes •Measure goodput Investigating RINA as an Alternative to TCP/IP 102

Concurrent RINA and IP •Validate concurrency IP and RINA stack •Measure goodput Investigating RINA as an Alternative to TCP/IP 103

Presented by Leonardo Bergesio FIRST PHASE RESULTS @ I2CAT Investigating RINA as an Alternative to TCP/IP 104

i2CAT OFELIA Island, EXPERIMENTA • Experiment == slice • FlowSpace: – Arbitrary Topology – Partition of the vectorial space of OF header fields – Slicing by VLANs • VMs to be used as end points or controllers • Perfect march: – SLICE  VLAN  Shim DIF over Ethernet Investigating RINA as an Alternative to TCP/IP 105

Workflow I • Access island using OCF. Create or access your project/slice Investigating RINA as an Alternative to TCP/IP 106

Workflow II • Select FlowSpace Topology and slice VLAN/s (DIFs) Investigating RINA as an Alternative to TCP/IP 107

Workflow III • Create VMs  Nodes and OpenFlow Controller Investigating RINA as an Alternative to TCP/IP 108

Resources Mapping SlicewithtwoVLANsids, one per DIF: 300, 301 Investigating RINA as an Alternative to TCP/IP 109

Single flow Packets are sent over the Ethernet/VLAN bridge Goodput roughly 60% of Link capacity (iperf tested) Investigating RINA as an Alternative to TCP/IP Project: IRATIbasicusecase Slice: multivlanslice 111

Multiple flows Flows to shared server (B & C to D)achieved half the throughput than the single flow (A to B) Investigating RINA as an Alternative to TCP/IP Project: IRATIbasicusecase Slice: multivlanslice 112

Concurrency between IP and RINA stack Project: IRATIbasicusecase Slice: multivlanslice UDP Time Interval 90s Nº of datagrams 554915 Data sent 778 MB BW 75.5 Mbps Investigating RINA as an Alternative to TCP/IP 113

FIRST PHASE RESULTS @ IMINDS Investigating RINA as an Alternative to TCP/IP 114

iLab.t “Virtual Wall”: Concept 115

Virtual Wall: Topology Control 116

Virtual Wall: Topology Control 117

Virtual wall @ iMinds Investigating RINA as an Alternative to TCP/IP 118

Emulab: architecture Internet Web/DB/SNMP emulab ArchitectureSwitch Mgmt Users PowerCntl Control Switch/Router Serial PC PC 168 Programmable “Patch Panel” p.119

Emulab: programmable patch panel p. 120

Workflow Experiment idea GUI Emulab runs the additional scripts from ns file ns script Hardware Mapping and swap in Investigating RINA as an Alternative to TCP/IP Additionalscripting 121

Basic Experiment on iMinds island • Use a LAN for the VLAN bridge Investigating RINA as an Alternative to TCP/IP 122

Single flow Packets are sent over the Ethernet/VLAN bridge Goodput roughly 60% Iperf bandwidth Investigating RINA as an Alternative to TCP/IP 123

Multiple flows Investigating RINA as an Alternative to TCP/IP 124

Concurrency between IP and RINA stack Start Echo Server UDP Investigating RINA as an Alternative to TCP/IP 125

CONCLÚIDÍ Investigating RINA as an Alternative to TCP/IP 126

Conclusions from phase I experimentation • • • • IRATI stack and Shim DIF are running ~60% goodput in comparison to iperf No major performance problems When running concurrently, the IRATI stack take precedence over the IP stack – our stack doesn't loose a packet from syscalls to devs-layer • ARP in Shim DIF should not reuse 0x0806 ETHERTYPE because of incompatibility with existing implementations • Registration to Shim-DIF over Ethernet should be explicit Investigating RINA as an Alternative to TCP/IP 127

Thanks for your attention! Questions? Investigating RINA as an Alternative to TCP/IP

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

IRATI @ RINA Workshop 2014, Dublin - Technology

Presentation of the current results of the IRATI project at the RINA Workshop celebrated in Dublin, January 2014
Read more

RINA family gathering at Dublin: IRATI presentation at the ...

Members of the IRATI crew delivering the IRATI presentation at the RINA Workshop. ... IRATI @ RINA Workshop 2014, Dublin from IRATI project.
Read more

Presentations | IRATI Investigating RINA as an Alternative ...

Follow us and receive all presentations related to the project. ... IRATI @ RINA Workshop 2014, Dublin from IRATI project. IRATI Experimentation, ...
Read more

Welcome to the RINAissance! - TSSG

Welcome to the RINAissance! " " An Introduction" to the RINA Architecture" Part 2! IRATI RINA Workshop! John Day! Dublin 2014!! In a network of devices why ...
Read more

The Pouzin Society - Podcasts - RINA

Updated Lost Layer slide set presented at RINA Workshop, Dublin, Ireland, Jan. 2014 ... Dublin, Ireland, Jan. 2014 Author: IRATI ... 2014 The Pouzin ...
Read more

Rina | LinkedIn

Rina Rina. Rina Rina ACCA | Audit Big 4 Firm | Internal Controls | SOX | Project Controlling | Tobacco. See more. Current Project Controller at PT HM ...
Read more

Rina | LinkedIn

View 58214 Rina posts, presentations, experts, and more. Get the professional knowledge you need on LinkedIn. ... IRATI @ RINA Workshop 2014, Dublin Views
Read more

Rina IRATI @ GLIF Singapoure -2013 - Technology

Share Rina IRATI @ GLIF ... 13th Annual Global LambdaGrid Workshop 14 IRATI contributions to RINA roadmap ... IRATI @ RINA Workshop 2014, Dublin.
Read more