Skip to content



On this page we descibe the operation of the secure data infrastructure reference implementation.

Data Ingress


Whenever a Data Owner role wants to import data into the secure data infrastructure, the Data Ingress process is started. Initially, the Data Owner role must sign an agreement on data delivery ("data processing agreement") manually and provide meta data of the data set: (i) list of attributes with respective description and primary key (ii) number of records e.g. rows (iii) disclosure of format. Currently we support all data types of PostgreSQL and (iv) short description of the data.

The Data Owner role subsequently has to request an account after signing the data processing agreement. Along with the required tools to ingress data into the system, the Data Owner role needs to disclose personal information like: (i) first- and lastname (ii) e-mail address and (iii) mobile phone number to send messages at the end of the process to. This step requires manual interaction (will usually be automated using trusted authentication services) and currently takes currently a few minutes upon approval before the Data Owner role can access its VM.

As seen below, the secure data infrastructure then automatically creates a new isolated Data Owner-VM, updates the firewall rules to grant access to this machine for the Data Owner role and sends the credentials to access it and transfer the sensitive data. After confirmation that the data is completely sent (or after a pre-defined timeout of 100 days) the infrastructure locks the virtual machine, transfers the data from it and securely destroys it. The Data Owner role then is notified using two different channels that the transfer was successful.


Image title

Data ingress process


Connect via sysadmin OpenVPN profile:

console ansible-playbook create-owner-node.yml -e @vars/secure.yml -e "username=jdoe" --ask-vault-pass

console title="Console output" ... TASK [Check config] ******************************************************************************************* sysadmin password (FreeIPA): (hidden) ... TASK [Print] ************************************************************************************************** ok: [localhost] => { "msg": [ "Credentials for Owner Node:", "", "Identifier: Owner Node lo8phien9y", "Address:", "", "Username: jdoe" ] } ...

Data Access


The creation of a user requires a proper identification of the Analyst which can be done by:

  1. Data Provider (on behalf of the Data Owner)
  2. Data Owner

Once the identity is verified, provide the System Administrator with the following basic information to create a user account:

  • Given name
  • Last name
  • E-Mail address
  • Phone number
  • SSH public key (rsa, des, etc.)


Image title

Data access process


The system administrator executes the create-user.yml playbook:

console ansible-playbook create-user.yml -e "username=jdoe" -e "role_type=owners"

Automated playbook tasks

  1. Create an account at the Identity Node
  2. Assign this account to a group (analysts, owners, providers, sysadmins, dbadmins)
  3. Create an OpenVPN profile (valid for 825 days by default)
  4. Create firewall zone for this account only
  5. Print initial password to the System Administrator

console title="Console output" ... TASK [Print] ************************************************************************************************** ok: [proidentity] => { "msg": [ "Credentials:", "", " Username: jdoe", " Initial Password: 9Ae!I,*HjA<p^6PrZyl;}[", "", "The OpenVPN profile is located at /tmp/jdoe.ovpn on this computer", "Provide it to the user through a secure channel." ] }

The System Administrator then is presented with the following result below and provides these infos to the user through a secure channel, i.e. encrypted email:

  • Username
  • Initial password
  • OpenVPN profile password
  • OpenVPN profile file (attachment)

Now connect via the sysadmin OpenVPN profile:

console ansible-playbook create-analyst-node.yml -e @vars/secure.yml -e "username=jdoe" --ask-vault-pass

console title="Console output" ... TASK [Check config] ******************************************************************************************* sysadmin password (FreeIPA): (hidden) ... TASK [Print] *********************************************************************************************************************** ok: [localhost] => { "msg": [ "Credentials for both Nodes:", "", "Identifier: Desktop Node thu2rekaed", "Username: mweise", "Address:", "Password: m9ZruOmoXlCo48AMZv3O", "", "Identifier: Analyst Node thu2rekaed", "Username: mweise", "Address:", "" ] } ...

Data Egress



Image title

Data egress process



Failed connection via SSH

Connect to the analyst-VM via sysadmin ssh and execute journalctl -xeft sshd:

Postponed keyboard-interactive