build the dsci-lab by yourself

This file:

  • Build of xubuntu_dsci_ws21, 2021-09-29

  • Xubuntu 20.04.3 LTS (Focal Fossa), 64-bit

Important: You are not requested to build the dsci-lab by yourself. Instead you are advised to download the fully configured dsci-lab as a ready-to-use virtual machine from http://jbusse.de/dsci-lab/.

However, if you are a lecturer or a deeply interested student you might want to learn how we set up the dsci-lab. Here are the steps.

Install VM Virtual Box 6.1

https://www.oracle.com/de/virtualization/technologies/vm/downloads/virtualbox-downloads.html

Start VirtualBox. Also install the Oracle VM VirtualBox Extension Pack.

Update Virtual Box?

“””Update October 19th, 2021: Oracle published VirtualBox 6.1.28 released! Oracle today released a 6.1 maintenance release which improves stability and fixes regressions.”””

  • VirtualBox 6.1.28 installieren

  • https://www.virtualbox.org/wiki/Downloads -> VirtualBox 6.1.28 Oracle VM VirtualBox Extension Pack installieren

  • im dsci-lab die Gasterweiterungen neu installieren: 2021-10-26 done!

Create the virtual XUbuntu machine

Download an ISO image (ca. 1.8 GB) of the LTS release: 20.04, Focal Fossa, 64-bit of XUbuntu e.g. from https://xubuntu.org/download/ > Germany > http://ftp.uni-kl.de/pub/linux/ubuntu-dvd/xubuntu/releases/20.04/release/xubuntu-20.04.3-desktop-amd64.iso: right click on file, “save as”, save to disk

Start Oracle VM Virtual Box Manager

  • Maschine > Neu > Name: e.g. xubuntu_dsci_ws21, Typ: Linux, Version: Ubuntu (64-bit)

  • Speichergröße: 4096 MB

  • Platte: Festplatte erzeugen

    • Dateityp der Festplatte: VDI

    • Art der Speicherung: dynamisch alloziert

    • Dateiname und Größe: hier mindestens 100 GB angeben > Erzeugen

Start machine xubuntu_dsci_ws21. A window will pop up: “Medium für Start auswählen”

  • Medium hinzufügen > locate xubuntu-20.04.3-desktop-amd64.iso on your disc (ca. 1.8 GB) > Auswählen

  • Starten

VirtualBox will start the ISO image:

  • Select language, e.g. “Deutsch”.

  • Install XUbuntu

  • Aktualisierungen herunterladen

  • Installationsart: Festplatte löschen und installieren (this will clean only your newly allocated virtual hard disc, NOT the disk of your host system); > Installieren

  • give Name, Username etc.. WRITE down your password on a secure location!

    • Name: Data Scientist

    • Name des Rechners: dsci-vbox

    • Benutzername: data

    • password: datadata (this is an intitial and very weak password, you MUST it change later!)

A basic version of xubuntu will be installed.

Install Basic System

“Die Installation ist abgeschlossen. Sie müssen jetzt den Rechner neu starten, um das System zu benutzen” > Jetzt neu starten

  • “Remove installation medium”: (nothing to do), “press ENTER”: press Enter!

  • log in with user Data scientist, password datadata

“Aktualisierungsverwaltung” will ask you to update the system: do it, “jetzt installieren”.

We will use the command line hereafter where possible. Get a new terminal by typing Strg-Alt-t.

To update your installation, regularly start “Aktualisierungsverwaltung”. Alternatively submit:

sudo apt update; sudo apt upgrade

Start “Internetnavigator”, i.e. Firefox. Navigate to http://jbusse.de/dsci-lab/dsci-lab-build.html. Alternatively manually type in (and don’t forget to add the ampersand “&” at the end of the line):

firefox http://jbusse.de/dsci-lab/dsci-lab-build.html &

If your WLAN works properly, Firefox will open this page in your new linux machine. This allows you to copy & paste the commands below to your terminal. Bookmark this page in Firefox by hitting CTRL-D.

Update this file?

To document your personal installation you might wish to update this very file you are reading currently.

  • In Firefox push the .md-Button in the upper right corner of the page and save the file to Downloads/dsci-lab-build.md.

  • alternatively you might want to download the markdown-source of this file locally:

cd ~/Downloads
wget http://jbusse.de/dsci-lab/_sources/dsci-lab-build.md
cd

Edit the file dsci-lab-build.md e.g. with Mousepad. Alternatively you want to use your favourite programming editor, e.g.

sudo apt install emacs
sudo apt install vim

Basic Packages

At this stage you have a clean, brand new virtual XUbuntu machine, with VirtualBox Guest extensions enabled.

To get our dsci-lab version you have to install some more packages. You can do so by simply copying the following commands into a terminal (open a new terminal e.g. by typing Strg-Alt-t):

Mindmap (incl. Java):

sudo apt install freeplane

Conda, Jupyter, Jupyterbook

Conda

cd Downloads/
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x ./Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh

Also say yes to “run conda init”.

(Instead of installing miniconda you also might want to install anaconda (https://www.anaconda.com/products/individual). Anaconda is much more complete than miniconda, but IMHO fo “fat”. In our dsci-lab we prefer a lightweight system. This allows you to look more easily “under the hood”, to understand what’s going on, and to maintain the whole system - the dependencies in our setup are complex enough anyhow.)

After you have installed Conda, close your terminal (CTRL-D) and open a new terminal again (e.g. with Strg-Alt-t). (Why close and open? In an earlier step you have installed conda. Conda puts an extra virtual environment layer over the standard Python installation, so we can work with multiple Python configurations in parallel. To learn more about conda virtual environments see https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-conda.html)

Your termial command line now should start with (base), which is the name of your current virtual conda environment:

(base) jb@dsci-vbox:~$

As said: Conda is minimalistic. Thus we have to install some modules by ourselfes. Some important ones are:

pip install pandas numpy matplotlib scikit-learn seaborn xgboost rdflib lxml owlrl markdownify markdown python-slugify

Notes:

  • We do install these packages into the virtual conda environment base. If we decide to create another virtual conda environment, it will be empty again, and we have to populate it with libraries again. This is the reason (a) why we prefer lightweight environments, and (b) why we want to learn how to install libraries by ourselfes.

  • Caveat: Install Conda not with sudo, but instead with the role of a normal user. Every user and every virtual conda environment are completely independent from each other. There is no system-wide installation.

Keep conda current:

conda update --all

Jupyter

sudo apt install jupyter

Install jupytext:

conda install -c conda-forge jupytext

Jupyterbook

Jupyter-book v0.8 (March 2021, c.f. https://jupyterbook.org/intro.html) makes use of the highly sophisticated documentation tool Sphinx to create websites and (via LaTeX) a pdf-book out of a bunch of jupyter ntebooks.

pip install -U jupyter-book

Test the installation: Build the book according to https://jupyterbook.org/start/build.html

mkdir -p ~/b
cd ~/b
jb create test
jb build test
cd

(We will use this b directory further for examples local to the guest machine, as opposed to the a directory allocated above, which gives us access to the host file system.)

LaTeX

Optionally install LaTeX (not contained in xubuntu_dsci_ws21):

sudo apt install imagemagick
sudo apt-get install texlive-latex-recommended texlive-latex-extra \
                     texlive-fonts-recommended texlive-fonts-extra \
                     texlive-xetex latexmk

Size: about 1.9 GB (!)

Gemeinsame Ordner

Einmalig anlegen:

mkdir a

Im Menü von VirtualBox: Geräte > Gemeinsame Ordner >

  • klicke Icon “Ordner+”: fügt einen neuen gemeinsamen Ordner hinzu

  • Ordner-Pfad: auf dem Host-Rechner aussuchen

  • Ordner-Name: z.B. “123abc” (kann beliebig heißen, muss nur eindeutig sein)

  • Permament erzeugen: Check

Nach jedem Hochfahren der virtuellen Maschine:

sudo mount  -t vboxsf -o uid=1000,gid=1000 123abc ~/a

Hinweis: Dieses - komplizierte - Kommando befindet sich im dsci-lab ja auch schon in der bash-history. mit STRG-r mount bekommt man dieses Kommando sofort wieder angezeigt, also kein Aufwand.

Install Guest Extensions

To prepare the installation of new kernel modules we need gcc:

sudo apt install gcc make perl

Oracle Virtual Box > Geräte > Gasterweiterungen einlegen: A window pops up, showing the directory /media/data/VBox_GAs_6.1.28./ > rightclick on background, “Terminal hier öffnen”, a new terminal opens. Type in:

sudo ./VBoxLinuxAdditions.run

Reboot the VM:

sudo reboot

You now should be able to resize the VM window, activate bidirectional Copy & Paste from Windows-Host to VB etc.

Geräte > Gemeinsame Zwischenablage > bidirektional

Export OVA

Maschine aktualisieren, Gasterweiterungen einlegen, installieren, Gasterweiterungen auswerfen, Maschine ‘runterfahren, USB auf 1.0 umstellen, dann erst Export OVA