Build the dsci-lab#

This file:

  • Build of dsci-lab_22.04_ss24*, 2024-03-02

  • Basis: Xubuntu 22.04.4 LTS (Jammy Jellyfish), 64 bit

previous versions:

Important: You are not requested to build the dsci-lab by yourself. Instead you are advised to download the fully configured dsci-lab as a ready-to-use virtual machine, see Download the dsci-lab OVA file.

However, if you are a lecturer or a deeply interested student you might want to learn how we set up the dsci-lab. Here are the steps.

Install VM Virtual Box 6.1#

Create the virtual XUbuntu machine#

Download an ISO image:

Start Oracle VM Virtual Box Manager

  • Maschine > Neu > Name: e.g. dsci-lab_22.04_ss24

  • Ort: z.B. /home/jb/VirtualBox VMs

  • Typ: Linux, Version: Ubuntu (64-bit)

  • Experten-Modus

  • Speichergröße: minimal 4096 MB, besser z.B. 16000 MB ;-) )

  • RAM Grafikkarte: 128 MB

  • Platte: Festplatte erzeugen > Erzeugen

  • Dateityp der Festplatte: VDI

  • Art der Speicherung: dynamisch alloziert

  • Dateiname und Größe: hier mindestens 100 GB angeben > Erzeugen

Start machine dsci-lab_22.04_ss24. A window will pop up: “Medium für Start auswählen”

  • Medium hinzufügen > e.g. /home/jb/v/xubuntu-22.04.3-desktop-amd64.iso on your disc > Auswählen

  • Starten > Try or install XUbuntu

VirtualBox will start the ISO image:

  • Select language, e.g. “Deutsch” > “Install XUbuntu”

  • Tastaturbelegung: German > German

  • Aktualisierungen herunterladen

  • Installationsart: Festplatte löschen und installieren (this will clean only your newly allocated virtual hard disc, NOT the disk of your host system;-) ); > Installieren

  • Wo befinden Sie sich? > Berlin

  • “Who are You?”:

    • Name: install

    • Name des Rechners: dsci-lab-ss24

    • Benutzername: install

    • password: 1nstall (the first char is the number “1” one instead of the letter “i”)

A basic version of xubuntu will be installed.

“Die Installation ist abgeschlossen. Sie müssen jetzt den Rechner neu starten, um das System zu benutzen” > Jetzt neu starten

  • Virtualbox will become black and “freeze”. Close the Window: “X”-button > die virtuelle Maschine ausschalten

  • Start the new machine again.

  • Log in with user install, password 1nstall

Whisker-Menue > Einstellungen > Sprachen

  • Unvollständige Sprachunterstützung: Sprachunterstützung nachinstallieren? > Ja; (on request enter password, i.e. “dscidsci”)

  • Sprache für Menues und Fenster (je nach Geschmack): das dsci-lab verwendet English ; “Ziehen Sie die Sprachen in die gewünschte Reihenfolge”; systemweit anwenden

Update Basic System#

“Aktualisierungsverwaltung” will start automatically and ask you to update the system: “Aktualisierte Anwendungen wurden seit der Veröffentlichung von Ubuntu 22.04 herausgegeben” > “jetzt installieren”.

Alternatively you may want to start Aktualisierungsverwaltung manually:

To update your installation, regularly start “Aktualisierungsverwaltung”. As an alternative to using “Aktualisierungsverwaltung” you can keep the system current with apt:

sudo apt update; sudo apt upgrade

Try it! Your system already should be up to date.

Restart the machine afterwards. Login again.

  • “Update standard folders to current language?” > im dsci-lab wollen wir englische Ordnernamen, also “update names”

We will use the command line hereafter where possible. Get a new terminal by typing Strg-Alt-t.

VirtualBox Guest Extensions#

To install new kernel modules - i.e. Oracle VirtualBox Guest Extensions - we need (the gnu C compiler) gcc:

sudo apt install gcc make perl

Oracle Virtual Box > Geräte > Gasterweiterungen einlegen: A window pops up, showing the directory /media/data/VBox_GAs_6.1.42./ > rightclick on background, “Open Terminal here”, a new terminal opens. Type in:

sudo ./VBoxLinuxAdditions.run

Some new kernel modules will be built. To replace the running kernel modules with the new ones you have to reboot the VM:

sudo reboot

Login again. You now should be able to resize the VM window. Try it!

Activate bidirectional Copy & Paste from Windows-Host to VB etc.

Geräte > Gemeinsame Zwischenablage > bidirektional

After updating virtualbox on the host you might want to update the guest extensions on the guest. Simply repeat the steps above. Donn’t forget to reboot your guest engine.

Firefox#

Xubuntu 22 comes with a Snap-Version of firefox. This causes some ugly configuration issues. We remove this (so called “rapid”) version of firefox and echange it with a .deb Version:

Test Firefox: Whisker-Menu “Internetnavigator” launces Firefox. Navigate to http://jbusse.de/dsci-lab/dsci-lab-build.html. Alternatively manually type in (and don’t forget to add the ampersand “&” at the end of the line):

firefox http://jbusse.de/dsci-lab/r_build.html &

If your WLAN works properly, Firefox will open the page you are actually reading here in your new linux machine. This allows you to copy & paste the following commands to your terminal. Bookmark this page in Firefox by hitting CTRL-D.

Firefox ESR#

Install Firefox ESR: https://ubuntuhandbook.org/index.php/2022/03/install-firefox-esr-ubuntu/

sudo add-apt-repository ppa:mozillateam/ppa
sudo apt update
sudo apt install firefox-esr

Firefox Privacy#

Start firefox. Hit the “ALT”-button to show the menu.

Edit > Settings (Bearbeiten > Einstellungen):

  • Home > New Windows and Tabs (Startseite und neue Fenster): point default to https://www.startpage.com/

    • Inhalte des Firefox-Startbildschirms: privacy! … but do not deactivate these widges (don’t close your eyes), but go to

  • Datenschutz und Sicherheit > Benuzerdefiniert > Cookies und Website-Daten

    • Cookies und Website-Daten beim Beenden von Firefox löschen: Check!

    • Hauptpasswort verwenden: Enter a save password!

    • Chronik: Firefox wird eine Chronik > nach benutzerdefinierten Einstellungen anlegen

      • “Die Chronik löschen, wenn Firefox geschlossen wird”: Check!

    • Datenerhebung durch Firefox und deren Verwendung

      • (de-)select whatever you want

Extras > Add-Ons und Themes > Erweiterungen

  • Ghostery – datenschutzorientierter Werbeblocker

    • optionally configure Ghostery: “ausführen in privaten Fenstern erlauben”

personalize this file#

To document subsequent personal installations of your VM you might wish to update this very file you are reading currently.

  • In Firefox push the .md-Button in the upper right corner of the page and save the file to Downloads/dsci-lab-build.md.

  • Alternatively you might want to download the markdown-source of this file locally:

cd ~/Downloads
wget http://jbusse.de/dsci-lab/_sources/r_build.md
cd

Edit the file dsci-lab-build.md e.g. with Mousepad. Alternatively you want to use your favourite programming editor. As of 2023 we recommend Microsoft Visual Studio Code (aka VS Code) https://code.visualstudio.com/docs/python/python-tutorial:

snap install code --classic

Only if you are over 50 years old ;-) you still want to stick to vim or emacs:

sudo apt install vim
sudo apt install emacs

From File > Preferences > Settings, search for telemetry, and set the Telemetry: Telemetry Level setting to off. This will silence all telemetry events including crash reporting from VS Code. You will need to restart VS Code for the setting change to take effect. https://code.visualstudio.com/docs/supporting/FAQ#_how-to-disable-crash-reporting

Optionally also get MyST-Marksown Syntax extension for VS Code: https://marketplace.visualstudio.com/items?itemName=ExecutableBookProject.myst-highlight

More basic Packages#

At this stage you have a clean, brand new virtual XUbuntu machine, with VirtualBox Guest extensions enabled.

To get our dsci-lab version you have to install some more packages. You can do so by simply copying the following commands into a terminal (open a new terminal e.g. by typing Strg-Alt-t):

freeplane#

Mindmap (incl. Java):

sudo apt install freeplane htop

LaTeX#

https://jupyterbook.org/en/stable/advanced/pdf.html recommends using the texlive distribution.

We have found that in combination with jupyterbook (see below) the following packes are actually required, with a total size about 2 GB (!):

sudo apt install imagemagick\
     	 	 texlive-latex-recommended texlive-latex-extra \
                 texlive-fonts-recommended texlive-fonts-extra \
                 texlive-xetex latexmk texlive-lang-german
sudo apt install xindy biber texlive-science

Test the integration of LaTeX installation and jupyterbook:

cd c
jb build test --builder pdflatex
atril test/_build/latex/book.pdf &
cd

Add new non-privileged user#

sudo adduser dsci 
sudo addgroup vboxsf
sudo adduser dsci vboxsf

Ausloggen als (Admin-) User Xubuntu, wieder einloggen als user dsci.

Login as user dsci#

(Mini-)Conda#

https://docs.conda.io/projects/conda/en/stable/user-guide/install/linux.html

Miniconda Installer: https://docs.conda.io/projects/miniconda/en/latest/ > Quick command line install:

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh
~/miniconda3/bin/conda init bash

==> For changes to take effect, close and re-open your current shell. <==

Why close and open? In an earlier step you have installed conda. Conda puts an extra virtual environment layer over the standard Python installation, so we can work with multiple Python configurations in parallel. To learn more about virtual environments:

Your termial command line now should start with (base), which is the name of your current virtual environment:

(base) dsci@dsci-lab-ss2022:~$ 

Keep conda current:

conda update --all

(Note:Instead of installing miniconda you instead want to install anaconda (https://www.anaconda.com/products/individual). Anaconda is much more complete than miniconda, but IMHO fo “fat”. In our dsci-lab we prefer a lightweight system. This allows you to look more easily “under the hood”, to understand what’s going on, and to maintain the whole system - the dependencies in our setup are complex enough anyhow.)

Jupyter Notebook#

in WS 2023:

As an existing Python user, you may wish to install Jupyter using Python’s package manager, pip, instead of Anaconda. (<https://test-jupyter.readthedocs.io/en/latest/install.html)

pip install jupyter

in WS 2022 (legacy): installation of jupyter with conda

sudo apt install jupyter

Start Jupyter. Jupyter will launch firefox. Stop Jupyter again:

jupyter notebook &
jupyter notebook stop

Install python packages into the virtual conda environment base#

As said: Conda is minimalistic. Tus we have to install some modules by ourselfes.

Some important ones are:

pip install pandas numpy matplotlib scikit-learn seaborn xgboost  lxml markdown python-slugify rdflib owlrl markdownify

Notes:

  • We do install these packages into the virtual conda environment base. If we decide to create another virtual conda environment, it will be empty again, and we have to populate it with libraries again. This is the reason (a) why we prefer lightweight environments, and (b) why we want to learn how to install libraries by ourselfes.

  • Caveat: Install Conda not with sudo, but instead with the role of a normal user. Every user and every virtual conda environment are completely independent from each other. There is no system-wide installation.

Jupyterbook#

Install Jupyterbook (c.f. https://pypi.org/project/jupyter-book/)

pip install jupyter-book

https://jupytext.readthedocs.io/en/latest/install.html

pip install jupytext --upgrade

Test the jupyterbook installation: Build the book according to https://jupyterbook.org/start/build.html

mkdir -p ~/c
cd ~/c
jb create test

build html:

jb build test
firefox-esr test/_build/html/index.html &

build pdf via LaTeX:

jb build test --builder pdflatex
atril test/_build/latex/book.pdf &

spaCy (not used)#

https://spacy.io/usage :

  • Linux, X86, conda, CPU

  • NO virtual env

  • Trained pipelines: English, German

conda install -c conda-forge spacy
python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm

PyCharm (not used)#

Warning: PyCharm is HUGE, we do not use it. (Rather try Visual Studio Code). However, if you want to play with PyCharm:

https://www.jetbrains.com/help/pycharm/installation-guide.html:

  • RAM: 4 GB (min), 8 GB (recommended)

  • Disk space: 2.5 GB and another 1 GB for caches (min), SSD drive with at least 5 GB of free space (recommended)

  • how to install: Standalone installation > Linux > Install using snap packages > Community Edition (same as https://snapcraft.io/install/pycharm-community/ubuntu)

sudo snap install pycharm-community --classic

get started with PyCharm: https://www.jetbrains.com/help/pycharm/quick-start-guide.html

Run pycharm-community in the Terminal.

TBD: initially configure PyCharm

  • point to our conda virtual environment, including python 3.9 interpreter

Export OVA#

Maschine aktualisieren, Gasterweiterungen installieren, Shared Folder deaktivieren, Medium auswerfen, Maschine ‘runterfahren, USB auf 1.0 umstellen, dann Export OVA:

  • /home/jb/v/dsci-lab_22.04_ss24.ova

  • “alle Netzwerkadressen mit einbeziehen”

  • Hersteller-URL: jbusse.de/dsci-lab

SHA und Größe der Datei feststellen:

ls -ls v/dsci-lab_22.04_ss24.ova
# 11558000 -rw------- 1 jb jb 11835387904 Sep 30 09:20 v/dsci-lab_22.04_ss24.ova

sha256sum v/dsci-lab_22.04_ss24.ova
# XXXXXf810530ff34b1a52d6b79f96cb313b8935ccfebd8dcbf56e7868e232cf5fb0a4  v/dsci-lab_22.04_ss24.ova

Auf den Server hochladen, auch dort sha256sum berechnen, kontrollieren, erst dann in Download the dsci-lab OVA file eintragen.

Admin#

Aktualisieren:

sudo apt update; sudo apt upgrade
sudo snap refresh

next steps#

http://jbusse.de/dsci-lab/h_shared-folders.html

http://jbusse.de/dsci-lab/h_customizing.html