build the dsci-lab by yourself

This file:

  • Build of xubuntu_dsci_ss22, 2022-02-25

  • Xubuntu 20.04.3 LTS (Focal Fossa), 64-bit

Important: You are not requested to build the dsci-lab by yourself. Instead you are advised to download the fully configured dsci-lab as a ready-to-use virtual machine from http://jbusse.de/dsci-lab/.

However, if you are a lecturer or a deeply interested student you might want to learn how we set up the dsci-lab. Here are the steps.

Install VM Virtual Box 6.1

https://www.oracle.com/de/virtualization/technologies/vm/downloads/virtualbox-downloads.html

  • 2022-02-25: The latest release is version 6.1.32

Start VirtualBox. Also install the Oracle VM VirtualBox Extension Pack:

  • push link 6.1.32 ExtPack, select open with Oracle VM Virtual Box (Standard)

Create the virtual XUbuntu machine

Download an ISO image (ca. 1.8 GB) of the LTS release: 20.04, Focal Fossa, 64-bit of XUbuntu

Start Oracle VM Virtual Box Manager

  • Maschine > Neu > Name: e.g. xubuntu_dsci_ss22, Typ: Linux, Version: Ubuntu (64-bit)

  • Speichergröße: 4096 MB

  • Platte: Festplatte erzeugen > Erzeugen

  • Dateityp der Festplatte: VDI

  • Art der Speicherung: dynamisch alloziert

  • Dateiname und Größe: hier mindestens 100 GB angeben > Erzeugen

Start machine xubuntu_dsci_ws21. A window will pop up: “Medium für Start auswählen”

  • Medium hinzufügen > locate xubuntu-20.04.3-desktop-amd64.iso on your disc (ca. 1.8 GB) > Auswählen

  • Starten

VirtualBox will start the ISO image:

  • Select language, e.g. “Deutsch” > “Install XUbuntu”

  • German > German

  • Aktualisierungen herunterladen

  • Installationsart: Festplatte löschen und installieren (this will clean only your newly allocated virtual hard disc, NOT the disk of your host system); > Installieren

  • Wo befinden Sie sich? > Berlin

  • “Who are You?”:

    • Name: Data Scientist

    • Name des Rechners: dsci-ss2022

    • Benutzername: dsci

    • password: dscidsci (this is an intitial and very weak password, you MUST it change later!)

A basic version of xubuntu will be installed.

Install Basic System

“Die Installation ist abgeschlossen. Sie müssen jetzt den Rechner neu starten, um das System zu benutzen” > Jetzt neu starten

  • “Remove installation medium”: (nothing to do), “press ENTER”: press Enter!

  • log in with user dsci (“Data Scientist”), password dscidsci

update system

“Aktualisierungsverwaltung” will start automatically and ask you to update the system: Do it, “jetzt installieren”.

To start Aktualisierungsverwaltung manually:

  • Whisker-Menu > Einstellungen > Aktualisierungsverwaltung

  • To update your installation, regularly start “Aktualisierungsverwaltung”.

Restart the machine afterwards. Login again.

We will use the command line hereafter where possible. Get a new terminal by typing Strg-Alt-t.

Instead of using “Aktualisierungsverwaltung” you can keep the system current with apt:

sudo apt update; sudo apt upgrade

Install Guest Extensions

To prepare the installation of new kernel modules we need gcc:

sudo apt install gcc make perl

Oracle Virtual Box > Geräte > Gasterweiterungen einlegen: A window pops up, showing the directory /media/data/VBox_GAs_6.1.32./ > rightclick on background, “Terminal hier öffnen”, a new terminal opens. Type in:

sudo ./VBoxLinuxAdditions.run

Reboot the VM:

sudo reboot

Login again. You now should be able to resize the VM window!

Activate bidirectional Copy & Paste from Windows-Host to VB etc.

Geräte > Gemeinsame Zwischenablage > bidirektional

Shared Folders

Allocate a guest’s shared folder directory, e.g. a:

mkdir a

Im Menü von VirtualBox: Geräte > Gemeinsame Ordner > Gemeinsame Ordner

  • klicke Icon “Ordner+”: fügt einen neuen gemeinsamen Ordner hinzu

  • Ordner-Pfad: auf dem Host-Rechner aussuchen

  • Ordner-Name: z.B. “abc123” (kann beliebig heißen, muss nur eindeutig sein)

  • Permament erzeugen: Check

Nach jedem Hochfahren der virtuellen Maschine:

sudo mount  -t vboxsf   -o uid=1000,gid=1000 abc123 ~/a

Hinweis: Dieses - komplizierte - Kommando befindet sich im dsci-lab ja auch schon in der bash-history. mit STRG-r mount bekommt man dieses Kommando später wieder sofort wieder angezeigt, also kein Aufwand.

Configure Firefox

Start “Internetnavigator”, i.e. Firefox. Navigate to http://jbusse.de/dsci-lab/dsci-lab-build.html. Alternatively manually type in (and don’t forget to add the ampersand “&” at the end of the line):

firefox http://jbusse.de/dsci-lab/dsci-lab-build.html &

If your WLAN works properly, Firefox will open this page in your new linux machine. This allows you to copy & paste the commands below to your terminal. Bookmark this page in Firefox by hitting CTRL-D.

Firefox Privacy

start firefox. Hit the “ALT”-button to show the menu.

Bearbeiten > Einstellungen

  • Startseite und neue Fenster: point default to https://www.startpage.com/

    • Inhalte des Firefox-Startbildschirms: privacy! … but do not deactivate these widges (don’t close your eyes), but go to

  • Datenschutz und Sicherheit > Benuzerdefiniert > Cookies und Website-Daten

    • Cookies und Website-Daten beim Beenden von Firefox löschen: Check!

    • Hauptpasswort verwenden: Enter a save password!

    • Chronik: Firefox wird eine Chronik > nach benutzerdefinierten Einstellungen anlegen

      • “Die Chronik löschen, wenn Firefox geschlossen wird”: Check!

    • Datenerhebung durch Firefox und deren Verwendung

      • (de-)select whatever you want

Extras > Add-Ons und Themes > Erweiterungen

  • Ghostery – datenschutzorientierter Werbeblocker

View markdown with Firefox

(1) install Add-On: Extras > Add-Ons und Themes > Erweiterungen

  • Markdown Viewer Webext von Kulero, Cimbali

(2) add MIME type: https://github.com/KeithLRobertson/markdown-viewer#support-for-local-files-on-linux: Firefox on Linux may not know how to handle markdown files by default (see #2). There are a number of possible workarounds for this (see this SuperUser question for example). […] Another workaround (which might cover other OSs as well), is to edit Firefox’s private mime types. These mime types are stored in a file indicated by helpers.private_mime_types_file, by default it is ~/.mime.types:

echo 'type=text/plain exts=md,mkd,mkdn,mdwn,mdown,markdown, desc="Markdown document"' >> ~/.mime.types

Then restart firefox. Firefox now should render markdown files as html.

Update this file?

To document subsequent personal installations of your VM you might wish to update this very file you are reading currently.

  • In Firefox push the .md-Button in the upper right corner of the page and save the file to Downloads/dsci-lab-build.md.

  • alternatively you might want to download the markdown-source of this file locally:

cd ~/Downloads
wget http://jbusse.de/dsci-lab/_sources/dsci-lab-build.md
cd

Edit the file dsci-lab-build.md e.g. with Mousepad. Alternatively you want to use your favourite programming editor, e.g.

sudo apt install emacs
sudo apt install vim

Basic Packages

At this stage you have a clean, brand new virtual XUbuntu machine, with VirtualBox Guest extensions enabled.

To get our dsci-lab version you have to install some more packages. You can do so by simply copying the following commands into a terminal (open a new terminal e.g. by typing Strg-Alt-t):

Mindmap (incl. Java):

sudo apt install freeplane

Conda, Jupyter, Jupyterbook

sudo apt install jupyter

Get Miniconda from https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links:

cd Downloads/
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x ./Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh
  • Do you accept the licence terms? yes

  • Do you wish the installer … by running conda init? yes

(Instead of installing miniconda you also might want to install anaconda (https://www.anaconda.com/products/individual). Anaconda is much more complete than miniconda, but IMHO fo “fat”. In our dsci-lab we prefer a lightweight system. This allows you to look more easily “under the hood”, to understand what’s going on, and to maintain the whole system - the dependencies in our setup are complex enough anyhow.)

After you have installed Conda, close your terminal (CTRL-D) and open a new terminal again (e.g. with Strg-Alt-t). (Why close and open? In an earlier step you have installed conda. Conda puts an extra virtual environment layer over the standard Python installation, so we can work with multiple Python configurations in parallel. To learn more about conda virtual environments see https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-conda.html)

Your termial command line now should start with (base), which is the name of your current virtual conda environment:

(base) jb@dsci-vbox:~$

As said: Conda is minimalistic. Tus we have to install some modules by ourselfes. Some important ones are:

conda install pandas numpy matplotlib scikit-learn seaborn xgboost  lxml markdown python-slugify

SPÄTER NOCH INSTALLIEREN:

# pip install rdflib owlrl markdownify

Notes:

  • We do install these packages into the virtual conda environment base. If we decide to create another virtual conda environment, it will be empty again, and we have to populate it with libraries again. This is the reason (a) why we prefer lightweight environments, and (b) why we want to learn how to install libraries by ourselfes.

  • Caveat: Install Conda not with sudo, but instead with the role of a normal user. Every user and every virtual conda environment are completely independent from each other. There is no system-wide installation.

Keep conda current:

conda update --all

Jupyterbook:

conda install -c conda-forge jupyter-book

Jupytext is already installed then:

conda install -c conda-forge jupytext

Test the installation: Build the book according to https://jupyterbook.org/start/build.html

mkdir -p ~/b
cd ~/b
jb create test
jb build test
cd

LaTeX

https://jupyterbook.org/advanced/pdf.html recommends using the texlive distribution (https://www.tug.org/texlive/), size about 1.9 GB (!):

sudo apt install imagemagick
sudo apt-get install texlive-latex-recommended texlive-latex-extra \
                     texlive-fonts-recommended texlive-fonts-extra \
                     texlive-xetex latexmk

Test the integration of LaTeX installation and jupyterbook:

cd b
jb build mybookname/ --builder pdflatex
jb build test --builder pdflatex
atril test/_build/latex/book.pdf &
cd

spaCy

https://spacy.io/usage :

  • Linux, X86, conda, CPU

  • NO virtual env

  • Trained pipelines: English, German

conda install -c conda-forge spacy
python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm

Zotero

Goto Zotero https://www.zotero.org/download/ > installations help > Linux > Debian/Ubuntu-based Distros > “A longtime community member maintains zotero-deb

wget -qO- https://apt.retorque.re/file/zotero-apt/install.sh | sudo bash
sudo apt update
sudo apt install zotero

Also do install the Zotero Firefox Connector from the zotero-site: https://www.zotero.org/download/ > Install Firefox connector.

Launch Zotero the first time.

zotero

Zotero wants to install severals defaults: say yes!

Zotero’s Better BibTeX

Enhance Zotero with https://retorque.re/zotero-better-bibtex/: Follow the installation instructions from https://retorque.re/zotero-better-bibtex/installation/ > latest release:

  • Save file zotero-better-bibtex-6.2.9.xpi e.g. to Downloads

  • if not alread running: start zotero

  • and import this file: Zotero > Werkzeuge > Add-ons > Install Add-on from File > select zotero-better-bibtex-6.2.9.xpi >

Zotero wants you to restart Zotero two times: do it. You will find that Zotero has allocated the new directory ~/Zotero: Include this directory to your list of directories which are backuped daily ;-)

Learn where your data reside: https://www.zotero.org/support/zotero_data.

Export OVA

Maschine aktualisieren, Gasterweiterungen installieren, Medium auswerfen, Maschine ‘runterfahren, USB auf 1.0 umstellen, dann Export OVA:

  • /home/jb/v/xubuntu_dsci_ss22.ova

  • alle Netzwerkadressen mit einbeziehen

  • Hersteller-URL: