build the dsci-lab by yourself¶
Note: You are not requested to build the dsci-lab by yourself. Instead you are advised to download the fully configured dsci-lab as a ready-to-use virtual machine.
However, if you are a lecturer or a deeply interested student you might want to learn how we set up the dsci-lab. Here are the steps.
Download an ISO image¶
Download an ISO image (ca. 1.55 GB) of the LTS release: 20.04, Focal Fossa, 64-bit of XUbuntu e.g. from https://xubuntu.org/download/ > http://ftp.uni-kl.de/pub/linux/ubuntu-dvd/xubuntu/releases/20.04/release/ http://ftp.uni-kl.de/pub/linux/ubuntu-dvd/xubuntu/releases/20.04/release/xubuntu-20.04-desktop-amd64.iso : right click on file, “save as”, save to disk
Install Basic System¶
Start Oracle VM Virtual Box Manager.
Maschine > Neu > Name: e.g. xubuntu-august-2020, Typ: Linux, Version: Ubuntu (64-bit)
Speichergröße: 1024 MB
Platte: Festplatte erzeugen
Dateityp der Festplatte: VDI
Art der Speicherung: dynamisch alloziert
Dateiname und Größe: hier mindestens 100 GB angeben > Erzeugen
Start machine xubuntu-august-2020. A window will pop up: “Medium für Start auswählen”
Medium hinzufügen > locate xubuntu-20.04-desktop-amd64.iso on your disc (ca. 1.55 GB) > Auswählen
Starten
VirtualBox will start the ISO image.
Select language, e.g. “Deutsch”.
Install XUbuntu
Aktualisierungen herunterladen
Installationsart: Festplatte löschen und installieren (this will clean only your newly allocated virtual hard disc, NOT the disk of your host system); > Installieren
give Name, Username etc.. WRITE down your password on a secure location!
Name: Data Scientist
Name des Rechners: dsci-lab-march-2021
Benutzername: data
password: datadata (this is an intitial and very weak password, you MUST it change later!)
“Die Installation ist abgeschlossen. Sie müssen jetzt den Rechner neu starten, um das System zu benutzen” > Jetzt neu starten
“Remove installation medium”: (nothing to do), “press ENTER”: press Enter!
log in with user Data scientist, password datadata
We will use the command line hereafter where possible. Get a new terminal by typing Strg-Alt-t
.
Update your installation
sudo apt upgrade; sudo apt update
sudo reboot
Manually type in (and don’t forget to add the ampersand “&” at the end of the line):
firefox http://jbusse.de/dsci-lab/dsci-lab-build.html &
If things worked well, Firefox will open this page you are currently reading in your new linux machine. This allows you to copy & paste the commands below to your terminal.
Install Guest Extensions¶
To prepare the installation of new kernel modules we need gcc:
sudo apt install gcc make perl
Oracle Virtual Box > Geräte > Gasterweiterungen einlegen: A window pops up, showing the directory /media/data/VBox_GAs_6.1.6./
> rightclick on background, “Terminal hier öffnen”, a new terminal opens. Type in:
sudo ./VBoxLinuxAdditions.run
Reboot the VM:
sudo reboot
You now should be able to resize the VM window, activate bidirectional Copy & Paste from Windows-Host to VB etc.
Basic Packages¶
At this stage you have a clean, brand new virtual XUbuntu machine, with VirtualBox Guest extensions enabled.
To get our dsci-lab version you have to install some more packages. You can do so by simply copying the following commands into a terminal (open a new terminal e.g. by typing Strg-Alt-t
):
LaTeX (optional, not contained in dsci-lab-march-2021)
sudo apt install texlive-xetex fonts-freefont-otf latexmk
Mindmap (incl. Java):
sudo apt install freeplane
Add your favourite programming editor, e.g.
sudo apt install emacs
sudo apt install vim
Jupyter and Conda¶
Jupyter
sudo apt install jupyter
Conda:
cd Downloads/
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x ./Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh
(Instead of installing miniconda you also might want to install anaconda
(https://www.anaconda.com/products/individual). Anaconda is much more complete than miniconda, but IMHO fo “fat”. In our dsci-lab we prefer a lightweight system. This allows you to look more easily “under the hood”, to understand what’s going on, and to maintain the whole system - the dependencies in our setup are complex enough anyhow.)
After you have installed Conda, close your terminal (CTRL-D
) and open a new terminal again (e.g. with Strg-Alt-t
). (Why close and open? In an earlier step you have installed conda
. Conda puts an extra virtual environment layer over the standard Python installation, so we can work with multiple Python configurations in parallel. To learn more about conda virtual environments see https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-conda.html)
Your termial command line now should start with (base)
, which is the name of your current virtual conda environment:
(base) jb@jb-ThinkPad-X250:~$
As said: Conda is minimalistic. Thus we have to install some modules by ourselfes. Some important ones are:
conda install pandas numpy matplotlib scikit-learn seaborn xgboost
Notes:
We do install these packages into the virtual conda environment
base
. If we decide to create another virtual conda environment, it will be empty again, and we have to populate it with libraries again. This is the reason (a) why we prefer lightweight environments, and (b) why we want to learn how to install libraries by ourselfes.Caveat: Install Conda not with sudo, but instead with the role of a normal user. Every user and every virtual conda environment are completely independent from each other. There is no system-wide installation.
Keep conda
current:
conda update conda
Install jupytext:
conda install -c conda-forge jupytext
Zotero¶
Download Zotero from https://www.zotero.org/download/ to /home/data/Downloads/
Install Zotero:
cd Downloads
bunzip2 Zotero-5.0.96_linux-x86_64.tar.bz2
tar -xvf Zotero-5.0.96_linux-x86_64.tar
mv Zotero_linux-x86_64/ ../Zotero
cd
Also do install the Zotero Firefox Connector from the zotero-site.
Launch Zotero the first time:
~/Zotero_linux-x86_64/zotero &
Enhance Zotero with https://retorque.re/zotero-better-bibtex/:
Follow the installation instructions from https://retorque.re/zotero-better-bibtex/installation/:
Save file
zotero-better-bibtex-5.2.126.xpi
toDownloads
…and import this file into Zotero: Werkzeuge > Add-ons > Install Add-on from File >
Zotero wants you to restart Zotero several times: do it. You will find that Zotero has allocated the new directory ~/Zotero
: Include this directory to your list of directories which are backuped daily ;-)
Zotero allows you to link to your own files within your base (i.e. home) directory. Allocate a directory where you would like to store literature you download from the web. I myself e.g. use this location:
/home/data/a/l2/linked_zotero_files
(/home/data/a/l2
is a location which gets a weekly backup, as opposed e.g. to /home/data/a/l
, which holds self made data and thus gets backups on a daily basis).
Tell the location of your own linked_zotero_files - folder to Zotero:
Zotero > Erweitert > Dateien und Ordner > Basisverzeichnis für verknüpfte Dateianhänge > Auswählen >
/home/data/a/l2/linked_zotero_files
(Hint: Due to a Zotero bug (as of Nov 2020) this directory must not start with the string “zotero”).
jupyter-book¶
Jupyter-book v0.8 (March 2021, c.f. https://jupyterbook.org/intro.html) makes use of the highly sophisticated documentation tool Sphinx to create websites and (via LaTeX) a pdf-book out of a bunch of jupyter ntebooks.
pip install -U jupyter-book
Test the installation: Build the book according to https://jupyterbook.org/start/build.html
mkdir test
cd test
jupyter-book create mybookname
jupyter-book build mybookname
jupyter-book build mybookname/ --builder pdflatex
atril mybookname/_build/latex/book.pdf &
cd
Gemeinsame Ordner¶
Einmalig anlegen:
mkdir a
Im Menü von VirtualBox: Geräte > Gemeinsame Ordner >
klicke Icon “Ordner+” (Fügt einen neuen gemeinsamen Ordner hinzu”
Ordner-Pfad: auf dem Host-Rechner aussuchen
Ordner-Name: z.B. “arbitraryname” (kann beliebig heißen, muss nur eindeutig sein)
Permament erzeugen: Check
Nach jedem Hochfahren der virtuellen Maschine:
sudo mount -t vboxsf -o uid=1000,gid=1000 arbitraryname ~/a
Hinweis: Dieses - komplizierte - Kommando befindet sich ja auch schon in der bash-history. mit STRG-r mount
bekommt man dieses Kommando sofort wieder angezeigt, also kein Aufwand.