Unix/Linux

File Naming conventions

https://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/unix.html

Five Concepts of file naming and organization

  • Have a distinctive, human-readable name that describes the content.
  • Follow a consistent pattern that is machine-friendly.
  • Organize files into directories (when necessary) that follow a consistent pattern.
  • Avoid repetition of semantic elements among file and directory names.
  • Have a file extension that matches the file format (no changing extensions!)

Tidyverse Style Guide

Wath is a terminal

  • The terminal is integrated into Mac and Linux systems,

  • but Windows users will have to install an emulator (we will use Git Bash).

  • You may experience it also in Colab.

The filesystem

File System

  • The file system is hierarchical, you create directories (folders) in other folders

  • When you log in to a Unix/Linux computer you are in your home directory

  • The command line has a prompt that is configurable and let you know that the terminal is ready for input

  • Environment variables, e.g., $HOME, $PATH, control your computing environment and many are customizable by users

Files and permissions

  • within each directory you can create files and other directories

  • files whose names start with a period . are hidden by default; use ls -a to view them

  • files have permissions: use ls -lh to view details

drwxr-xr-x    4 robert  staff   128B Aug  8 10:03 bin
-rw-r--r--@   1 robert  staff    20M Jun 24 09:11 boots.png
-rw-r--r--@   1 robert  staff    26M Jun 23 15:23 boots1.pdf
  • you can read, write, or execute files and control who (Owner, Group, Other) can see or use them by changing permissions

Working directory

  • The directory you are currently in.

  • Commands will generally take effect in this directory

  • see your working directory using pwd

  • list the contents of the directory (files and other directories) using ls

  • find out about how ls works by issuing the command: man ls (long) or ls --help(medium) (shorter version: tldr, whatis, apropos(show me commands relevant to this))

Paths

  • The string returned by pwd command is the full path to the working directory.

  • The full path to your home directory is stored in the environment variable $HOME.

  • You can see it by executing echo $HOME

Paths

  • the shorthand ~ as a nickname for your home directory

    • Example: the full path for docs (image in prev. slides) can be written like this ~/docs.
  • The environment variable called $PATH which you can display using echo

    • that variable is a set of individual paths, separated by : (colon) and it tells your Unix shell where to look for commands

    • you can add other paths to $PATH, so that Unix can look for commands in those paths (eg make your own bin directory)

Common Unix commands

https://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/unix.html

  • ls: Listing directory content

  • mkdir and rmdir: make and remove a directory

  • cd: navigating the filesystem by changing directories

  • pwd: see your workding directory

  • mv: moving files

  • cp: copying files

  • rm: removing files

  • more and less: display the contents of a file

Autocomplete

  • In Unix/Windows you can auto-complete by hitting tab.

  • Example, if we type cd d then hit tab.

  • Unix will either auto-complete if docs is the only directory/file starting with d or show you all directories begining with d.

Text editors

text editors are essential tools, in a terminal environment. Here are some of the most popular command-line text editors:

  • Nano
  • Pico
  • Vi or Vim
  • Emacs

Other useful commands

  • curl - download data from the internet.

  • tar - archive files and directories into one file.

  • gzip - and other compression tools - make big files small

  • ssh - connect to another computer.

  • find - search for files by filename in your system.

  • grep - search for patterns in a file.

  • awk/sed - powerful commands to find specific strings in files and change them.

    • awk: field-based processing (columns, math, conditions)
    • sed: stream editor (find/replace, delete, rearrange lines)

Shell

  • The shell is a program to organize and facilitate your interactions with the operating system

  • echo $SHELL will display the shell you are using

  • hidden files and directories to provide customization e.g. .gitconfig or .ssh

  • most shells have two files that are commonly used to provide user level customization .zsh, and .zprofile

Resources

To get started.