Introduction to Bash/Shell & Setup
What is Bash/Shell?
Bash (Bourne Again SHell) is the most popular Unix shell and command-line interpreter. It is the default shell on Linux and macOS and is essential for automation, DevOps, system administration, and scripting.
Why Learn Bash?
- Automate repetitive tasks
- Essential for DevOps, CI/CD, and cloud operations
- Powerful text processing with sed, awk, grep
- Works everywhere (Linux servers, macOS, WSL)
Setup & Installation
Bash is pre-installed on Linux and macOS. On Windows, you have several options:
- WSL (Windows Subsystem for Linux) - Recommended for full Linux compatibility
- Git Bash - Comes with Git for Windows
- Cygwin - POSIX-compatible environment
- Terminal (Windows 11) - Built-in support
Installing WSL on Windows
wsl --install
Checking Your Bash Version
bash --version echo $BASH_VERSION echo $SHELL
Essential First Commands
# System information uname -a # Kernel info whoami # Current user pwd # Present working directory date # Current date/time echo $HOME # Home directory echo $PATH | tr ':' '\n' # PATH directories
Creating Your First Script
#!/bin/bash # My first Bash script echo "========================================" echo " Welcome to Bash Scripting!" echo "========================================" echo "Date: "$(date)" echo "User: "$(whoami)" echo "Host: "$(uname -n)" echo "OS: "$(uname -o)" echo "========================================"
# Make script executable and run
chmod +x hello.sh
./hello.sh
๐ป Try It Yourself - Multi-Language Compiler
Practice Bash and many other programming languages right here in your browser! Switch between languages, modify the code, and click "Run" to see results instantly.
๐ก Practice Tips:
- Switch to Bash in the language selector and try the shell scripting examples
- Experiment with Bash's command substitution and process handling
- Try other shell languages like Shell, PowerShell, or compare with Python
- Use the "Load Example" button to see Bash-specific code samples
- Use Ctrl+Enter to quickly run your code
Create a script called system_info.sh that displays:
- Current date and time
- Logged in user
- Hostname
- Current directory
- Bash version
Write a script that:
- Creates a directory called
bash_practice - Changes into it
- Creates 3 subdirectories:
scripts,logs,data - Lists the structure
Variables & Data Types
Understanding Bash Variables
In Bash, there are no strict data types. Everything is essentially a string, but can be treated as an integer in arithmetic contexts. Variable names are case-sensitive and can contain letters, numbers, and underscores (cannot start with a number).
Variable Declaration Rules
# โ CORRECT - No spaces around = NAME="Alice" AGE=25 _PRICE=99.99 # โ WRONG - Spaces cause errors # NAME = "Alice" # Error: command not found # AGE = 25 # Error: command not found # Accessing variables echo "Hello $NAME, you are $AGE years old." echo "Alternative syntax: ${NAME}"
Variable Types
# String variables USER="john_doe" QUOTE='Single quotes prevent $expansion' MESSAGE="Double quotes allow $USER expansion" # Integer (treated as string until arithmetic) COUNT=42 YEAR=2024 # Readonly variables (constants) readonly PI=3.14159 readonly APP_NAME="MyApplication" # PI=3.14 # Error: readonly variable # Unset variables TEMP_VAR="temporary" unset TEMP_VAR # echo $TEMP_VAR # Empty, no error
Special Variables
#!/bin/bash echo "Script name: $0" echo "First argument: $1" echo "Second argument: $2" echo "All arguments: $@" echo "Argument count: $#" echo "Process ID: $$" echo "Last exit code: $?"
Command Substitution
# Modern syntax: $() CURRENT_DATE=$(date +%Y-%m-%d) FILES_COUNT=$(ls | wc -l) USER_NAME=$(whoami) # Legacy syntax: `` (backticks) - avoid # OLD_WAY=`date +%Y` echo "Today is: $CURRENT_DATE" echo "Files in directory: $FILES_COUNT" # Nested command substitution FILES_WITH_EXT=$(ls $(pwd)/*.sh 2>/dev/null | wc -l) echo "Shell scripts: $FILES_WITH_EXT"
Environment & Export
# Make variable available to child processes export API_KEY="sk-1234567890abcdef" export DB_HOST="localhost" export DB_PORT=5432 # View all environment variables env | grep "^API_" printenv DB_HOST # Add to PATH export PATH="$HOME/bin:$PATH" # Remove from environment export -n API_KEY # Remove export property unset API_KEY # Delete variable entirely
Create a script that stores and displays user information:
- Store username, home directory, and shell in variables
- Use command substitution to get current date/time
- Display a formatted profile card
- Export a custom variable called
MY_APP
Create a script that accepts two arguments (first name and last name) and outputs:
- Full name combining both arguments
- Total number of arguments passed
- Script name that was executed
- Exit code of the last command
Operators & Expressions
Arithmetic Operators
Bash supports integer arithmetic using multiple syntaxes:
Arithmetic Expansion $(( ))
# Basic operations RESULT=$(( 10 + 5 )) echo "10 + 5 = $RESULT" # All arithmetic operators echo "Addition: $(( 20 + 10 ))" echo "Subtraction: $(( 20 - 10 ))" echo "Multiplication: $(( 20 * 10 ))" echo "Division: $(( 20 / 10 ))" echo "Modulo: $(( 20 % 7 ))" echo "Exponent: $(( 2 ** 10 ))" # Using variables NUM=5 echo "NUM squared: $(( NUM * NUM ))"
let Command
# 'let' for arithmetic - no $ needed let "a = 5 + 3" let "b = a * 2" let "x += 5" # compound: x = x + 5 let "x *= 2" # compound: x = x * 2
Comparison Operators
# Numeric comparisons [ "$NUM1" -eq "$NUM2" ] # Equal [ "$NUM1" -ne "$NUM2" ] # Not equal [ "$NUM1" -lt "$NUM2" ] # Less than [ "$NUM1" -le "$NUM2" ] # Less or equal [ "$NUM1" -gt "$NUM2" ] # Greater than [ "$NUM1" -ge "$NUM2" ] # Greater or equal # String comparisons [ "$STR1" = "$STR2" ] # Equal [ "$STR1" != "$STR2" ] # Not equal [ -z "$STR" ] # Empty (zero length) [ -n "$STR" ] # Not empty
File Test Operators
# Existence and type [ -e "file" ] # Exists [ -f "file" ] # Regular file [ -d "dir" ] # Directory [ -L "link" ] # Symbolic link [ -p "pipe" ] # Named pipe # Permissions [ -r "file" ] # Readable [ -w "file" ] # Writable [ -x "file" ] # Executable [ -s "file" ] # Non-empty # File comparisons [ "file1" -nt "file2" ] # file1 newer than file2 [ "file1" -ot "file2" ] # file1 older than file2
Logical Operators
# AND (&& / -a) [ -f "file" ] && [ -r "file" ] [ -f "file" -a -r "file" ] # OR (|| / -o) [ -f "file" ] || [ -d "file" ] # NOT (!) [ ! -f "file" ] # File does not exist
Create a script that accepts 3 arguments: two numbers and an operator (+, -, *, /), then performs the calculation and displays the result.
Write a script that checks if a file (provided as argument) exists and reports:
- File type (regular, directory, link, etc.)
- Read/write/execute permissions
- File size (empty or not)
Control Flow: If & Case
If Statements
Bash supports if/elif/else chains for complex decision making:
Basic If Structure
#!/bin/bash AGE=25 if [ "$AGE" -ge 18 ]; then echo "You are an adult" else echo "You are a minor" fi # One-liner with && and || [ "$AGE" -ge 18 ] && echo "Adult" || echo "Minor"
If-Elif-Else Chains
#!/bin/bash SCORE=85 if [ "$SCORE" -ge 90 ]; then echo "Grade: A" echo "Excellent work!" elif [ "$SCORE" -ge 80 ]; then echo "Grade: B" echo "Good job!" elif [ "$SCORE" -ge 70 ]; then echo "Grade: C" echo "Satisfactory" elif [ "$SCORE" -ge 60 ]; then echo "Grade: D" echo "Needs improvement" else echo "Grade: F" echo "Failed - study more" fi
Nested If Statements
#!/bin/bash USER_TYPE="admin" LOGGED_IN="true" if [ "$LOGGED_IN" = "true" ]; then echo "User is logged in" if [ "$USER_TYPE" = "admin" ]; then echo "Admin access granted" echo "Showing admin dashboard..." elif [ "$USER_TYPE" = "user" ]; then echo "User access granted" echo "Showing user dashboard..." else echo "Unknown user type" fi else echo "Please log in first" fi
Case Statements
Case is cleaner than multiple if/elif for matching patterns:
Basic Case
#!/bin/bash DAY="Monday" case "$DAY" in "Monday") echo "Start of work week" ;; "Friday") echo "Almost weekend!" ;; "Saturday"|"Sunday") echo "Weekend!" ;; *) echo "Regular work day" ;; esac
Case with Command Arguments
#!/bin/bash # Usage: ./script.sh [start|stop|restart|status] COMMAND=$1 case "$COMMAND" in start|s) echo "Starting service..." # start commands here ;; stop|st) echo "Stopping service..." # stop commands here ;; restart|r) echo "Restarting service..." # restart commands here ;; status|stat) echo "Checking status..." # status commands here ;; help|-h|--help) echo "Usage: $0 {start|stop|restart|status}" ;; *) echo "Unknown command: $COMMAND" echo "Usage: $0 {start|stop|restart|status}" exit 1 ;; esac
Pattern Matching in Case
#!/bin/bash FILENAME=$1 case "$FILENAME" in *.txt) echo "Text file" ;; *.sh) echo "Shell script" ;; *.jpg|*.jpeg|*.png|*.gif) echo "Image file" ;; *.tar.gz|*.tgz|*.zip) echo "Compressed archive" ;; [0-9]*) echo "Starts with a number" ;; *) echo "Unknown file type" ;; esac
Write a script that takes a number and classifies it:
- Positive/Negative/Zero
- Even or Odd
- Range: small (1-10), medium (11-100), large (>100)
Create a case-based menu script with options:
- List files in current directory
- Show disk usage
- Create a backup of specific file
- Exit
Loops
For Loops
Bash offers multiple ways to iterate - choose the right one for your use case:
C-style For Loop
# Traditional C-style loop (Bash 2.04+) for (( i=0; i<5; i++ )); do echo "Iteration: $i" done # Countdown for (( i=10; i>0; i-- )); do echo "Countdown: $i" sleep 1 done echo "Blast off!" # Multiple variables for (( i=0, j=10; i<=10; i++, j-- )); do echo "i=$i, j=$j" done
Range-based For Loop
# Brace expansion {start..end} for i in {1..5}; do echo "Number: $i" done # With step/increment (Bash 4.0+) for i in {0..10..2}; do echo "Even number: $i" done # Reverse order for i in {10..1}; do echo "Counting down: $i" done
For Loop with Lists
# Iterate over list of strings for color in red green blue yellow; do echo "Color: $color" done # Iterate over array FRUITS=("apple" "banana" "cherry") for fruit in "${FRUITS[@]}"; do echo "Fruit: $fruit" done # Iterate over command output for user in $(who | awk '{print $1}' | sort -u); do echo "Logged in user: $user" done
File Globbing in For Loops
# Process all text files for file in *.txt; do [ -f "$file" ] || continue # Skip if no matches echo "Processing: $file" wc -l "$file" done # Recursive processing for file in /var/log/*.log; do [ -f "$file" ] || continue echo "$(basename "$file"): $(stat -c%s "$file") bytes" done # Multiple patterns for file in *.sh *.bash; do [ -f "$file" ] || continue chmod +x "$file" echo "Made executable: $file" done
While Loops
# Basic while loop COUNTER=0 while [ "$COUNTER" -lt 5 ]; do echo "Counter: $COUNTER" ((COUNTER++)) done # Reading file line by line while IFS= read -r line; do echo "Line: $line" done < file.txt # Reading from command output while read -r line; do echo "User: $line" done < <(who) # Infinite loop with break while true; do read -p "Enter command (or 'quit'): " cmd [ "$cmd" = "quit" ] && break eval "$cmd" done
Until Loops
# Until runs while condition is FALSE # Perfect for "wait until something happens" COUNT=10 until [ "$COUNT" -eq 0 ]; do echo "Count: $COUNT" ((COUNT--)) done # Wait for file to exist until [ -f "/tmp/ready.flag" ]; do echo "Waiting for file..." sleep 2 done echo "File detected!"
Loop Control: break & continue
# break - exit loop completely for i in {1..100}; do if [ "$i" -eq 50 ]; then echo "Found 50, stopping" break fi done # continue - skip to next iteration for i in {1..20}; do if (( i % 2 == 0 )); then continue # Skip even numbers fi echo "Odd: $i" done # break N - break out of N nested loops for i in {1..3}; do for j in {a..c}; do echo "i=$i, j=$j" [ "$j" = "b" ] && break 2 # Break both loops done done
Select Statement (Interactive Menus)
#!/bin/bash PS3="Choose an option: " # Prompt select choice in "List Files" "Show Date" "Current User" "Exit"; do case "$choice" in "List Files") ls -la ;; "Show Date") date ;; "Current User") whoami ;; "Exit") echo "Goodbye!" break ;; *) echo "Invalid option" ;; esac done
Create a script that generates a multiplication table. The user should input a number, and the script prints the table from 1 to 10.
Write a script that:
- Reads a log file line by line
- Counts ERROR, WARNING, and INFO messages
- Stops processing when it finds 5 ERROR entries
- Prints a summary report
Functions & Scripting
Function Basics
Functions are reusable blocks of code that make scripts modular and easier to maintain.
Defining Functions
#!/bin/bash # Style 1: function_name() { ... } greet() { echo "Hello, World!" } # Style 2: function function_name { ... } function say_goodbye { echo "Goodbye!" } # Call functions greet say_goodbye
Function Arguments
#!/bin/bash # Function with parameters greet_user() { echo "Hello, $1!" # $1 is first argument echo "You are $2 years old." # $2 is second argument echo "Total arguments: $#" # $# is argument count echo "All arguments: $@" # $@ is all arguments } # Call with arguments greet_user "Alice" 25 greet_user "Bob" 30 "extra"
Return Values
#!/bin/bash # Return exit codes (0-255) check_file() { if [ -f "$1" ]; then return 0 # Success else return 1 # Failure fi } # Check return code with $? check_file "existing.txt" if [ $? -eq 0 ]; then echo "File exists!" fi # Return string output using echo (command substitution) get_full_name() { echo "$1 $2" } FULL_NAME=$(get_full_name "John" "Doe") echo "Full name: $FULL_NAME"
Variable Scope
#!/bin/bash # Global variable (default) GLOBAL_VAR="I am global" # Function with local variables demo_scope() { local LOCAL_VAR="I am local" GLOBAL_MOD="Modified in function" echo "Inside function:" echo " LOCAL_VAR: $LOCAL_VAR" echo " GLOBAL_VAR: $GLOBAL_VAR" } demo_scope echo "Outside function:" echo " LOCAL_VAR: '$LOCAL_VAR' (empty - not accessible)" echo " GLOBAL_MOD: $GLOBAL_MOD"
Function Libraries
#!/bin/bash # Library of utility functions # Logging functions log_info() { echo "[INFO] $(date '+%Y-%m-%d %H:%M:%S') - $@" } log_error() { echo "[ERROR] $(date '+%Y-%m-%d %H:%M:%S') - $@" >&2 } log_warn() { echo "[WARN] $(date '+%Y-%m-%d %H:%M:%S') - $@" } # File utilities file_exists() { [ -f "$1" ] } dir_exists() { [ -d "$1" ] } # Validation functions is_number() { [[ "$1" =~ ^-?[0-9]+$ ]] } is_email() { [[ "$1" =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ]] } # Usage in main script: # source ./utils.sh # log_info "Starting process"
Recursive Functions
#!/bin/bash # Factorial calculation factorial() { local n=$1 if (( n <= 1 )); then echo 1 else local prev=$(factorial $((n - 1))) echo $((n * prev)) fi } echo "5! = $(factorial 5)" echo "10! = $(factorial 10)" # Directory tree listing (recursive) list_tree() { local dir=$1 local indent=$2 for item in "$dir"/*; do [ -e "$item" ] || continue echo "${indent}$(basename "$item")" if [ -d "$item" ]; then list_tree "$item" " $indent" fi done }
Error Handling in Functions
#!/bin/bash # Safe division with error checking safe_divide() { local dividend=$1 local divisor=$2 # Validate inputs if ! [[ "$dividend" =~ ^-?[0-9]+$ ]]; then echo "Error: First argument must be a number" >&2 return 1 fi if ! [[ "$divisor" =~ ^-?[0-9]+$ ]]; then echo "Error: Second argument must be a number" >&2 return 1 fi if [ "$divisor" -eq 0 ]; then echo "Error: Cannot divide by zero" >&2 return 1 fi echo $((dividend / divisor)) return 0 } # Usage result=$(safe_divide 100 5) if [ $? -eq 0 ]; then echo "Result: $result" else echo "Operation failed" fi
Create a library file with functions for:
add(),subtract(),multiply(),divide()is_even(),is_prime()max(),min()(variable arguments)
Write a script with functions that:
create_backup()- backs up a file with timestamprestore_backup()- restores from backuplist_backups()- shows available backupscleanup_old_backups()- removes backups older than 7 days
Arrays & String Manipulation
Indexed Arrays
Arrays allow us to store multiple values inside a single variable. Instead of creating many separate variables, arrays help organize data efficiently. In Bash, indexed arrays use numeric indexes starting from 0.
Arrays are commonly used for handling file lists, user data, logs, automation tasks, and command outputs.
# Declare arrays FRUITS=("Apple" "Banana" "Cherry" "Date") NUMBERS=(10 20 30 40 50) # Access elements (0-indexed) echo "${FRUITS[0]}" # Apple (first element) echo "${FRUITS[2]}" # Cherry (third element) # Last element (Bash 4.2+) echo "${FRUITS[-1]}" # Array length echo "${#FRUITS[@]}" # Print all elements echo "${FRUITS[@]}"
Adding and Updating Elements
Bash arrays are dynamic, meaning elements can be added or modified at runtime. This makes arrays very useful in scripts that continuously process data.
# Initial array LANGUAGES=("Python" "JavaScript") # Add new elements LANGUAGES+=("Rust") LANGUAGES+=("Go" "Java") # Update existing element LANGUAGES[1]="TypeScript" echo "${LANGUAGES[@]}"
Looping Through Arrays
Arrays become extremely powerful when combined with loops. We can iterate through every element automatically instead of accessing values manually.
SERVERS=("Web" "Database" "Cache") # Loop through values for server in "${SERVERS[@]}"; do echo "Server: $server" done # Loop using indexes for i in "${!SERVERS[@]}"; do echo "$i -> ${SERVERS[$i]}" done
Array Slicing
Slicing allows extracting a portion of an array. This is useful when processing subsets of data.
COLORS=("Red" "Blue" "Green" "Yellow") # Get elements from index 1 echo "${COLORS[@]:1}" # Get 2 elements starting from index 1 echo "${COLORS[@]:1:2}"
Associative Arrays (Bash 4.0+)
Associative arrays store data using custom keys instead of numeric indexes. They work similarly to objects or dictionaries in other programming languages.
Associative arrays are useful for storing structured information such as:
- User profiles
- Configuration settings
- Server information
- API responses
# Declare associative array declare -A USER USER[name]="Alice" USER[role]="Administrator" USER[country]="Canada" # Access values echo "${USER[name]}" echo "${USER[role]}" # Print all keys echo "${!USER[@]}" # Print all values echo "${USER[@]}"
Looping Through Associative Arrays
declare -A PRODUCTS PRODUCTS[laptop]=1200 PRODUCTS[mouse]=25 PRODUCTS[keyboard]=70 for item in "${!PRODUCTS[@]}"; do echo "$item costs ${PRODUCTS[$item]}" done
String Manipulation
Strings are heavily used in Bash scripting because shell scripts mainly work with text, file paths, command outputs, and logs.
Bash provides powerful built-in operations for extracting, replacing, formatting, and processing strings efficiently.
String Length & Extraction
TEXT="Learning Bash Scripting" # String length echo "${#TEXT}" # Extract first 8 characters echo "${TEXT:0:8}" # Extract from position 9 echo "${TEXT:9}"
Replacing Text
String replacement is commonly used for cleaning filenames, editing logs, or formatting output data.
MESSAGE="I love Java" # Replace first occurrence echo "${MESSAGE/Java/Python}" # Replace all occurrences TEXT="apple apple apple" echo "${TEXT//apple/orange}"
Case Conversion
WORD="Linux" # Uppercase echo "${WORD^^}" # Lowercase echo "${WORD,,}" # Capitalize first character echo "${WORD^}"
String Comparison
ROLE="admin" if [[ "$ROLE" == "admin" ]]; then echo "Administrator access granted" fi EMAIL="" if [[ -z "$EMAIL" ]]; then echo "Email is empty" fi
Splitting Strings
DATA="red,green,blue" IFS=',' read -ra COLORS <<< "$DATA" for color in "${COLORS[@]}"; do echo "Color: $color" done
Best Practices
- Always quote string variables to avoid word splitting
- Use meaningful names for arrays and variables
- Prefer associative arrays for structured data
- Use loops instead of repetitive code
- Validate string data before processing
Create a contact manager using associative arrays.
- Add contacts with phone numbers
- Search contact by name
- Delete contacts
- Display all contacts
Create a script that cleans messy filenames using string manipulation.
- Convert uppercase to lowercase
- Replace spaces with underscores
- Remove unwanted symbols
- Rename files automatically
Data Import & Cleaning
Learn how to read CSV and Excel files and clean messy data using tidyr.
Introduction to Data Import
In real-world data analysis projects, data usually comes from external files such as CSV files, Excel spreadsheets, databases, or APIs. Before analyzing data, we first need to import it into R.
R provides powerful functions and libraries that make data importing simple and efficient. Once the data is imported, the next important step is cleaning the data to remove errors, missing values, and inconsistencies.
Reading CSV Files
CSV (Comma-Separated Values) files are one of the most common file formats used in data science and analytics. Each line in a CSV file represents a row, and commas separate the column values.
data <- read.csv("students.csv")
data
The read.csv() function imports the CSV file into a Data Frame.
- "students.csv" is the file name
- The imported data is stored in the variable data
Viewing Imported Data
After importing data, it is important to inspect and understand the dataset structure.
head(data) str(data) summary(data)
These functions help us understand the dataset:
- head() displays the first few rows
- str() shows the structure and data types
- summary() provides statistical summaries
Reading Excel Files
Excel files are widely used in businesses and organizations for storing data. To read Excel files in R, we commonly use the readxl package.
library(readxl)
data <- read_excel("students.xlsx")
data
The read_excel() function imports Excel spreadsheets directly into R.
Understanding Dirty Data
Dirty data refers to incomplete, inconsistent, or incorrect data that can affect analysis results.
Common problems in datasets include:
- Missing values
- Duplicate rows
- Incorrect formatting
- Extra spaces in text
- Empty columns
Data cleaning is an important step because clean data produces more accurate analysis and better machine learning models.
Handling Missing Values
Missing values are represented as NA in R. We can identify and remove missing values using built-in functions.
is.na(data) na.omit(data)
Here:
- is.na() checks for missing values
- na.omit() removes rows containing missing values
Removing Duplicate Data
Duplicate records can create incorrect analysis results. R provides functions to detect and remove duplicates.
unique(data)
The unique() function removes duplicate rows from the dataset.
Introduction to tidyr
The tidyr package is used for cleaning and organizing messy datasets. It helps convert raw data into a tidy format that is easier to analyze.
library(tidyr)
Using drop_na()
The drop_na() function removes rows containing missing values.
data %>%
drop_na()
Using replace_na()
Instead of removing missing values, we can replace them with meaningful values.
data %>%
replace_na(list(marks = 0))
In this example, missing values in the marks column are replaced with 0.
Cleaning Text Data
Text data often contains unnecessary spaces or inconsistent capitalization. Cleaning text improves data quality and consistency.
trimws(data$name)
The trimws() function removes extra spaces from text values.
Renaming Columns
Clear and meaningful column names make datasets easier to understand.
colnames(data) <- c("Name", "Age", "Marks")
This statement changes the column names of the dataset.
Exporting Cleaned Data
After cleaning the dataset, we can save it back to a CSV file for future use.
write.csv(data, "cleaned_data.csv")
The cleaned dataset will be stored in a new CSV file.
Advantages of Data Cleaning
- Improves data accuracy
- Removes inconsistencies
- Prepares data for analysis
- Enhances machine learning performance
- Makes reports more reliable
Summary
In this lecture, we learned how to import CSV and Excel files into R using read.csv() and read_excel().
We also explored data cleaning techniques such as handling missing values, removing duplicates, renaming columns, and organizing datasets using the tidyr package.
Clean data is essential for accurate analysis, reporting, and machine learning applications.
Process Management & Signals
Process Management
# List all processes ps aux # BSD style (detailed) ps -ef # UNIX style ps aux | grep "nginx" # Filter for specific process # Get current shell PID echo "My PID: $$" # Get parent PID echo "Parent PID: $PPID" # Find process by name pidof nginx # Get PID of nginx pgrep nginx # List all nginx PIDs pgrep -f "python script.py" # Match full command line # Check if process is running if pgrep -x "sshd" > /dev/null; then echo "SSH service is running" fi
Killing & Signaling Processes
# Common signals # 1 (SIGHUP) - Hang up (reload config) # 2 (SIGINT) - Interrupt (Ctrl+C) # 9 (SIGKILL) - Kill immediately (cannot be caught) # 15 (SIGTERM) - Terminate gracefully (default) # 18 (SIGCONT) - Continue (resume) # 19 (SIGSTOP) - Stop (pause, Ctrl+Z) # Kill by PID kill 1234 # Send SIGTERM (15) to PID 1234 kill -9 1234 # Force kill (SIGKILL) kill -15 1234 # Graceful termination # Kill by name killall firefox # Kill all firefox processes pkill nginx # Kill by pattern # Send custom signal kill -HUP 1234 # Reload config (SIGHUP)
Background & Foreground Jobs
# Run command in background sleep 60 & # & runs in background # Output: [1] 12345 # Job number [1], PID 12345 # List jobs jobs # Show all background jobs jobs -l # Include PIDs jobs -r # Running jobs only jobs -s # Stopped jobs only # Bring job to foreground fg %1 # Bring job [1] to foreground fg # Most recent job # Send to background Ctrl+Z # Suspend current job bg %1 # Resume job [1] in background # Wait for background job wait %1 # Wait for job [1] to complete wait # Wait for all background jobs # Disown (prevent SIGHUP on logout) disown %1 # Job survives logout nohup script.sh & # Run immune to hangups
Signal Handling in Scripts
#!/bin/bash # trap - handle signals gracefully # Cleanup function cleanup() { echo "Cleaning up..." rm -f /tmp/tempfile_$$ echo "Goodbye!" exit 0 } # Set trap trap cleanup EXIT # Run on script exit trap cleanup SIGINT # Run on Ctrl+C trap '' SIGTERM # Ignore SIGTERM # Reset trap trap - EXIT # Remove EXIT trap # Example: temp file handler TEMP_FILE=$(mktemp) trap "rm -f $TEMP_FILE; exit 1" EXIT INT TERM # Script continues here... echo "Using temp file: $TEMP_FILE" # ... do work ... # Explicit cleanup (trap also runs on exit) rm -f "$TEMP_FILE"
Write a script that:
- Monitors if a process (by name) is running
- Restarts it if it dies
- Logs restarts with timestamp
- Handles SIGTERM gracefully
Create a script that:
- Runs multiple commands in parallel (background)
- Limits concurrent jobs to N at a time
- Collects exit codes
- Reports which succeeded/failed
RegEx & grep
Powerful Text Processing
grep -E "^[0-9]+" data.txt # Find lines starting with numbers sed -i 's/old/new/g' file.txt # Replace in file awk '{print $1}' log.txt # Print first column
DevOps & Automation
Cron Jobs
Schedule scripts to run automatically.
# Edit crontab crontab -e # Run every day at midnight 0 0 * * * /path/to/backup.sh
Capstone Project: System Monitor
Create a script that monitors disk usage, CPU, and memory, and sends an alert if limits are exceeded.
# Project requirements: # 1. Use df, top, and free commands // 2. Implement logic to check thresholds // 3. Write results to a log file // 4. (Optional) Send an email/webhook notification
Advanced Scripting Patterns
Content coming soon...
Final Project โ Production Shell Scripts
Content coming soon...