/projects/megashell/image1.png

megashell

Published @ 08-02-2024

#unix #c #codam #shell #syscall

Unix shell written in C.

The goal of this project is to build a unix shell in C.
But as a side-quest to also learn about teamwork, having a working git strategy and some basic unit testing.
This was a group project I did together with the one and only ✨ Iris Van Melsen

You can checkout the finished project over here.


What is a shell?

Simply put, a shell is a program that takes commands from the keyboard and gives them to the operating system to perform. In the old days, it was the only user interface available on a Unix-like system such as Linux. Nowadays, we have graphical user interfaces (GUIs) in addition to command line interfaces (CLIs) such as the shell.

Features


  • ' (single quote) will prevent the shell from interpreting meta-characters in the quoted sequence.
  • " (double quote) same as for the single quote, except that it does interpret the $ allowing for the use variables inside the double quoted sequence.
  • < infile STDIN redirection.
  • << delimiter STDIN reads from heredoc.
  • > outfile output redirection.
  • >> outfile output redirection in append mode.
  • cmd1 | cmd2 pipeline: redirect the STDOUT from cmd1 into the STDIN of cmd2.
  • $? expands to the most recent exit code of a command.
  • CTRL-D exits the shell.
  • CTRL-C displays new line.
  • CTRL-\ does nothing.
  • Prompt color indicates lastest exit code.


Builtins

A builtin is a command that is part of the shell, instead of being installed on the system.
We added the following buitlins


Builtin description
echo [-n] display a line of text with or without a newline at the end.
cd change the current working directory.
pwd prints the current working directory.
exit [code] exits the shell with a status code if given.
history display the shells command history.
export [name[=value]] exports environment variable to the new child process.
env print the current environment.
unset [name] removes a variable from the environment.



Implementation

The shell will read a string of commands from the STDIN using the readline library. This project involves a lot of parsing of the string read by readline, because of that it is recommended to split the code into the 4 following modules.


Lexer

After reading the string from STDIN using readline, the lexer will split the string into different bits known as “tokens”. It will store all of these tokens in a linked list, allowing us to go over them with a better understanding of the context they where used in later on.

for example:
input: ls | cat > $OUT
output: ["ls", TOKEN_TEXT] ["|", TOKEN_PIPE] ["cat", TOKEN_TEXT] [">", TOKEN_GREATER_THAN] ["$OUT", TOKEN_DOLLAR]

As you can see, the input that was just a plain string is now a series of tokens. This is useful because later in the program we can easliy distinguish between them and handle them accordingly.
For example, we only want to expand the TOKEN_DOLLAR tokens and this token type can easliy be distinguished by using an if statement in the Expander.


Expander

The expander takes the list of tokens from the lexer, finds all of the TOKEN_DOLLAR tokens, and expands them to their corresponding value defined in the shell’s environment variables. For example $USER would be expanded to joppe, which is my username. If there is no corresponding value for a variable it expands to an empty string.
It will also expand all the quote blocks, keeping the spaces intact.


Parser

The parser takes the expanded token list and filters the commands and its corresponding options such; as the command arguments, STDIN/STDOUT redirections and the optional heredoc. and puts them into a list of t_cmd_frame which the executor will use to run its commands.
If something is wrong with the order of the tokens or if the given input is weird, the parser will throw a syntax error.

The t_cmd_frame struct.


field description
char **argv Command and its arguments.
char *infile Path to the STDIN file
char *outfile Path to the STDOUT file
char *hd_delim Heredoc delimiter, if this is set the infile will be ignored.
bool is_append If you want to append to the outfile instead of overwriting it.


Executor

This module handles the actual running of commands. It does this by spawning child processes and setting up IO redirections.

For each heredoc in the list it will:

  • Create a pipe.
  • Spawn a child process in which the heredoc prompt runs.
  • Save the pipes read-end allowing us to later hook that end up to the STDIN of the command that the heredoc is associated with.

After the heredocs have run and the pipe are filled with their content, it will start setting up the pipeline.

The pipeline works as following:

  • Create a pipe().
  • Spawn a child process using fork().
  • If there is more than one command in the pipeline, setup its IO to either read from the previous command’s output using its pipe, and/or to write to current pipe so the next command can read from it.
  • If the command is a builtin just run it, otherwise search for its location in the PATH variable or the current working directory.
  • (Incase of non-builtin) check if the user has the right permissions.
  • Actually execute the command using execve

When all the commands have run use waitpid to gather their exit code, and handle any errors if they occurred.

Yup thats about it.

© 2024 Joppe Boeve. All rights reserved.