Skip to main content

42sh Implementation

At all times, you MUST keep in mind that this is the order of priority of instructions in case of conflicting information:

  1. The subject
  2. The SCL
  3. The behavior of bash --posix

1. Non-existing Shell Script

What is the expected behavior when trying to parse a command from a shell script which does not exist? For instance:

42sh$ 42sh doesnotexist.sh

Answer:

This will not be tested so you can do whatever you want.

2. Variable Expansions in for Loops

Related News

When running the given test, we notice that bash --posix displays "hello world" with one space. However, our program displays "hello world" with two spaces. Is that due to us not having handled the IFS yet?

Given test:

for a in " hello $@ world "
do
echo $a;
done

Answer:

Before Step 4, you do not handle field splitting, therefore it is normal for the output to differ. Indeed, since $a is not quoted in echo $a, variable expansion occurs at that point in the program, resulting in one space only with bash --posix thanks to the IFS.

3. Implementation Defined Features

Related News 1

The SCL specifies in rule #1 of cd, regarding the HOME environment variable, that the default behavior is implementation-defined. What should we do about that?

Related News 2

Section 2.8.2 of the SCL reads:

The exit status of a command that terminated because it received a signal shall be reported as greater than 128.

How can we check that we have the correct exit status?

Answer:

This means that you are free to do whatever you deem best-fitting. These cases will not be tested.

4. "**" Operator in Arithmetic Expansion

Related News

It is said that we have to implement the "**" operator. However, it is not POSIX compliant. Do we have implement it regardless?

Answer:

Yes. You can mimic the behavior of bash --posix.

5. Redirection Grammar Imprecision

Related News

For redirections, the grammar does not expect a IONUMBER at the end of the rule:

redirection =
[IONUMBER] ( '>' | '<' | '>>' | '>&' | '<&' | '>|' | '<>' ) WORD
;

Should we not handle cases such as 2>&1? If not, I do not understand the use of <& when an IONUMBER is not given at the end.

Answer:

Both the subject and the SCL specify a WORD token and NOT an IONUMBER at the end. However, in the case of >& and <&, if the WORD is neither a valid number nor -, the behavior is not specified (sections 2.7.5 and 2.7.6). Thus, we do not test this.

6. Non-XBD Name Expansion Inconsistency

Related News 1 Related News 2

In a for Loop, when the variable does not correspond to a XBD name such as:

for % in a b c; do echo $%; done;

Should we return an exit code between 1 and 125, inclusive, and halt execution because the error is within the expansion, or should we proceed by displaying 2?

Answer:

In this case, bash --posix is not compliant with the SCL: the SCL specifies that the for Loop variable must respect XBD for the grammar to be valid. However, bash --posix allows, at parsing, for non-XBD-compliant variables. Thus, the error is only noticed at execution, and only halts execution when --posix is invoked. We will not test this.

7. Redirection Towards Stream

Related News

When testing echo toto 1>&- using bash --posix and dash, the SCL's requested behavior is not observed. Normally, fd 1 (stdout) should be closed. However, if an echo is run after, such as echo tata, tata is displayed on stdout. Which behavior should we follow?

Answer:

Redirections are only effective in the execution context of the command(s) they are associated with. For this example, fd 1 should be closed when executing echo toto, then restored. This applies for all types of redirections.

8. Variable Declaration Inconsistency with Builtins

Related News

Using bash --posix and dash, the following script

abc=3 . ./non-existing.file
echo $abc

results in:

.: cannot open ./non-existing.file
3

The same script with another builtin does not declare the variable abc:

abc=3 echo test$abc
echo test$abc

results in:

test
test

.

In the SCL, the dot builtin's description does not specify this behavior. Should we reproduce it or is it an imprecision?

Answer:

dot is a Special Built-in Utility. According to rule 2.9.1 of the SCL, if the command to execute is a Special Built-in Utility, then the assignment should affect the shell's environment, and not just the command itself. However, this will not be tested. Furthermore, echo is described as neither a Built-in Utility nor a Special Built-in Utility, but as a simple Utility. Therefore, every variable assignment linked to the command must remain within the scope of the command itself.

9. Shell Interpretation and Quoting

Related News

When testing ./src/42sh -c "echo \t", the shell in which the command is executed interprets the backslash, which means that the actual input to the src/42sh command is echo t. However, using single quotes such as ./src/42sh -c 'echo \t' does not interpret the backslash, and the input is echo \t. Should we reproduce this behavior?

Answer:

Yes. You need to take it into account in your tests. However, this behavior is independent of your shell implementation.

In your tests, you should use double quotes rather than single quotes. Even if single quotes might seem to work for escape sequences (i.e., \t), they will not work when you want to test the expansion of such sequences. Indeed, a single quote cannot appear between a pair of single quotes (for instance, 'to'to' is not valid).

To preserve escape sequences when calling your binary, you can simply escape the backslash by doubling it. For example, if you want to verify that echo -e "to\tto" works, you should call your binary as ./src/42sh -c "echo -e \"to\\tto\"" so that the backslash is preserved after the expansion performed by your shell. Note that you must also escape the double quotes in this case.

10. Parse Execution Loop

While reading the grammar provided on the trove, no sequence of rules that would allow writing several simple commands separated only by newlines (as one would typically find in a script) is described. In POSIX bash, this is possible.

Here is an example of the kind of code:

echo toto
echo tata

According to the trove grammar, we have:

list -> and_or -> pipeline -> command -> simple_command

Is this due to a misunderstanding, an omission in the subject, or a deliberate design choice for 42sh?

Answer:

As a matter of fact, this sequence is invalid according to the grammar. However, your shell should implement a parse-execution loop that allows executing several commands one after the other.

In the given example, the first echo toto corresponds to a valid simple_command, which is executed and ends the first list. After its execution, your shell should then parse the next input, echo tata, which is also a valid simple_command, and execute it as well. Eventually, an EOF will be encountered, which will terminate the parse–execution loop.