I have an existing script that I’m trying to modify to delete certain information from a file called portscan.txt
. The file contains port information for domains and I want to delete the information only if certain conditions are met. The conditions include:
- The file with the domains should have 2 or less lines.
- The ports should only be 80 or 443 (i.e., I don’t want to delete the file if 8080, 8443, or any other port exists in the file).
I have tried scripting this out and here’s what I have:
#!/bin/bash
lines=$(cat portscan.txt | wc -l)
ports=$(cat portscan.txt | grep -Pv '(^|[^0-9])(80|443)($|[^0-9])')
if [[ $lines < 3 ]] && (( $ports != 80 || $ports != 443 )); then
echo "I need to delete this"
else
echo "I will NOT delete this..."
fi
I attempted nested if
statements because I was unable to do a condition like this: “IF portscan.txt is less than two lines AND the ports are NOT 80 OR 443.” I also tried using ((
because I read that this is better to be used with arithmetic functions, but I’m not as bash savvy with conditional arguments when it should be like: “This AND that OR that”. Any help would be greatly appreciated!
3 Answers
Introduction
In Linux, Bash is a popular shell language used for scripting and automation tasks. When working with files, it’s common to have to delete certain information based on specific conditions. In this blog post, we will explore how to remove file content if multiple conditions are met using Bash scripting. We will use an example of deleting information from a file called portscan.txt
only if certain conditions are satisfied.
Understanding the Problem
The first step to solving this problem is to understand the conditions that need to be met before deleting information from the file. The conditions are:
- The file with the domains should have 2 or less lines.
- The ports should only be 80 or 443 (i.e., I don’t want to delete the file if 8080, 8443, or any other port exists in the file).
We need to check if these conditions are satisfied before deleting any information from the file. If any of these conditions are not met, we should not delete any information from the file.
Writing the Bash Script
Let’s take a look at the Bash script that we have so far:
#!/bin/bash
lines=$(cat portscan.txt | wc -l)
ports=$(cat portscan.txt | grep -Pv '(^|[^0-9])(80|443)($|[^0-9])')
if [[ $lines < 3 ]] && (( $ports != 80 || $ports != 443 )); then
echo "I need to delete this"
else
echo "I will NOT delete this..."
fi
The script starts with the shebang line, which tells the system to use the Bash shell to execute the script. The next two lines of the script store the number of lines in the file and the ports in the file in the variables lines
and ports
, respectively.
The next line of the script checks if the conditions are met before deleting any information from the file. The if
statement checks if the number of lines in the file is less than 3 and if the ports in the file are not equal to 80 or 443. If both conditions are satisfied, the script prints “I need to delete this”. Otherwise, it prints “I will NOT delete this…”.
Understanding the Code
Let’s break down the code and understand how it works.
The first line of the script is the shebang line. It tells the system to use the Bash shell to execute the script.
#!/bin/bash
The next two lines of the script store the number of lines in the file and the ports in the file in the variables lines
and ports
, respectively.
lines=$(cat portscan.txt | wc -l)
ports=$(cat portscan.txt | grep -Pv '(^|[^0-9])(80|443)($|[^0-9])')
The cat
command is used to display the contents of the file. The wc -l
command is used to count the number of lines in the file. The output of the cat
command is piped to the wc -l
command, which counts the number of lines in the file and stores the result in the lines
variable.
The grep -Pv '(^|[^0-9])(80|443)($|[^0-9])'
command is used to search for lines in the file that do not contain the ports 80 or 443. The grep
command with the -P
option is used to enable Perl-compatible regular expressions. The -v
option is used to invert the search, i.e., to search for lines that do not contain the specified pattern. The regular expression '(^|[^0-9])(80|443)($|[^0-9])'
matches the ports 80 or 443 and any non-numeric characters that may precede or follow them. The output of the grep
command is stored in the ports
variable.
The next line of the script checks if the conditions are met before deleting any information from the file.
if [[ $lines < 3 ]] && (( $ports != 80 || $ports != 443 )); then
echo "I need to delete this"
else
echo "I will NOT delete this..."
fi
The if
statement checks if the number of lines in the file is less than 3 and if the ports in the file are not equal to 80 or 443. If both conditions are satisfied, the script prints “I need to delete this”. Otherwise, it prints “I will NOT delete this…”.
Improving the Code
The Bash script we have so far works correctly, but there are some improvements we can make to make it more efficient and readable.
Instead of using the cat
command to display the contents of the file and then piping it to another command, we can use the <
operator to redirect the contents of the file to another command. This is more efficient because it avoids creating unnecessary subshells.
lines=$(wc -l < portscan.txt)
ports=$(grep -Pv '(^|[^0-9])(80|443)($|[^0-9])' < portscan.txt)
We can also simplify the regular expression used in the grep
command by using the -w
option, which matches whole words only.
ports=$(grep -wv '8080|8443' < portscan.txt)
The grep -wv '8080|8443'
command is used to search for lines in the file that do not contain the ports 80 or 443. The -w
option is used to match whole words only. The |
operator is used to specify multiple patterns to search for. The regular expression '8080|8443'
matches the ports 8080 and 8443.
The improved Bash script looks like this:
#!/bin/bash
lines=$(wc -l < portscan.txt)
ports=$(grep -wv '8080|8443' < portscan.txt)
if [[ $lines < 3 ]] && [[ -z $ports ]]; then
echo "I need to delete this"
else
echo "I will NOT delete this..."
fi
The [[ -z $ports ]]
condition checks if the ports
variable is empty. If it is empty, it means that no lines were found in the file that do not contain the ports 80 or 443.
Conclusion
In this blog post, we explored how to remove file content if multiple conditions are met using Bash scripting. We used an example of deleting information from a file called portscan.txt
only if certain conditions were satisfied. We learned how to use the cat
command to display the contents of a file, how to use the wc
command to count the number of lines in a file, and how to use the grep
command to search for patterns in a file. We also learned how to use the <
operator to redirect the contents of a file to another command, how to simplify regular expressions using the -w
option, and how to use the -z
option to check if a variable is empty.
To delete the file portscan.txt
if certain conditions are met, you can use the following script:
#!/bin/bash
# Count the number of lines in the file
lines=$(wc -l < portscan.txt)
# Check if the number of lines is less than or equal to 2
if [[ $lines -le 2 ]]; then
# Check if all the ports in the file are either 80 or 443
if grep -Pv '(^|[^0-9])(80|443)($|[^0-9])' portscan.txt; then
# If there are any ports that are not 80 or 443, do not delete the file
echo "I will NOT delete this..."
else
# If all the ports are 80 or 443, delete the file
rm portscan.txt
echo "I have deleted the file."
fi
else
# If the number of lines is greater than 2, do not delete the file
echo "I will NOT delete this..."
fi
This script first checks if the number of lines in the file is less than or equal to 2. If it is, it then checks if all the ports in the file are either 80 or 443. If all the ports are 80 or 443, it deletes the file. If any of the ports are not 80 or 443, or if the number of lines is greater than 2, it does not delete the file.
I hope this helps! Let me know if you have any questions.
checkfile() {
awk '
BEGIN {
FS = ":"
status = 1
ports[80] = 1
ports[443] = 1
}
NR == 3 || !($2 in ports) {status = 0; exit}
END {exit status}
' "$1"
}
file=portscan.txt
checkfile "$file" || echo rm -- "$file"
If I run the awk command on a file, it will exit with a status of 0 if the file has a third line or if it contains a “non-standard” port. On the other hand, if the file has two or fewer lines and only contains “standard” ports, the function will return a non-zero value, and the rm
command will be printed. I can remove the echo
command if the results look right.
Alternately:
checkfile() {
# if more than 2 lines, keep the file
(( $(wc -l < "$1") > 2 )) && return 0
# if a "non-standard" port exists, keep the file
grep -qv -e ':80$' -e ':443$' "$1" && return 0
# delete the file
return 1
}
or, more tersely
checkfile() {
(( $(wc -l < "$1") > 2 )) || grep -qv -e ':80$' -e ':443$' "$1"
}